A look at how we built a semantic search layer that understands what people mean, not just what they type

Most search bars promise more than they deliver. Type "melancholy road trip" and you'll get nothing. Type "drive" and you'll get everything with the word drive in it. For content-rich websites, that gap between what a user is reaching for and what they actually find is where discovery dies.
Vector embeddings change that. For agencies building editorial platforms, archive sites, or anything with deep catalogues of rich content, they're worth understanding.

Howcontentbecomessomethingamachinecanreasonabout
The process starts before any search happens. Each piece of content goes through an enrichment step, where a language model (eg, Claude) reads it and generates structured metadata. Not just categories, but interpretive signals: mood, era, instrumentation, emotional register, cultural reference points. The kind of context a knowledgeable person would reach for when describing something to a friend.
That enriched content is then passed to an embedding model. We use Cohere's multilingual model, which converts text into a vector, a long list of numbers that positions the content within a space where meaning, not keywords, determines proximity. Two pieces of content that share a feeling or a theme end up close together in that space, even if they share no words at all.
Those vectors are stored in Supabase using the pgvector extension, which turns a standard Postgres database into a vector store. In practice, that means content gets converted into numbers that represent its meaning, and a search is matched against those numbers rather than the words themselves. When someone searches, their query is embedded the same way, and the database finds the nearest neighbours: the content that means something similar. To keep that fast at scale, we use HNSW indexing, a graph-based approach that makes similarity search feel instant even across tens of thousands of records.

Anexampleforamusicartistwithadeepcatalogue
We built a proof of concept internally, using a band with decades of recorded work as a test case. The premise was simple: how could we connect different types of content (long form essays, reviews, product, audio, video) together in a programatic way so a user could follow a logical trail through the content.
With a semantic layer in place, someone typing "late night, a bit sad, sounds like rain" finds tracks that match that feeling, not that phrase. The experience stops feeling like a search engine and starts feeling like a recommendation from someone who actually knows the music.
This leads to the potential for personalised content journeys, treating different types of users with different relationships to the content in their own way.
Adifferentkindofsearch
Any client whose value lives in the depth of their content, a publisher, a cultural institution, a fashion archive, is sitting on an asset their users can't fully reach.
Semantic search doesn't just improve findability. It changes what the site is for. Instead of search for what you already know exists, it becomes a way to explore what you didn't know to look for. That's a different proposition, and search actually delivering on intention.