Oct 30, 2007 115 reads by Navneet Kaushal

Following the "Do what I mean, not what I say" post, the official Live Search blog is back with "Do What I Mean, Not What I Say" Part II which introduces more features in LiveSearch.


"We do really badly on the query ca chp" a coworker complained in one email.

"Ca chp?" I thought. "What the heck does that mean?"

live-logo1.jpgIt turned out it was pretty simple: "ca" was short for California and "chp" was short for California highway patrol. Obviously, my coworker knew what he meant by the query ca chp, but I didn't know it, and our search engine definitely didn't know it. After seeing many complaints from customers of this sort we began to realize that to truly improve the relevance of our search engine, it was more confirmation that we had to move past just simple keyword matching, and into understanding the intent of your query.

So when you search for crossroads mall in OKC we take this to mean crossroads mall in Oklahoma City. When you search for Julia child bio we'll also look for Julia child biography to give you better results. But of course, the same word could mean something different in another context. Hence, when you search for nw university we we'll search for northwestern university but if you search for nw co-ed soccer we'll search for northwest co-ed soccer instead.

Intelligent "stop word" retention

Another area that fell under the "Do what I mean, not what I say!" category were "stop words".

What are "stop words" you ask?

Well, in Search Engine parlance they are words that oftentimes may not contain much "meaning" in the query – words such as (a, the, in, etc…) and hence it may not be crucial as to whether they are found on the desired results page or not. For example if the query was the aurora borealis, you probably wouldn't be too concerned as to whether the word "the" was found on the top page returned or not, since "the" doesn't contain much meaning here. Hence, it may be perfectly acceptable to drop it from the query when retrieving pages.

However, if your query was The Office (the title of a popular televisions show) it would be absolutely ridiculous to drop the word "the" since the query would essentially change meaning – and we received a lot of emails about how we were doing just that. In fact, previously we were routinely dropping all stop words – and knew this needed dramatic improvement.

In our recent release we've overhauled our logic, and if you search for something where the "stop words" contain crucial meaning, we can sense thatand realize that "the" in The Office is crucial, or the "A" in Avenue A is crucial; Whereas if you query for something like the aurora borealis we realize that the word "the" isn't as crucial as the other query words.

The evolution and growth of LiveSearch has been unstoppable.

Navneet Kaushal

Navneet Kaushal

Navneet Kaushal is the founder and CEO of PageTraffic, an SEO Agency in India with offices in Chicago, Mumbai and London. A leading search strategist, Navneet helps clients maintain an edge in search engines and the online media. Navneet's expertise has established PageTraffic as one of the most awarded and successful search marketing agencies.
Navneet Kaushal
Navneet Kaushal
Most popular Posts
Upcoming Events
Events are coming soon, stay tuned!More