How a search engine works


If you ask a marketeer to define search engine, he will say "Its the dreamland. With it, you can sell whatever you want to whoever currently wants to buy it".
If you ask a webmaster, he or she will say "Its the mechanism that allow people to find quality web pages" and he will probably try to tell you about all the pages he or she has.
For a student, is a way of finding all the info to finish homework earlier, and for gamers the only way to find cheats and/or information about new games and of course the games themselves.
In general, for all of us is the medium that allow us to find whatever we are looking for, until at least something better comes along.
But how all started?


A brief history of search engines

The entire web as you know it, was a vision of Tim Berners-Lee. As he worked at CERN in Europe, he created a site on which he described how the web worked, and placed its documents on the first web hosting server at info.cern.ch.
Since the interest back then for the web was small, the first group was formed by universities linking one each other. Tim created a directory, known as Virtual Library and that was all.
From that time until late of 90s, there were not search engines as you know them. How things worked back then isn't important but as you understand, as web grew, the need of a way to retrieve information grew bigger.
That need drove the creation of directories as Yahoo, and after,the as you know them today, search engines.
Lets take a look inside them and see how things work from a view of how they can help you, to get traffic and higher ranking in their results


Relevancy of search engines

The only purpose of a search engine is to deliver the most relevant content it can to whatever a user will ask. If a search engine does that, its considered successful. And as these lines are typed, the most successful search engine at the moment is Google.


Listing a new site

So you finished your site. There is one problem. Not a single search engine knows that it exists. And even if they find it, they will have a headache trying to figure its content quality. But how they find you and, what they do to determine your site quality?


What are the components of a search engine

Regardless of how a web page is organized, all of them have three major parts.

  • a spider (also called crawler)
  • an archive (also called catalogue or index)
  • and a search interface


What is a spider

Or crawler. It does the dirty job of scanning the web, updating the index with new pages and updating changed existing pages that were checked before.
Crawlers have two types of scanning, deep and shallow. It also varies on how often they scan a web page. Some pages that change frequently receive more often scans, in comparison with pages that don't change. It doesn't affect your ranking or relevancy, it just helps the search engine to keep an updated copy in its index
Truth be told, the more frequent you get scanned, the better, simply because you can enjoy immediate scans of newly written pages that you create by placing links to your frequently crawled pages.


What is the index

The index or database of a search engine, is where all the data from spiders are stored. when you search you actually search the index, not the web.
Searching the web is the job of spiders. The index fix a catalogue that its content is shorted and organized based on keywords for each language.
So when you search Google and it returns 500.000 results, it means that 500.000 pages contain the keywords you requested or have inbound link texts that contain the words in a phrase.


What is search interface

If you go at Google and ask something, you use the search interface to interact with the algorithm that it uses, and scan through its index in an effort to return the most relevant results.
In order to do that, it breaks your input in keywords, takes into account special commands and checks after to its database.
In most search engines, most keyword relevant results are prestored and delivered from there, and only some are calculated at real time. In simple words when you ask for "what is the best car" it doesn't actually search the index, but that keyword is already calculated and delivered to you (usually because of frequency of request).


What is a search engine algorithm

Its a mathematical equation that in simple words do its best to imitate the user. Its often to see your place on results to fall, not because you changed something, but because the algorithm changed.


What is a Meta tag

The reason of existence of meta tags is to help search engines to organize the web. In meta tags, the author of a site writes keywords and description, and that was used to match user keywords requests.
When they were introduced they were very important, but by time they lost their relevancy, because people begun to abuse them by placing lots of repeated words like "free, free, free ..." and that was enough to get high ranking.
Still, regardless if meta tags lost their importance, you should still fix some well structured keywords in there because every bit helps.


Advanced search engine options

It goes out of this blogs purposes to list all the options on filtering results, but some are useful.
  • site:whatever.com or site:.com (or other LTDs like .org .net etc.)
It list results that come only from the specified domain or LTD.

  • Quoted keywords with "". Example "advanced search tips"
It lists only site that contain the exact keyword.
The list is quite big but since you can combine those things together, you can have pretty filtered results or to do things that otherwise you couldn't do.


Today's search engines

Today technology that powers search engines evolves in fast paces on emulating the user. Competition led also to appear a lot of them, each one using different algorithm and approach. That is affecting you because if you try to optimize your content to list good in one, that doesn't necessary means that you will be good also in another.
And speaking of optimization, if you read this blog to see how things work and get ideas of how  to generate traffic by manipulating the search engine, i would suggest to better focus on quality content. It is much safer and in the long run it pays off better. If you want to know more you can also read other ways of making money online.
If you are interest to submit your site you can visit the complete search engine submitting links blog (although its by far not complete yet). If you want to know more about them, then you enter SEOs paths or search engine optimization.
Regardless of intention, in the end we all enjoy when a search engine returns exactly what we want.

As of tomorrow? Who knows.