Just Google it - How Search Engines Work

Just Google it - How Search Engines Work

How do Search Engines Work?

Something that has grown to become such a large part of everyday life, the modern search engine is still a mystery to most people. Being used everyday, there are over 3.5 billion searches on Google alone. Typing in a request and getting an answer in mere seconds can be quite addicting and can easily become a regular part of your life. But how does the tool find your results and present them so quickly? These are just a few of the many question to be made regarding search engines and techniques working behind the scenes.

Who does the grunt work?

In order to provide results for your search, there is something that does most of the work called a “crawler”, or “spider”. This is a generic term for an Internet Bot which role is to go about the internet, crawling on different web pages and making note of their content. These bot programs do so by following links from site to site, they then send the data back to their servers. This is only possible for pages that don’t specify that they wouldn’t like crawlers examining them, those that do are known as part of the dark web (or deep web). Listing those websites as “noindex” means to search engines that they should be skipped of indexing and are in turn kept out of databases.

What is indexing?

Indexing is the process that makes it possible for search engines to respond quickly. After the crawlers have finished going through websites by following links and have stored them in the servers, they then have a copy made of them, adding the URLs to the index. To store this all takes a massive amount of memory, as in Google’s case, their servers are well over 100,000,000 Gigabytes in size. To put this in perspective, just one percent of this is about 10,000 hours of television. This is a crazy amount of information and is constantly getting larger and larger, and mind you, this is only one search engine. 

How do they reproduce this information?

Indexing is only part of the process however, there is another process between those databases and the results being offered, this is called retrieval. However, the methods for each search engine are different  and that’s why the results are not always the same. There is a set of criteria that they use to choose what webpages they think fit best to your search. These diverse algorithms compare billions of different pages to see which one would suit you the best. This is achieved by seeing if your search terms are in the title, next to each other, how many pages are linked to them, or if they appear in a matching order. The specific method of one of these algorithms is never completely revealed, if so there would be people who would try to take advantage of it for better results. This has happened previously when website creators placed an extreme amount of keywords in their website, since this is how search engines used to operate. Which is why the term “keyword stuffing” was created. 

This is where things get really cool, the software that generates the results utilizes machine learning. Meaning the more pages are analyzed, the more accurate the outcome. Eventually they can understand even the underlying meaning of a word. Nevertheless, the most precise way to make a search isn’t by typing a full question into a search bar, for example, “How do I make award winning chocolate cake?”. You would be better off searching based off the key words “award”, “winning”, “chocolate”, and “cake”. This makes it much easier for the retrievers to do their job, supplying more exact results centered around those keywords. 

How is the order then decided?

After the results have been handpicked by whatever method the search engine deems fit, they are rearranged in the order considered most helpful to your needs. This is then called ranking and for Google it is called Pagerank or PR (invented and named after Lawrence Edward Page). This process is similar to the one of how the pages were originally retrieved, by the order, frequency, and quality of keywords. Of course this is just a few examples of how it is really done. Placement in this ranking system can be improved with a technique called SEO (search engine optimization), giving an advantage over competitors if done correctly. The google algorithm is taking into account more than 200 criteria to determine the ranking results. There is a vast amount of smaller techniques making up the whole of SEO, ranging from keyword optimization, to link building, and they are all equally important. 

improve your
google ranking!

Be successful online