How search engine works

What is search engine?

In simple words, search engine is a tool that finds results when someone searches the web.

There is a search engine within our computers, which searches files, images etc stored within the computer, but search engine mostly refers to searches on the web. 

How search engine works on the web?

To make a search, you give an input (type a query or submit a photo or say a query to voice assistant) and you expect the results to come out within a second. It is not possible for any program or tool to search the entire web (=world wide web) real-time in a few seconds because the web is enormously big and millions of searches are made every minute. 

So, what the search engine does is to make an index of webpages. It is like a list of webpage that the search engine believes are useful to answer search queries. For populating this index, a software (called search bot or web crawler) keeps crawling the entire web all the time and whenever it finds something of use, it includes it in the index. 

Let me now expand some concepts relating to how search engines work:

Search indexing

The indexes maintained by search engines are huge. They contain enormous data in big servers in different parts of the world. The indexes are still very big databases, but much smaller than the whole web, and the data is highly organized so that search results can be given out instantly.

Search ranking

Search engines want to give the most relevant results in response to a search query. Different search engines have different sets of logic for this. Based on what the technologists manning a search engine think is valuable, they make complex formulas in which points are given to a number of ranking parameters. These formulas. called search algorithms, are used for ranking each webpage - and a webpage enters the index based on their ranking. Some commonly used parameters that are included in search algorithms are: how many other websites link to this webpage, whether its content is original, how long is the webpage, whether the content is legally and socially appropriate, whether the webpage identifies the owner. Google is supposed to use a few hundred such factors, periodically adding new factors and keep changing their relative importance. 

Keywords

Once a webpage is found to be worthy of being included in the index, it is tagged with one or more search keywords

A search keyword is the word (or a set of words) that people use for making searches. For example, when looking for wedding dress, people might type such queries on the search box: "wedding dress", "dress for the bride", "what dress should I wear on my wedding?", "wedding gown", "wedding gown in Minaz city", etc.

In this example, out of many billion webpages in a search engine's index, a billion will have useful/ valuable information on dresses. Out of them, a few million webpages will have info on wedding dresses, and a still smaller number of pages on wedding gown. A few thousand webpages may be talking about wedding gowns available in a city called Minaz. 

Search engines keep researching for keywords using artificial intelligence (described below) and tagging them with webpages in the index. Thus, a good article on a wide range of related topics may be tagged with hundreds of keywords. 

Keywords may refer to a very wide subject (e.g. dress) or a narrow subject (wedding dress) or an extremely narrow one (wedding gown in Minaz). Keywords that refer to a narrow subject, they are called long-tail keywords.

Search relevance

As a searcher, you might have experienced that sometimes the query comes to your mind straight and sometimes it comes in bits. You sometimes do not like to type out the entire query that comes to your mind, especially when you are travelling on a rough road and typing on the mobile phone. Search engines use human knowledge and also machine learning tools to make the best guess about what you are looking for. 

In the above example, if you live in Minaz and make this query, "wedding gown", the search engine will try to give you the most useful results. So, you are likely to get results for wedding dress shops in Minaz. You might also get some buying information on other dresses. If the town does not have many dress shops, you might get more results on general information on wedding gown, some tips on wearing a wedding gown, etc. 

But what happens if you type something like this on your search box: "Apple red best"? Depending upon how smart the search engine is, it will use all the information it has gathered about you (your earlier searches, what results people click after making this type of query, whether people in your area search for the fruit apple or Apple electronics products, etc) and make an intelligent guess.

Keywords are the key to connecting search query with the right webpages. So, when the search engine has tagged some webpages for keyword "wedding gowns in Minaz", and someone looks for this information, the search engine will look for webpages in its index, which are tagged for "wedding gown" and "Minaz". Then, because the searcher seems to be trying to find a shop in Minaz, the search engine will filter the selected webpages for buying intent. Thus, the top results on the search page will likely be those of shops selling wedding gowns in Minaz.

Matching keywords is easy, but finding the search intent (intent behind a search query) and matching it with the entries in the index is the real tough job for the search engine, because the searchers make queries in not so straight manner. So, all the major search engines invest heavily in this. 

Recent developments in the way searches are made by people has added complexity to online search. People now search more on mobile devices. They write small queries and are often interested in local information. People also use voice assistants for search. Such queries can have nuances based on dialects and the peculiar way one speaks. A large number of queries are made within apps and social media platforms. Images, audio and videos present more challenges to search engines because these media do not have search keywords associated with them unless the creator adds them manually.

How Google search engine works

As we all know, Google is the big daddy of web search. It works on the lines discussed above: it has huge servers that hold data of billions of webpages in its index. Its crawlers are extremely fast and powerful. Its processing speed also is tremendous. 

Google's big workforce keeps inventing new ways to filter useful/ valuable content out of all that is produced every moment. It must also be efficient in discarding content that is not useful but is presented as useful by using unethical SEO techniques. Based on its research, Google brings out algorithm updates.

For judging the search intent, Google keeps doing research into natural language processing and other relevant fields. It extensively uses artificial intelligence (one segment of it is machine learning). 

search indexing and search intent

Most popular search engines

The present breakup of searches on the web is as follows;

  • Google: 92.7% of all searches (74% of all desktop searches and 94% of mobile searches) are made on this search engine.
  • Bing: About 2.7% of searches are made on this search engine (12.5% of desktop searches, less than 1% of mobile searches). 
  • Yahoo!: Of all searches, 1.5% are done on this search engine.
  • Baidu: This is China's own search engine and gets about 1.1% of all search traffic. 
  • DuckDuckGo, Yandex and Ask are other popular search engines.

Many social media/ ecommerce platforms have a huge number of webpages of their own, they have their own search engines. The following are worth special mention: YouTube, Amazon, Facebook.

Comments

Popular posts

Detailed observations on Indian blogging in English

Indian top blog directory 2023 to be released on June 1

Top literary blogs list: India's best literature blogs 2023 [also other great book resources]