Working of a search engine
Lets understand how a search engine works through this chart:
If we look back to earlier example the search engine acts as a librarian that gathers relevant books which is required information from the library of data available on the internet.
To summarize, when user searches for a particular data the web crawlers scan or crawl through the data available on web and gather all the relevant information (Crawling). After this, the gathered information is organized in the form of catalog or database so that the relevant web pages can be selected quickly. The search engine then picks up the most relevant results according to the ranking and finally displays it in the results page or SERP. It is quite a technical process, but all this happens so quickly that user gets the results as soon as they search something on the search engine.
Search Engine
Imagine you are in a library and are looking for a particular book. Now if you have to go through every book in each category, it will be a tedious and difficult task. Moreover, if the library has more than a million books then this task seems next to impossible. You are definitely going to need a librarian who can bring the relevant books for you without any delay. Well, that’s where a search engine comes in.
Search engine spamming refers to the practice of creating Web pages, or sets of Web pages, designed to get a high relevance rank for some queries, even though the sites are not popular sites. Popularity ranking schemes such as PageRank make the job of search engine spamming more difficult, since just repeating words to get a high TF– IDF score was no longer sufficient. However, even these techniques can be spammed, by creating a collection of Web pages that point to each other, increasing their popularity rank. Techniques such as using sites instead of pages as the unit of ranking (with appropriately normalized jump probabilities) have been proposed to avoid some spamming techniques, but are not fully effective against other spamming techniques. The war between search engine spammers and search engines continues even today.
The hubs and authorities approach of the HITS algorithm is more susceptible to spamming. A spammer can create a Web page containing links to good authorities on a topic, and gains a high hub score as a result. In addition, the spammer’s Web page includes links to pages that they wish to popularize, which may not have any relevance to the topic. Because these linked pages are pointed to by a page with high hub score, they get a high but undeserved authority score.
Table of Content
- What is a Search Engine?
- History of search engines
- Working of a search engine
- Architecture Of Search Engine
- How queries are processed in search engine?
- Search Engine Advantages:
- Examples Of Popularly Used Search Engines