How Does Google Work? Search Engines and Site Crawlers

This is a beginner’s introduction to how search engines and site crawlers, like Google and Googlebot, work using everyday language.

Back in the infancy of the internet, back before AI overviews and search engines, there was chaos. Only users who had direct urls could find what they were looking for, and even then it wasn’t always easy to know you had the right site.

I like to use the analogy of the internet is the ocean for this. If the internet is the ocean, then websites are like marine life – from tiny minnows to great whale sharks. Users in the days before search engines needed to strap on their scuba gear and dive deep to find the fish they wanted. They had to navigate plumes of spammy plankton, electric eel viruses, predatory shark sites waiting to feed on their personal data.

Since users were getting eaten alive out there, analysts started charting courses manually at first in the 1990s, generating databases based on publicly available FTP. However, the number of websites surged to levels far beyond manual control.

With trillions of websites bursting to life in the internet ocean, analysts built prototype after prototype engineered to compile them for navigation. Eventually determining that because there were so many sites living and dying every moment, the solution couldn’t be as simple as a stagnant database. It needed to be alive, too. It needed to grow, evolve, and change.

Enter the idea that seeking and compiling needed to be two separate functions – and crawling and indexing were born. The search engine erupted on the scene by the late 1990s, and Google prevailed in the post.com bubble as the dominant force by 2000 thanks to its focus on simplicity and user appeal.

To return to our metaphor: instead of diving in themselves, with search engines all users needed to do now was sit at the edge of a dock, throw out a fishing line using their query as bait, and wait for search results to nibble. Let’s go with that this particular user is looking for trout as a site.

Deeper under water, Googlebot crawlers circle like picky sharks sniffing out potential matches to keywords and topics at all times. Looking for sitemaps and following links, checking for updates, building up a school of potential fish that users would want to catch.

Corralling the fish closer to the surface, the index, these Googlebot sharks act as a filter that keep the search engine results pages, the fish that start nibbling at the search query bait, relevant.

This time, the nibbling SERPs were five trout and one red herring; the Googlebot shark mostly gets it right. When it doesn’t, the user can always throw the herring back and rerun the query, after all.

← Back

Thank you for your response. ✨