Search engines today are used in almost everyone’s day to day life. The benefits of search engines, to individuals and to society as a whole, are immense. Without them, the web wouldn’t be as useful and the e-commerce environment in particular would be vastly different.
With an estimated 30 trillion individual pages on the web, search engines play a crucial role in locating the information that’s essential to the professional, social and private lives of millions of people.
Whether we’re looking for information about healthcare, businesses, education, the government, national and local news, entertainment, celebrities, e-commerce, or pretty much anything else, they help us to organise and streamline the wealth of data at our fingertips.
© reshoot / Adobe Stock
How do search engines work?
When you know the web address of a site, it’s easy to find it by typing the URL into the address bar at the top of your browser, but if you don’t know the URL, a search engine will enable you to look for things by typing in key words.
A search engine is a program that scans its index of web pages, looking for keywords and content related to your search, displaying the results. It makes the index using another program, known as a web crawler, which browses the web, storing information about all of the pages that it visits.
When a web crawler visits any web page, it copies and adds the URL to the index. Afterwards, it will follow all of the web page’s links, adding them to the index too and following all of the links. This process is repeated continually, so that it builds up a massive index of many web pages.
The information that has been compiled by the web crawler is used by the search engine and becomes its index. Every web page that a search engine recommends will have been visited by the web crawler.
First search engine
The first search engine for public use was launched in 1990, when Archie was developed by Bill Heelan and Alan Emtage. Heelan worked at McGill University in Montreal, where Emtage was a postgraduate student.
Archie began as a student and volunteer staff project at the university’s School of Computer Science in 1987. The systems manager, Peter Deutsch, helped to connect the School of Computer Science to the internet. Emtage wrote the first version of Archie. Relatively simple, it contacted a list of FTP archives once a month, so that it wasn’t wasting the remote servers’ resources.
It would request a listing and the listings were subsequently stored in local files, which were searched using the Unix Grep (Global regular expression print) command. Over time, improved versions of Archie were developed.
Expanding from being a local tool to a network-wide resource, available from sites across the internet, initially the Archie servers processed around 50,000 queries each day, generated worldwide by just a few thousand users. In 1990, Archie was upgraded every two months as internet use grew.
By 1992, Archie had 2.6 million files, containing 150 gigabytes of information. Although this seems small compared with today’s figures, at the time, it was bigger than anyone could have imagined.
Work on Archie stopped in the late 1990s.
Launch of World Wide Web
In the early years, before the World Wide Web, data was shared via File Transfer Protocol. Computer scientist Tim Berners-Lee created the World Wide Web, developing a system based on the concept of hypertext for updating and sharing information. In liaison with Robert Cailliau, he produced a prototype, named Enquire.
Working as a fellow of CERN (Europe’s largest Internet node) in 1989, Berners-Lee grasped the opportunity to join hypertext with the Internet. He connected it to the DNS and TCP and the World Wide Web was born. He incorporated similar ideas to those which had been used in the Enquire system.
He called his first web browser and editor WorldWideWeb, developing it on NeXTSTEP. His first Web server was called httpd: HyperText Transfer Protocol daemon. The first Web site at http://info.cern.ch/ was put online on 6th August 1991. It was the first Web directory in the world.
Web’s first robot
The World Wide Web Wanderer, the Web’s first robot, was introduced by Matthew Gray in June 1993. Initially created to count active web servers, it was then upgraded to capture URLs, resulting in a database called the Wandex.
However, The Wanderer began to create problems as much as it provided solutions, as it led to a system lag because it accessed the same page literally hundreds of times per day. Although Gray rectified the software problem, people questioned the value of bots.
Martijn Koster launched his indexing of the Web in October 1993, in response to The Wanderer. ALIWEB would crawl meta information, enabling users to submit the pages they wanted to index with their own page description.
As it didn’t need a bot to collect data, it didn’t use an excessive bandwidth. A criticism of ALIWEB was that a lot of users didn’t understand how to submit their site.
Primitive Web Search
In December 1993, three new bot-fed search engines had been launched: JumpStation, the Repository-Based Software Engineering spider and World Wide Web Worm.
JumpStation gathered information about the header and title from Web pages and retrieved them using a linear search. The WWW Worm indexed URLS and titles. The problem with both of the search engines was that they listed results in the order in which they had been found, without discrimination. The RSBE spider implemented a ranking system.
However, the early search algorithms didn’t cache full page content, or do adequate link analysis, if users didn’t know the exact name of what they were searching for.
In February 1993, the project, Architext, was developed by six Stanford undergraduate students, who used statistical analysis of word relationships to carry out a more effective search. They created the Excite search engine.
By mid-1993, their search software was released for use on the Web. Broadband provider @Home purchased Excite in January 1999 for $6.5 billion, renaming it Excite@Home.
In January 1994, the EINet Galaxy web directory was launched. It became a big success, largely due to the fact it contained Telnet and Gopher search features, as well as its web search feature. Even though the web size in 1994 didn’t particularly need a web directory, other developers began to follow suit.
Jerry Yang and David Filo created the Yahoo Directory in April 1994. It was unique in that it included a human-compiled description for each URL. The Yahoo Directory grew and it began charging commercial sites, although many informational sites were included free of charge.
On 20th April 1994, Brian Pinkerton of the University of Washington released WebCrawler. This was the first crawler to index whole pages. It became immensely popular and in 1997, Excite bought it out.
WebCrawler opened the door for many other search engines and spurred the release of OpenText, Infoseek and Lycos.
Lycos was designed and launched at Carnegie Mellon University in July 1994 by Michale Mauldin. It had a catalogue of 54,000 documents when it went public and provided ranked relevance retrieval, word proximity bonuses and Lyprefix matching. By August 1994, it had expanded to identify 394,000 documents and by November 1996, it had indexed more than 60 million documents.
Also launched in 1994, Infoseek offered a few add-ons, in comparison to Lycos. Netscape began to use Infoseek as its default search engine in December 1995. This gave Infoseek major exposure. Its most popular feature was enabling webmasters to submit a page in real time to the search index.
Released at the same time as Infoseek, AltaVista introduced new and important features, including almost unlimited bandwidth, advanced searching techniques and natural language queries. It also permitted users to add or delete their own URL, while enabling inbound link checking and providing search tips.
On 20th May 1996, the Inktomi Corporation launched its search engine, Hotbot, using improved technology created by research at the University of California, Berkeley. Inktomi was founded by Prof Eric Brewer and student Paul Gauthier. The site became popular quickly, after being listed by Hotwire.
Further developments in the 21st century have taken search engines to a whole new level. The World Wide Web today is vital to daily life.
Contact The Cornwall SEO Co. to discover how our SEO expertise can help your business.
We will bring you up-to-date with more recent search engine developments in our next blog, so keep your eyes peeled!