Search Engines Then and Now

The Internet started to develop around 1983. I started using the Internet by the 90s. Since that time I have used it. As a professional and specializing in information retrieval systems since the early 60s.

Since the 1960s we started developing online information retrieval networks in Europe. The indexing not only based on relevant search terms contained within a document but also out-of-context terms. In those early days an information technician had to read and understand scientific and technical documents to arrive at relevant search terms usable for indexing. My studies at Syracuse University attempting the inclusion of AI principles into prototype online information retrieval systems. [That was by the beginning of the 1970s.]

One can notice marked changes to the quality of content from then and now. One just needs to look at the “so called quality of content” of YouTube videos. What was then perceived as technologically relevant can in most cases not be compared to what it is now.

It is almost impossible to actually accomplish receiving precisely that information or information link to documents which is searched (using the most complicated Boolean language operators AND/OR/NOT including all sorts of nesting for constructing a search). Over and above, today’s searches deliver unwanted ballast – advertisements, and therefore irrelevant to a search.

[ https://library.albany.edu/subject/tutorials/education/boolean.html ].

According to Google, search results are returned in form of : “Search Engine Results Pages (SERP). These are the pages displayed by search engines in response to a query by a searcher. The main component of the SERP is the listing of results that are returned by the search engine in response to a keyword query, although the pages may also contain other results such as advertisements.” {Quoted by Google.}

PageRank was named after Larry Page,[1] one of the founders of Google. PageRank is a way of measuring the importance of website pages.

[ https://en.wikipedia.org/wiki/PageRank ]

The biggest difference seems to lie in today’s technology which more or less emphasizes the importance of retrieved information based on its commercial value.

Meaning that a successful information retrieval activity delivers results based on “measuring the importance of website pages.” The importance being counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites.[2] According to Google. [Which reminds me of various other social networks where importance of information is determined by how many LIKEs occur.].

Unfortunately today’s information stores are controlled by several tech mega giants and corporations, who infiltrate each online search with commercial data. Not necessarily relevant to a user’s search.

Then, what are the most important elements of a search for meaningful information, and what are they based on. INDEXING. Indexing establishes the basis for retrieval of information. Real data is not obtained, but only a path to documents/pages.

In other words, how can a user retrieve meaningful information or links to relevant websites or documents if the indexing has been faulty. No matter how complex AI algorithms are developed, they are still not good enough to master the complexities of human languages, which over and above are continually evolving.

Advertisements