The Whys and Hows of Measuring Search Relevance

The Whys and Hows of Measuring Search Relevance

Answers to questions about almost everything are just a few clicks away in the digital world. Just access to a search engine and human knowledge is at our fingertips. However, with a plethora of engines up for sale, how to get hold of the value for money deal? Evaluate the amount of improvement that a new search engine brings in and voila! But, quantifying the quality of search still remains a mystery for a lot of people out there. To help you untangle the mystery and face the dilemma of choosing the best search engine, we have compiled a comprehensive guide to measuring the quality of search. Let’s get rolling.

Key Factors to Evaluate the Quality of Search

It is difficult for users to verbalize why they prefer one search portal over another. That is why the following factors are taken into consideration to assess the overall quality of the search.

  • Search Efficiency: The amount of effort that a user puts in to search and locate the desired information is defined as search efficiency. The less the efforts involved, the more efficient the search engine.
  • Search Effectiveness: The frequency of typing/retyping and revising the search terms in order to reach the desired information is known as search effectiveness. The less the frequency of redefining the search query, the more effective is the search engine.
  • Search Relevance: The evaluation of how closely a search result is related to the query is known as search relevance. With a high degree of search relevance, you can find the right information at the right time effortlessly.

Key Metrics to Evaluate the Quality of Search

Measuring the effectiveness of a search engine is a whole new ballgame. However, by gauging the search behavior of a customer, you can derive insights to ascertain the effectiveness of your search engine. Here are a few metrics that help us understand and measure search behavior.

  • Precision: The fraction of relevant search results among the total search results is known as precision. Let’s say you run a query “tote bags” and the engine returns 30 results – out of which, 15 are tote bags, 10 are jute lunch bags and 5 are laptop bags. The precision in this instance is 50% (15/30). There might be several reasons behind a low rate of precision but lack of context remains the core problem.
  • Recall: The ability of a search engine to retrieve all the relevant results from the corpus is known as recall. Let’s take the same search query again – tote bags. Although there are 50 products that qualify as tote bags, the search engine returns only 30 of those. Then, the recall is 30 out of 50 or 60%.
  • Mean Reciprocal Rank (MRR): Not all searches demand an equal amount of effort. Therefore, not all clicks are treated equally. A click on the top result is treated as a full click whereas if the visitor clicks farther down, it is counted less than a full click as more efforts are required to locate the desired information. Therefore, we need a more granular metric to ascertain the position of the high-ranked result in order to improve search relevance.

    Mean Reciprocal Rank (MRR) is one such approach that calculates the average of the reciprocal ranks of a query response and guides the search engine to put the most-desired result on the top. However, the calculation of reciprocal ranks remains a question mark. Well, it is a multiplicative inverse of the rank of the first correct answer: 1 for clicking the first result, ½ for clicking the second result, ⅓ for the third result, and so on. The higher the MRR, the more relevant and effective is the search.

    Mean Reciprocal Rank

    Q represents the total number of clicks; ranki refers to the reciprocal rank.

  • Normalized Discounted Cumulative Gain (NDCG): This metric asserts that the highly relevant documents are more useful than moderately relevant documents, which are in turn more useful than irrelevant documents.

    Before we dive into NDCG, let’s understand its predecessors: Cumulative Gain (CG) and Discounted Cumulative Gain (DCG). CG is the sum total of all relevance scores in a recommendation set, irrespective of the rank of a result in the search results whereas DCG bridges the gap between relevance scores and the positioning of search results.

    Let’s consider the following two search result lists with relevance scores of individual documents: Set A [2+3+3+0+1+2] and Set B [3+3+2+2+1+0]. The cumulative gain for both the sets is equal to 11, therefore, both are equally good as per CG. However, a visitor is likely to be happier with Set B as results are displayed in decreasing order of relevance, thus, the most-desired result is on the top and demands negligible efforts to access the information. That’s exactly where DCG plays its part.

    Although DCG sounds good in the first place, it is still incomplete as depending on multiple factors, the number of recommendations served may vary for each user. While calculating DCG for different queries, you’ll discover that some queries are just harder than others and therefore, produce lower DCG scores than easier queries. This is where NDCG steps in. It scales the results based on the best results seen (called the ideal DCG or iDCG).

    The CG, DCG and NDCG at particular rank p is defined as:

    The CG, DCG and NDCG at particular rank p

    rel1 is the graded relevance of the result at position I.

  • Clickthrough Rate: The fraction of search results that receive clicks amounts for clickthrough rate. Let’s say of the 30 results returned by a search engine, you click on 3 results. The CTR in this case would be 10 (3/30).

How Search Built on a Cognitive Framework Improves the Quality of Search Results?

One search query often returns thousands of results. With the information jungle only getting bigger with each passing day, it becomes imperative to understand user intent and populate pertinent results with ease. Laid on the foundation of natural language processing, cognitive search engines give importance to contextual relevance over word-to-word keywords. By taking into account the user intent, spelling errors, synonyms, acronyms, etc. it improves precision of search.

Additionally, the taxonomy and nomenclature properties of cognitive search engines help in entity extraction and semantic tagging; thus, narrowing the search results to precisely meet the context, concept, or idea that the user is searching for. After all, if the search only populates a bunch of useless, unrelated results, then your users might feel agitated and primed to look at other alternatives to find what they need. Last but not least, cognitive search engines also let you manually tune results in order to meet specific user expectations and strike a balance between precision and recall.

Want to Incorporate Relevance Into Your Search Engine?

You may have an amazing portfolio of products or services but a mediocre search experience can easily turn it all to dust. Therefore, incorporating relevance to a search engine is non-negotiable. To know how you can fuel relevance into your search engine, download our white paper: The Anatomy of Search Relevance.

The Anatomy of Search Relevance

No Comments

Post a Comment

Featuring Cornerstone

Leveraging SearchUnify Analytics and Insights

The Cornerstone Story
Featuring Cornerstone

Leveraging SearchUnify Analytics and Insights

The Cornerstone Story