Skip to content

Diversity Metrics

INFO

This tutorial outlines part of the workflow for the Informfully Recommenders repository. The Recommenders Pipeline provides an overview of all components. And you can look at the Tutorial Notebook for hands-on examples of everything outlined here.

Gini Coefficient

The Gini coefficient is calculated for three different features: category, sentiment, and party. It quantifies the inequality across these three dimensions within a given recommendation list. The smaller the value indicated, the higher the equality. A value of 0 indicates perfect equality, while a value of 1 indicates perfect inequality. (In this context, diversity is equated to equality.)

(Expected) Intra-List Distance

The intra-list distance is computed for four different features: category, title (embeddings), sentiment, and party. It calculates the average pairwise dissimilarity between items in the recommendation list. The smaller the value indicated, the higher the similarity. A value of 0 indicates perfect similarity, while a value of 1 indicates perfect dissimilarity. (In this context, diversity is equated to dissimilarity.) Expected intra-list distance is a rank-warfare version of intra-list distance; the same principles and interpretation apply. The main difference is that it considers the position and relevance of an item for assigning a value.

RADio Divergence

MetricItem FeatureValue RangeDetailsInterpretation
ActivationSentiment[0,1]Compares the sentiment distribution between the recommendation list and the article pool.A higher value indicates greater divergence in sentiment distribution between the recommendation list and the item pool.
Calibration CategoryCategory[0,1]Compares the complexity distribution of the recommendation list with the user's category preferences (based on their reading history).A higher value indicates greater deviation from the user's category preferences.
Calibration CategoryComplexity[0,1]Compares the complexity distribution of the recommendation list with the user's complexity preferences (based on their reading history).A higher value indicates greater deviation from the user's complexity preferences.
FragmentationStory[0,1]Quantifies the differences between story chain distributions in recommendation lists across users.A higher value indicates greater variation in news story chains for the recommendation lists.
Alternative VoicesMinority and majority ratio[0,1]Compares the proportion of minority and majority viewpoints between the recommendation list and the item pool.A higher value indicated greater disparity between minority and majority representation in the recommendation list and item pool.
RepresentationParty mentions[0,1]Measures divergence in representation of political parties in the recommendation list and item pool.A higher value indicates greater divergence between the party mentions in the recommended articles and the item pool.