This thesis systematically analyzes offline fairness evaluation measures for recommender systems, revealing flaws in interpretability, expressiveness, and applicability. It proposes novel evaluation approaches and practical guidelines for selecting appropriate measures, directly addressing the confusion caused by un-validated metrics. Key Takeaways This thesis systematically analyzes offline fairness evaluation measures for recommender systems, revealing flaws in interpretability, expressiveness, and applicability. It proposes novel evaluation approaches and practical guidelines for selecting appropriate measures, directly addressing the confusion caused by un-validated metrics. What Happened A new thesis published on arXiv (2604.25032) tackles a growing problem in the AI community: the widespread use of fairness evaluation measures for recommender systems without proper scrutiny of their robustness.…