Ranking: Compare, don’t score

A. Ammar, D. Shah
2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 776-783

The need to rank items based on user input arises in many practical applications such as elections, group-decision making and recommendation systems. The primary challenge in such scenarios is to decide a global ranking based on partial preferences provides by users. The standard approach to address this challenge is to ask users to provide explicit numerical ratings (cardinal information) of a subset of items. The main appeal of such an approach is the ease of aggregation. However, the rating scale as well as the individual ratings are often arbitrary and may not be consistent from one user to another. A more natural alternative to numerical ratings requires users to compare pairs of items (ordinal information). In contrast to cardinal information, such comparisons provide an “absolute” indicator of the user’s preference. However, it is often hard to combine or aggregate comparisons to obtain a consistent global ranking. In this work, we provide a tractable framework for utilizing comparison data as well as first-order marginal information for the purpose of ranking. We treat the available information as partial samples from an unknown distribution over permutations. Using the Principle of Maximum Entropy, we devise a concise parameterization of distribution consistent with observations using only O(n2) parameters, where n is the number of items in question. We propose a distributed, iterative algorithm for estimating the parameters of the distribution. We establish the correctness of the algorithm as well as identify the rate of convergence explicitly. Using the learnt distribution, we provide efficient approach to (a) learn the mode of the distribution using ‘maximum weight matching’, (b) identification of top k items, and (c) an aggregate ranking of all n items. Through evaluation of our approach on real-data, we verify effectiveness of our solutions as well as the scalability of the algorithm.