Wednesday, September 15, 2010

Thoughtful Thursday, Slater and Kemeny Voting Systems

Computing Slater Rankings Using Similarities Among Candidates

by Vincent Conitzer, Electroic Commerce 2006 (Search google.com or citeseerx.ist.psu.edu or scholar.google.com to find the full text of this article.)

As Dr. Connitzer wisely reminds us, the Slater index and Kemeny index are different but related. Let's say every voter gives us their preferences or rankings. Maybe the faculty in a department gives us their ranking of preferences for incoming graduate students. How do we get a ranking for every alternative. A Slater index gives us a ranking for the set where the least number of pairwise elections would be wrong.

We know from Arrow theory that one cannot find a reasonable algorithm to combine the rankings. One of the problems is that a mechanism might be disturbed by an irrelevant alternative. Assume that A wins an election against B, we throw in a candidate Q which everyone dislikes compared to A and B. Then, adding Q or deleting to the list of alternatives should not change whether A or B would win.

20%20%40%20%
CADA
ABBV
BDAw
QQQB
VVVQ
If A and B had a pairwise election, in the above example, A would win. Any ranking should hopefully have A above B but with Condorcet Paradox conditions, that may not be true. That is the final ranking might have B above A.

The Slater index just adds that ranking a blech factor of negative one with 40% preferring that to the more preferred A above B. If it rated V above Q, that would also be a blech factor negative one, even though only twenty percent would like that bad ranking. A Kemeny index looks at how many people dislike a combination. In the above case, the ranking would get a blech factor of 60% and the second a blech factor of 80%. The goal of both the Slater or Kemeny methods is to reduce the "blech" factor for the number of bad comparisons.

Unfortunately, finding the best ranking in either categorization is NP complete to evaluate. That is they are NP complete to find out what the best ranking would be.

Dr. Connitzer found a good way to compute the best ranking, by finding groups of candidates that can be treated the same for all other candidates, "similar" candidates. Example: A B C D E F G H I J are running.

A defeats E
A defeats F
B defeats E
B defeats F
C defeats E
C defeats F
D defeats E
D defeats F

E defeats G
F defeats G
E defeats H
F defeats H
E defeats I
F defeats I
E defeats J
F defeats J
Dr. Conitzer's would find that E and F are similar. He finds a hierarchy structure of sets and subsets thereof where each level in the hierarchy is a similarity set.

So what? If we have thirty candidates--they could be alternatives in a referendum, with ten issues and 191 voters, his algorithm can find the ranking instantaneously, while other algorithms take fourty seconds or so. And it finds a solution when other cannot.

Now this doesn't handle millions of voters--should we use some sort of approximation?

Improved Bounds for Computing Kemeny Rankings

Vincent Conitzer, Andrew Davenport, Jayant Kalagnanam.

Assume there is an election between two alternatives, A and B. The voters can only tell which is better or more correct with a probability of slightly above 50%. Each voter votes for the one that seems the best to them. We saw that if one simply takes the majority, there is a very high likelihood--in reasonable sized electorates, of getting the correct answer or best alternative. This is Condorcet's theorem.

But what if there are dozens of alternatives. It turns out that the Kemeny method would give us the best answer, and each voter had an imperfect idea of what was the best decision. And other voting systems correspond to other models, which is a different paper and a later Thoughtful Thursday.

Of course, there is no reliable way to solve large instances of these. Dr. Connitzer raised the issue of approximations to the Kemeny method--but is that what people were promised. We have seen very close elections, and certainly to have the result be an approximation that could be wrong. Dr. Connitzer developed algorithms to compute the Kemeny method. I won't give the detailed description because I doubt that the readers are interested and if they are, his article describes it far better than I could. But three idea of operations research/computer science are important

  1. graph theory
  2. linear programming
  3. integer programming
Many other techniques in social choice theory represent elections as a graph. The nodes are the candidates and the edges represent the number of votes in a pairwise election. If A would beat b by 32 votes, then Dr. Connitzer's graph would have an arrow from A to B with a 32 on it. The linear programming relaxation has less than a one percent deviation from optimality which is measured by optiamlity. And if most people agree on an order, e. g., the Nazi Party candidate is not wanted, that improves dramatically the ability to find a good ranking. That seems good, but what if an electorate is highly polarized, say between socialized medicine and no government intervention. One interesting study would be to look at a population where there are two consensuses on very different formulations.

An interesting idea would be that after an election, one could have a competition to determine the optimal Kemeny ranking. Different groups of computing scientists could produce a ranking and its Kemeny distance. (It is very easy to verify the Kemeny distance of a given ranking.) The ranking with the minimum Kemeny distance would be the winner. Thus, an agrieved group who narrily lost an election could pay for supercomputer time to try and find a better ranking that hopefully would get what they wanted. One could have some sort of prize for the team of operations researchers and programmers that found a winner.

As I believe I mentioned, if there are thousands of alternatives, e. g. tax codes with slightly different structures, each voter would vote on a subset of alternatives. Drs. Kenyon-Mathieu and Schudy spoke about this in their article on "How to Rank with Few Errors" where they said "In statistics and psychology, one motivation is ranking by paired comparisons: here, you wish to sort some set by some objective but you do not have access to the objective, only a way to compare a pair and see which is better; for example, determining people's preferences for types of food" or perhaps preference for tax code. (I will Thoughtful Thursday this material here.) These would be combined in the same manner and described here as if each voter gave a full preference order.

No comments:

Post a Comment