Sensible Analysis of Censored Rank Data

This problem commonly arises in market research data, and the solutions I have seen to date just don’t seem to cut it (typically, they throw away too much data or do some averaging of ranks which I just don’t believe in).

The problem has something in common with MADM (Multi Attribute Decision Making) , which is a very interesting and challenging area in its own right .. google on ELECTRE if you are interested, or have a look at the paper “Ranking Projects Using the Electre Method” by John Buchanan www.esc.auckland.ac.nz/organisations/orsnz/conf33/papers/p58.pdf

However, MADM is not directly applicable, so we need to forge ahead on our own.

The problem : I have censored rank order data .. electors have been asked to rank the 4 most important issues out of a list of 20.

For each individual we therefore have a vector of 20 measurements 1..5, where 1..4 are ranks and 5 = not ranked/less important than the nominated 4.

I would like to be able to use the full information in the ranking, not just rely on first preferences. Nor do I want to average ranks, which appears to be common practice.

I would like to make statements of the sort ‘I am at least 80% confident that the most important issue is “B” ‘.

The immediate thought is some sort of simulation/bootstrapping, which should be straightforward enough if I use just the first rank to denote “most important”… but that ignores the information contained in the lower ranks.

My next thought was that I should attempt some form of ordination .. some unidimensional scaling using perhaps a 1 dimensional cmdscale solution .. this rests on the ability to build a suitable distance matrix, which I think is possible. Or maybe a form of Thurstone scaling.

Finally, we might build a model. A suggestion (courtesy of Jonathon Baron) is to build a model in which each issue has a mean and standard deviation for its utility .. the ranks for each subject are based on the 20 numbers drawn from the model, and to fit the model we need to optimize some error function (mismatch of expected and predicted ranks). That sounds promising.

If I get the time I will build an artificial dataset and see how the various approaches perform.

In the meantime, if your MR supplier assures you that “XYZ is the MOST IMPORTANT issue”, question her/him a bit more closely as to how she did the analysis (and whether it was just based on first ranks or silly averaging).

Update: there are some possible clues in the “choice modelling” literature

http://links.jstor.org/sici?sici=0735-0015(199901)17%3A1%3C117%3AMLMIPR%3E2.0.CO%3B2-X

http://www.springerlink.com/index/GV2264QWG1476842.pdf

Leave a Comment