A thoughtproduct in the life of...: The Ginormous Stackrank of Human Experiences

I've decided to accouche an idea that began over four years ago.

Back then, I was freshly emerging from the ethics-heavy portion of my graduate education. The moral reasoning models I was learning copulated with the decision analysis tools I was exposed to, and my brain conceived The Carmack Vector Addition Theory of Ethics: Advancing the Ball.

In its first trimester, the idea was mostly geared toward enabling a more rigorous mathematical approach to ethical decision making. As the idea continued to gestate, I developed some alternative titles for the approach: "Mathematizing Morality" or "Quantifying Compassion." I also debated various designs and objectives. Eventually though, the example of Facemash from The Social Network (Mark Zuckerberg developed a website that allowed visitors to compare two student pictures side-by-side and let them choose who was “hot” and who was “not”) prevailed due to it's simplicity. My intention now is to create a giant stack rank of human experiences.

How would the system work? I'll go into greater detail below, but the crux of the system is pretty simple: users choose which experience they prefer out of a pair. For example, you might be asked, "Which do you prefer?" between (A) graduating from college and (B) falling in love. You select one, then move on to the next pair the system feeds you and repeat. The end result after millions of selections is a giant, robust stack rank of human experiences.

Below I detail (1) Rules, (2) Approach, (3) Next Steps, (4) Problems/Solutions, (5) Initial List, and (6) Further Commentary.

Rules

You can only select between experiences you've actually had

No experience in the list can exceed a "2" level of detail

1=Having coffee
2=Having coffee with a friend

3=Having coffee with a friend in the morning

You must answer honestly

Approach

A user clicks on a link and arrives on the landing page/app home. There are two options: "View the List" or "Participate." If the user chooses "View the List", they are taken to the stackrank where they can search and browse.

If the user chooses "Participate," he or she is given a Batch (40 experiences). The experiences all have three options: "I have experienced this", "I haven't experienced this", or "this experience doesn't qualify, e.g. it exceeds the level of detail or is not an actual human experience" (the last option is for quality control). The default, "I haven't experienced this", is selected for all. The user selects the appropriate option for all 40 experiences, based on his/her own past.

The user is then fed a set of between 10 & 100 experience pairs (only experiences the user indicated s/he has had are presented to the user). Each pair has two options. For example, the pair is (A) graduating from college and (B) falling in love. The user selects A or B, and is then shown the next pair as well as a progress bar (e.g. pair 2 of 100). At the end of the set of pairs, the user is given the option to add a question of his/her own. If s/he chooses "no thanks," they are returned to the landing page/app home. If s/he chooses "add a question," they are taken to a screen where they submit a question (some brief submission guidelines display).

The system randomly-ish presents the new submissions to subsequent participants, and uses the results to update G-SHE (Ginormous Stackrank of Human Experiences) in real time, much as a chess ranking system would. The system also feeds pairs in a strategic way (e.g. doesn't often ask participants if they'd rather fall in love vs. lose your child) in order to elicit the most differential inputs, similar to the methodology for pairing opponents in large-bracket sport competitions.

Next steps

Determine if G-SHE (or a list substantially like it) is already out there in the world somewhere. If so, consider abandoning or redirecting the effort
Decide which ratings system to use
Elo

Glicko

Glicko 2

Most common- used to rank chess players

Elo + ratings reliability

Glicko + ratings volatility

There may be a better rating system - these are just the first three I've researched so far
I'm thinking Glicko 2

Identify an existing list of human experiences to start with
Develop the tool
Distribute the tool
Manage the tool

Problems/Solutions

This effort will doubtlessly run into numerous problems as it proceeds; I'll start capturing them in this expandable table.

Problem	Candidate solutions
Similar experiences submitted (duplicates)	-Use existing tech to detect similar submissions and have a human decide whether they're essentially duplicates, then merge if yes -Whatever approach is taken to solve this problem in comparable settings, such as user feedback fora
Too little participation	-Could display leaderboards - e.g. who's submitted the most qualifying questions, who's submitted the most selections, etc. -Could pay folks on mechanical turk to participate -Could ask volunteers or ethics students to participate -Could display the full list only if the user participates (only give a sample of the list until the user participates) -Could exchange statistical analysis of the results for participation -Whatever other solutions survey firms use to solve this problem
Quality of questions	-Enable a button on the selection screen for "recommend removing this experience (usually because it (1) is not an actual human experience or (2) exceeds a "2" level of detail) -Enable a button on the "have you had this experience" screen to recommend removal - enable Wiki-style comments, or some crowd-based moderation approach used in comparable settings such as wikipedia
Bots complete batches	Leverage existing human-detection tech and restrict participation to humans
Same person selects between the same pair 2+ times	Authenticate the users, or require a sign-in that signals the system not to present a pair to that user if that user has seen that pair in the past
Participant lies	Have the system refrain from including in the effective data set, all results that come from participants whose selection profiles vary more than 3 standard deviations from the median Use some other "smart" techniques to detect likely liars and underweight or eliminate their responses from the calculations Require a set amount of time on each question (similar to completing the blood donation questionnaire) to disincentivize speeding through the questions
Participant tires due to quantity of pairings	Allow users to complete a certain number of pairs per day/week Allow completion in batches that don't exceed a defined number of pairs

Initial List

I hope to find an existing list of human experiences that comply with rule #2, so I don't have to reinvent the wheel. However, the approach is scalable even if I do have to start from scratch. Here's a candidate initial list:

Being displaced due to a civil war

Waking up after a good sleep

Having sex

Skydiving

Giving birth

Mastering a foreign language

Being tortured for over six months

Losing a life partner

Going fishing

Having an accomplishment recognized at work

Eating lunch

Reading a book for pleasure

Your child dying

Voting in a meaningful government election

Having coffee with a friend

Taking a nap

Falling in love

Further commentary

I hope this list will be a useful tool for preference utilitarians. Though I'm not 100% sure yet of all the applications for this stack rank, I expect creative applications will be identified and developed by those who become acquainted with the result. I can imagine think tanks, policy analysts, ethicists, and others being interested in the data; demographers might collect rich data on the participants, then categorize and analyze the results. I also think the average person would be fascinated by the list itself- how interesting it would be to browse and see how various experiences rank!
Q. Why the "2" level of detail? A. To engender consistency and simplicity. The greater the complexity, the more difficult (and potentially less reliable) the preferences become. Plus, constraining the base unit worked well for Twitter...
Q. As sales and marketing professionals will tell you, people's actual choices are better measures of their preferences than what they choose in survey responses. How do you solve for that? A. I don't: that's a weakness in my approach. However, since not all experiences are chosen (say, being raised Catholic), my approach enables a comparison of a greater breadth of human experience than would be possible with a choice-based approach.
Q. Your baby has a long way to go before it matures into a robust, mature adult. How will you get this effort there, given your limited expertise? A. I'm convinced that once smart people see what I'm going for, they'll identify and share improvements. I believe we only need a strong proof of concept to inspire better future versions (like how thefacebook.com of 2004 inspired the far more sophisticated version we now know in 2014 as Facebook).
In future iterations I'd like to provide a more sophisticated approach to letting the participant choose the experiences they've had, which populates the pool from which their presented pairings are drawn.
I'd like to capture the data from the batch phase where participants indicate whether they've had the experience. That data element itself is interesting, in addition to being a useful basis for the system to decide what experience pairs to present to a participant (e.g. present several pairs that include rare experiences to participants who have had that experience).
So far the best title I have is "The Ginormous Stackrank of Human Experiences", acronym G-SHE; lmk if you have a catchier one.

A thoughtproduct in the life of...

Tuesday, December 2, 2014

The Ginormous Stackrank of Human Experiences

Rules

Approach

Next steps

Problems/Solutions

Initial List

Further commentary

No comments:

Post a Comment

Search This Blog

Elo	Glicko	Glicko 2
Most common- used to rank chess players	Elo + ratings reliability	Glicko + ratings volatility