Turkopticon: motivation, design, status, lessons, questions [Slide 1.] 0. Mechanical Turk. 0.1. Mechanical Turk is a web site run by Amazon. 0.2. It is a market for small information tasks, for "external crowdsourcing." For example, "what is in this picture?", "are these two directory entries for a business the same?", "rewrite this sentence in your own words", "transcribe this audio clip", and so on. 0.3. The prices for the tasks range from 1 cent to a few dollars. Most work on Mechanical Turk is precarious and for low pay. Experienced workers can earn more, but most earn a few US dollars per hour. 0.4. You can be paid in US dollars, Indian rupees, or Amazon gift card points. 0.5. Amazon says there are between 250,000 and 500,000 workers on Mechanical Turk. This is sort of a meaningless number though, because workers can be more or less active. Some do one task a month when they are bored; some work on it 40 hours a week. 0.6. One researcher estimates that there are about a thousand workers on it at any time. 0.7. In a survey we ran from 2008 to 2010, a little less than half the workers were in the US, a little less than half were in India, and the rest were from elsewhere. 0.8. In the same survey, about 20% said they relied on income from Mechanical Turk to meet basic needs. 0.9. This survey is out of date, but if anything we underestimated that number -- because, we were told later by workers who criticized our paper, our survey paid too little to attract the "professional" crowd workers. 0.10. There were diverse motivations for doing work on Mechanical Turk, but most respondents to the survey said their main motivation for working on Mechanical Turk was money, not fun. 0.11. Employers can "reject" work -- that is, not pay. [Slide 2.] 1. Motivation for Turkopticon. 1.1. The design of Mechanical Turk is not fair for workers. 1.2. Employers do not have to pay for work. 1.3. They do not have to give a reason for not paying. 1.4. Workers can complain, but employers do not have to answer. They don't even have to read the emails. 1.5. Amazon charges employers for *posting* tasks. So it does not hurt Amazon if employers do not pay workers. 1.6. Mechanical Turk keeps track of workers' approval rates. A worker's "approval rate" is the percentage of their work that employers have paid for. 1.7. Employers use approval rates to screen workers. They can allow only workers with an approval rate above a certain number to work on their tasks. The default is 95%. The approval rate is assumed to be a proxy for worker quality. 1.8. But of course this is wrong, because employers may refuse to pay for any reason -- for example, because they feel like it. 1.9. So there is an inaccurate reputation system for workers. 1.10. And there is no reputation system for employers. 1.11. For example, if you are a worker choosing a task you might want to know how frequently the employer who posted the task has refused to pay other workers. You might call this the employer's approval rate. You might even want to see a list of all tasks from employers who have an approval rate above a certain number. But this is not currently possible. 1.12. I want to pause for a moment. You might think, well, that is too bad for the crowd workers. But this has nothing to do with me. My job is safe. I am a skilled worker. Maybe. But you should know that unless your job involves doing something with your body, there is probably a researcher in a computer science department in the US, or a programmer in Silicon Valley, who is right now trying to figure out how to crowdsource your job -- to save your employer money and take part of the savings for themselves. I'm not saying this to stir up fear, but you should know about it. There was a project at Carnegie Mellon University where researchers and professional journalists tried to crowdsource the production of science journalism. I hear it didn't work so well, which I am pretty happy about. But next time it might work. And it's not far from crowdsourcing science journalism to crowdsourcing academic writing. Now, I think writing produced that way will not be high quality. But it may be good enough that universities will see an opportunity to cut costs -- and certainly in the US universities are trying very hard to cut costs (except in administrative pay and athletics budgets). I think the idea is that if research can be crowdsourced, then any "information work" -- anything that can be done with only a computer and a brain -- can be crowdsourced. So the issue here is not only about establishing fair conditions for current crowd workers. The issue is that many of us who have secure "information work" jobs now may be crowd workers sooner than we expect. [Slide 3.] 2. Turkopticon, our tiny stand against this future. 2.1. Turkopticon is a third-paty employer reputation system for Mechanical Turk. 2.1.1. By "third-party" I mean it is not associated with Amazon at all. We have no special access to their data or anything like that. 2.2. The original goals of Turkopticon were (a) to call attention to the unfairness of Mechanical Turk and (b) to pressure Amazon to build an employer reputation system into it. 2.3. Turkopticon has two parts: a browser add-on and a web database application. 2.3.1. The web application lets workers review employers. 2.3.2. The browser add-on adds these reviews to the Mechanical Turk interface. [Slide 4.] 2.4. Here are some pictures. 2.4.1. This is what Mechanical Turk looks like normally. [Slide 5.] 2.4.2. This is what it looks like when you have Turkopticon. [Slide 6.] 2.4.3. If you mouse over one of the arrows, you see this. 2.4.4. We have four scores for each employer: 2.4.4.1. "Communicativity": how well do they respond to worker communications? 2.4.4.2. "Generosity": how well do their tasks pay? 2.4.4.3. "Fairness": do they reject fairly, or do they reject without good reason? 2.4.4.4. "Promptness" how fast do they pay? Employers in Mechanical Turk have up to 30 days to pay. But workers prefer faster pay. 2.4.5. If you click on the link for the number of reviews, you can see the individual reviews. [Slide 7.] 2.4.6. And you can leave your own review. [Slide 8.] [Slide 9.] 2.5. Here are some numbers. 2.5.1. We have about 20,000 users. 2.5.2. We have almost 100,000 reviews, covering almost 22,000 employers. 2.5.3. These have been posted by about 8,000 workers. 2.5.4. Almost 16% of the reviews have been posted in the last three months, so the users are quite active. 2.5.5. We have about 12,000 daily visits to the different parts of the service. 2.5.6. And we have reviews for most of the employers on Mechanical Turk. [Slide 10.] 2.6. But so what? Did we achieve what we set out to do? 2.6.1. Not really. 2.6.2. We did call attention to the unfairness of Mechanical Turk. 2.6.3. But Amazon did not build an employer reputation system. 2.6.4. In fact, I was at an event where somebody asked an Amazon executive why there is no employer reputation system in Mechanical Turk. She said "the community handles the problems" with employer misbehavior. 2.6.4.1. Now, I want to point out how important this answer is. Imagine if Siemens had a factory. And every once in a while the machines would break down and they had to hire an outside contractor to come repair them. But the workers had to keep a pot of money on the side, collected out of their own paychecks, to pay the contractor. And somebody asked management, why doesn't the company pay for this out of their profits? And management said, well, the workers handle the problem. It's really not a satisfactory answer. But this is the spirit of American technology industry these days. It's very opportunistic and it is in love with anything "self-organizing" or "user-generated" because those are free inputs. So Turkopticon, combined with other tools and forums made by workers, make Mechanical Turk *less unfair*. But it is a free input. We do a free service for workers by keeping this up, but we also do a free service for Amazon by keeping this up. We make the situation a little less intolerable, so it goes on longer like this, without Amazon having to take responsibility. So, is that success? I don't know. 2.6.5. Also, a built in reputation system would be much better because it could display objective data and let workers search or automatically screen by employer statistics, like how often they reject work. With Turkopticon it is all manual, because it can only be integrated so much. 2.6.6. There are other problems with Turkopticon. I won't talk about these in detail. But they all come from the fact that our day jobs are to do research, not to maintain or improve Turkopticon. 2.6.7. There is also one thing that we could not fix even if we could work on Turkopticon full time, which is that people do not trust each other on Mechanical Turk. I think a new crowd work market needs to be built to fix this, but that is just my opinion. [Slide 11.] 2.7. What have we learned? 2.7.1. I learned some things about making software, like: 2.7.1.1. If you listen to people and build what they want, they will use it. I should add, "sometimes." But the point is, it is possible, but you must really listen and respond to people's concerns, and you have to keep listening. Amazon gets away with not listening to workers because it is fairly unique and workers need the money. But if there was a serious competitor to Mechanical Turk that really addressed workers' issues but still managed to attract employers, I think it would do well. 2.7.1.2. Maintenance is both technical and social. And both parts are time consuming. 2.7.2. I learned some things about markets, which some American economists still think are god-given entities like atoms or squirrels, but are really institutions created by people. 2.7.2.1. Most workers and employers have good intentions, but not all of them. 2.7.2.2. The small fraction of participants with selfish intentions affects the market. You have to account for them in the design of the market or they will mess it up for everyone else. 2.7.2.3. No system can solve all problems, so you need a human administrator around. This is obvious to most people here, but it is not obvious to programmers. Remember that I said that Silicon Valley loves things that are, or at least seem to be, "self-organizing"? They also love to have systems solve problems for them so they don't have to deal with them. So the idea that a perfect system can be built that will not require any human oversight is a popular fantasy. I think it is a dangerous fantasy, so I have to keep making this claim that it is not possible. It may sound silly to you but it is an important reminder for technologists, and I am telling you so if you ever run into anybody who believes this you won't be surprised. If they are an American programmer who maybe studied some economics in school, but no sociology, you should be even less surprised. 2.7.2.4. To maintain trust, there should be a record of why judgments were made the way they were. We have had bugs that have made people ask things like "Has Turkopticon sold out?" These were not even things that we did on purpose! They were accidents! And people got worried. We also did some things, early on, on purpose without talking to workers about it first, or explaining our motivations. Bad mistake. 2.7.2.5. There is one more thing, which is not on the slide. That is that in a complex system, sometimes you cannot see the consequences of your actions. So even well-intentioned people can harm others by accident. So in very distributed systems like Mechanical Turk, or even in more traditional outsourcing arrangements, we need to establish ways for people to communicate what is going on with them. One of the problems with Mechanical Turk is that people are almost anonymous to each other. Workers don't have names or photos or anything, they just have numbers. They are long numbers, like ten or twelve digits long. So they all look the same to the employer. Actually, they all look like robots to the employer, so the employer doesn't feel bad when she or he chooses not pay them. But my point is that we need more lines of communication if we want to support market participants to achieve fair outcomes. We need more lines of communication so that market participants can think of each other as human beings and treat each other like human beings. 2.7.3. I also learned something about Amazon, which is that they are not all-powerful. In 2008 I really thought that after we made Turkopticon -- we made the first version in a weekend -- they would be so ashamed that two grad students could just throw this thing together that they would get the point and make a good one themselves. It would be a thousand times better than ours and have real data and workers would be able to screen and search by employers' rejection rates and pay speeds and all of this, all of which Amazon has, or could easily track... It didn't happen. None of it happened. Nothing has changed in terms of workers' ability to judge employers inside Mechanical Turk itself. Not one thing, in five years. So, obviously they do not have infinite technical resources to work on this sort of thing, and it is not high on their to-do list. Somebody else is just going to have to make a new market. Maybe somebody in this room. [Slide 12.] 2.8. What now for Turkopticon? 2.8.1. Somebody asked me if there will be a commercial version. No. There are three reasons for this: 2.8.1.1. First, Turkopticon started off non-profit and we would like to keep it that way. 2.8.1.2. Second, the people who need Turkopticon the most could not really afford to pay for it. 2.8.1.3. Third, the workers would hate us and would stop using it. 2.8.2. There are some improvements we could make to Turkopticon. Some are more organizational and some are straightforward and technical. This is just part of my to-do list for Turkopticon. But there is no timeline for this to-do list because of the day job. It is possible that after I finish grad school we will make a non-profit organization and ask for some grants to keep this going. 2.8.3. But really, I would like to avoid having to do all of this, because I would like Turkopticon to become unnecessary. Turkopticon shouldn't need to exist. This should all be built in to Mechanical Turk...or into whatever replaces it. [Slide 13.] 2.9. I want to steal the slogan from the World Social Forum -- you know, "another world is possible" -- I really believe that; if I didn't believe that I would have been too depressed to work on this for five years -- and say, much less ambitiously perhaps but I think part of the bigger picture, "another crowd work is possible." Or, in keeping with the theme of the conference, a more cooperative crowd work is possible. Here is a first try at a to-do list for building another crowd work. Anybody can sign up to any part of this to-do list: workers, employers, system builders, trade unionists, policy makers, researchers, ... 2.9.1. First, we need to understand what is going on. The situation in crowd work is very complex. For example, many US crowd workers don't actually want minimum wage, because they are afraid that will mean less work. They get very angry when you argue that government should regulate crowd work. I think this anger is based in fear. I have not ever said this in the US, but I think the basic income idea is very interesting and relevant here. The point here is we cannot jump to simple solutions "oh, just make a minimum wage." People don't want it and it would probably not even be enforceable for technical reasons. So, obviously I would say this because I am a researcher, but I think we need more research. We need research from different perspectives though, not just from academics. I would love to see cross-sector work groups -- collaborations between workers, employers, system builders, trade unionists, policy makers, researchers, and so on. It would be a big pain to manage, but I think it is definitely possible and could be very beneficial. 2.9.2. We need to build a community around the dream of a more ethical future for crowd work. This is starting to emerge now but it is far from mature. 2.9.3. And finally, we need to build and maintain new models, new systems, and new cross-sector conversations that lead to, and sustain, learning and action. 2.10. Thank you.