A dollar for your face: Meet the humans behind Machine Learning models

To train machine learning models, tech companies are hiring a Germany-based service provider to buy selfies and pictures of ID cards from underpaid gig workers, whose rights are often disregarded.
Author: Josephine Lulamae

M. was looking for a side hustle to support her family, when she found Clickworker. She was offered a task of sending two pictures of her ID papers from the Philippines, for which she would earn 1 U.S. dollar. She found it suspicious. After she uploaded the first picture, the link had expired. “I don’t know if the task was valid,” she writes, adding a laughing/sweating emoji: 😅. She hasn’t been paid for it.

A. who has been on Clickworker to earn a bit of extra money for the past months, says he was also asked to send pictures of his government ID, to be used to train apps that do ID recognition. He didn’t do it. “It’s just a habit not to send personal ID because scams are left and right in the Philippines,” he writes. (M. and A. asked not to be named in this article.)

The marketplace for personal data 

Apps like Clickworker, Amazon’s Mechanical Turk, and Appen recruit gig workers, or “microworkers”, who are paid a few cents for tasks like labeling data for clients who develop machine learning models. In recent years, some of Clickworker’s initial tasks (copywriting, categorizing products) have been increasingly automated.

These days, getting people to sell personal data, such as a selfie (smiling, frowning, looking surprised) or a video of their personal ID card, is the main source of revenue at Clickworker, which has offices in New York and Essen, a large city in western Germany. Indeed, “most of our earlier competitors have now focused on AI training data,” Ines Maione, Clickworker’s marketing manager, says. “It’s future-proof, because human training data cannot be replaced so quickly.”

Thanks to platforms like Clickworker and Appen, there is now a “marketplace for personal data”, says Paola Tubaro, a professor at the French National Center for Scientific Research, which exists in a legal gray area. For a collection of selfies or selfie-videos, workers can earn 3 to 5 euros. When the task asks for selfies taken over time (to train face recognition to recognize aging), prices can go slightly higher.

Worker suspicion

Tubaro has spent the past years interviewing people in Venezuela who took jobs on Clickworker and Microworker. She says that some of the people she interviewed felt uncomfortable doing tasks that involved personal data. “Some workers who ended up doing these tasks are the poorest, who cannot afford to refuse a task paid 5 dollars,” she says.

Venezuelans have been suffering from high inflation and a deep economic crisis since at least 2010. The monthly minimum wage has gone down to 3-5 dollars per month. “One man told me: Here I could earn 5 dollars, by working half an hour taking selfies of myself and I did it,” Tubaro recalls, “though he added that before, in another situation, he wouldn’t have done that.”

In another case, one man told Tubaro that he knew of someone who agreed to upload some selfies and later found his face in an advertisement on some website, which he then had to contact repeatedly to get the picture taken down. “I don’t know if the story was true,” Tubaro says. “But clearly this is something that workers feel is of concern, because otherwise they wouldn’t have discussed it amongst themselves in an online forum.”

The anonymous client

Last summer, the German immigration authority BAMF said that they use Clickworker for voice recordings to train their controversial dialect recognition software. “It’s very possible that [BAMF] asked us [to provide data],” says Maione, the Clickworker manager.

Who else can be a client? Clickworker allows any company and academic institution to buy personal data. A tech company may ask for “a series of portraits over 10 years.” In which case there would be a check that they are official, Maione says. The client also has to sign a contract, promising “only to use the crowdworkers’ personal data to train their AI.” (And not share the data with anyone else). Often, Clickworker doesn’t know what product the model will be used for. According to Tubaro, Machine Learning supply chains are so long that often the client doesn’t know either.

Once, a client asked for pictures of elderly people in retirement homes. Clickworker rejected this task, for practical reasons. “One can’t just march into a retirement home and be like: hi, just a few pictures,” Maione says. Nor can they provide images of babies, because “everyone who works for us is over 18.” For ethical reasons, she adds, they wouldn’t ask for nudes. Or porn. “We wouldn’t take any requests that violate clickworkers’ rights.”

The clients almost never want to be featured on the Clickworker website. They also do not, in general, want workers to know who they are. “That would be total chaos, they’d be getting questions from thousands of people,” Maione explains.

Questions about missing payments, for example. Or about a task not working. Under the EU’s privacy law GDPR, workers can also ask the client or Clickworker to delete their personal data. (Although, Tubaro says, “this protection isn’t really there” for people outside of Europe who don’t know what GDPR is.) Clickworker is supposed to deal with these questions: “Our service for the client is that we take care of everything,” Maione says.

If a worker found their picture on the internet somewhere and contacted Clickworker, “we would have to sue the client and say that we are taking legal steps,” Maione says. They haven’t had such a case yet, she adds: “Thank God.”

Enforcing data protection rights 

Jamael Jacob, a legal and policy advisor at the Philippines-based Foundation for Media Alternative, describes this “anonymous client” approach as such: “It’s like Clickworker is Airbnb, but it doesn’t inform people who rent the house who the actual owners of the house are.” What if the client suffers a data breach, for example, and the microworker’s data is compromised? “They won’t even know that their information was affected by the breach.” What if the client engages in unethical practices?

Jacob adds: “It’s really difficult to even call the data subjects workers because they are more like commodities–at least their data is.” (At Clickworker, microworkers are also called “the crowd” or, when referring to the company logo of bright human shadows, “the colorful crowd”)

Under the Filipino data protection law, which can also apply to international companies, crowdworkers in the Philippines have a right to know who they are selling their ID card-snaps to. S., who didn’t want to take a picture of his Philippine ID card, told us that when he has taken selfies for Clickworker tasks in the past, he has never been able to see the client’s name. Potentially, the Filipino data protection authority could go after Clickworker.

However, when a woman tried to bring a case against Meta last year in the wake of the Cambridge Analytica scandal (millions of Filipinos were part of the datasets involved), the authority dismissed the case, because the woman couldn’t provide the evidence of Facebook’s wrongdoing himself. Nor did they end up bringing a case against Uber, when Uber had a data breach that affected many Filipino drivers, or against any other foreign company yet.

Resources for investigations are scarce. Under former president Rodrigo Duterte, employees were not appointed for their knowledge of data protection. And the new deputy privacy commissioner’s only visible qualification for her job, observers say, is that she used to work for the law firm of current president Ferdinand Marcos Jr.’s wife. “To make things short, we still have a weak data protection authority,” Jacob says. He was part of the data protection authority when it was first established.

Exploitation 

Given the current dire economic situation in the Philippines, Jacob says he wouldn’t be surprised if people are turning over data like their government-issued IDs to platforms like Clickworker.

“And I suppose there is something to be said about companies taking advantage of situations like that,” he says. “There’s also a degree of exploitation there. I don’t know, do you think that EU citizens, German citizens would actually agree to something like this, considering this kind of information that the company is asking for?”

Half a year ago, Clickworker started a campaign to try and get more people in Germany and the USA to agree to upload selfies or videos of themselves–under pressure from clients who, Maione says, “say exactly what kind of pictures they want, often according to country and ethnicity.” So far, Maione says, microworkers in Germany and the US had mostly refused to do these kinds of tasks.

Not everyone gets paid the same for a selfie, Maione says. “It depends, if we know we need pictures from all over the world, and, as an extreme example, we still need Swiss people, then it can happen that we fix the price for them [the hypothetical Swiss], because otherwise we wouldn’t get any data.”

The article was first published on Algorithm Watch

Leave a Comment