Of all the artificial intelligence stories that have surfaced lately, a new one by Josh Dzieza in a collaboration for New York and The Verge is both compelling and surprising. He examines a simple-sounding premise: For AI models to work, they need to be fed with data — lots of data, almost unimaginable amounts of data. Enter the “annotators”. That means millions of people around the world are working for generally low wages and struggling with monotonous tasks like labeling photos of clothes to keep the AI models smarter. Behind “even the most impressive AI system are people — large numbers of people labeling data to train it and clearing data when confused,” writes Dzieza.
In what he calls a burgeoning “global industry,” they work for companies that sell that data to large corporations at a high price, all of which encourages a culture of secrecy.
In fact, commentators are usually forbidden to talk about their work, even though they remain in the dark about the bigger picture anyway. (One big player is Scale AI, a Silicon Valley data provider.) “The result is that, with few exceptions, little is known about the information that drives the behavior of these systems, and even less about the people making that design. ” ” Dzieza interviewed two dozen commentators around the world, and even worked as one himself, to get the full picture. At one point, describing the entire man-machine feedback loop, he offers this amazing gem: ” ChatGPT seems so human because it was trained by an AI mimicking humans mimicking an AI pretending to be a better version of an AI trained to write human.” (The whole story is one worth reading.)