The Dangers of Data-Based Certainty
Economists are rushing to embrace the use of big data in their research, while many policymakers think artificial intelligence offers scope for greater cost-effectiveness and better policy outcomes. But before we entrust more decisions to data-based machine-learning and AI systems, we must be clear about the limitations of the data.
CAMBRIDGE – Friends of mine who work in the arts and humanities have started doing something unusual, at least for them: poring over data. This is due to the pandemic, of course. Every day, they check COVID-19 case numbers, how slowly or quickly the R factor is declining, and how many people in our area got vaccinated the day before.
Meanwhile, social media are full of claims and counterclaims about all manner of other data. Is global poverty declining or increasing? What is the real level of US unemployment? The scrutiny, sometimes leading to tetchy arguments, results from people’s desire to cite – or challenge – the authority of data to support their position or worldview.
But in other areas where data are used, there is remarkably little focus on its reliability or interpretation. One striking example I have noticed recently concerns the “CAPTCHA” tests designed to protect websites against bots, which ask you to prove your humanity by identifying images containing common features such as boats, bicycles, or traffic lights. If your choice – even if correct – differs from that of the machine system using your selection to train an image-recognition algorithm, you will be deemed inhuman.