The bigger the pot of data, the more likely that inaccuracies will creep in to the analysis. The consequences depend on what happens next…

If algorithms represent a new ungoverned space, a hidden and potentially ever-evolving unknowable public good, then they are an affront to our democratic system, one that requires transparency and accountability in order to function – Taylor Owen

A recent article – The Violence of Algorithms – highlighted the increasing breadth and depth of data being collected about people and how the increasing sophistication of technology is enabling some decisions in society to be automated that perhaps shouldn’t be.

Well worth a read, it concludes with three concerns:

  1. Acts of war have become spatially and conceptually boundless. The lines between war and peace and between domestic international engagements are disappearing
  2. We are heading to a place of predictive governance, based on unaccountable and often unknowable algorithms and the biases, values, and ambiguities that are built into them
  3. Spaces of dissent in society are being eroded. Acts of digital civil disobedience are increasingly being targeted and prosecuted not as protest but as terrorism

There are two factors to consider: 1) When should we automate versus intervene? and 2) What checks and balances should be built-on to validate the recommendation.

Overall, machines outperform humans when it comes to computations involving data. The bigger the data, the more advanced the analysis and the more reliance we place on computers to do the digital heavy lifting. But the logical rational decision isn’t always the best one. Not if emotional wellbeing or civilised behaviour is your goal.

Sometimes, even if the data is right, the decision will be wrong

The bigger the data, the more likely there will be ambiguity and inaccuracies (or the simple inevitable randomness within large enough data sets) that can lead to misinterpretation. Empathy and awareness about the context of a given situation enable us to detect a data anomaly that our digital friends cannot.

These issues were highlighted in another recent article – UK report details what happens when police spying goes wrong – that catalogued a range of mistakes made due to simple data errors such as mistyping an email address leading to arresting the wrong person. Whilst the mistake was later found and corrected, some arrests and accusations linger far longer than they should.

Being able to critique data analysis is becoming more important as sensor-based and data-driven automation increases within urban spaces. Even when an analysis ‘feels right’, it should always be checked for misleading assumptions and potential inaccuracies. And careful consideration needs to be applied to any decision to automate ‘intelligent’ actions. Sometimes even if the data is right, the outcome will be wrong.

Never forget, with a large enough data set the improbable becomes probable.

As, to finish on a lighter note, amusingly illustrated by xkcd:

Comic poking fun at significance

Source: xkcd – significant


Hat tips to Euan Semple and Toni Sanchez for sharing the two articles (via LinkedIn and Twitter)

Hacker!Featured image: iStockPhoto


Blog, Data Science
, ,

Join the conversation! 1 Comment

  1. […] previous post was somewhat mis-titled. There are often accuracy issues with data. A mistyped name here, or social […]

Comments are closed.

%d bloggers like this: