
Should we be recruiting instead of procuring AI? What is the lifecycle for an AI product or service? Who is responsible for unexpected outcomes? These were some of the challenges discussed at a recent workshop…
During October 2019, I participated in a workshop organised by the UK government’s Office for Artificial Intelligence (AI) in collaboration with the World Economic Forum Centre for the Fourth Industrial Revolution and hosted by TechUK. The purpose of the workshop was to discuss guidelines for procuring AI for use in government services. The specific scenarios for our workshop were in defence-related procurement. However, much of the discussion focused on AI procurement challenges in general for government use, given the growing smart city narrative and the assumption of AI becoming embedded within modern urban infrastructure.
The workshop was run under Chatham House rule, so no comments or specifics can be attributed to anyone or companies present. Approximately 30 people attended, representing well-known organisations across academia, industry, government and non-profits. The workshop report and updated guidelines are due to be published in Spring 2020. The following are a selection of the topics discussed during the day.
The day began with a robust debate about whether guidelines for AI were even needed or whether it was just another IT purchase sharing the same challenges – a tendency for large-scale projects intended to run for much longer than the lifecycle of the technology being procured. The argument for treating AI differently centred around the fast-moving nature of AI development at present. An algorithm considered ‘state of the art’ today may be rendered obsolete tomorrow due to a breakthrough in some aspect of machine learning. The algorithm will have been developed on test data and may not perform as expected once deployed on live data. In essence, traditional IT projects require risk management. AI can add uncertainty that requires a different approach to the procurement process and project lifecycle.
“Should procuring an AI be more like hiring a new employee?”
One of the most interesting suggestions was: “Should we treat AI procurement more like recruiting people instead of like purchasing technology?” If imagining that we are recruiting AI, what different approaches would be considered for its lifecycle from hiring to handing the AI its P45*. Instead of evaluating against a precise set of requirements, would evaluating ‘competencies’ be a better technique? Should the AI have a probation period to check it is fit for purpose? What about an annual performance review? It was a great thought experiment to explore what aspects of the traditional IT procurement process may not be fit for purpose when evaluating AIs and/or what new approaches should be considered to produce a better outcome.
What happens if the AI is found to contain a flawed algorithm?
There were lengthy discussions about the role of data. Most algorithms will likely have been trained on test data. How should the AI be evaluated for production use? Should a sample of government data be provided? How to ensure it adheres with data protection and privacy acts? If the sample is filtered to comply with the acts, what procurement steps need to be included to ensure the AI is also evaluated once it is operational with live data. What controls are needed to ensure access permissions are adhered to? How to know what scenarios the AI will and won’t work in? What happens if it is used for a purpose that the producer/seller of the AI did not consider or test for? Or if the producer discovers, or is alerted to, a flaw in the algorithm? This is a particular concern of mine given the flawed theories on which emotion detection algorithms are currently being built. What industry benchmarks might be beneficial for controlling and evaluating the use of AI? Should there be a method to recall AI? How would that impact a complex system that is reliant on the AI?
An interesting perspective came out of the discussion on data. There was a broad consensus amongst suppliers and purchasers in my group that the AIs would always contain pre-trained classifiers. This means that the AI would not continue to learn once in production. Many of the recent breakthroughs in AI have been in deep learning where the AI learns through reinforcement, responding to feedback from within the environment in which it operates. The belief is that such algorithms keep improving with more data, as opposed to traditional machine learning algorithms that typically reach a peak, after which adding more data has minimal or no effect on performance. If the consensus against reinforcement learning is representative, such techniques are not yet considered suitable for real-world applications.
A key aspect in producing guidelines for procuring AI is in the intended uses for the AI. Can a universal set of guidelines be produced, ‘one size fits all’, or are tailored guidelines needed? Two scenarios were presented. One related to the detection of misinformation. This has minimal ethical concerns in the use of AI or in knowing how it makes its predictions. All that matters is how good the AI is. The second scenario related to making a decision that would carry consequences for humans involved in the process. This would require far more rigour in evaluating the predictions made by the AI and the role of the AI in the decision cycle. Perhaps there should be some sort of ranking system to determine what boundaries, or checks and balances are needed, in the use of different types of AI, such as when a ‘black box’ algorithm whose predictions cannot be challenged can or cannot be used in a decision.
“Do we need an AI ‘health and safety’ officer?”
There was a lot of discussion about the lifecycle of an AI, from procurement to decommissioning, and what new roles and activities may be needed for running AI-embedded systems. For example, companies over a certain size are required to have a dedicated health and safety officer and implement mandatory health and safety training for employees. If you don’t like your desk set-up, it’s probably not OK to perform some DIY customisations to it, unless it is your home office… Do we need the equivalent of a health and safety officer for AI? Someone responsible for ensuring the AI is used as it was intended and is responsible for ensuring correct procedures are followed when changes are made to the AI or its intended use?
What should the governance plan for AI look like? How do you evaluate its use and performance? How do you decide when an AI is no longer fit for purpose? What about ethical considerations? What if an AI is used for a scenario that was never anticipated and has human consequences? Who is accountable? This led on to another robust debate. As one attendee drily observed, given the focus areas of this workshop, we are talking about procuring AIs that may be intended for use in weapons systems. An opinion expressed was that ethics was the responsibility of the company using the AI and not of the company providing the AI… It was a challenging discussion.
Discussions and questions posed through the day revealed just how little guidance, standards or regulation there is for the use of AI in government systems currently. There is a danger, when heading into ‘what if’ scenarios, of being distracted by the most unlikely of outcomes at the expense of more common and immediate needs. But there is good reason to be concerned. A presentation on AI in Government at Stanford University’s Human-centred Artificial Intelligence Conference, also held in October 2019, stated that at least 76 out of 176 countries are actively using AI for surveillance purposes, with 64 countries using facial recognition and 52 using predictive policing (Sharon Bradford Franklin, Open Technology Institute).
This was a summary of just some of the points discussed during the day. Hopefully it has provided some food for thought about the procurement and deployment of AI in real-world systems. The post will be updated when the final report is published. In the meantime, a guide to using AI in the public sector has recently been released on gov.uk.
* For non-UK readers, a P45 is what you are issued whenever you leave an organisation.
References
- Draft guidelines for AI procurement – gov.uk, published 20 September 2019
- Shaping the Future of Technology Governance: Artificial Intelligence and Machine Learning – World Economic Forum for the Fourth Industrial Revolution
- Chatham House Rule – Chatham House
- Human-centred Artificial Intelligence Conference, Fall 2019 – Stanford University
- Failing to detect emotions – blog post, May 2019
- A guide to using AI in the public sector – gov.uk, published 27 January 2020
Featured image: Storm over St Ive’s in Cornwall, 22nd December 2019 (author’s photo, for want of a more appropriate image that hasn’t already been used for an AI-related post… stormy waters felt appropriate 🙂 )
These two statements seem scary when considered together. Am I missing the big picture?
“An opinion expressed was that ethics was the responsibility of the company using the AI and not of the company providing the AI… ”
“There was a broad consensus amongst suppliers and purchasers in my group that the AIs would always contain pre-trained classifiers. This means that the AI would not continue to learn once in production.”
No I don’t think you are. When I said there were some robust debates during the workshop, they were indeed ‘robust’. It raised up a number of big challenges, and also possible solutions, that I had not considered from this angle. I held off writing it up because there was so much to cover. I try not to let blog posts run too long and am toying with producing a pamphlet/short book on the topic.