February 8, 2024
On 2 February 2024, the House of Lords Communications and Digital Committee published a lengthy report on large language models (“LLMs”) and generative AI, based on extensive oral and written evidence and roundtables. An LLM is a type of AI foundation model which is trained on large datasets typically using deep learning (i.e. processing data in ways inspired by how the human brain works) and which is capable of generating a range of outputs. With LLMs, which are focused on language (i.e. written text), the underlying software is designed, and extensive data is collected, often using automated bots to obtain text from websites (“web scraping”). LLMs are designed to learn relationships between these pieces of data and predict sequences making them capable of generating natural language text.
The Committee points out that LLMs are, at present, structurally designed around probability and plausibility, rather than around creating factually accurate assessments which correspond to the real world. This is partly responsible for the phenomenon of “hallucinations” whereby the model generates plausible but inaccurate or invented answers. Further, the process for arriving at an answer is typically described as a “black box” because it is not always possible to trace exactly how a model uses a particular input to generate particular outputs. These factors, together with the fact that LLMs can display bias, regurgitate private data and struggle with multi-step tasks, raise concerns around their use in high-stakes applications (e.g. critical national infrastructure).
The Report examines trends over the next three years and identifies priority actions for the Government. AI could bring huge economic benefits and drive ground-breaking scientific advances by synthesising vast amounts of data to identify patterns and insights. However, the report states that it also comes with formidable risks including threats to public safety, societal values and competition, where market power could be concentrated in the owners of a small number of the largest cutting-edge LLMs. The Government is not striking the right balance between innovation and risk and, in particular, has pivoted too far towards a narrow focus on high-stakes AI safety, which appears to be a reference to the Government’s earlier papers on “frontier AI” (details of which were previously reported by Wiggin here and here). Seeking to build on rather than recap the extensive literature on AI, the Committee makes several recommendations to the Government briefly summarised below:
- In considering a beneficial regulatory framework, the Government must address potential conflicts of interest and “regulatory capture” which could arise as it engages in further consultation with industry and brings more private sector expertise within policymaking.
- Market competition must be an explicit policy objective and a nuanced approach between open and closed AI systems must be adopted. Open models, which tend to make more of the underlying system code, architecture and training data available, offer greater access and competition but raise concerns about the proliferation of dangerous capabilities. Closed models, which tend to publish less information about how they have been developed and the data used, offer more control but also more risk of concentrated power.
- Measures should be implemented to boost computing power and infrastructure (maximising energy efficiency and equitable access e.g. to researchers and SMEs) and skills.
- The Government should explore options for, and the feasibility of, developing a sovereign LLM capability, to support public sector functions, built to the highest security and ethical standards.
- The Government should publish its view on whether copyright law provides sufficient protection for rightsholders and, if necessary, set out options for updating legislation. The Government’s proposed IPO Code (previously reported by Wiggin) should enable rightsholders to exercise their legal rights, ensure developers are transparent about the use of web crawlers to acquire data and allow rightsholders to check training data. The Government should invest in large, high-quality datasets to encourage tech firms to use licensed material.
- Immediate security risks (which the Committee considers arise from making existing malicious activities easier and cheaper) should be addressed by scaling existing mitigations relating to cybersecurity, counterterrorism, child sexual abuse material and disinformation, and societal harms around discrimination, bias and data protection should be tackled, including clarification on the use of personal data in model training.
- Although the Committee does not consider that catastrophic risks (above 1000 UK deaths and tens of billions in financial damages) are likely within the next three years, an agreed system of warning indicators, and mandatory safety testing, is needed.
- The Government should speed up the tools needed to enable sector regulators to regulate AI (as proposed in the Government’s March 2023 white paper, “A pro-innovation approach to AI regulation”) including Government-led central support teams, investigatory and sanctioning powers, cross-sector guidelines and a legal review of liability by the Law Commission.
- In the Committee’s view, UK regulation should learn from but not copy the US, EU and China, and extensive primary legislation aimed solely at LLMs is not currently appropriate. Rather, the immediate priority is to develop accredited standards and common auditing methods relating to the testing of high-risk, high-impact models.
For more information, click here.