Generative artificial intelligence (AI) is a subset or specialised branch of AI that specifically deals with the generation of content or data that is often creative, human-like, or novel. Generative AI systems are designed to create new content, such as text, images, music, or video that can mimic human creativity or generate content that is contextually relevant.
AI has the potential to reshape our societies as a whole. For this reason, the active engagement of parliaments is critical to safeguarding the public interest and ensuring that AI technologies are both developed and deployed responsibly.
This report in the result of the collaborative efforts of the European Parliamentary Technology Assessment (EPTA) network presented to the 2023 Conference on Generative Artificial Intelligence – Opportunities, Risks, and Policy Challenges, held in the premises of the Parliament of Catalonia in Barcelona on 9 October 2023.
This report consists of contributions from 15 EPTA members on these issues:
- Generative AI and democracy
- Generative AI and health
- Generative AI and education
- Generative AI and work
In addition, the different institutions have conducted interviews with members of parliament concerning their knowledge and views on generative AI, especially on the above mentioned issues.
Generative Artificial Intelligence. Opportunities, Risks and Policy Challenges
In the last decade, deep learning methods (a subfield of machine learning, in turn a subfield of AI) have seen remarkable improvements in accuracy and generalisation, leading to breakthroughs in healthcare (e.g. cancer diagnosis) or science (e.g. weather prediction or protein modelling).
Some of the most successful developments using a deep learning methodology are large language models (LLMs). These models are aimed at generating sequences of words that are plausible but not necessarily truthful. The purpose of the developers is to achieve good statistical approximations to what humans could possibly write according to the many human-written texts used to train the model.
Several corporations have built applications on top of LLMs like ChatGPT or Bard. These applications serve many purposes like searching for information, summarising texts, or creating content based on a dialogue with a user. A portion of the public have perceived these systems as an example of machines acquiring human-like intelligence and this has led to warnings from the scientific community against that misperception.
There are a number of issues that need to be put on the table to inform a discussion about a correct use of this family of applications:
No trustworthiness. As mentioned above, the output generated by these systems is not aiming at being truthful but plausible. This is a risky property as uninformed members of the public may take decisions based on erroneous information produced by the systems. Appropriate safeguards must be introduced to protect the citizens. The connection with the sources of information that current search engines have is missing in LLMs.
Black box behaviour. These models are the result of a tedious process of optimisation that computes billions of parameters that are then used to generate the results. The capacity of these systems to explain why they generate something is extraordinary limited as the output is the result of millions of arithmetic operations over those parameters.
Rigidity. Systems are trained with a set of data to fix the parameters and then no further training is done and no further changes are introduced. If we limit the training set to documents prior a given date, no information in documents generated after that date will influence the generation of output.
High energy consumption. The cost of training the systems account for thousands of megawatts. In addition, as systems need to be retrained frequently due to the lack of adaptability, the consumption has to be repeated every time.
Proprietary systems. Most of these applications belong to a handful of companies that can afford their high development cost. This legitimate business has to be monitored to protect citizens from potential biases and misuses. For instance, there is opacity on the data used to train the systems and thus it is unclear if these systems respect regulations like GDPR.
Copyright. As texts are generated from the documents used to be trained and no reasonable trace can be made to them, the notion of copyright seems to be at stake here. Some discussion has to be had on how to preserve the rights of human creators.
In summary, although the new developments based on deep learning methods are producing significant advances in several areas of science, their use in the context of LLMs needs to be closely scrutinised to avoid misinformation and cyber-security threats.