Artificial intelligence, the ability of machines to perform tasks like humans do, has existed for decades, but recent advances have opened up new frontiers in AI with the potential to impact virtually every aspect of society.
On this page, you will find an overview of:
key aspects of artificial intelligence
important considerations and ethical issues, especially as they relate to social science research
what's happening in AI at UMD
Artificial intelligence refers to the ability of machines to perform human tasks. In fact, machines are much better than we are at performing some tasks, so AI is sometimes interpreted as "augmented intelligence." The tasks that they perform better--at least so far--relate to pattern matching, i.e., comparing input data to a repository of existing information and the sequences and relationships within. In the early days of AI, computers accomplished pattern matching by following preset algorithms. More recently, systems have been trained to predict outcomes or characteristics of input observations recursively, based on the results of their past attempts to process similar observations. This is known broadly as "machine learning" (ML) and is where much of the growth in AI has happened in recent years. The exponential growth of available data to train these systems was made possible by the internet, smart phones, and related technologies. This data explosion converged with advances in computer hardware and networks to enable leaps forward in the scope and usefulness of AI applications.
Generative AI takes machine learning to the next level. These models don't just make predictions or classifications based on an underlying set of related data. Rather, they process, assimilate, and find patterns within and between vast and diverse data sources. This enables generative AIs to produce original content in response to user prompts. While generative AI writ large uses all kinds of data and media, the subtype called Large Language Models (LLMs) focuses exclusively on modeling language through training on text-based data and generation of original written responses to prompts. GPT-4, the model that underlies ChatGPT, is one such LLM.
As AI models become more powerful and interwoven into more day-to-day aspects of society, concerns continue to grow about their limitations and possible harms. This an active area of research, discussion, and policy debates. Some of the critical considerations as they relate to academic research applications include:
Mis- and Dis- information: AI can make mistakes due to inaccurate data underlying the models as well as misinterpretation (or manipulation) of the nature of or relationships between training data.
Bias: Models trained on data which does not include representative information on the entire population being modeled will inevitably produce biased results. This type of bias can lead to skewed and harmful impacts, particularly for disadvantaged populations, in areas such as health, policing, and hiring practices when AI is used to automate or assist in decision-making.
Privacy: AI's ability to integrate and draw conclusions across disparate data sources poses challenges for protecting identities and sensitive information. While any single data set by itself may be de-identified, when AI is applied to pooled, individual-level information, anonymity cannot necessarily be guaranteed.
Copyright: Active court cases are testing whether generative AI models can be trained on content such as books, newspaper articles, and other creative works. At issue is whether training these models is protected under the "fair use" clause of US copyright law.
Prediction v. Explanation: AI's limited-to-non-existent ability to explain or have its results interrogated has been a major limitation, especially in academic research where explanation is usually the primary goal. This is beginning to change with the development of causal AI which which will be crucial in AI trust in governance.
Model Fit: Fitting accurate models is a challenge in all statistical endeavors, and AI is no different. A model that fits the training data well may not make accurate predictions for test data.