GENERATIVE ARTIFICIAL INTELLIGENCE (GENERATIVE AI)
Definition and History
Generative AI is a type of artificial intelligence technology that can produce various types of content, including text, imagery, audio and synthetic data. The recent buzz around generative AI has been driven by the simplicity of new user interfaces for creating high-quality text, graphics and videos in a matter of seconds. Generative artificial intelligence (generative AI, GenAI,[1] or GAI) is artificial intelligence capable of generating text, images or other data using generative models,[2] often in response to prompts.[3][4] Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.
Improvements in transformer-based deep neural networks enabled an AI boom of generative AI systems in the early 2020s. These include large language model (LLM) chatbots such as ChatGPT, Copilot, Gemini and LLaMA, text-to-image artificial intelligence art systems such as Stable Diffusion, Midjourney and DALL-E, and text-to-video AI generators such as Sora. Companies such as OpenAI, Anthropic, Microsoft, Google, and Baidu as well as numerous smaller firms have developed generative AI models.
Applications of Generative AI
Generative AI has uses across a wide range of industries, including software development, healthcare, finance, entertainment, customer service,[13] sales and marketing,[14] art, writing,[15] fashion,[16] and product design.[17] However, concerns have been raised about the potential misuse of generative AI such as cybercrime, the use of fake news or deepfakes to deceive or manipulate people, and the mass replacement of human jobs.
Modalities
A generative AI system is constructed by applying unsupervised or self-supervised machine learning to a data set. The capabilities of a generative AI system depend on the modality or type of the data set used.
Generative AI can be either unimodal or multimodal; unimodal systems take only one type of input, whereas multimodal systems can take more than one type of input.[38] For example, one version of OpenAI's GPT-4 accepts both text and image inputs.[39]
Code
In addition to natural language text, large language models can be trained on programming language text, allowing them to generate source code for new computer programs.[42] Examples include OpenAI Codex.
Audio
Generative AI can also be trained extensively on audio clips to produce natural-sounding speech synthesis and text-to-speech capabilities, exemplified by ElevenLabs' context-aware synthesis tools or Meta Platform's Voicebox.[45]
Duration:
AI-generated music from the Riffusion Inference Server, prompted with bossa nova with electric guitar
Generative AI systems such as MusicLM[46] and MusicGen[47] can also be trained on the audio waveforms of recorded music along with text annotations, in order to generate new musical samples based on text descriptions such as a calming violin melody backed by a distorted guitar riff.
Video
Runway Gen2, prompt A golden retriever in a suit sitting at a podium giving a speech to the white house press corps
Generative AI trained on annotated video can generate temporally-coherent, detailed and photorealistic video clips. Examples include Sora by OpenAI,[10] Gen-1 and Gen-2 by Runway,[48] and Make-A-Video by Meta Platforms.[49]
Molecules
Generative AI systems can be trained on sequences of amino acids or molecular representations such as SMILES representing DNA or proteins. These systems, such as AlphaFold, are used for protein structure prediction and drug discovery.[50] Datasets include various biological datasets.
Robotics
Generative AI can also be trained on the motions of a robotic system to generate new trajectories for motion planning or navigation.
Planning
The terms generative AI planning or generative planning were used in the 1980s and 1990s to refer to AI planning systems, especially computer-aided process planning, used to generate sequences of actions to reach a specified goal
Data
Generative AI systems are often used to develop synthetic data as an alternative to data produced by real-world events.
Computer aided design
Artificially intelligent computer-aided design (CAD) can use text-to-3D, image-to-3D, and video-to-3D to automate 3D modeling.
References
Newsom, Gavin; Weber, Shirley N. (September 6, 2023). "Executive Order N-12-23" (PDF). Executive Department, State of California. Retrieved September 7, 2023.
Pinaya, Walter H. L.; Graham, Mark S.; Kerfoot, Eric; Tudosiu, Petru-Daniel; Dafflon, Jessica; Fernandez, Virginia; Sanchez, Pedro; Wolleb, Julia; da Costa, Pedro F.; Patel, Ashay (2023). "Generative AI for Medical Imaging: extending the MONAI Framework". arXiv:2307.15208 [eess.IV].
Jump up to:a b Griffith, Erin; Metz, Cade (January 27, 2023). "Anthropic Said to Be Closing In on $300 Million in New A.I. Funding". The New York Times. Retrieved March 14, 2023.
Lanxon, Nate; Bass, Dina; Davalos, Jackie (March 10, 2023). "A Cheat Sheet to AI Buzzwords and Their Meanings". Bloomberg News. Retrieved March 14, 2023.
Pasick, Adam (March 27, 2023). "Artificial Intelligence Glossary: Neural Networks and Other Terms Explained". The New York Times. ISSN 0362-4331. Retrieved April 22, 2023.
Karpathy, Andrej; Abbeel, Pieter; Brockman, Greg; Chen, Peter; Cheung, Vicki; Duan, Yan; Goodfellow, Ian; Kingma, Durk; Ho, Jonathan; Rein Houthooft; Tim Salimans; John Schulman; Ilya Sutskever; Wojciech Zaremba (June 16, 2016). "Generative models". OpenAI.