Edmond de Belamy
Obvious (2018)
Explore our coverage of Tech - from computers to artificial intelligence
From its technical underpinnings to its economic consequences and understand the direction in which the field is moving.
Today’s large language models (LLMs), which are based on the so-called “transformer” architecture developed by Google, have a limited “context window”—akin to short-term memory. Doubling the length of the window increases the computational load fourfold. That limits how fast they can improve. Many researchers are working on post-transformer architectures that can support far bigger context windows—an approach that has been dubbed “long learning” (as opposed to “deep learning”). Modern LLMs can generate an impressive amount of text but beware: the models may get some facts wrong, and are prone to flights of fancy that their creators call “hallucinations” (inherent feautre of AI). Leading models are now racing to become “multimodal”, capable of dealing with various types of data. However, as systems that use LLMs to control other components proliferate, LLMs will have fewer hallucinations as they are now being trained on datasets combining text, images and video to provide a richer sense of how the world works. Another approach augments LLMs with formal reasoning capabilities, or with external modules such as task lists and long-term memory.
Another problem is the algorithm by which LLMs learn, called backpropagation. All LLMs are neural networks arranged in layers, which receive inputs and transform them to predict outputs. When the LLM is in its learning phase, it compares its predictions against the version of reality available in its training data. If these diverge, the algorithm makes small tweaks to each layer of the network to improve future predictions. That makes it computationally intensive and incremental. Since 2017 most AI models have used a type of neural-network architecture known as a transformer (the “t” in gpt), which allowed them to establish relationships between bits of data that are far apart within a data set. To overcome the scaling problems, alternative architecture frameworks called Mamba and 'Joint Embedding Predictive Architecture' (JEPA) are being considered. For a new generation of AI models to stun the world, fundamental breakthroughs may be needed.
The Jetsons: Tech companies are venturing into the realm of complex AI agents capable of multitasking across various applications, from planning itineraries to booking accommodations, restaurants, and taxis. Bill Gates foresees a future where these agents, fueled by natural language understanding, will revolutionize daily tasks, potentially replacing traditional apps. These agents, dubbed by Microsoft as having "agency," hint at a future where AI can autonomously execute complex sequences of actions, raising significant ethical concerns about control and autonomy in our interactions with technology. A Large Action Model (LAM), which is built on LLMs, functions as the agent's OS to make it your personal voice assistant. The LAM OS uses its long-term memory of you to translate your requests into actionable steps and responses; it comprehends whatapps and services you use daily. The LAM can learn to see and act in the world like humans do. It is still early days, but the app-led devices of today will likely give way to new agent-led devices. As we edge toward an agent-centric technological landscape, as seen in early iterations like OpenAI's GPT Store and Microsoft's GitHub Copilot, the balance of human oversight versus AI autonomy becomes increasingly delicate, potentially redefining our relationship with digital assistants. As companies like Microsoft and OpenAI race to develop these sophisticated agents, concerns over relinquishing human agency to AI loom large, highlighting the ethical complexities inherent in this technological shift.
A Journey Through the Electrifying History of Energy
Draft text under preparation.
Deep-dive into the Hidden Language of Computer Hardware and Software
Starting with the challenge of sending messages across a distance at night without electronic devices, the narrator devises an initial, inefficient method using simple blinks of light. Discovering Morse code, which employs a system of short and long blinks (dots and dashes), proves much more efficient, reducing the complexity and length of messages. The text highlights the broader theme of codes as systems for communication, comparing Morse code to spoken and written language, and emphasizes how codes, including those used in computers, are essential for transmitting information effectively. the development and intricacies of Morse code, invented by Samuel Morse in the 1830s and further refined by Alfred Vail. It details how Morse code, used with the telegraph, illustrates the concept of codes and their applications. The challenge of decoding Morse code is contrasted with its simpler transmission. The text introduces the concept of organizing Morse code based on the number of dots and dashes, revealing that the number of possible codes doubles with each additional dot or dash, which follows powers of two (2, 4, 8, 16, etc.). This binary nature of Morse code and its correlation with binary systems, such as computer codes, highlights the significance of powers of two in information theory. development and mechanics of Braille, invented by Louis Braille, a blind French teenager, in the early 19th century. After a childhood accident left him blind, Braille refined an existing system of raised dots used for military communication into a more efficient tactile code for the blind. Braille's system, which uses a 2x3 cell of raised dots, provides up to 64 unique combinations, though only 25 are used for the Latin alphabet. The text also discusses how Braille incorporates binary principles, similar to Morse code, and includes advanced features such as Grade 2 Braille with contractions and symbols for mathematical and musical notation. It highlights how Braille's design incorporates redundancy and shift codes to handle variations like uppercase letters and numerals, demonstrating its adaptability and complexity.
A flashlight typically includes batteries, a lightbulb (or LED), and a switch, all housed in a case. The lightbulb, often incandescent, contains a tungsten filament that glows when electricity flows through it. The electricity in a flashlight circuit is powered by batteries through a chemical reaction that creates a flow of electrons, which travels through a closed circuit, illuminating the bulb. Key concepts include the nature of electric current (the flow of electrons), voltage (the potential difference), resistance (opposition to flow), and how these interact in a circuit. The text also highlights the importance of the switch, which controls whether the circuit is complete or open. Additionally, it notes the efficiency of LEDs over incandescent bulbs in terms of energy consumption. By wiring up the components in a specific way, you can create two independent circuits that allow for two-way communication. The text introduces the concept of a "common" or shared connection between circuits to reduce the number of wires needed. It also explains how using the Earth as a conductor can further reduce wiring requirements, although this is limited by the Earth's resistance and the voltage of the batteries used. The chapter concludes by noting the challenges of long-distance communication due to wire resistance and the historical context of telegraph systems.
The evolution of logic from Aristotle's syllogisms to Boolean algebra, emphasizing the impact of George Boole's innovations in abstract mathematics. Aristotle's logic, rooted in syllogisms like "All men are mortal; Socrates is a man; hence, Socrates is mortal," laid the groundwork for formal logical reasoning. Over centuries, mathematicians sought to formalize logic mathematically, culminating in Boole’s development of Boolean algebra. This system abstracts algebra from numerical calculations to classes and operations, using symbols like + (union) and × (intersection) to represent logical operations. Boolean algebra proved essential in modern computing by providing a mathematical framework for logical operations, exemplified by circuits where switches and lightbulbs model Boolean AND and OR operations. This text illustrates how Boolean algebra's principles, such as the laws of contradiction and the application of AND/OR operations, underpin digital logic and circuitry.
Samuel Finley Breese Morse, born in 1791, was a multifaceted figure known for his contributions to both art and technology. Educated at Yale and trained in art in London, Morse achieved success as a portrait artist and was also an early photographer. However, his most significant impact was in telecommunications with his invention of the telegraph and Morse code. Prior to Morse, long-distance communication was slow and limited, but his telegraph revolutionized this by enabling instantaneous communication over vast distances. The telegraph operated using electromagnetism, a concept explored by scientists like Hans Christian Ørsted and Michael Faraday. Morse’s system employed a binary code transmitted via a key and sounder, laying the groundwork for modern communication technologies. His innovation also introduced the concept of the relay, an electrical switch that amplifies signals, which became crucial for extending communication networks. The telegraph, with its ability to transmit messages quickly and over long distances, marked the beginning of the modern era of communication.
The foundational concepts of computer logic, focusing on how Boolean algebra and electrical components like relays and switches create logic gates essential for computing. It illustrates this through a practical example of designing a circuit to select a cat based on specific criteria. Logic gates, including AND, OR, NAND, NOR, and inverters, are fundamental in constructing these circuits. By using relays as switches, the text explains how to implement these gates and perform Boolean operations in real-world scenarios. The text also touches on De Morgan's laws, which offer methods for simplifying Boolean expressions and demonstrate the versatility and foundational nature of these logic components in computing systems.
Human numerical systems are influenced by our physiology and the number of fingers and toes we have. It starts with the decimal system, based on ten digits, and explains how this system feels natural due to our ten fingers and toes. It then introduces alternative systems like octal (base eight) and binary (base two), illustrating how different base systems could have developed depending on our anatomy. Octal and binary systems are compared, emphasizing that while decimal is familiar, octal relates closely to binary due to its power-of-two structure. The text also briefly touches on the quaternary system (base four) for creatures like lobsters. It demonstrates conversions between these systems and explains binary arithmetic, highlighting the fundamental differences in how numbers are represented and manipulated across various systems. The overall theme is the deep connection between human physiology, number systems, and their applications in computing.
The concept of binary digits, or bits, and their role in encoding and communicating information. It starts with a historical anecdote about a man requesting a simple sign to determine his welcome home, illustrating how a binary choice (a bit) can convey a clear message. The text then transitions to explaining binary as the most fundamental system of representing information with just two values: 0 and 1. It demonstrates how binary is used in various contexts, from traditional signals like lanterns in Paul Revere's ride to modern applications such as UPC barcodes. Through these examples, the text emphasizes that bits, despite their simplicity, are powerful tools for encoding complex information and facilitating communication. The overarching theme is the efficiency and universality of binary encoding in representing and processing information.
The significance of bytes and hexadecimal notation in computing. It begins by discussing how individual bits are grouped into "words" of varying lengths, traditionally multiples of 6 bits, but modern computers standardize on 8-bit bytes. The term "byte," introduced by IBM in the 1950s, refers to an 8-bit unit that efficiently represents data, with hexadecimal (base-16) providing a more succinct way to express byte values compared to binary or octal systems. Hexadecimal uses digits 0-9 and letters A-F to represent values, simplifying the representation of binary data. The text covers the conversion between binary, decimal, and hexadecimal systems, highlights the use of hexadecimal in HTML for color coding, and explains the importance of hexadecimal in computing for compact data representation.
The evolution of text representation in computing, focusing on character encoding standards. It begins with early binary codes like Morse code and Braille, which were foundational but had limitations. Morse code, for instance, uses variable bit lengths for different characters, while Braille is a straightforward 6-bit code. The Baudot code, a 5-bit system used in telegraphs and teletypewriters, introduced the concept of shift codes to handle different character sets, although this led to issues like misinterpreted data when shift codes were not managed correctly. The transition to ASCII (American Standard Code for Information Interchange), a 7-bit code formalized in 1967, marked a significant advancement, providing a standardized method for text representation that avoids shift codes and includes a comprehensive set of characters including uppercase and lowercase letters, numbers, and punctuation. ASCII's simplicity and consistency made it dominant, though IBM’s EBCDIC (Extended Binary Coded Decimal Interchange Code), an 8-bit code used in IBM's systems, remained a notable alternative. ASCII’s influence extends to modern computing, with files often stored in ASCII format or its 8-bit extension, reflecting its practicality and widespread adoption. The text also notes the challenges of text file compatibility across different systems, such as varying end-of-line markers, and highlights the transition to richer text formats like HTML, which build on ASCII by incorporating markup for formatting. the evolution and limitations of character encoding systems, focusing on ASCII and its shortcomings in representing diverse global scripts. ASCII, a 7-bit code, is primarily designed for American English and lacks support for accented characters and non-Latin scripts, leading to the development of extended ASCII sets and various regional character encodings like ANSI, ISO-8859-1, and Windows-1252. The introduction of Unicode, a 16-bit system, aimed to address these limitations by providing a universal character set that supports over 65,000 characters initially and expanding to over 1 million with the 21-bit version. Unicode includes characters from a wide range of languages and symbols, including emojis. UTF-8, a variable-length encoding format, is widely used on the internet due to its backward compatibility with ASCII and efficient storage. However, issues like character encoding mismatches, such as those seen in email encoding errors, highlight the importance of correctly specifying and using encoding standards.
How to construct a binary adding machine using basic logic gates and electrical components. It begins by emphasizing the fundamental role of addition in computing and outlines how a simple adding machine can be built on paper using binary numbers and logic gates like AND, OR, NAND, and XOR. The adding machine will have switches for input and lightbulbs for output, simulating how binary addition works. The addition process involves handling sum and carry bits separately, simplified through binary arithmetic tables. The machine employs half adders for single-bit addition and full adders for multi-bit numbers, with a total of 144 relays needed for an 8-bit adder. The text concludes by discussing how multiple 8-bit adders can be cascaded to handle larger numbers, mirroring basic computer arithmetic operations.
The evolution of computing technology from relay-based systems to modern transistor-based computers. It begins with George Stibitz's early relay-based 1-bit adder, which led to the Complex Number Computer in 1940, and details other early relay computers like the Harvard Mark I. It contrasts these with analog computers, like the Differential Analyzer, and highlights Charles Babbage's pioneering work on the Analytical Engine, which featured programmable elements. The narrative then shifts to the development of transistors by John Bardeen, Walter Brattain, and William Shockley in 1947, marking a significant leap from vacuum tubes to solid-state electronics. Transistors, which are smaller, more efficient, and longer-lasting than vacuum tubes, revolutionized computing, leading to the creation of early transistor computers and the establishment of Silicon Valley. The text concludes with a mention of Geoffrey Dummer’s visionary concept of integrated circuits, which would eventually become a reality in electronic equipment. Overall, the theme emphasizes the continuous advancement in computer technology from mechanical and analog systems to electronic and integrated circuits.
How binary subtraction differs from addition and introduces techniques to simplify it. Addition involves a straightforward process of carrying values from one column to the next, while subtraction requires borrowing, which complicates the operation. To address this, the text presents the concept of nines' complement and ones' complement. The nines' complement method avoids borrowing by converting the subtraction problem into an addition problem with complements, simplifying calculations. For binary numbers, this translates to the ones' complement, which is calculated by inverting bits. The text also explores how binary subtraction can be managed using two’s complement notation, which allows both positive and negative numbers to be handled seamlessly within a fixed bit-width format. This system simplifies arithmetic operations by converting subtraction into addition of complements, although it requires careful attention to overflow issues.
How electricity powers various devices, with a focus on how it makes things move using basic electric circuits. It highlights the function of electric buzzers and bells and introduces the concept of relays and oscillators. Relays can create buzzing sounds by opening and closing circuits repeatedly, while oscillators continuously alternate between on and off states, providing essential timing functions in electronics. The text delves into flip-flops, circuits that "remember" their state, and distinguishes between different types, including the R-S (Reset-Set) flip-flop and the D-type (Data-type) flip-flop. The latter, especially when level-triggered, serves as a basic form of memory storage in digital circuits. The discussion also touches on practical applications like adding circuits and issues with level-triggered flip-flops in complex operations, emphasizing their role in automating and storing data. Implementation and functionality of a Hold That Bit signal in digital circuits, particularly focusing on D-type flip-flops. It describes how a Hold That Bit signal maintains the output state regardless of changes in the Data input when the signal is low (0). The text details how this signal is integrated into circuits using R-S flip-flops and how it simplifies to a level-triggered D-type flip-flop, which saves the Data input value when the Clock signal is high (1). The discussion then transitions to edge-triggered D-type flip-flops, which update their output only on the transition of the Clock signal from low to high (positive edge). The final part of the text introduces ripple counters and frequency dividers, demonstrating how multiple flip-flops can be cascaded to count in binary and measure oscillator frequencies. The practical applications, including enhancements like Clear and Preset signals, are also covered, emphasizing the importance of these components in creating reliable and accurate digital circuits.
The story of the Internet is all about layers.
The resilience and adaptability of the internet are rooted in its layered architecture, which allows it to function seamlessly despite constant physical changes and challenges. Originating from Bob Metcalfe's 1973 Ethernet innovation, the internet has evolved from a network plagued by concerns of imminent collapse to a robust and scalable system. Initially designed with a fundamental lack of built-in memory, the internet's decentralised protocols were supplemented by layers such as TCP/IP and application protocols like HTTP, enabling flexibility and rapid growth. This layered abstraction allows users to interact with digital services without considering the complex physical infrastructure beneath, which includes billions of kilometres of fibre-optic cables and numerous data centres. Advances such as optical fibre have bolstered this infrastructure, allowing it to handle more data and support emerging technologies like AI and the metaverse. However, the physical limitations still influence what is digitally possible, reminding tech innovators of the essential role that infrastructure plays in the evolution of digital technologies.
Advances in physical storage and retrieval made the cloud possible
The evolution of physical storage has been pivotal in the rise of cloud computing, yet sustaining this digital infrastructure demands continuous innovation. It all began in 1956 with IBM's 305 RAMAC, a behemoth of a machine that stored a mere 4.4 megabytes on hefty magnetic disks, costing an astronomical sum compared to today's trivial expenses for gigabytes of storage. Modern cloud storage represents an abstraction of this early technology, dispersing data across a global network of servers to the extent that users no longer need to concern themselves with the physical details. Yet, beneath this seamless facade lies a complex web of physical storage advancements essential to meet the escalating data demands—123 zettabytes in 2023 alone. Technologies like magnetic tape, once deemed outdated, continue to underpin vast data repositories for institutions like the Rutherford Appleton Laboratory due to their cost-effectiveness and durability. Meanwhile, flash memory and hard-disk drives cater to different needs with varying speeds and costs. Redundancy systems such as RAID safeguard against hardware failures, ensuring data integrity. However, magnetic tape's limitations—requiring meticulous storage conditions and periodic replacement—highlight the need for new solutions. Enter glass storage, a cutting-edge technology with the potential to revolutionize data preservation by encoding information in laser-etched dots on glass slides, promising longevity of up to 10,000 years. As the cloud evolves, so too must its physical underpinnings, balancing the need for rapid access with enduring reliability.
The evolution of the internet highlights a shift from its initial decentralized vision to a highly centralized reality dominated by a few tech giants. The concept of "network effects," once championed to explain the success of Microsoft and other early digital players, has not entirely captured the complexities of today's internet landscape. As Carl Shapiro and Hal Varian's ideas about network effects have been scrutinized, the real narrative unfolds in the interplay of layers within the internet. Initially decentralized at the protocol level, the internet's web and application layers have become increasingly concentrated, with giants like Google, Facebook, and Amazon consolidating power through massive data accumulation and advanced technologies. This centralization is further intensified by the advertising-driven business model, which prioritizes user data and scales, ultimately fueling further concentration. As these companies pivot to artificial intelligence, concerns mount about their potential to dominate not just online spaces but various industries, prompting ongoing efforts by blockchain proponents to reclaim the internet's decentralized ethos.
How machines learn?
Artificial intelligence could be the next major leap in evolution, but it remains a contentious issue. Life on Earth began 13.8 billion years ago with the Big Bang, and roughly four billion years later, self-replicating structures emerged. This life can be categorized into three stages: Life 1.0, biological entities like bacteria, which evolve slowly and cannot adapt within their lifetimes; Life 2.0, exemplified by humans who can acquire new knowledge and adapt throughout their lives; and Life 3.0, a theoretical stage where technological life forms could design both their hardware and software. The debate over AI's role in this evolutionary trajectory is polarized: digital utopians view it as a natural progression, techno-skeptics doubt its near-term impact, and beneficial AI advocates urge a focus on ensuring AI research yields universally positive outcomes. Artificial Intelligence promises both groundbreaking advancements and significant risks. Its potential benefits are immense, but flaws in AI systems could be exacerbated due to their speed and complexity, making trial-and-error approaches potentially disastrous. Max Tegmark's exploration in Life 3.0 highlights the urgent need for robust AI systems, emphasizing four key areas: Verification (building the system correctly), Validation (ensuring the right system is built), Control (maintaining human oversight), and Security (protecting against malware and hacks). The concept of an "intelligence explosion" further complicates the future, where an Artificial General Intelligence (AGI) with full cognitive capabilities could evolve too quickly for humans to manage, potentially leading to scenarios ranging from totalitarian regimes to autonomous AI entities. As we look ahead, the trajectory of AGI could lead to various outcomes over the next 10,000 years, including harmonious coexistence with AI, a future without advanced AI, or even the end of humanity.
The advent of advanced AI tools, like ChatGPT, has the potential to transform economies similarly to past innovations such as steam engines and electricity. AI, being a general-purpose technology, promises to enhance productivity across various industries but could also displace workers. Historical patterns show that while new technologies often take time to impact the economy significantly—e.g., steam power and electrification—the eventual benefits can be profound. Current AI advancements, though impressive, may initially depress productivity growth as businesses adapt. Despite concerns about potential job losses, history suggests that technology typically creates new opportunities and shifts employment rather than causing mass unemployment. As AI continues to develop, it could lead to significant economic changes, demanding careful management to ensure it translates into broad-based prosperity.
Lawyers are a conservative bunch, befitting a profession that rewards preparedness, sagacity and respect for precedent.
Lawyers, renowned for their adherence to tradition and reverence for precedent, encountered an unexpected twist in the tale when Steven Schwartz, a personal-injury lawyer at New York's Levidow, Levidow & Oberman, ventured into uncharted territory. Last month, Schwartz sought assistance from ChatGPT, an artificial intelligence (AI) chatbot, to aid in preparing a court filing. Yet, in a curious turn of events, Schwartz placed undue reliance on the AI, resulting in a motion brimming with fabricated cases, rulings, and quotes—an erroneous submission made under the misguided assurance from the bot that the provided cases were genuine and accessible in reputable legal databases (which, in reality, they were not).
For skeptics of technology within the legal profession, Schwartz's misstep might seem a validation of their reservations—an affirmation that traditional methods reign supreme. But such a conclusion would miss the mark entirely. Assigning blame to AI for Schwartz's blunder is as nonsensical as attributing printing errors in a typed document to the invention of the printing press. In both scenarios, culpability lies squarely with the lawyer who failed to verify the motion before submission—not with the tool that facilitated its creation.
AI, far from being a passing fad or harbinger of doom, represents a nascent yet potent tool that has the potential to revolutionize legal practice and reshape the economics of law firms. While various fields stand ripe for disruption, few match the legal profession in the clarity of their need for innovation—coupled with the inherent risks involved.
Enterprises that successfully integrate AI into their operations stand poised to reap substantial benefits, while those lagging behind risk obsolescence akin to the fate of typesetters. According to a recent report by Goldman Sachs, a staggering 44% of legal tasks are amenable to AI automation, a figure surpassed only by clerical and administrative support functions. Lawyers often find themselves mired in the laborious scrutiny of voluminous documents—a task that AI has proven adept at handling. From due diligence to research and data analytics, AI applications have already made significant inroads in legal practice.
While the extractive capabilities of AI have hitherto dominated its usage—facilitating the extraction of specific information from texts—more potent generative AIs, like ChatGPT, are poised to elevate legal research and document review to unprecedented levels. By discerning nuanced connections within legal texts, these advanced AI systems promise to streamline and enhance legal analysis.
Leading firms, such as Allen & Overy in London, have embraced AI tools like Harvey for contract analysis, due diligence, and litigation preparation, heralding a paradigm shift in legal practice. However, skepticism abounds among legal practitioners. Despite 82% acknowledging the potential utility of generative AI in legal work, a mere 51% endorse its adoption—a testament to prevailing apprehensions regarding AI's propensity for generating inaccuracies and the inadvertent disclosure of privileged information.
Yet, with advancements in technology and judicious oversight, these concerns can be addressed. Consider the response of a federal judge in Texas following Schwartz's gaffe, mandating attorneys to certify that they either abstained from employing generative AI or thoroughly vetted its outputs before submission. Just as the transition from library-based legal research to digital databases revolutionized legal practice, the widespread adoption of generative AI holds the promise of catalyzing further innovation.
AI's potential impact on the legal landscape extends beyond mere efficiency gains—it has the power to fundamentally alter the dynamics of legal service delivery and consumption. By diminishing the reliance on vast armies of junior lawyers—traditionally the backbone of large firms—AI threatens to upend traditional billing models. Firms may pivot towards value-based billing or impose technology surcharges, adapting to a new paradigm where AI assumes a central role in legal service provision.
Moreover, the advent of AI raises existential questions about the future of legal employment. If AI can execute tasks that previously necessitated scores of associates in a fraction of the time, the imperative for large firms to maintain bloated associate ranks diminishes. A recalibration of the associate-to-partner ratio looms on the horizon, potentially precipitating a seismic shift in the legal labor market.
However, this transformation is unlikely to transpire overnight. Nevertheless, AI holds the promise of democratizing access to legal services, rendering them more affordable and accessible—particularly for small and medium-sized enterprises. Aspiring legal professionals may find solace in AI as a facilitator of solo practice, ushering in a new era of legal entrepreneurship.
In the final analysis, AI's ascendance augurs well for legal consumers. As Richard Susskind, technology adviser to the Lord Chief Justice of England, aptly observes, clients seek solutions to their problems—not necessarily the intervention of lawyers. If AI can deliver outcomes efficiently and effectively, its adoption is all but inevitable. After all, individuals have embraced tax software for filing returns, prioritizing efficacy over interpersonal interactions with tax advisers.
In essence, Schwartz's misstep serves as a cautionary tale, not against the perils of AI, but against complacency and uncritical reliance on technology. Embracing AI as a transformative force within the legal realm demands a nuanced understanding of its capabilities and limitations—a recognition that AI, in its nascent stage, represents a tool of immense potential, awaiting adept wielders to unlock its transformative power. ■
Software that gives legal advice could shake up the legal profession by dispensing faster and fairer justice.
GIVEN the choice, who would you rather trust to safeguard your future: a bloodsucking lawyer or a cold, calculating computer? Granted, it's not much of a choice, since neither lawyers nor computers are renowned for their compassion. But it is a choice that you may well encounter in the not-too-distant future, as software based on “artificial intelligence” (AI) starts to dispense legal advice. Instead of paying a lawyer by the hour, you will have the option of consulting intelligent legal services via the web. While this might sound outlandish, experts believe that the advent of smart software capable of giving good, solid legal advice could revolutionise the legal profession.
What is arguably one of the most conservative of all professions has already been quietly undergoing a technological revolution: many lawyers now use automated document-retrieval systems to store, sort and search through mountains of documents. But the introduction of smarter programs, capable of not just assisting lawyers but actually performing some of their functions, could turn the profession on its head. Such software could both improve access to justice and massively reduce legal costs, both for the client and the courts.
That is not to say that laptops will soon be representing people in court. But when a civil case goes to court it is usually a good indication that all other options have failed. Technology has the potential to preclude this last resort. “You move from a culture of dispute resolution to dispute avoidance,” says Richard Susskind, a law professor who is technology adviser to Britain's Lord Chief Justice. Making legal advice more accessible, he says, means people are more likely to seek advice before getting themselves into trouble.
Some such programs already exist online and are currently being used by lawyers, says John Zeleznikow, a computer scientist at Victoria University in Australia and one of the orchestrators of this transformation. Although current programs are designed to help lawyers give advice, this is just the beginning. The trend, he says, is to make such services available to the masses. One service is designed to help resolve property disputes between divorcing couples. Aptly named SplitUp, the system can examine a client's case and, by scrutinising previous rulings, predict what the likely outcome would be if it went to court. The system, developed and now operating in Australia, is proving to be very helpful in getting couples to settle their disputes without having to go to court, says Andrew Stranieri, an AI expert at the University of Ballarat, in the Australian state of Victoria.
Dr Zeleznikow and Dr Stranieri have teamed up and launched a company, called JustSys, to develop AI-based legal systems. GetAid, another of their creations, is being used in Australia by Victoria Legal Aid (VLA) to assess applicants for legal aid. This is a complicated process that normally consumes about 60% of the authority's operational budget, because it involves assessing both the client's financial status and the likelihood that his or her case will succeed. Although both these systems are only available for use by lawyers and mediators, it is the clients who benefit, says Dr Zeleznikow. With SplitUp, a client can avoid going to court with a claim that will surely lose and is instead given a chance to find a more realistic solution. With GetAid, although it may appear to be the legal professionals who are directly benefiting, there is a real knock-on effect for the client, says Domenico Calabro, a lawyer with VLA. Automating the application process frees up lawyers and paralegals so they can spend more of their time actually representing people rather than processing applications, he says.
Anatomy of an Artificial Lawyer
What makes both these programs so smart is that they do more than just follow legal rules. Both tasks involve looking back through past cases and drawing inferences from them about how the courts are likely to view a new case. To do this, the programs use a combination of two common AI techniques: expert systems and machine learning. Expert systems are computer-based distillations of the rules of thumb used by experts in a particular field. SplitUp, for example, uses an expert “knowledge base” of 94 different variables, which are the factors identified by legal experts as most important to judges dealing with domestic-property disputes. Because no two cases are ever the same, and because judges use different degrees of discretion, it is not enough simply to apply a set of rules to these variables, however.
Hence the need for machine learning, a technique in which a decision-making system is “tuned” using historical examples, and adjusting the model to ensure it produces the correct answer. The system is trained using a sample of previous cases to learn how these variables have been combined by judges in the past. All of this builds an accurate model of the decision-making process a judge might use, and allows it to be applied to new cases, says Dr Zeleznikow. GetAid also makes inferences, but instead of working out what the courts will award the client, its intelligence lies in its ability to predict whether the client has a winnable case. Both systems are incredibly accurate, says Mr Calabro. Tests of GetAid, carried out by VLA, showed that when 500 past applications were fed into the system it gave the same result as the actual outcome 98% of the time. The remaining 2% were then re-examined and found to be borderline cases. All 14 of VLA's offices now use GetAid, and the Australian authorities are considering rolling it out in the country's other seven states.
“Smart software could make legal advice more readily available and rulings more consistent.”
Some may regard all this as too impersonal, but those people can probably continue to afford a human lawyer, says Dr Susskind. Most of the people on the receiving end of this technology are not getting any legal advice at all at the moment. Stuart Forsyth, a consultant for the American Bar Association's Futures Committee, points to a growing trend in America of people representing themselves in court. This happens in more than half of all domestic disputes and an even larger proportion of some other types of case. This is worrying, says Mr Forsyth, because these people are probably not doing a very good job for themselves. Internet-based legal-advice software could not only create a more level playing field but in doing so could also dramatically alter the nature of legal guidance, says Dr Susskind. Instead of being a one-to-one advisory service, it could become a one-to-many information service. Lawyers, of course, might not regard this as such a good thing. So it is not surprising that AI has traditionally been frowned upon within the legal profession.
Lawyer v computer
In the 1980s, a program designed to help lawyers interpret immigration law laid down by the British Nationality Act caused consternation among academics and lawyers alike. Shockingly, it could be used by lawyers and non-lawyers alike. Critics were worried that bypassing lawyers might pose a threat to democracy, because of the important role lawyers play in re-interpreting statutes laid down by Parliament, says Blay Whitby, an AI expert at the University of Sussex. “Any change to the status quo should be the subject of proper, informed democratic debate,” he says. Such concerns still linger, but attitudes seem to be shifting, says Mr Forsyth, as a new generation of more technology-savvy lawyers emerges. In 1999, a Texas court banned a basic self-help software package, Quicken Family Lawyer, on the grounds that the software was, in effect, practising law without a licence. Yet within 90 days this decision was overturned. This indicates a willingness among judges, at least, to tolerate the technology. Americans may like lawsuits, but they like technology even more.
One reason for optimism, suggests Dr Zeleznikow, is the way in which the programs are designed to be used. To have a machine making legal decisions about a person's welfare would be morally untenable in many situations, he says. So these days, programs are designed to have built-in safety checks to prevent them from overstepping this ethical line. For example, GetAid cannot reject applicants, but can only approve them: the rest are referred to a legal officer for reconsideration. Another example concerns the systems used by judges to help them in the complex and arcane process of sentencing. There is a real drive for sentencing to become more transparent and consistent, says Mr Forsyth. “People have great difficulty rationalising why one person gets one punishment, while someone else ends up with a lesser sentence,” he says. Some judges are already using software tools to address this issue, but these are mainly statistical packages which give judges nothing more than a sense of how similar convictions have been sentenced in the past, says Uri Schild, a computer scientist at Bar-Illan University in Israel. However, these programs are now becoming more sophisticated. Dr Schild has developed a system that attempts to go one stage further, by considering not just the nature of the crime, but also the offender's previous conduct.
Magistrates and judges are often under considerable time constraints when working out sentences, and are unable to give detailed consideration to the offender's previous convictions. So Dr Schild's system evaluates an offender's record and creates a brief overview for the judge to peruse, including the number of previous offences, how serious they are, their frequency, and so on. For each category the program indicates how significant it is to the case in hand. Another program, from JustSys, appears to push things even further. The Sentencing Information System helps judges construct and record their arguments for deciding upon a sentence. The decisions still come from the judges, says Dr Zeleznikow, but the system helps them justify their decisions by mapping out their reasons.
People have to be kept in the loop because of accountability, says Dr Whitby. But the technology itself need not be feared as a new entity. On the contrary, the same AI techniques have been helping engineers and businesses for years, in fields from marketing to oil-drilling—and they would not have been so widely adopted if they did not work. The real issue is one of acceptance, he says.
None of these systems threatens to put lawyers and judges out of a job, nor is that the intention. They do things that people do at the moment, says Dr Zeleznikow, “but they could be quicker and cheaper”. What the systems still lack is the ability to exercise discretion, and that is not likely to change for the foreseeable future—so humans need not worry about losing their jobs to an army of robo-lawyers. But smart software has the potential to make legal advice more readily available, unnecessary court battles less frequent, and rulings more consistent. Surely not even a lawyer could argue with that.
Humans will add to AI’s limitations - slowing progress even more, but another AI winter is unlikely.
Generative AI: What is it good for?
The strengths and weaknesses behind the hype
Although ai is still in its infancy, some industries have been eager adopters. A close look at three of these—translation, customer service and sales—is broadly supportive of the optimistic shift among economists, though not without complications. Will ai boost the incomes of superstars more than those of stragglers, much as the internet revolution did? Or will it be a “great equaliser”, raising the incomes of the worst off but not those of high flyers? The answer may depend on the type of employment in question.
Generative artificial intelligence is the technology behind a wave of new online tools used by millions of people around the world. As the technology proliferates, so do concerns about the accuracy of information provided by these tools and how reliable they might be in safety-critical areas like finance and health care. Leveraging AI successfully depends significantly on people, especially in relation to company culture and skills. The flipside of ai disruption is new jobs elsewhere.
The technological limits of naive, fallible ai, in other words, will lead humans to impose additional political and social limits upon it. Clever algorithms will have to fit into a world that is full of humans, and, in theory at least, run by them. Artificial intelligence is both powerful and limited. As that realisation spreads, some of the dreams of high summer will fade in the autumnal chill.
What would humans do in a world of super-AI?
Consider areas where humans have an advantage in providing a good or service—call it a “human premium”. This premium would preserve demand for labour even in an age of superadvanced ai. One place where this might be true is in making private information public. So long as people are more willing to share their secrets with other people than machines, there will be a role for those who are trusted to reveal that information to the world selectively, ready for it then to be ingested by machines. Your correspondent would like to think that investigative journalists will still have jobs. The human premium might appear elsewhere, too. People value history, myths and meaning. Non-fungible tokens, for which provenance can be verified on a blockchain, are typically valued at many multiples more than images with identical pixels but a different history. In areas such as caregiving and therapy, humans derive value from others spending their scarce time with them, which adds feeling to an interaction. Artificial diamonds, which have the same molecular structure as those from the ground, trade at an enormous discount—around 70% by one estimate. In the future, items with a “made by a human” tag might be especially desirable.
People problems - If this premium is big enough, it could even weigh on growth. Divide the sectors of the economy into those with a large human premium and those without. If humans do not substitute machine-produced goods and services for those made by fellow humans, the Baumol effect would only deepen. Measured economic growth could even hit zero. Indeed, if extremely powerful ai failed to supercharge growth, it would suggest that the economy had already moved beyond materiality towards play, politics and areas where what people value most of all is interacting with others.
Perhaps one day ais will produce entirely new goods and services that will outcompete the desire to please and interact with other humans. The manner in which such a contest played out would reveal something profound: just how much of a “social animal” is a human? ■
Generative AI is a marvel. Is it also built on theft? The wonder-technology faces accusations of copyright infringement.
Generative artificial intelligence (AI) has caused a creative explosion of new writing, music, images and video. The internet is alive with AI-made content, while markets fizz with AI-inspired investment. But some wonder how creative the technology really is—and whether those cashing in have fairly compensated those on whose work the models were trained. To those who hold the rights to these creative works, generative AI is an outrage—and perhaps an opportunity. A frenzy of litigation and dealmaking is under way, as rights-holders angle for compensation for providing the fuel on which the machines of the future are run. For the AI model-makers, it is an anxious period “They have created an amazing edifice that’s built on a foundation of sand.”
AIs are trained on vast quantities of human-made work, from novels to photos and songs. These training data are broken down into “tokens”—numerical representations of bits of text, image or sound—and the model learns by trial and error how tokens are normally combined. Following a prompt from a user, a trained model can then make creations of its own. More and better training data means better outputs.
Many AI companies have become cagey about what data their models are trained on, citing competitive confidentiality (and, their detractors suspect, fear of legal action). But it is widely acknowledged that, at least in their early stages, many hoovered up data that was subject to copyright. Openai’s past disclosures show that its GPT-3 model was trained on sources including the Common Crawl, a scraping of the open internet which includes masses of copyrighted data. Most of its rivals are thought to have taken a similar approach. The tech firms argue that there is nothing wrong with using others’ data simply to train their models. Absorbing copyrighted works and then creating original ones is, after all, what humans do. Those who own the rights say there is a difference. AI companies “are spending literally billions of dollars on computer chips and energy, but they’re unwilling to put a similar investment into content”.
Media companies were badly burned by an earlier era of the internet. Publishers’ advertising revenue drained away to search engines and social networks, while record companies’ music was illegally shared on applications like Napster. The content-makers are determined not to be caught out again. Publishers are blocking AI companies’ automated “crawlers” from scraping words from their websites: nearly half of the most popular news websites block Openai’s bots. Record companies have told music-streaming services to stop AI companies from scraping their tunes. There is widespread irritation that tech firms are again seeking forgiveness rather than permission. The lawyering is now happening. The biggest rights-holders in various creative industries are leading the charge. All tech firms deny wrongdoing.
Fair Use?
In America the tech companies are relying on the legal concept of fair use, which provides broad exemptions from the country’s otherwise-ferocious copyright laws. They have an encouraging precedent in the form of a ruling on Google Books in 2015. Then, the Authors Guild sued the search company for scanning copyrighted books without permission. But a court found that Google’s use of the material—making books searchable, but showing only small extracts—was sufficiently “transformative” to be considered fair use. Generative-ai firms argue that their use of the copyrighted material is similarly transformative. Rights-holders, meanwhile, are pinning their hopes on a Supreme Court judgment last year which tightened the definition of transformativeness, with its ruling that a series of artworks by Andy Warhol, which had altered a copyrighted photograph of Prince, a pop star, were insufficiently transformative to constitute fair use.
Not all media types enjoy equal protection. Copyright law covers creative expression, rather than ideas or information. This means that computer code, for example, is only thinly protected, since it is mostly functional rather than expressive. (A group of programmers are aiming to test this idea in court, claiming that Microsoft’s GitHub Copilot and Openai’s CodexComputer infringed their copyright by training on their work.) News can be tricky to protect for the same reason: the information within a scoop cannot itself be copyrighted. Newspapers in America were not covered by copyright at all until 1909, notes Jeff Jarvis, a journalist and author. Before then, many employed a “scissors editor” to literally cut and paste from rival titles.
At the other end of the spectrum, image-rights holders are better protected. ai models struggle to avoid learning how to draw copyrightable characters—the “Snoopy problem”, referring to the cartoon beagle. Model-makers can try to stop their ais drawing infringing images by blocking certain prompts, but they often fail. At prompting, Microsoft’s image creator, based on Openai’s Dall-e, happily drew images of “Captain America smoking a Marlboro” and “The Little Mermaid drinking Guinness”, despite lacking express permission from the brands in question. (Artists and organisations can report any concerns via an online form).
Musicians are also on relatively strong ground: music copyright in America is strictly enforced, with artists requiring licences even for short samples. Perhaps for this reason, many AI companies have been cautious in releasing their music-making models. Outside America, the legal climate is mostly harsher for tech firms. The European Union, home to Mistral, a hot French AI company, has a limited copyright exception for data-mining, but no broad fair-use defence. Much the same is true in Britain, where Getty has brought its case against Stability AI, which is based in London (and had hoped to fight the lawsuit in America). Some jurisdictions offer safer havens. Israel and Japan, for instance, have copyright laws that are friendly for AItraining. Tech companies hint at the potential threat to American business, should the country’s courts take a tough line. OpenAI says of its dispute with the New York Times that its use of copyrighted training data is “critical for us competitiveness”. Rights-holders bridle at the notion that America should lower its protections to the level of other jurisdictions just to keep the tech business around. One describes it as unAmerican. But it is one reason why the big cases may end up being decided in favour of the AI companies. Courts may rule that models should not have trained on certain data, or that they committed too much to memory. “But I don’t believe any us court is going to reject the big fair-use argument. Partly because I think it’s a good argument. And partly because, if they do, we’re just sending a great American industry to Israel or Japan or the EU.”
Copyrights, copywrongs - Licensing
While the lawyers sharpen their arguments, deals are being done. In some cases, suing is being used as leverage. “Lawsuits are negotiation by other means,” admits a party to one case. Even once trained, AIs need ongoing access to human-made content to stay up-to-date, and some rights-holders have done deals to keep them supplied with fresh material. Openai says it has sealed about a dozen licensing deals, with “many more” in the works. Partners so far include the Associated Press, Axel Springer (owner of Bild and Politico), Le Monde and Spain’s Prisa Media. Rupert Murdoch’s News Corp, which owns the Wall Street Journal and Sun among other titles, said in February that it was in “advanced negotiations” with unnamed tech firms. “Courtship is preferable to courtrooms—we are wooing, not suing". Shutterstock, a photo library, has licensed its archive to both Openai and Meta, the social-media empire that is pouring resources into AI. Reddit and Tumblr, online forums, are reportedly licensing their content to AI firms as well.
Most rights-holders are privately pessimistic. A survey of media executives in 56 countries by the Reuters Institute found that 48% expected there to be “very little” money from AI licensing deals. Even the biggest publishers have not made a fortune. Axel Springer, which reported revenue of €3.9bn ($4.1bn) in 2022, will reportedly earn “tens of millions of euros” from its three-year deal with Openai. “There is not a big licensing opportunity. I don’t think the aim of [the AI models] is to provide alternatives to news,” says Alice Enders of Enders Analysis, a media-research firm. The licensing deals on offer are “anaemic”. “When companies are…saying, ‘We don’t need to license this content, we have full rights to scrape it,’ it diminishes their motivations to come together and negotiate fair economics.” Some owners of copyrighted material are therefore going it alone.
Specialized AI
Getty last year launched its own generative AI, in partnership with Nvidia, a chipmaker. Getty’s image-maker has been trained only on Getty’s own library, making it “commercially safe” and “worry-free”, the company promises. It plans to launch an AI video-maker this year, powered by Nvidia and Runway, another AI firm. As well as removing copyright risk, Getty has weeded out anything else that could get its customers into trouble with IP lawyers: brands, personalities and many less obvious things, from tattoo designs to firework displays. Only a small percentage of Getty’s subscribers have tried out the tools so far, the firm admits. But it hopes that recurring revenue from the service will eventually exceed the “one-time royalty windfall” of a licensing deal.
A number of news publishers have reached a similar conclusion. Bloomberg said last year that it had trained an AI on its proprietary data and text. Schibsted, a big Norwegian publisher, is leading an effort to create a Norwegian-language model, using its content and that of other media companies. Others have set up chatbots. Last month the Financial Times unveiled Ask ft, which lets readers interrogate the paper’s archive. The San Francisco Chronicle’s Chowbot, launched in February, lets readers seek out the city’s best tacos or clam chowder, based on the paper’s restaurant reviews. The BBC said last month that it was exploring developing AI tools around its 100-year archive “in partnership or unilaterally”. Most big publications are experimenting behind the scenes. It is too early to say if audiences will take to such formats. Specialised AI tools may also find it hard to compete with the best generalist ones. Openai’s Chatgpt outperforms Bloomberg’s AI even on finance-specific tasks.
Licensing content to tech firms has its own risks
Rights-holders “have to be thinking very hard about the degree to which this is being used to train their replacements”. The new questions raised by AI may lead to new laws. “We’re stretching current laws about as far as they can go to adapt to this,” says Mr Grimmelmann. Tennessee last month passed the Ensuring Likeness Voice and Image Security (ELVIS) Act, banning unauthorised deepfakes in the state. But Congress seems more likely to let the courts sort it out. Some European politicians want to tighten up the law in favour of rights-holders; the EU’s directive on digital copyright was passed in 2019, when generative AI was not a thing. “There is no way the Europeans would pass such a directive today. Another question is whether copyright will extend to AI-made content.
So far judges have been of the view that works created by AI are not themselves copyrightable. In August an American federal court ruled that “human authorship is a bedrock requirement of copyright”, dismissing a request by a computer scientist to copyright a work of art he had created using AI. This may change as AIs create a growing share of the world’s content. It took several decades of photography for courts to recognise that the person who took a picture could claim copyright over the image. The current moment recalls a different legal case earlier this century. A wildlife photographer tried to claim copyright over photographs that macaque monkeys had taken of themselves, using a camera he had set up in an Indonesian jungle. A judge ruled that because the claimant had not taken the photos himself, no one owned the copyright. (A petition by an animal-rights group to grant the right to the monkeys was dismissed.)
Generative AI promises to fill the world with content that lacks a human author, and therefore has no copyright protection. ■
Might future law-school graduates look to machines rather than the judges, rules and standards that have underpinned the legal system?
Handing complicated tasks to computers is not new. But a recent spurt of progress in machine learning, a subfield of artificial intelligence (AI), has enabled computers to tackle many problems which were previously beyond them. The result has been an AI boom, with computers moving into everything from medical diagnosis and insurance to self-driving cars. There is a snag, though. Machine learning works by giving computers the ability to train themselves, which adapts their programming to the task at hand. People struggle to understand exactly how those self-written programs do what they do (see article). When algorithms are handling trivial tasks, such as playing chess or recommending a film to watch, this “black box” problem can be safely ignored. When they are deciding who gets a loan, whether to grant parole or how to steer a car through a crowded city, it is potentially harmful. And when things go wrong—as, even with the best system, they inevitably will—then customers, regulators and the courts will want to know why. For some people this is a reason to hold back AI.
France’s digital-economy minister, Mounir Mahjoubi, has said that the government should not use any algorithm whose decisions cannot be explained. But that is an overreaction. Despite their futuristic sheen, the difficulties posed by clever computers are not unprecedented. Society already has plenty of experience dealing with problematic black boxes; the most common are called human beings. Adding new ones will pose a challenge, but not an insuperable one. In response to the flaws in humans, society has evolved a series of workable coping mechanisms, called laws, rules and regulations. With a little tinkering, many of these can be applied to machines as well.
Be open-minded
Start with human beings. They are even harder to understand than a computer program. When scientists peer inside their heads, using expensive brain-scanning machines, they cannot make sense of what they see. And although humans can give explanations for their own behaviour, they are not always accurate. It is not just that people lie and dissemble. Even honest humans have only limited access to what is going on in their subconscious mind. The explanations they offer are more like retrospective rationalisations than summaries of all the complex processing their brains are doing. Machine learning itself demonstrates this. If people could explain their own patterns of thought, they could program machines to replicate them directly, instead of having to get them to teach themselves through the trial and error of machine learning.
Away from such lofty philosophy, humans have worked with computers on complex tasks for decades. As well as flying aeroplanes, computers watch bank accounts for fraud and adjudicate insurance claims. One lesson from such applications is that, wherever possible, people should supervise the machines. For all the jokes, pilots are vital in case something happens that is beyond the scope of artificial intelligence. As computers spread, companies and governments should ensure the first line of defence is a real person who can overrule the algorithms if necessary.
Even when people are not “in the loop”, as with an entirely self-driving cars, today’s liability laws can help. Courts may struggle to assign blame when neither an algorithm nor its programmer can properly account for its actions. But it is not necessary to know exactly what went on in a brain—of either the silicon or biological variety—to decide whether an accident could have been avoided. Instead courts can ask the familiar question of whether a different course of action might have reasonably prevented the mistake. If so, liability could fall back onto whoever sold the product or runs the system. There are other worries. A machine trained on old data might struggle with new circumstances, such as changing cultural attitudes. There are examples of algorithms which, after being trained by people, end up discriminating over race and sex.
But the choice is not between prejudiced algorithms and fair-minded humans. It is between biased humans and the biased machines they create. A racist human judge may go uncorrected for years. An algorithm that advises judges might be applied to thousands of cases each year. That will throw off so much data that biases can rapidly be spotted and fixed.
AI is bound to suffer some troubles—how could it not? But it also promises extraordinary benefits and the difficulties it poses are not unprecedented. People should look to the data, as machines do. Regulators should start with a light touch and demand rapid fixes when things go wrong. If the new black boxes prove tricky, there will be time to toughen the rules.
Humans are inscrutable too. Existing rules and regulations can apply to artificial intelligence
The first draft of an academic paper, written in 2015, by two professors, one at the University of Chicago, the other at the University of Toronto. They envisaged machines able to assemble data and produce predictive outcomes, and then distribute these everywhere, instantly, turning rules and standards upside down and replacing them with micro-directives that were more responsive to circumstances, and rational.
One of the paper’s co-authors had gone so far as to join a startup combining law and machine learning to provide answers about complex areas of tax, such as how to determine if a person is an employee or independent contractor, or whether an expenditure should be treated as current or depreciated—murky stuff that even tax authorities preferred coming from machines.
That was novel in 2016. Each year since then it had expanded. Students aspiring to work in investment management now routinely used machines to assess whether a shareholder in a firm that was sold through a leveraged buy-out would be retrospectively liable for a “fraudulent transfer” if the company subsequently collapsed, a risk that defied being addressed because it was so hard to measure.
The entire world of negligence had been transformed. Live in a remote location and it was fine to install a swimming pool. A child moves nearby and a computer sends out a notification that the pool has become an “attractive nuisance” and a fence should be built immediately. The physical topography may not have changed, but the legal one had. Criminal law once revolved around externally observed facts.
Then DNA evidence entered the picture. Now, cases often hinged on data about pulse rates, intoxication and location, drawn from the wristbands that replaced watches. It was much fairer—but creepy, because the facts came from perpetual monitoring.
A formula for justice
The most important introductory course faced by Sonia and her classmates had long ceased to be about contracts or procedure; it was algorithms and the law. One student melded data on work attendance, high-school grades, standardised tests and documented preferences in music into a program for use by states to determine an individual age of consent for sex and alcohol. The most likely to have a portrait added to the library wall—the first of many replacing old judges, who had somehow gained fame for making decisions that now seemed hopelessly devoid of data.
Could it help boost human capital, and ultimately growth?
New technology brings with it both the sweet hope of greater prosperity and the cruel fear of missing out. Satya Nadella, the boss of Microsoft, says he is haunted by the fact that the Industrial Revolution left behind India, his country of birth. (Indian manufacturers hardly enjoyed a level playing-field—Britain was then both their rival and their ruler.) Many technologies, such as online-education courses, have generated more hype than economic growth in the emerging world. Some people worry that generative artificial intelligence (ai), too, will disappoint the global south. The big winners so far seem to be a bunch of Western early adopters, as well as startups in San Francisco and America’s “magnificent seven” tech firms, which include Microsoft and have together added an astonishing $4.6trn to their market value since Chatgpt’s launch in November 2022.
Yet ai stands to transform lives in the emerging world, too. As it spreads, the technology could raise productivity and shrink gaps in human capital faster than many before it. People in developing countries need not be passive recipients of ai, but can shape it to suit their own needs. Most exciting of all, it could help income levels catch up with those in the rich world.
The promise of ai in developing countries is tantalising. As in the West, it will be a useful all-purpose tool for consumers and workers, making it easier to obtain and interpret information. Some jobs will go, but new ones will be created. Because emerging countries have fewer white-collar workers, the disruption and the gain to existing firms may be smaller than in the West. The imf says that a fifth to a quarter of workers there are most exposed to replacement, compared with a third in rich countries.
But a potentially transformative benefit may come from better and more accessible public services. Developing economies have long been held back by a lack of educated, healthy workers. Primary-school teachers in India have twice as many pupils as their American counterparts, but are ill-equipped for the struggle. Doctors in Africa are scarce; properly trained ones are scarcer. Whole generations of children grow up badly schooled, in poor health and unable to fulfil their potential in an increasingly global labour market.
Entrepreneurs around the world are exploring ways that ai can help. India is combining large language models with speech-recognition software to enable illiterate farmers to ask a bot how to apply for government loans. Pupils in Kenya will soon be asking a chatbot questions about their homework, and the chatbot will be tweaking and improving its lessons in response. Researchers in Brazil are testing a medical ai that helps undertrained primary-care workers treat patients. Medical data collected worldwide and fed into ais could help improve diagnosis. If ai can make people in poorer countries healthier and better educated, it should in time also help them catch up with the rich world.
Pleasingly, these benefits could spread faster than earlier waves of technology. New technologies invented in the early 20th century took more than 50 years to reach most countries. By contrast, ai will spread through the gadget that many people across the emerging world already have, and many more soon will: the phone in their pockets. In time, chatbots will become much cheaper to provide and acquire.
Moreover, the technology can be tailored to local needs. So far there is little sign that ai is ruled by the winner-takes-all effects that benefited America’s social-media and internet-search firms. That means a variety of approaches could prosper. Some developers in India are already taking Western models and fine-tuning them with local data to provide a whizzy language-translation service, avoiding the heavy capital costs of model-building.
Another idea that is also taking off in the West is to build smaller, cheaper models of your own. A narrower set of capabilities, rather than the ability to get every bit of information under the sun, can suit specific needs just fine. A medical ai is unlikely to need to generate amusing limericks in the style of William Shakespeare, as Chatgpt does so successfully. This still requires computing power and bespoke data sets. But it could help adapt ai in more varied and useful ways.
Some countries are already harnessing ai. China’s prowess is second only to America’s, thanks to its tech know-how and the deep pockets of its internet giants. India’s outsourcing industry could be disrupted, as some back-office tasks are taken on by generative ai. But it is home to a vibrant startup scene, as well as millions of tech developers and a government that is keen to use ai to improve its digital infrastructure. These leave it well-placed to innovate and adapt. Countries in the Gulf, such as the United Arab Emirates and Saudi Arabia, are determined to build an ai industry as they shift from oil. They already have the capital and are importing the talent.
Each country will shape the technology in its own way. Chinese chatbots have been trained to keep off the subject of Xi Jinping; India’s developers are focused on lowering language barriers; the Gulf is building an Arabic large language model. Though the global south will not dislodge America’s crown, it could benefit widely from all this expertise.
Plenty could yet go wrong, obviously. The technology is still evolving. Computing power could become too expensive; local data will need to be gathered and stored. Some practitioners may lack the ability to take advantage of the knowledge at their fingertips, or the incentive to try new things. Although countries in sub-Saharan Africa stand to gain the most from improvements to human capital and government services, the technology will spread more slowly there than elsewhere without better connectivity, governance and regulation.
The good news is that investments to speed ai’s diffusion will be richly rewarded. Much about the ai revolution is still uncertain, but there is no doubt that the technology will have many uses and that it will only get better. Emerging countries have suffered disappointments before. This time they have a wonderful opportunity—and the power to seize it. ■
Presently there are two families of AI architecture models: large language models (LLMs) for text, and diffusion models for images. What's next?
Most AI relies on a neural network, trained on massive amounts of information—text, images and the like—relevant to how it will be used. Through much trial and error the weights of connections between simulated neurons are tuned on the basis of these data, akin to adjusting billions of dials until the output for a given input is satisfactory.
There are many ways to connect and layer neurons into a network. A series of advances in these architectures has helped researchers build neural networks which can learn more efficiently and which can extract more useful findings from existing datasets, driving much of the recent progress in AI. Most of the current excitement has been focused on two families of models: large language models (LLMs) for text, and diffusion models for images. These are deeper (ie, have more layers of neurons) than what came before, and are organised in ways that let them churn quickly through reams of data.
LLMs—such as GPT, Gemini, Claude and Llama—are all built on the so-called transformer architecture. Introduced in 2017 by Ashish Vaswani and his team at Google Brain, the key principle of transformers is that of “attention”. An attention layer allows a model to learn how multiple aspects of an input—such as words at certain distances from each other in text—are related to each other, and to take that into account as it formulates its output. Many attention layers in a row allow a model to learn associations at different levels of granularity—between words, phrases or even paragraphs. This approach is also well-suited for implementation on graphics-processing unit (GPU) chips, which has allowed these models to scale up and has, in turn, ramped up the market capitalisation of Nvidia, the world’s leading GPU-maker. Transformer-based models can generate images as well as text. The first version of DALL-E, released by OpenAI in 2021, was a transformer that learned associations between groups of pixels in an image, rather than words in a text. In both cases the neural network is translating what it “sees” into numbers and performing maths (specifically, matrix operations) on them. But transformers have their limitations. They struggle to learn consistent world-models. For example, when fielding a human’s queries they will contradict themselves from one answer to the next, without any “understanding” that the first answer makes the second nonsensical (or vice versa), because they do not really “know” either answer—just associations of certain strings of words that look like answers. And as many now know, transformer-based models are prone to so-called “hallucinations” where they make up plausible-looking but wrong answers, and citations to support them. Similarly, the images produced by early transformer-based models often broke the rules of physics and were implausible in other ways (which may be a feature for some users, but was a bug for designers who sought to produce photo-realistic images). A different sort of model was needed.
Enter diffusion models, which are capable of generating far more realistic images. The main idea for them was inspired by the physical process of diffusion. If you put a tea bag into a cup of hot water, the tea leaves start to steep and the colour of the tea seeps out, blurring into clear water. Leave it for a few minutes and the liquid in the cup will be a uniform colour. The laws of physics dictate this process of diffusion. Much as you can use the laws of physics to predict how the tea will diffuse, you can also reverse-engineer this process—to reconstruct where and how the tea bag might first have been dunked. In real life the second law of thermodynamics makes this a one-way street; one cannot get the original tea bag back from the cup. But learning to simulate that entropy-reversing return trip makes realistic image-generation possible. Training works like this. You take an image and apply progressively more blur and noise, until it looks completely random. Then comes the hard part: reversing this process to recreate the original image, like recovering the tea bag from the tea. This is done using “self-supervised learning”, similar to how LLMs are trained on text: covering up words in a sentence and learning to predict the missing words through trial and error. In the case of images, the network learns how to remove increasing amounts of noise to reproduce the original image. As it works through billions of images, learning the patterns needed to remove distortions, the network gains the ability to create entirely new images out of nothing more than random noise. Most state-of-the-art image-generation systems use a diffusion model, though they differ in how they go about “de-noising” or reversing distortions. Stable Diffusion (from Stability AI) and Imagen, both released in 2022, used variations of an architecture called a convolutional neural network (CNN), which is good at analysing grid-like data such as rows and columns of pixels. CNNs, in effect, move small sliding windows up and down across their input looking for specific artefacts, such as patterns and corners. But though CNNs work well with pixels, some of the latest image-generators use so-called diffusion transformers, including Stability AI’s newest model, Stable Diffusion 3. Once trained on diffusion, transformers are much better able to grasp how various pieces of an image or frame of video relate to each other, and how strongly or weakly they do so, resulting in more realistic outputs (though they still make mistakes).
Recommendation systems are another kettle of fish. It is rare to get a glimpse at the innards of one, because the companies that build and use recommendation algorithms are highly secretive about them. But in 2019 Meta, then Facebook, released details about its deep-learning recommendation model (DLRM). The model has three main parts. First, it converts inputs (such as a user’s age or “likes” on the platform, or content they consumed) into “embeddings”. It learns in such a way that similar things (like tennis and ping pong) are close to each other in this embedding space. The DLRM then uses a neural network to do something called matrix factorisation. Imagine a spreadsheet where the columns are videos and the rows are different users. Each cell says how much each user likes each video. But most of the cells in the grid are empty. The goal of recommendation is to make predictions for all the empty cells. One way a DLRM might do this is to split the grid (in mathematical terms, factorise the matrix) into two grids: one that contains data about users, and one that contains data about the videos. By recombining these grids (or multiplying the matrices) and feeding the results into another neural network for more number-crunching, it is possible to fill in the grid cells that used to be empty—ie, predict how much each user will like each video. The same approach can be applied to advertisements, songs on a streaming service, products on an e-commerce platform, and so forth. Tech firms are most interested in models that excel at commercially useful tasks like this. But running these models at scale requires extremely deep pockets, vast quantities of data and huge amounts of processing power.
In academic contexts, where datasets are smaller and budgets are constrained, other kinds of models are more practical. These include recurrent neural networks (for analysing sequences of data), variational autoencoders (for spotting patterns in data), generative adversarial networks (where one model learns to do a task by repeatedly trying to fool another model) and graph neural networks (for predicting the outcomes of complex interactions). Just as deep neural networks, transformers and diffusion models all made the leap from research curiosities to widespread deployment, features and principles from these other models will be seized upon and incorporated into future AI models. Transformers are highly efficient, but it is not clear that scaling them up can solve their tendencies to hallucinate and to make logical errors when reasoning. The search is already under way for “post-transformer” architectures, from “state-space models” to “neuro-symbolic” AI, that can overcome such weaknesses and enable the next leap forward. Ideally such an architecture would combine attention with greater prowess at reasoning. Right now no human yet knows how to build that kind of model. Maybe someday an AI model will do the job.
What Do the Gods of Generative AI Have in Store for 2025?
The world’s first ‘reasoning model,’ an advanced form of AI, was released in September 2024 by OpenAI. o1, as it is called, uses a ‘chain of thought’ (#COT), mimicking Type 2 thinking capabilities. It answers difficult questions by breaking down problems into their constituent steps and testing various approaches behind the scenes before presenting a conclusion to the user.
Its unveiling set off a race to replicate this method. Google introduced its own reasoning model, Gemini 2.0, along with Gemini 2.0 Flash, or “Gemini Flash Thinking,” in December 2024, alongside agentic AI prototypes Astra and Mariner. OpenAI responded by unveiling groundbreaking generative AI products: o3, an update of o1, released just days later, alongside Sora, a video-generation tool, and Canvas, designed for writing and coding.
Then came competition. China’s AI industry, backed by regulatory reforms and a wave of returning ‘sea turtles’ (a term for Chinese scientists who have studied and worked abroad), has nearly caught up with America’s. It is now more open and more efficient as well. When OpenAI released o1 in September 2024, Alibaba responded within three months with a newer version of its QwQ (Question with Qwen) chatbot, incorporating similar reasoning capabilities. Meanwhile, another Chinese firm, DeepSeek, previewed its own reasoning model, R1, just a week before Alibaba’s launch. Despite Uncle Sam’s efforts to curb China’s AI ambitions, these firms had reduced America’s technological lead to a matter of weeks. Then, on January 20, 2025, Chinese startup DeepSeek disrupted the scene with its gargantuan and better-trained open-source LLM—the v3 reasoning model—a cost-efficient breakthrough that fundamentally reshaped the economics of AI.
In response to this competitive pressure, OpenAI released advancements such as ChatGPT 'Operator' on January 24, 2025, and ‘Deep Research,’ a PhD-level super-agent, in February 2025. These were made available to Pro subscribers for $200 per month, hinting at even greater leaps in natural language understanding, speed, and autonomous task execution. These innovations have fueled the prospect of AI agents interacting seamlessly with the web. More advanced models, including ChatGPT-5, are expected later in 2025.
As AI models become increasingly sophisticated, computer scientists are seeking more efficient ways to train them. Training AI involves feeding it data with some sections hidden; the model then makes a guess about the missing content. If the guess is incorrect, a mathematical process called ‘backpropagation’ tweaks the model so that it makes a slightly better prediction next time. The challenge arises when training is done ‘in parallel,’ where multiple GPUs (such as the RTX 40 SUPER series, the H200, and the Blackwell GeForce RTX 50) work on backpropagation simultaneously. After each step, these GPUs must share data about the changes made—a process known as ‘checkpointing.’ For large-scale training runs, nearly half the time is spent on checkpointing alone. To address this bottleneck, Google DeepMind engineers have proposed DiLoCo, a method designed to reduce checkpointing overhead.
Silicon Valley sees agentic AI (‘intelligent agents’), which shift from chat-based interactions to action-oriented AI, as a major breakthrough for 2025. However, each innovation has faced its own challenges related to data reliability, trust, and high operational costs. This sets the stage for an intensely competitive yet complex AI landscape in 2025. Organizations still focused on earlier waves of AI must now explore agentic AI to remain relevant. This transition isn’t merely about adopting new technology—it requires reimagining how work is done.
As we stand on the cusp of this third wave of AI, one thing is clear: 2025 is a pivotal year for agentic AI. The future of work will be shaped by our ability to collaborate effectively with AI agents. Those who embrace this transformation early will be best positioned to harness its potential, fostering more efficient, innovative, and productive workplaces where humans and AI agents work together seamlessly.
On a quest for the scientific ideas that have led to the current moment in AI. We get behind the hype, buzz words and jargon to explore Eight ideas to understand the story of Generative AI.
Part one: What is intelligence? In the middle of the 20th century, the inner workings of the human brain inspired computer scientists to build the first “thinking machines”. But how does human intelligence actually relate to the artificial kind?
Part Two: How do machines learn? Learning is fundamental to artificial intelligence. It’s how computers can recognise speech or identify objects in images. But how can networks of artificial neurons be deployed to find patterns in data, and what is the mathematics that makes it all possible?
Part Three: What made AI take off? A decade ago many computer scientists were focused on building algorithms that would allow machines to see and recognise objects. In doing so they hit upon two innovations—big datasets and specialised computer chips—that quickly transformed the potential of artificial intelligence. How did the growth of the World Wide Web and the design of 3D arcade games create a turning point for AI?
Part Four: What made AI models generative? In 2022, it seemed as though the much-anticipated AI revolution had finally arrived. Large language models swept the globe, and deepfakes were becoming ever more pervasive. Underneath it all were old algorithms that had been taught some new tricks. Suddenly, artificial intelligence seemed to have the skill of creativity. Generative AI had arrived and promised to transform…everything.