The Internet is best understood as a tall, multi-layered cake, each layer depending entirely on the foundation below it. At the bottom are physical cables, then communication protocols, then addressing systems. On top of these reside the hosting layer with all the cloud infrastructure, and finally, at the summit are the service applications themselves that we use daily — websites, mobile apps, social platforms and increasingly AI/ML services — all entirely dependent on the layers below.
Every time a new layer emerged or matured, markets treated it as a gold rush—and a bubble followed.
The internet network architecture (the OSI Model)
1. Physical Layer - The Actual Cables: At the base are undersea links, terrestrial conduits for fibre-optic cables—thin glass strands carrying light signals across oceans and continents—which form the physical highways of global data. These are literal glass tubes that shoot light pulses carrying data in packets. The physical wiring layer fueled the late 1990s–2001 fibre/telecom bubble. Massive overinvestment in fibre infrastructure led to overcapacity, and many companies like WorldCom collapsed when demand failed to materialise as quickly as projected.
2. Internet Protocol and language Layer - This is where TCP/IP lives: Above them sit the open protocol and language layer — open standards like TCP/IP, routing mechanisms and open-source implementations — enabling devices to “converse in the same language”. These are the fundamental rules for how data packets move around. Next comes the addressing tier: systems such as DNS, IP addresses and URLs that translate human-friendly names into numerical network addresses so packets can find their way through the network. This layer was built mostly by unpaid developers, academics, and global standards bodies rather than corporations. The Linux Foundation maintains much of the open-source software that implements these protocols, but no single entity owns them. This layer didn't create traditional financial bubbles because it's fundamentally open standards and freely available to all. However, it enabled every bubble that followed.
3. Security and Identity Layer - Between protocols and applications sits a critical but often invisible layer: the trust infrastructure that makes secure communication possible. This includes: Authentication systems (OAuth, SAML) that verify who you are Certificate authorities that issue SSL/TLS certificates, enabling the padlock icon in your browser Encryption protocols that scramble data in transit Identity management systems that control access to resources Without this layer, online banking, e-commerce, and private communication would be impossible. It's the silent guardian that prevents your credit card from being stolen every time you shop online.
4. Transport + Application Protocol Layer: defines how web browsers, email clients, apps communicate; the base for websites and services. 1995–2000 dot-com bubble driven by new web businesses built on HTTP.
5. Server (Hosting) and Compute Abstraction Layer: This is where cloud providers sit. Cloud platforms such as AWS, Google Cloud, and Azure run enormous data centres, also called web servers, that rent out compute, storage, databases and virtual machines that power websites, apps and services.
At the base of this layer sits Data Storage and Management. Before applications can run, they need somewhere to store and retrieve data. Relational databases like PostgreSQL, MySQL organise data in structured tables. NoSQL databases like MongoDB, Cassandra handle unstructured data at scale. Data warehouses like Snowflake, BigQuery store massive datasets for analysis. Vector databases like Pinecone, Weaviate enable AI applications to search by semantic meaning rather than exact matches. These storage systems are as foundational as compute itself—every application depends on them.
Within this layer, next comes the Container Orchestration and Traffic Management Deployments. These technologies actually manage how applications run at scale: Docker, which packages applications into portable standardised self-contained 'containers', and Kubernetes, which orchestrates those containers across clusters of cloud servers—handling scaling, health checks, traffic routing, and failover. These tools do not execute the application's logic; instead, they simply ensure that applications run consistently and are deployed reliably, no matter how many servers or users are involved.
Load balancers sit between incoming traffic and application servers, distributing requests evenly so no single server gets overwhelmed. Message queues (Kafka, RabbitMQ) enable different parts of applications to communicate asynchronously, decoupling services so they can scale independently.
Content Delivery and Protection: Sitting in front of these cloud data centres are specialist networks such as Cloudflare, which speed up websites through CDNs by caching content closer to users, block attacks through DDoS protection, and provide DNS services that help users reach websites quickly and safely.
A simple analogy is helpful, cloud is a massive rented warehouse of corporate data that hosts (stores) and runs applications; Docker is the system that breaks large applications into many small, neat packages called containers that keep each part organised so that each part of the application can run independently without loading the whole system every time; Kubernetes is the automated system that arranges and delivers these containers across cloud servers so the app stays fast even when millions of users are online. Once a container is running, the runtime engine inside it executes the actual application logic—fetching the right data, applying algorithms, and showing users what they need; and Cloudflare is the security guard and delivery network between users and that warehouse sitting outside but in front of the warehouse, that protects the building and speeds up access for users. Almost all large social-media and e-commerce platforms use Docker and Kubernetes because their systems are too large, too complex, and too traffic-heavy to run on single servers. They need millions of small components working together at all times, and Docker/Kubernetes make that possible in a predictable and automated way. Social media and e-commerce sites break their platforms into hundreds or thousands of small services—login service, notifications service, payments service, product search, recommendations, chat, checkout, etc. Docker packages each of these services into small, standardised containers so the code runs the same way everywhere across global data centres. Kubernetes manages these containers at massive scale, automatically starting new ones when traffic spikes, shutting them down when traffic drops, restarting them when they crash, and spreading them across thousands of cloud machines to prevent any single point of failure. AWS hosts roughly a third of the world’s servers, while Cloudflare handles traffic for about a fifth of all websites. When AWS or Cloudflare experiences outages—or when Kubernetes clusters within cloud providers misconfigure deployments—large chunks of the internet can become unreachable. This cloud layer powered a decade of hyperscaler valuations (2010s–present), the 2010s Cloud Boom, as companies raced to “move to the cloud,” inflating infrastructure providers’ valuations long before profitability models were proven. Late-2010s experienced the infra-tooling boom (Docker, Kubernetes are not an internet layer but a deployment abstraction that Packages code → schedules it across clusters → scales automatically, where DevOps, microservices, and enterprise spending surged.
6. Application Layers (Runtime Execution Layer): This is where actual websites and services run. Everything familiar about the Internet lives here: social media, e-commerce, streaming services, and now AI applications.
Runtime Execution Engines: Within the application layer, infrastructures called runtime engines (runtime execution environments) V8 (2009), JVM, CPython, Node.js—and browser engines like WebKit, Blink, Gecko and WebAssembly (WASM), execute the 'logic' of web pages, web apps, and server-side applications. These enable JavaScript to run both in web browsers for interactive front-ends and on servers for backend applications and APIs. Just like an operating system translates user actions into binary instructions for the CPU, runtime engines translate high-level programming languages into machine code. Similarly, runtime execution engines act as intermediary layers between human-readable code and CPU instructions, making applications actually function.
Languages, Formats, and Communication: Programming languages like C, Java, Python, and JavaScript are used to write code that executes logic, makes decisions, and processes data. Markup languages like HTML, XML, and Markdown describe the structure and presentation of content. Data formats like JSON, XML, CSV, and YAML store and exchange data between systems. C doesn't need a runtime engine because it compiles directly to standalone executables that run natively on the CPU. In contrast, Java Virtual Machine (JVM), CPython, and Node.js exist because Java, Python, and JavaScript need something to translate and execute their code at runtime. The flow works like this: Code written in a programming language (JavaScript, Python) makes an API request → API defines how systems communicate → JSON is the data format exchanged → Runtime engine (Node.js, JVM, CPython) executes the code → HTML displays the result to users. These technologies revolutionised and supercharged the application layer by enabling fast, rich applications that run in web browsers and server-side environments.
Microservices Architecture: Modern large-scale applications are typically built as microservices—breaking monolithic applications into hundreds or thousands of small, independent services. Each service handles one specific function: login service, notifications service, payments service, product search, recommendations, chat, checkout, etc. This architectural pattern is what made Docker and Kubernetes necessary. Instead of one giant application, you have many tiny ones that need to be packaged, deployed, scaled, and managed independently.
The rise of SaaS, rich front-ends fueled the 2008–2018 mobile + web-app boom for infra that converts code into machine instructions; runs application logic. Browsers run JS engines; servers run JVM/Node/Python.
The application layer (apps, platforms, AI models, e-commerce, social media) has sparked multiple investment frenzies:
Social media boom (2010s): Facebook, YouTube, Twitter, and Instagram created closed "walled garden" ecosystems with massive valuations despite initially limited revenue models.
Mobile app mania (2010–2018): Every company felt pressure to build dedicated mobile apps, triggering a commercial land-grab that was really just a new interface to existing services, not a fundamental technical innovation.
AI/ ML (2023-Present): The current AI frenzy is not a separate foundational layer of the Internet stack. It's just the latest tier (sub-layer) within the application tier, similar to search engines, recommendation systems, or payment gateways. What makes AI different is its appetite. AI workloads are far more compute-intensive than previous applications, which amplifies dependence on the cloud layer rather than creating a new one. AI introduces infrastructure primitives that previous applications didn't require: GPU clusters optimised for parallel computation rather than general-purpose processing, Vector databases for semantic search and retrieval, Model registries that version and manage trained AI models, Inference engines that optimise model serving at scale, Training/inference split creating fundamentally different compute patterns than traditional request-response applications. What makes AI different is its appetite. AI workloads are far more compute-intensive than previous applications, which amplifies dependence on the cloud layer rather than creating a new one. This is why hyperscalers like AWS, Azure, and Google Cloud, alongside GPU chip makers, have become the primary beneficiaries of the AI boom. They're selling the pickaxes in this gold rush. Every company now adds "AI-powered" to its offerings, valuations skyrocket, and massive infrastructure investment flows despite unclear paths to profitability for many players. In that sense, today's AI frenzy is not new at all. It's simply the highest floor built on a tower that has bubbled, crashed, and rebuilt itself many times before.
The Internet has always grown in layers, and every time a new layer appeared or evolved, markets treated it as the next transformative opportunity. The technology was real. The infrastructure got built. But the valuations often ran ahead of the revenue, creating cycles of boom and correction that have defined the digital age.
Edge Computing and Decentralization: The Emerging Shift. Not everything flows through centralized cloud servers. Two parallel trends are reshaping the hosting layer: Edge computing pushes computation closer to users—into cell towers, local data centers, or even IoT devices. This reduces latency for real-time applications like autonomous vehicles, AR/VR, and industrial automation. It blurs the line between the hosting layer and the application layer, creating distributed compute that's neither purely cloud nor purely local. Peer-to-peer architectures—BitTorrent, blockchain networks, IPFS—distribute data and computation across many nodes without central servers. While not dominant, these represent an architectural alternative to the cloud-centric model, particularly for applications prioritizing censorship resistance or decentralization over efficiency.
The Consulting Industrial Complex: Professional consulting firms have long ridden each technological wave by selling “digitisation,” “transformation,” and “enablement” programmes to corporations, often packaging familiar process-mapping and IT integration work as the next frontier of innovation; yet the outcomes have frequently fallen short of the grand promises, with clients discovering that expensive roadmaps do not necessarily translate into real change. This cycle mirrors the broader pattern: technological storytelling becomes a monetizable asset in itself, independent of whether the underlying transformation actually succeeds.
This cycle mirrors a broader pattern: major technological booms and busts are predominantly Western phenomena, particularly centred in the United States. Emerging markets experience technology adoption, but rarely the speculative frenzy that characterizes Silicon Valley and Wall Street. The reason is straightforward: real innovation happens in the West, and it takes years to trickle down to other economies. Western markets—specifically the US—host the venture capital ecosystem, research universities, and risk-tolerant capital markets that fund experimental technologies before they're profitable or even fully formed. This creates the boom: investors pour money into nascent technologies, valuations soar based on potential rather than revenue, and eventually reality intervenes. By the time these technologies reach emerging markets, they're mature, proven, and commoditised. Emerging economies adopt cloud services after the cloud wars have been fought. They build mobile apps after the mobile bubble has deflated. They'll deploy AI after the current frenzy has settled into practical applications. They skip the speculation and inherit the infrastructure. This isn't a disadvantage—it's often an advantage. Emerging markets avoid the costly mistakes, the overinvestment, and the wreckage of failed startups. They adopt what works. But they also don't capture the enormous wealth creation that happens during the bubble phase, when early investors and employees of successful companies extract generational fortunes. The West tolerates boom-bust cycles because the booms create real infrastructure and occasional trillion-dollar companies, even if most participants lose money. The busts are painful, but they're seen as the price of being first. Emerging markets get stability and lower risk, but they're always building on foundations laid—and paid for—elsewhere. In that sense, the fibre bubble, the cloud boom, the app frenzy, and today’s AI mania all follow the same script: a Western market inflates a vision of the future, global consultants monetise the storytelling under the banner of transformation, and the rest of the world eventually adopts the technology only once the dust has settled.
An outlier has been China. It didn't just "adopt after the dust settled"—it built parallel ecosystems (WeChat, Alipay, ByteDance) that often surpassed Western models in features and scale. WeChat integrated social messaging, payments, ride-hailing, and e-commerce years before Western "super apps" attempted similar integration. Mobile payment infrastructure was arguably pioneered at scale in India (UPI) and China before the West caught up. These weren't adoptions of Western technology—they were indigenous innovations solving local problems. E-commerce models like Alibaba's Taobao created merchant ecosystems distinct from Amazon's approach, and platforms like Pinduoduo pioneered social commerce at massive scale. The "West innovates, rest adopts" story worked better in the 1990s–2000s than today. What remains true is that speculative boom-bust cycles are still concentrated in US capital markets, where risk tolerance for unproven technologies remains highest.
The fibre bubble, the cloud boom, the app frenzy, and today's AI mania all follow the same script: markets inflate a vision of the future, infrastructure gets built (sometimes excessively), and eventually the technology matures into practical utility. Each layer was real. The infrastructure remained. But the valuations often ran ahead of the revenue. What changes is where value accrues and who captures it. Sometimes it's the infrastructure providers. Sometimes it's the application builders. And increasingly, it's not just Western companies capturing that value—it's whoever builds the best solution for their market, regardless of geography.
Next up, though, Agents Assemble: The next version of the web will be built for machines, not humans. AI will surf, shop and act on your behalf.
In 1999, a decade after inventing the world wide web, Sir Tim Berners-Lee, a British computer scientist, imagined an intelligent version of his creation. In that vision, much of daily life—finding information, making plans, handling mundane tasks—would be done not by people, but by “intelligent agents”: machines able to read, interpret and act. The web has evolved dramatically since its invention but the experience has remained manual—users still type, click and browse before they buy, read or watch. Artificial intelligence (AI) may now bring Sir Tim’s dream within reach. Today’s large language models (LLMs) can summarise documents, answer questions and reason. What they cannot do for the moment is act. That, however, is changing with “agents”: software that gives LLMs tools which let them perform tasks, not just generate text. The shift started in 2022 with the launch of ChatGPT. Many users began asking questions of chatbots, rather than putting keywords into search engines, to assimilate information that might be spread around the web. Such “answer engines” barely scratch the surface of the potential, however. Kevin Scott, chief technology officer of Microsoft, a software giant, reckons agents able to handle more complex tasks “are not that far away”. But for them to take over more of the work, the web’s plumbing must change. A central obstacle is language: giving agents a way to talk to online services and each other. A website or online service normally talks to the outside world through an application programming interface (API), which tells visitors what it can do, such as booking a doctor’s appointment or supplying a map location. APIs, however, are written for humans, and each has its own quirks and documentation. This is a tough environment for AI agents, because they reason in natural language. Dealing with each new API requires learning its dialect. To act independently on the web, therefore, agents will need a standardised way to communicate. This is the aim of the Model Context Protocol (MCP), developed by Anthropic, an AI lab. Mike Krieger, its chief product officer, says the idea came while linking Claude, its chatbot, to services like Gmail, an email platform, and GitHub, a repository of code. Instead of integrating each application with Claude on a case-by-case basis, the firm wanted a shared set of rules to help agents directly access a user’s emails or files. Rather than study technical guides, an agent can ask an MCP server what a system does—book a flight, cancel a subscription, issue a refund and so on—and then take an action on behalf of the user, without bespoke code. Say you want to book a trip from London to New York. You start by giving your travel plans to a trip agent, which then subdivides the task between specialised agents that can look for flights, hotels and cars. These agents contact the MCP servers of airlines, hotels and car-hire firms, gather information, compare possibilities and create a list of potential itineraries. Once you pick an option, the trip agent would book the whole lot. This type of co-ordination requires rules for how individual agents identify, talk to and trust each other. Google’s proposed solution is the A2A (agent-to-agent) protocol for this purpose. Agents can advertise their abilities to each other through this and negotiate which agent does what. Laurie Voss of Arize AI, a startup, says companies are in a “landrush” to define the dominant standards for the agentic web. The most widely adopted protocol will let its backers’ tools do more, sooner and better. On December 9th, 2025, Anthropic, OpenAI, Google, Microsoft and others announced the Agentic AI Foundation, which will develop open-source standards for AI agents. Anthropic’s MCP will be part of this, signalling its wider adoption as an industry standard for agentic communication. Still, most of the web that these agents will surf is made for human eyes. Finding a product still means clicking through menus. To let language models access sites more easily, Microsoft has built Natural Language Web (NLWeb), which lets users “chat” to any web page in natural language. Users could ask the NLWeb interface of a travel website, for example, for tips on where to go on holiday with three children; or what the best wine shops are in a particular place. Whereas traditional search might require clicking through filters for location, occasion and cuisine across several menus, NLWeb is able to capture the full intent of a question in a single natural sentence, and respond accordingly. Each NLWeb site can also act as an MCP server, exposing its content to agents. Thus NLWeb bridges the modern visual internet and one that agents can use. As agents grow more capable, a new platform contest is taking shape, this time over the agents themselves. It echoes the browser wars of the 1990s, when firms fought to control access to the web. Now, browsers are being reimagined with agents at their core. OpenAI and Perplexity, a generative-AI startup, have launched agent-powered browsers that can track flights, fetch documents and manage email. Their ambitions go further. In September OpenAI enabled direct purchases from select websites inside ChatGPT. It has also integrated with services like Spotify and Figma, letting users play music or edit designs without switching apps. Such moves worry incumbents. In November Amazon, a shopping site, sued Perplexity, alleging the startup was violating its terms of service by failing to disclose that its browser was shopping instead of a real person. Airbnb, a short-term-rentals app, chose not to integrate with ChatGPT, saying the feature was not “quite ready”. Advertising, too, will have to adapt. Today’s web runs on monetising human attention, through search ads and social feeds. Alphabet and Meta, among the biggest tech firms, expected to earn nearly half a trillion dollars a year this way, accounting for more than 80% of their revenues. Dawn Song, a computer scientist at the University of California, Berkeley, says marketers may need to pitch not to people, but to “agent attention”. Travel sites, for instance, will not persuade the traveller, but their digital proxy. The tactics may stay the same, optimising rankings, targeting preferences, paying for placement, but the audience will be algorithms. Despite the risks, software developers are optimistic. A shift from a “pull” internet, where people initiate actions, to a “push” model, where agents act unprompted—setting up meetings, flagging research or handling small tasks. It could be the foundation of a new and very different version of the web.
Beneath the visible Internet stack lies a deeper industrial and geopolitical foundation: chip fabs (TSMC, Intel), the ultra-sophisticated tools that make fabrication possible (ASML, Lam, Tokyo Electron), the design software that architects chips (Synopsys, Cadence), the packaging/test ecosystem, and beneath all of that the global capital markets—anchored by the US dollar—and the Western security architecture that protects supply chains. These layers form the real substructure of modern computing: a mix of physics, capital, and geopolitics on which cloud platforms, AI models, applications, runtimes like V8, and the entire Web ultimately depend.
Semiconductor Fabrication Layer: This layer represents the companies that physically manufacture chips—the CPUs, GPUs, and memory that computers and data centres rely on. These firms take circuit designs and manufacture them using advanced lithography, chemicals, and wafer-processing technologies. Everything above it—cloud servers, phones, AI accelerators, laptops—exists only because these fabs can produce chips at unimaginably small scales (3nm, 2nm). If they stop, the entire digital economy collapses. TSMC manufactures ~90% of the world’s leading-edge chips, making it the single most important industrial node on Earth. This is why the chip supply chain is a geopolitical flashpoint.
Semiconductor Manufacturing Equipment (SME) Layer: Companies like ASML, Tokyo Electron, Applied Materials, Lam Research, KLA build the machines used by chip fabs. ASML: EUV lithography machines—the most complex machines ever built (~$150M each). Lam Research: Etching and deposition tools. Tokyo Electron & KLA: Metrology and process equipment. Fabrication cannot exist at all without the machinery these firms produce. ASML’s EUV tools are so advanced and so few that ASML alone practically decides which country can make cutting-edge chips. This is the industrial bedrock beneath the semiconductor layer.
EDA (Electronic Design Automation) Software Layer: Synopsys, Cadence, Siemens EDA, ARM, Ansys represent the software used to design chips. These tools turn human circuit designs into layouts that fabs can manufacture. Chip design today is impossible without EDA tools. These companies—mostly American—hold oligopolistic control over chip design software. They are the intellectual infrastructure beneath the physical infrastructure.
Outsourced Assembly & Testing (OSAT) Layer: ASE, Amkor represent the companies that: package chips attach them to substrates test them for defects This is the layer after fabrication but before chips are usable. Without this step, raw silicon dies cannot be inserted into laptops, servers, or phones. OSAT is dominated by Taiwan and South Korea.
The meme below humorously depicts the modern web infrastructure as a towering, unstable stack of technologies—from AI and cloud services like AWS to open-source contributions—resting on the foundational work of C developers implementing dynamic arrays manually, a core concept absent from C's standard library.
Why Modern Processors Represent Humanity's Greatest Productivity Arbitrage
A modern processor is an affront to intuition. A sliver of silicon, no larger than a fingernail and thinner than cling film, serenely renders 3D worlds, listens for satellites orbiting thousands of kilometres above Earth, deciphers voices from across oceans and still has time left over to check your messages. To make sense of this, scale it up two million times and drop it into Manhattan. The result is not a hulking monolith but a strangely elegant city: a glassy slab scarcely ten storeys tall, resting on a brute pedestal built purely to stop it snapping. Inside runs a dense copper metropolis—layer upon layer of microscopic highways entombed in glass—funnelling signals downwards to street level, where the real business of thinking takes place. There, packed more tightly than any crowd New York has ever seen, stand billions of switches. Each transistor is no cleverer than a wall-mounted light switch. It knows only “on” or “off”. And yet, from this binary boredom, astonishing complexity arises.
The processor’s genius is not intelligence but obedience. Transistors are chained into logical gates—AND, OR, NOT, NOR—which are themselves chained into instructions. Memory is merely geography: addresses in this city where lights flick briefly on or off, like sticky notes slapped on doors and torn down moments later. Each bit of memory needs four transistors, arranged into tiny, perfectly ordered cells. Numbers are patterns of light. Images too. Even a monochrome Mona Lisa is nothing more than a neighbourhood of illuminated houses. Elsewhere sit the arithmetic logic units, asking relentlessly dull questions at terrifying speed: if this is true, do that; if not, do something else. Computers can count only to one. They just do it flawlessly, without fatigue or error, billions of times over. With enough instructions, monotony becomes motion: wind rippling through digital trees, shadows sliding across virtual flags, cinematic battles fought not with steel but with trigonometry. And they do not work alone. Modern chips enlist entire divisions—the CPU, GPU and NPU—processing in parallel, so that even when slowed to human pace they could perform hundreds, even tens of thousands, of simple operations every few minutes.
The final absurdity reveals itself only when time is dragged to a crawl. A processor does not tick once a second, like a human thought, but billions of times represented by GHz; that's time dilation. To make it “think” at our speed, time itself must be stretched grotesquely: one second in the real world becomes thousands of years inside the chip. In this frozen universe, electrical signals creep through silicon at millimetres per second; light itself ambles along at twice that pace. And still the processor outpaces us. A 4-GHz chip, ticks four billion times a second, performs more calculations in that fleeting moment than a human savant could manage in four thousand years of uninterrupted effort. This is why modern life runs on processors—not because they are clever, but because we have built near-atomic megastructures that perform simple acts, in parallel, at speeds that bend time out of shape. They are silicon time machines in our pockets, quietly remaking the world while the rest of us experience it one second at a time.
A traditional computer CPU is a specialist. It is designed to be the fastest possible general-purpose thinker, assuming that memory, graphics, networking and storage live elsewhere on the motherboard. It excels at complex, sequential tasks, heavy branching logic and high peak performance, but it depends on external components—RAM chips, a discrete GPU, chipset controllers, Wi-Fi cards—each with their own power budgets and communication delays. This modular design favours flexibility and upgradeability, but it costs energy and physical space. Performance comes first; efficiency is negotiated later.
A smartphone SoC, by contrast, is an entire computer compressed into a single piece of silicon. Alongside CPU cores sit the GPU, neural processing unit, image signal processor, video encoders, memory controllers, security enclaves and cellular radios, all sharing power and memory on the same die. This radical integration minimises distance: data moves microns instead of centimetres, saving both time and energy. The trade-off is that each component is narrower in scope and ruthlessly optimised for its task. Rather than one very fast brain, an SoC uses many smaller, specialised minds working in parallel, coordinated by software that decides which unit should wake, work and sleep. The biggest difference, then, is economic rather than technical. A PC CPU is built to maximise performance per second; an SoC is built to maximise performance per joule. Laptops and desktops can afford fans, large batteries and wall sockets. Phones cannot. SoCs therefore sacrifice peak speed for integration, specialisation and efficiency—allowing a device that fits in a pocket to render graphics, recognise faces, process photos and maintain a global radio link all day on a single charge. One is a powerful engine waiting for a vehicle; the other is an entire city engineered to run on a sip of fuel.
A modern smartphone system-on-a-chip (SoC) is an affront not just to intuition, but to scale itself. What looks like a single black rectangle buried beneath glass is, in truth, an entire civilisation compressed into a few square millimetres. If you enlarged a phone SoC two million times and dropped it into a city, you would not see one building but many: a dense federation of specialised districts sharing power, memory and time. A CPU quarter handles general reasoning, a GPU renders images and motion, an NPU recognises faces and voices, an ISP turns raw photons into photographs, and radios whisper constantly to cell towers and satellites. What once required entire rooms of dedicated machines now cohabits inside something thinner than a fingernail clipping, drawing power from a battery smaller than a deck of cards.
The SoC’s genius lies in orchestration rather than brute force. Each block is ruthlessly specialised, built to do one narrow job with maximal efficiency. The CPU remains the bureaucrat, good at handling instructions and exceptions; the GPU is the industrial workforce, moving vast quantities of simple mathematics in parallel; the NPU is a pattern savant, trained to recognise faces, speech and intent without understanding any of it. Between them sits a shared memory fabric, shuttling data at speeds that make geography almost irrelevant. Information flows not because any part “knows” what it is doing, but because the architecture ensures it arrives at exactly the right place, at exactly the right moment. Intelligence, such as it is, emerges from scheduling, arbitration and bandwidth—an economy of silicon where wasted motion is the cardinal sin.
Slow time enough and the absurdity becomes clearer still. Each district pulses to its own clock, ticking hundreds of millions or billions of times per second, coordinated so tightly that a missed beat would mean a dropped call, a blurred photo, a stuttering game. To make an SoC operate at human tempo, time must be dilated until a single real-world second stretches into millennia. In that frozen landscape, radios negotiate with orbiting satellites, neural units classify faces, and graphics engines paint entire worlds—simultaneously. This is why phones feel alive. Not because they think, but because their silicon societies operate in a realm where time is cheap, error is forbidden, and parallel effort is essentially free. A smartphone SoC is not a chip so much as a city-state: compact, specialised, and astonishingly productive, quietly running civilisation from the palm of your hand.
Beware Malware: This is how to protect the internet from malicious attacks
Few inventions in history have been as important for human civilisation and as poorly understood as the internet. It developed not as a centrally planned system, but as a patchwork of devices and networks connected by makeshift interfaces. Decentralisation makes it possible to run such a complex system. But every so often comes a chilling reminder that the whole edifice is uncomfortably precarious. On March 29th a lone security researcher announced that he had discovered, largely by chance, a secret backdoor in XZ Utils.
This obscure but vital piece of software is incorporated into the Linux operating systems that control the world’s internet servers. Had the backdoor not been spotted in time, everything from critical national infrastructure to the website hosting your cat pictures would have been vulnerable. The backdoor was implanted by an anonymous contributor who had won the trust of other coders by making helpful contributions for over two years. That patience and diligence bears the fingerprints of a state intelligence agency. Such large-scale “supply chain” attacks—which target not individual devices or networks, but the underlying software and hardware that they rely on—are becoming more frequent. In 2019-20 the SVR, Russia’s foreign-intelligence agency, penetrated American-government networks by compromising a network-management platform called SolarWinds Orion. More recently Chinese state hackers modified the firmware of Cisco routers to gain access to economic, commercial and military targets in America and Japan.
The internet is inherently vulnerable to schemes like the XZ Utils backdoor. Like so much else that it relies on, this program is open-source—which means that its code is publicly available; rather like Wikipedia, changes to it can be suggested by anyone. The people who maintain open-source code often do so in their spare time. A headline from 2014, after the uncovering of a catastrophic vulnerability in OpenSSL, a tool widely used for secure communication, and which had a budget of just $2,000, captured the absurdity of the situation: “The Internet Is Being Protected By Two Guys Named Steve.” It is tempting to assume that the solution lies in establishing central control, either by states or companies.
In fact, history suggests that closed-source software is no more secure than is the open-source type. Only this week America’s Cyber Safety Review Board, a federal body, rebuked Microsoft for woeful security standards that allowed Russia to steal a signing key—“the cryptographic equivalent of crown jewels for any cloud service provider”. This gave it sweeping access to data. By comparison, open-source software holds many advantages because it allows for collective scrutiny and accountability.
The way forward therefore is to make the most of open-source, while easing the huge burden it places on a small number of unpaid, often harried individuals. Technology can help, too. Let’s Encrypt, a non-profit, has made the internet safer over the past decade by using clever software to make it simple to encrypt users’ connections to websites. More advanced artificial intelligence might eventually be able to spot anomalies in millions of lines of code at a stroke. Other fixes are regulatory. America’s cyber strategy, published last year, makes clear that the responsibility for failures should lie not with open-source developers but “the stakeholders most capable of taking action to prevent bad outcomes”.
In practice that means governments and tech giants, both of which benefit enormously from free software libraries. Both should expand funding for and co-operation with non-profit institutions, like the Open Source Initiative and the Linux Foundation, which support the open-source ecosystem. The New Responsibility Foundation, a German think-tank, suggests that governments might, for example, allow employees to contribute to open-source software in their spare time and ease laws that criminalise “white hat” or ethical hacking. They should act quickly. The XZ Utils backdoor is thought to be the first publicly discovered supply-chain attack against a crucial piece of open-source software. But that does not mean it was the first attempt. Nor is it likely to be the last.■
Users of the internet can ignore its physical underpinnings but for technologies like artificial intelligence and the metaverse to work, others need to pay attention
In 1973 bob metcalfe, a researcher for Xerox at Palo Alto Research Centre, helped think up a way for the company’s computers to send information to each other via co-axial cables. He called this concept Ethernet after the medium by which, in 19th-century physics, electromagnetic forces were thought to be transmitted. Ethernet would become a cornerstone of the internet.
Despite his role in its foundations, Dr Metcalfe later doubted the hardiness of the internet as it became a global phenomenon. In late 1995 he noticed that a quarter of internet traffic was getting lost on its way, and that the system did not seem to be responding well to that volume of loss. He predicted that the whole shebang would “go spectacularly supernova and, in 1996, catastrophically collapse”. The collapse never happened, and Dr Metcalfe literally ate his words. At a conference in California, he produced a print-out of his prediction, pureed it in a blender and slurped it up with a spoon. “I learned my lesson,” Dr Metcalfe says now. “The internet is more robust than I had estimated.”
In its more than 40 years the internet as a whole has never completely stopped working. Parts of it break all the time, but resilience was built into the internet from day one. It is a decentralised, distributed network of billions of computers and billions of routers, connected to each other by perhaps billions of kilometres of cables. The network works seamlessly for end-users because of layers of software above this hardware that manage how the computers communicate, building in multiple redundancies and leaving no single point of failure. This power of abstraction—the ability to create, transmit and consume digital artefacts without needing to think about the physical realities behind them—is the secret sauce of the internet. And, indeed, of all computer science.
Abstraction is also the key to why Dr Metcalfe’s prediction ended up proving wrong. To see why, one has to grasp the internet’s layered structure. Some engineers think of the internet as having five layers (though others say there are four or seven depending on whether certain functions get layers of their own). At the bottom is the most physical of layers, where photons and electrical signals whizz from one server to another via routers and cables. Just above the cables are local-network protocols like Ethernet, Dr Metcalfe’s contribution, which allow computers and other devices near each other to interpret this traffic as groups of ones and zeros.
Above the cables and local-network protocols are two communications layers, “transmission control protocol” and “internet protocol” (tcp/ip), which enable computers to interpret messages as “packets”: short strings of data with a tag at one end which describes their destination. tcp/ip interacts with Ethernet but need not know about the cables at the very bottom. Sitting above tcp/ip is the application layer of software and language that users will begin to find more familiar, like “http” (as seen on the world wide web). That allows webby stuff to interact with tcp/ip without worrying about Ethernet, cables and the like.
These levels of abstraction made the internet flexible and allowed it to scale beyond what many—including Dr Metcalfe—imagined. Each intermediate layer is designed to manage disruptions below and to present a clean image above. A well-designed layered system like the internet dampens chaos caused by errors, rather than spiralling out of control with them. And it didn’t hurt that, all the while, the physical foundation itself was strengthening. Optical fibre became increasingly available throughout the 1990s, which increased bandwidth to send more packets faster, losing fewer of them. The problem Dr Metcalfe was worried about got resolved without the rest of the internet really noticing. And as applications became more data-intensive, the plumbing below continued to hold up admirably.
The internet’s seemingly limitless adaptability has been enabled by layers of abstraction
To take an example, originally the internet was designed to carry text—a restricted set of 128 different characters—at a rate of 50 kilobits per second. Now video makes up more than 65% of traffic, travelling at hundreds of megabits per second, without gumming up the pipes. Changing web protocols from http to the more secure https did not affect lower layers. As copper wire is upgraded to fibre-optic cable, applications do not have to change. The internet’s seemingly limitless adaptability has been enabled by those layers of abstraction between the user and the cables.
But Dr Metcalfe was not entirely wrong. The benefits of abstraction are still ultimately limited by infrastructure. In its early days Google was able to beat its competitors in part because it kept things simple. Others tried loading huge pages with lots of adverts. But they misjudged how much modems could handle at a reasonable speed. Since no one wants to wait for a web page to load, you now “google” things rather than “AltaVista-ing” them.
AltaVista learned the hard way that abstraction comes at a cost: it can obscure the frailties of hardware. Tech visionaries of today should take notice. Their most ambitious schemes will not work without the appropriate infrastructure to deliver them. From autonomous cars to augmented reality, from artificial intelligence (ai) to the metaverse, decisions at the physical layer constrain or expand what is digitally possible. Underneath all the layers of abstraction, the physical infrastructure of the internet is the foundation of the digital future. Without it, the internet is just an idea.
This special report will demystify the physical building blocks of the internet in order to explain how they constrain what is possible in the abstractions which sit on top of them. It will explore what about the physical layer must change for the internet to remain sustainable—in the physical sense, but also environmentally—as the internet’s uses multiply far beyond its original remit.
A good place to start would be to explain how this article reached your screen. Each digital article starts somewhere in the “cloud”. To users this is the infinite attic where they toss their digital stuff—articles, photos, videos. But the cloud is actually composed of tens of millions of computers connected to the internet.
Your click on a mouse or tap on a screen created packets that were turned into signals which travelled tens or thousands of kilometres through metal, glass and air to a machine in a data centre.
Depending on where you are in the world, the data centre that your article will have come from will be different. This is because The Economist, along with most content providers on the internet, gets to users via something called a content-delivery network (cdn). This stores ready-to-read articles in data centres across the world, rather than having our main servers in northern Virginia put all the components together every time. This spreads out the load so that the main servers do not get overwhelmed. And it helps an article get to your screen faster because memory devices with the data needed are physically located much closer to you.
This means that when your correspondent just clicked on an Economist headline while on her laptop, it came from a data centre in London, made a short trip through fibre-optic cable and then, for the “last mile”, perhaps by way of old-fashioned copper wiring until arriving at a cable box and Wi-Fi router in her flat. An instant later, packets of data reassembled on her laptop in front of her eyes, a digital article rendered on a digital screen.
If your correspondent had been the very first person in a region to ask for the article, the trip would have been slower, as if over the primordial internet of decades ago, because a cached copy would not yet have been available at a data centre nearby. Instead her request would have travelled through thin strands of glass that lie at the bottom of the Atlantic Ocean, to a data centre in northern Virginia, and back again. These fibre-optic cables form the backbone of the physical internet. It is through them that nearly all intercontinental internet traffic flows.
The internet relies on these cables, but not on any single cable; it relies on data centres, but not any single one. Its distributed nature and its abstractions make the internet difficult to pin down. But not so for the tech giants. They are vertically integrating the internet: laying cables, building data centres, providing cloud services and ai. As the internet becomes more powerful, it is becoming crucial to grasp both its physical and corporate composition. Only by peeling back the layers of abstraction can one lay bare the internet’s foundations and understand its future.■
Feb 3rd 2024
Advances in physical storage and retrieval made the cloud possible but more progress is needed to sustain it
On september 14th 1956 ibm announced the first commercial computer to use a magnetic hard disk for storage. Weighing in at about 1,000 kilograms, the 305 ramac (random access method of accounting and control) was the world’s most expensive jukebox. It stored 4.4 megabytes on 50 double-sided disks, each one measuring two feet in diameter and spinning 1,200 times a minute. Two access arms located and retrieved information in an average time of six-tenths of a second. Companies could lease the machine for $3,200 per month—roughly equivalent to paying $100m annually for a gigabyte of storage today.
Almost 70 years later, a gigabyte of storage costs pennies. Businesses and consumers can retrieve information much faster, from anywhere in the world, than they could have from a 305 ramac in the same room. What is more, they can work with this stored data where it is stored, rather than having to schlep it around. That is because their bytes are stored not in one jukebox, but in a great many of them: sliced up, replicated and distributed over a vast collection of computers and storage devices in massive data centres scattered across the world. In a word, the cloud.
The cloud is an abstraction of everything one could do on a 305 ramac and more. It endeavours to separate the actions of storing, retrieving and computing on data from the physical constraints of doing so. The concept intentionally obfuscates (clouds, one might say) the user’s ability to see the existence of hardware. To users, the cloud is a big virtual drawer or backpack into which you can put your digital stuff for safe-keeping, and later retrieve it to work on (or play with) anywhere at any time. It does not matter to you where or how—or indeed in how many pieces divided among various hardware devices strewn across the planet—your data is kept; you pay to not have to worry about it.
But to cloud providers the cloud is profoundly physical. They must build and maintain the physical components of the cloud and the illusion that goes with it, keeping up as the world produces more data that needs storing, sorting and crunching. The quantities of data being created are ever growing too. In 2023 the world generated around 123 zettabytes (that is, 123 trillion gigabytes) of data, according to International Data Corporation, a market-research firm. Picture a tower of dvds growing more than 1km higher every second until, after a year, it reaches more than halfway to Mars. This data must be stored in different ways for different purposes, from spreadsheets that need to be available instantly, as on a bookshelf, to archival material that can be put in an attic. How is it possible to do all this in an orderly, easily retrievable way?
For a start, it helps to recognise the technical leaps in storage that have made the cloud possible. For each type of data and computational task there are different kinds of physical storage with trade-offs between cost, durability and speed of access. Much like the layers of the internet, the cloud needs these multiple layers of storage to be flexible enough to adapt to any kind of future use.
Inside an unassuming building in Didcot, England, in the Scientific Computing Department (scd) at the Rutherford Appleton Laboratory, one of Britain’s national scientific-research labs, sit Asterix and Obelix, two stewards of massive quantities of data. They are robotically managed tape libraries—respectively the largest and second-largest in Europe. Together Asterix and Obelix store and keep organised the deluge of scientific data that comes in from particle-physics experiments at the Large Hadron Collider, at cern, along with various other sorts of climate and astronomy research. The data produced by all this research has scaled up by orders of magnitude, says Tom Griffin, scd’s director, which means they have had to switch from scientists coming in with laptops and usb sticks to creating a cloud of their own.
Asterix and Obelix form a sizeable chunk of the lab’s self-contained cloud (its computing power is conveniently located in the same room). Together the two can store 440,000 terabytes of data—equivalent to a million copies of the three “Lord of the Rings” films, extended edition, in 4k resolution. Each is made up of a row of cabinets packed with tape cartridges; if all the cartridges were unspooled, the tape would stretch from Athens to Sydney. When a scientist requests data from an experiment, one of several robots zooms horizontally on a set of rails to find the right cabinet, and vertically on another set of rails to find the right tape. It then removes the tape and scans through the reel in order to find the requested information. The whole process can take up to a minute.
Magnetic tape, similar to that used in old audio cassette tapes, might seem like an odd choice for storing advanced scientific research. But modern tape is incredibly cheap and dense (its data density has increased by an average of 34% annually for decades). This has been made possible by reducing the size of the magnetic particles—called “grains”—in which information is stored and by packing them more closely together. A single cartridge, maybe the size of two side-by-side audio cassettes, can hold 40 terabytes of data. That equates to almost 1m 305 ramacs. Plus, it is durable and requires little energy to maintain. These qualities make tape the storage medium of choice not only for this scientific data, but also for big chunks of the cloud at Amazon, Google and Microsoft.
But if you are not a scientist at Didcot—or if you are, but you are taking a break to scroll your recent group chats and Instagram posts on your phone—you will want your data from the cloud much more quickly than you can get it from tape. Flash memory, in common use on laptops and phones, is best for when data needs to be frequently looked up or modified, like recent photos. Solid-state drives save data by trapping or releasing electrons in a grid of flash-memory cells. Retrieving the data is as simple as checking for the presence of electrons in each cell, and involves no moving mechanical parts; it takes about one-tenth of a millisecond, though if it is in the cloud instead of on your phone, add a few dozen milliseconds for delivery from the data centre. The data remains even when the power is turned off, though memory will eventually degrade as electrons leak out of the cells.
As new photos you take go to a data centre, your older ones get demoted from flash to old-fashioned hard-disk drives spread across multiple data centres, most likely including some in the country or at least the continent you live in (for most readers). These read and write data mechanically onto a spinning magnetic disk, not dissimilar from the 305 ramac, and are more than five times cheaper per gigabyte of storage than flash (though that gap is closing). Retrieval takes a sloth-like 5-10 milliseconds. Then years-old stuff that you forgot about might get further relegated, from disk drives to magnetic tape like they have at the Didcot lab.
Even on the side of the cloud provider, the exact physical device on which data is stored is abstracted away. One way that this is often done is called raid (redundant array of independent disks). This takes a bunch of storage hardware devices and treats them as one virtualised storage shed. Some versions of raid split up a photo into multiple parts so that no single piece of hardware has all of it, but rather several storage devices have slightly overlapping fragments. Even if two pieces of hardware break (and hardware failures happen all the time), the photo is still recoverable.
The cloud is also redundant in another way. Each piece of data will be stored in at least three separate locations. This means that were a hurricane or tornado or wildfire to destroy one of the data centres that had a copy of your photo, it would have two copies left to fall back on. This redundancy helps make cloud storage reliable. It also means that most of the time, millions of hard-disk drives are spinning on standby, just in case.
Still, companies are working on making the infrastructure of the cloud more robust. Tape, in particular, has its disadvantages as a long-term storage medium. It must be kept within a certain range of temperatures and humidities, and away from strong magnetic fields, which could erase the information. And it requires replacing every decade or two. So the hunt is on for something that takes up less room, lasts longer and requires less maintenance.
One promising medium is glass. A fast and precise laser etches tiny dots in multiple layers within platters of glass 75mm square and 2mm thick. Information is stored in the length, width, depth, size and orientation of each dot. Encoding information in glass in this way is the modern equivalent of etching in stone, says Peter Kazansky, one of the inventors of the technology, based at the University of Southampton in Britain. If you fry, boil, scratch or even microwave glass slides, you can still read the data.
Researchers at Microsoft are harnessing this tech to build a cloud out of glass. They increased capacity so that each slide can hold just over 75 gigabytes, and used machine-learning to improve reading speed. They claim their slides will last for 10,000 years. Microsoft has developed a system (much like the tape robots) that can handle thousands, or even millions, of these slides.
Achieving this kind of scale, without the need to supply power to storage shelves or to replace the storage devices themselves, is necessary to build a truly durable foundation for the cloud. Necessary, but not sufficient. For the cloud is not just a storage shed. Its users are demanding that it do a lot more computing work, and more quickly, than ever before.■