Today

The case for computational growth

Information Technology forms an integral part of our everyday lives and has been key in driving the development of today’s increasingly data-driven world. Computers have tightly interwoven themselves within the fields of education, health and manufacturing. Thus, constant iteration on predominantly hardware, but also software infrastructure is a cornerstone to advances within society and is an indispensable necessity.

There is a general need for growth in computational power. The last few decades have seen an exponential rise in the amount of raw data produced and stored - a growth rate which far outstrips the computational growth shown by the famous observation Moore’s Law. Data is no longer simply numerical or text-based. There are audio, images, video, which are all increasing rapidly in size with larger and larger resolutions. All this data is worthless if it is not processed. The processing of data simply means conducting operations on a set of input data, to create more useful output data. Arithmetic and logical operations are performed, along with movement, storage and retrieval of data, the four fundamental pillars of computing in order to transform one type of data into another, more appropriate one.

Over the course of the past four decades, gains in performance have largely been driven by CMOS scaling. However, due to physical and economic limitations, future growth will be sustained by Beyond CMOS technologies. Each has its own unique benefits and drawbacks, affecting their viability for commercialisation. Due to limited resources, only a few can be selected for further research and will shape the devices of the future.

The importance of computing performance for the sciences

Computers have been particularly effective in revolutionising the forming of theories and experimentation in the sciences. For example, when the actual experiment is too dangerous, expensive or even impossible to conduct with current technology, simulations are extremely useful. Examples of such models are the Big Bang, modeling of protein structures and engineering problems concerning structures such as bridges or skyscrapers. Often, the achievement of scientists and other researchers can be limited by the power of their computers. There are two main drivers for increasing computational power in the scientific field: multiphysics and confidence.

The former is essential to better understanding complex real world systems, where are wide mix of computational models are used. Here, one object may weakly interact with another, however these effects compound, leading to an exponential increase in complexity. An example is the addition of chemical bonds in the structural simulations of proteins. The ability to model ever more complex problems will not only speed up research such as drug discovery, saving lives, but also allow for a completely new level of simulation, where the entire human body could be modeled to test for the efficacy of the drug.

The latter is related to how accurately the model actually represents the world. Through more powerful computers, models can be simulated at larger time scales or more numerical precision, increasing the accuracy of the results.

The growth of computational capability is essential not only for the sciences, but also education, defense, entertainment and countless other sectors.

A brief history of the methods of computation using active electrical components

What is a computer?

The definition of a computer is one which has changed significantly as our technology has developed and matured. When we refer to a computer today, we often picture an electronic device capable of the storage and processing of data, according to a set of instructions provided in the form of a program. However, the term was originally coined as a reference to a person who helped perform calculations, usually with the help of a machine. They would be key to encoding the problem, generally onto punch cards which would then be fed into a machine. This would then return a result, which would then be decoded by the “computers”.

Vacuum Tubes

Vacuum tubes brought a revolution to computing, transforming the passive electrical circuits, which relied primarily on electromechanical or relay technology, into active electrical circuits, capable of current modulation. The introduction of the vacuum tube, an active electrical component ushered in a new era of electronic computing.

One of the most common designs for vacuum tubes was the triode, designed by Lee de Forest in 1906. The device was simple, containing a grid in the middle of a cathode and a plate. All of this would be contained within a vacuum tube, hence the name. Heating the cathode would cause electrons to be freed. By controlling the current supplied to the grid, the electrons would either be repelled back to the cathode (OFF state) or travel to the plate (ON state). This formed the basic binary system, underpinning the majority of modern computing. However, this device was inefficient by nature, due to the fact that a lot of heat would be generated. The tubes were unreliable, consumed a large amount of energy and had big spatial requirements. This made the technology very expensive, making commercialisation at the individual consumer level impossible.

Transistors

The transistor was invented in 1947 and shortly after, announced in 1948 by Bell Laboratory engineers John Bardeen and Walter Brattain. It was the ideal replacement to the vacuum tube, since not only did it consume less power, it was also smaller, faster and easier to manufacture. The transistor was essentially a solid state relay switch, very different to the modern ones seen in today’s electronics. William Shockley, who also worked at Bell, invented the junction transistor a few months later. The junction transistor utilized semiconductors, such as germanium and silicon. Although this type is closer to a modern transistor, the first metal oxide semiconductor field effect transistor would not be invented until 1959, by Dawon Kahng and Martin Atalla. The MOSFET was the device to have to most profound impact on computation and the development of electronic components.

Integrated Circuits

The development of the MOSFET coincided with the invention of the integrated circuit by Jack Kilby (Texas Instruments) and Robert Noyce (Fairchild) in 1958 and 1959 respectively. Rather than having hundreds or thousands of separate transistors which needed to be connected to each other by wires, all of the transistors (and resistors and capacitors) could now be laid onto a single substrate. This made the entire semi-conducting circuit a lot more compact and could easily be scaled up. The era of rapid scaling had begun, where a large rise in computing performance would be seen every year, leading to Gordon Moore, co-founder of Intel, making his famous observation in 1965. Just like the vacuum tube transformed the industry, is there a modern alternative to the integrated circuit?

The MOSFET

What is the MOSFET?

The Metal Oxide Semiconductor Field Effect Transistor (MOSFET) forms the basis of all modern electronics. It is the active, electrical device which forms the pattern of ones and zeros commonly known to be the foundation of nearly all modern computing. It is an active component since it can modulate the flow of charge carriers through a circuit. The opposite is true for passive components such as relays, resistors or capacitors. Although it is commonly thought that the MOSFET, or transistors, form the very smallest parts of our computers, the MOSFET itself is a much larger component formed from many smaller parts.

The construction and workings of the MOSFET

To learn more about how this device works, visit the following link or read the full EPQ:

http://www.onmyphd.com/?p=mosfet.transistor

Short-Channel Effects

To learn more about SCEs, visit the following link or read the full EPQ:

http://www.onmyphd.com/?p=mosfet.short.channel.effects

Where do we stand today?

The current state of the semiconductor industry

Traditional CMOS (Complementary Metal Oxide Semiconductor) scaling has become increasingly difficult, especially over the course of the last decade. This is mainly due to the rising costs of development as ever smaller gate lengths have led to more complex SCEs. Many people believe that it is a technological limit which slows down or stops Moore’s Law. However, this is simply not true. As Gordon Moore said himself, “Moore’s Law is really about economics”. As long as there is demand for faster, more power efficient semiconductors, technological progress will continue.

Even though demand for semiconductors has been rising, the rising costs of development have eclipsed the capabilities of many foundries. Thus, there are only four companies which plan to continue their efforts in scaling down transistors. These companies are Intel, TSMC, Samsung and Globalfoundries. However, the latter is an example of how even these large companies are struggling in an era of post-Dennard scaling.

In early 2018, Globalfoundries installed two new EUV machines at its Fab 8 facility in an effort to keep pace with the competition. However, soon after, in mid-2018, it decided to halt its plans to move to the new 7 nm node and the use of EUV. The investors were considering selling the EUV machines as they tried to maintain profitability over innovation. This signalled the start of a fall in year-to-year growth in computing power, but also marked the path which the industry will likely take in the next few years.

It shows that in the short-term, growth will not be sustained primarily by the shrinking of transistors. Instead, growth will be facilitated by a greater focus on parallel CPU architecture and advancements in specialised silicon. The demand for chips which have a specific purpose, e.g: those to be used to power neural networks or manage vast server farms, will rise. The focus on increases parallelism will drive the CPU core count upwards and the demand for HBM memory will increase as there will be a shift in how memory and compute are integrated.

Although the growth in profits for these companies has fallen, they still manage to turn large profits every year. For example, TSMC was responsible for manufacturing 7 nm chipsets for Apple and Huawei in the iPhone Xs and the Huawei Mate 20 Pro. In a remark made by TSMC in 2018, it is confident that scaling will continue down to 3 and 2 nm, showing that the hurdles to Moore’s Law are economic, not technological.

Furthermore, SEMI’s World Fab Forecast reinforces the strength of the semiconductor market. It forecasts that investment in new fabrication technologies will grow 14% to $62.8 billion annually, with investment in research growing to $17 billion. However, the increased cost of research and manufacturing is countered by the ever increasing demand for silicon. Q3 2018 silicon wafer area shipments increased 3% on Q2 2018 to set another all-time high, according to the SEMI Silicon Manufacturers Group (SMG). It was also 8.6% higher than the same time in 2017. All of this data shows that there is incredibly strong demand for further innovation within the semiconductor industry. Since Moore’s Law is driven by economics, this reinforces the idea that we will not see a significant slow down in the growth of computational power. So, if transistors cannot be shrunk, what are the alternative methods and which will most likely succeed?

Changing focus to parallelisation and chiplets

As the difficulty of scaling rises, companies are taking longer durations between major architectural revisions. However, investment into parallelisation has dramatically increased. In early June of 2019, AMD announced its new Ryzen 3000 series of processors, truly bringing high core-count processing to the general public. For example, the Ryzen 9 3900X brings 12 cores and 24 threads to the general consumer for a price of $499. These high core-counts were previously only reserved for enterprise customers, often also requiring the use of specialised hardware such as custom motherboards and cooling solutions.

Ryzen 9 is based on the new Zen 2 architecture, which emphasizes the use of chiplets. Consequently, the die on the new AMD processors consists of a combination of chiplets from a variety of vendors. For example, the IO is separated from the “Core Complex” (also known as CCX) of the processor. The CCX is itself comprised of four cores. The IO must communicate with the CCX in a fast and efficient manner. Here is where AMD’s infinity fabric comes into play, announced in April of 2017, which allows the connection of the CCX with a plethora of different chipsets.

It is important to note that it is not just AMD which is driving the use of chiplets. Intel, the largest manufacturer of desktop semiconductors, also has plans to join the chiplet revolution.

What is a chiplet?

A chiplet is a specialised piece of silicon, designed for one particular task in mind. A collection of these chiplets can be assembled and connected with an interconnect architecture, to allow them to all function as one large chip. As one of its primary benefits, it allows easy specialisation of equipment, due to its modular nature. Furthermore, it can allow semiconductor yield to increase, since if a chiplet is faulty, it can be discarded, granted it is not already connected onto the main die. It could lead to a fabless semiconductor industry, where a specialised package can be made by gathering more general purpose chiplets. This will greatly accelerate innovation and competition within the industry, but has a few major hurdles to overcome as detailed in later sections.

As time progresses, the types of computing have diversified. Since there are a variety of different workloads, different architectures are needed. All of these different architectures need support, but this can be made quite difficult when there are just so many variations. The solution to this problem is heterogeneous integration. One of the types of heterogeneous integration is chiplets. In the future, the use of these heterogeneous technologies may make the adoption of germanium or III-V semiconductors much easier. Currently, we are only using silicon with 2D integration, but we are already witnessing the benefits. This can be observed with the rise of HBM memory. However, memory is just the first phase, as demonstrated by Ryzen 3000 and Zen 2.

Intel's Approach to Chiplets

After spending years in development and millions in development costs, Intel believes that its approach is finally ready for production. It’s called the Embedded Multi-Die Interconnect Bridge, or EMIB, for short. It comprises of a high-density bridge which connects the individual chiplets together. A silicon interposer brings all the individual chiplets together in order to package the substrate. (A silicon interposer is a silicon substrate with dense interconnects and TSVs (Through Silicon Vias) built into it. These allow for high bandwidth connections.) The high density interconnects on the silicon substrate are called microbumps (as named by Intel). Intel says that they provide a higher density than standard packaging substrates. Using this new technology will be costly, so Intel plans to use it only where it is needed, localising it to particular areas on the die.

To demonstrate the viability of EMIB, Intel has already made some products commercially available which are powered by this technology. A prominent example is Kaby Lake-G. It represents a partnership between Intel and AMD, for the integration of an AMD GPU, along with HBM and Intel’s CPU. More interestingly, a HMB interface was used for the HBM in the package, whereas PCI-E (a standard circuit board level interface) was used within the package for the CPU and GPU integration, both industry standards. The use of standard interfaces shows that Intel is actively encouraging the shift to a fabless industry, where the cost of chip development can be dramatically lowered.

Along with the cost reductions which Intel desperately needs to compete today, having a chiplet arrangement like the one in Kaby Lake-G, allows chip designers to focus solely on their strengths and overcome their weaknesses by partnering with other chipset designers. The reason why AMD built the GPU and Intel the CPU for this product is because these are the components which the other company is better at designing and building. This project was simply a testing ground for future partnerships between the two companies, which could lead to a much better product for the consumer in the end. As a result, EMIB is able to benefit both the consumer which uses, the companies which design and the foundries which manufacture the chips.

Traditional Scaling or Parallelisation and Chiplets?

Today, the major chipset vendors have a choice to make. They could either continue down the path of traditional CMOS scaling, where the pitch of the transistors is made ever smaller, or start a revolution. A shift in not only how chips are developed and manufactured, but also how business itself is conducted within the industry. In an ideal world, both options would be pursued to their fullest extent. However, in the real world, even the largest companies have limited resources for research and development. So, a firm decision has to be made.

Choosing the first option provides a predictable path for the next half a decade. It is a relatively safe option, since although scaling is becoming more of a challenge, both from an economic and scientific perspective, these companies have decades worth of experience regarding performance enhancements from scaling. The improvements to power consumption and efficiency can be easily and accurately predicted, making the creation of a roadmap quite easy.

However, choosing this path will lead to imminent stagnation, which would truly slow down Moore’s Law after five years. Even if the majority of research and development resources were poured into scaling, eventually a hard wall would be hit. With no other technology to replace it, this relatively safe option becomes one of unsustainable growth. Hitting a dead-end in performance growth would have disastrous consequences for all sectors of society, not just the semiconductor industry, since almost all sectors are reliant on computation and hence computational growth.

The second option is unpredictable, but boundless in terms of potential. Firstly, it will allow for growth to continue well past the decade mark. Currently, we are only seeing the results of small experiments with chiplets. In the near future, it is almost certain that logic-to-logic integration on a 3D level will become possible. Extending the die into 3D space will allow for chip designs never possible before and a rate of growth never witnessed. Furthermore, it provides a path into the future where silicon is not the primary semiconductor. This modular architecture would allow the integration of traditional CMOS with Beyond CMOS technologies.

The use of chiplets will also be a catalyst for innovation and collaboration between companies. For example, specialised hardware for growing fields such as machine learning would be much easier to construct and maintain, in addition to being orders of magnitude faster than anything available today, or with traditional scaling in the future. The barriers to entry into the semiconductor industry will be greatly lowered as a foundry is no longer required, increasing competition which would increase quality and reduce prices. In addition to this, chiplets promote IP (Intellectual Property) reuse, hugely reducing the research and development costs for chip design as the architecture does not have to be built from scratch. As highlighted previously, reducing these costs will be key if Moore’s Law is to continue.

Even though chiplets provide the potential for a bright future, they too have hurdles to overcome. Currently, as stated by Intel, the use of chiplets is still relatively expensive. This is a barrier which will definitely be overcome and in some ways has already been. With AMD bringing its Ryzen 3000 series, it not only brings 7 nm to the consumer, but also its Infinity Fabric, an interconnect architecture similar to EMIB (but proprietary).

The larger barrier to EMIB is having a universal standard across the industry. For example, although chiplets have the potential to increase yield, they can actually cause problems reducing yield too. For example, without a universal standard test to verify quality, if only one of chiplets is not functional and is used in the package, all the other good chiplets will also become useless. Furthermore, vendors must do more to have better support for power and thermal management, so sufficient information is provided by each individual chiplet in order to manage the power and thermals correctly. This is especially important in the future with 3D stacking where the confined spaces for chiplets in the middle provide a large challenge for heat dissipation. Finally, a mechanical standard for the placement of the microbumps is also needed, to increase compatibility across vendors.

Investment solely in chiplets is not viable either. This is because although chiplets can provide increases in performance, the primary source for increase in efficiency is still through traditional scaling. As a result, I believe that there must be investment in both scaling and chiplets, but there should be a greater focus on the development of chiplet-based technologies until the materials to replace silicon are perfected.

The path down to 5 nm (2020)

Succeeding Immersion Lithography with EUV

Extreme Ultraviolet Lithography (EUV) is a long awaited technology which will make the production of chipsets with pitch lengths of 7 nm and under much easier and cheaper. It has been pursued for over a decade and is finally ready to replace the traditional immersion lithography technique.

The traditional technique works by shining a beam of light which has a wavelength of 193 nm onto a photomask, which is a patterned surface. This light then falls onto the silicon wafer where it reacts with a photosensitive chemical, causing it to etch circuits and features onto the silicon wafer. However, modern transistors have pitch lengths which are a lot smaller than the wavelength of this light. As a result, a piece of equipment called a photomask is used to turn this wide beam of light into specific patterns to etch the desired circuits onto the silicon wafer. This whole process must be repeated around 80 times, which makes lithography difficult and expensive with the older technology.

In comparison, EUV is a massive leap forward, since it uses light of a wavelength of just 13.5 nm. Since this is a lot closer to the feature size of the transistor, the number of lithography steps required is greatly reduced. For example, GlobalFoundries aims to combine 15 lithography steps into just 5. If transiors of feature sizes of 5 nm and below were manufactured using immersion lithography, the number of lithography steps required would be over 100, which is simply impractical.

The limiting factor for EUV technology for many years, and even today, has been the power of the light source which is needed. At the end of 2018, there was a light source capable of outputting 250 W of power, but in order to manufacture chips with features sizes of 3 and 1 nm, a power of 500 and 1000 W will be required. The challenge is creating a light source with a sufficient power output with high efficiency to reduce the immense amount of energy which is required. For example, the machine at GlobalFoundries for 7 nm EUV uses an astonishing megawatt of power just for a few watts of light to be delivered onto the silicon wafer.

In addition to this, EUV is also plagued with other, much smaller problems. The masks required for EUV work completely differently to those in immersion lithography. Instead of transmitting the light, they reflect it, so slight imperfections on the surface of the mask, could render the end product, the wafer, useless. Furthermore, these masks must be protected from dust using pellicles. However, pellicles from immersion lithography cannot be used, since they are opaque to EUV light. The pellicles are relatively easy to develop, but the larger concern is the mask itself. Thus, researchers are working on developing actinic patterned-mask inspection, a technology to detect imperfections on the surface of the mask, However, this is proving to be much more difficult than previously expected. Currently, we can work around the mask problem through the use of “naked” masks, but this will not be possible in the future with the advent of smaller nodes.

In conclusion, currently, there is a choice for foundries to pick between immersion or EUV lithography. EUV is more efficient in terms of its energy use, but the upfront cost of EUV machines may be greater than the demand for the chipsets as shown by Globalfoundries. However, in the very near future, EUV will definitely be the dominant technology, since the cost of running the immersion lithography machines on nodes smaller than 7 nm will be greater than the cost of an EUV machine as a whole.

Google Sites

Report abuse