The mechanism used by ROMs to “store” information varies with different ROM technologies. Modern ROMs use an MOS transistor, one per stored bit, to distinguish between a 0 and a 1.
The figure below is the schematic of a primitive 8 × 4 ROM that you could build yourself using a 3-to-8 decoder and a handful of discrete NMOS transistors. The address inputs select one of the decoder outputs to be asserted. Each decoder output is called a word line because it selects one row or word of the table stored in the ROM. The figure shows the situation with A2–A0 = 101 and the ROW5 decoder output asserted.
Each vertical line in the figure below is called a bit line because it corresponds to one output bit of the ROM. An asserted word line turns on a transistor, if one is present, at the intersection of the word line and a bit line. A transistor pulls the bit line LOW when turned on. There is only one transistor in row 5, and when ROW5 is asserted, the corresponding bit line (D1_L) is pulled LOW. All of the other bit lines remain HIGH, since none of the other decoder outputs are asserted and all the other transistors in the array are off. The bit lines are buffered through inverters to produce the D3–D0 ROM outputs, 0010 for the case shown.
Suppose you wanted to build a 128 × 1 ROM using the kind of structure described in the preceding subsection. Have you ever thought about what it would take to build a 7-to-128 decoder in two levels of logic? Try 128 7-input NAND gates to begin with, and add 14 buffers and inverters with a fanout of 64 each! ROMs with millions of bits and more are available commercially; trust me, they do not contain 20-to-1,048,576 decoders or worse. Instead, they use a different structure, called two-dimensional decoding, to reduce the decoder size to something proportional to the square root of the number of addresses.
The basic idea in two-dimensional decoding is to arrange the ROM cells in an array that is as close as possible to square. For example, the figure below shows a possible internal structure for a 128 × 1 ROM. The three high-order address bits, A6–A4, are used to select a row. Each row stores 16 bits starting at address (A6, A5, A4, 0, 0, 0, 0). When an address is applied to the ROM, all 16 bits in the selected row are “read out” in parallel on the bit lines. A 16-input multiplexer selects the desired data bit based on the low-order address bits.
The following sections contain a more information on ROMS. It also includes a section for EPROMs which is something you may or may not have used in this class. If you are interested, you can start with this information. Only a basic understanding of this topic is needed for this class.
Unless you visit the Computer History Museum in Mountain View, CA, you won’t find any ROM modules built with discrete transistors or diodes. A modern ROM is fabricated as a single IC chip; one that stores 4 gigabits (2^32 bits) can be purchased for under $5. Various methods have been used to “program” the information stored in a ROM, as discussed below and summarized in the table below.
Most of the early integrated-circuit ROMs were mask-programmable ROMs (or, simply, mask ROMs). A mask ROM is programmed by the pattern of connections and no-connections in one of the masks used in the IC manufacturing process. To program or write information into the ROM, the customer would give the manufacturer a listing of the desired ROM contents on a disk or other medium. The manufacturer would use this information to create one or more customized masks to manufacture ROMs with the required pattern. Because of mask costs and the four-week delay typically required to obtain programmed chips, mask ROMs were used only in very high-volume applications. For low-volume applications there were more cost-effective choices, discussed next.
A programmable read-only memory (PROM) is similar to a mask ROM, except that the customer could store data values (i.e., “program the PROM”) in just a few minutes using a PROM programmer. A PROM chip is manufactured with all of its diodes or transistors “connected.” This corresponds to having all bits at a particular value, typically 1. The PROM programmer was used to set desired bits to the opposite value. In bipolar technology, this was done by vaporizing tiny fusible links inside the PROM corresponding to each bit.
Introduced later, an erasable programmable read-only memory (EPROM) could be programmed like a PROM, but could also be “erased” to the all-1s state by exposing it to ultraviolet light. No, the light does not cause fuses to grow back! Rather, EPROMs use a different technology, called “floating-gate MOS.”
As shown in the figure below, an EPROM has a floating-gate MOS transistor at every bit location. Each transistor has two gates. The “floating” gate is not connected to anything and is surrounded by extremely high-impedance insulating material. To program an EPROM, the programmer applies a high voltage to the nonfloating gate at each bit location where a 0 is to be stored. This causes a temporary breakdown in the insulating material and allows a negative charge to accumulate on the floating gate. When the high voltage is removed, the negative charge remains. During subsequent read operations, the negative charge prevents the MOS transistor from turning on when it is selected.
Early EPROM manufacturers guaranteed that a properly programmed bit would retain 70% of its charge for at least 10 years, even if the part was stored at 125°C, so EPROMs definitely fell into the category of “nonvolatile memory.” However, they could also be erased. The insulating material surrounding the floating gate becomes slightly conductive if it is exposed to ultraviolet light with a certain wavelength. Thus, EPROMs could be erased by exposing the chips to ultraviolet light, typically for 5–20 minutes, when the chip was housed in a package with a transparent quartz lid. Less expensive one-time programmable (OTP) versions of these devices were also offered without the quartz lid.
An electrically erasable programmable read-only memory (EEPROM) is like an EPROM, except that individual stored bits may be erased electrically. The floating gates in an EEPROM are surrounded by a much thinner insulating layer and can be erased by applying a voltage of the opposite polarity as the charging voltage to the nonfloating gate. Large EEPROMs (1 Mbit or larger) can be erased only in fixed-size blocks of 128 Kbits to 8 Mbits (16 Kbytes to 1 Mbyte). These memories are called flash EPROMs or flash memories, because an entire block can be erased “in a flash,” like the flash of a camera. The last flash EPROMs in the table above use a “NAND architecture” internally that brings benefits and limitations, as we’ll soon discuss.
As noted in the table, writing an EEPROM location takes much longer than reading it, so an EEPROM is no substitute for the volatile read/write memories discussed later in this chapter. Also, the insulating layer can be worn out by repeated programming operations. As a result, EEPROMs can be reprogrammed only a limited number of times, typically 10,000 to 100,000 times per location; that’s a second reason they’re no substitute for read/write memory.
EEPROMs are quite suitable for storing information that doesn’t change very often, such as the default configuration data and bootstrap programs for computers large and small, or the application software for the embedded processors in all sorts of equipment. On the other hand, when flash memory is used in a computer file system, where some files may be rewritten very frequently, special methods must be used to avoid “wearing out” some locations; we’ll say more about that later.
The arrangement of transistors in previous figure above is called a NOR architecture, because any of the transistors in a column can pull their bit line low, reminiscent of the parallel arrangement of NMOS transistors in a NOR gate. In the mid-1990s, the industry sought ways to build higher density EEPROMs for new applications like digital-camera memories and ultimately high-capacity “solid-state disks” (SSDs) to replace mechanical, magnetic disc drives, and they turned to another transistor arrangement called NAND architecture.
As shown in the new figure below, the NAND architecture does not have a ground connection for every bit of storage. Instead, a group of transistors are connected in series, as in a NAND gate, with only the last connected to ground; all must be “on” to pull their bit line low. Typically, there are 16 to 32 transistors in series, and omitting most of the ground connections allows them to be packed more closely together. Compared to an array of NOR cells, a NAND array may be 40% smaller. Note that a complete memory chip will have more groups of 16 to 32 words below the topmost group shown in the figure below, connected to the same bit lines and using the same circuitry to read the values placed on the bit lines.
In an NAND memory, the transistor thresholds and programming levels are set up so that a transistor whose word line is HIGH will be on regardless of whether it stores a 1 or 0. A transistor whose word line is LOW will be off or on depending on whether or not a charge has been stored on its floating gate. Thus, a word (row) is read by setting the group and ground select lines HIGH, and setting all of the word lines HIGH except for the desired word, whose word line is set LOW. Each long column of 16 to 32 transistors in series will pass current or not, depending on the value of its stored bit for the selected word.
The higher density of NAND memory comes at a price in performance, in particular, access time. A row in a NOR memory array can be read out fairly quickly, typically within tens of nanoseconds. In a NAND array, the current that flows in a column is an order of magnitude lower than with NOR, and it can only be sensed reliably by integrating the current (charge transfer) over a relatively long time, on the order of microseconds. Thus, NAND memories are unsuitable for providing random access to instructions or data in a microprocessor system, which requires access times in the tens of nanoseconds.
However, on-chip memory arrays are big, and NAND arrays are still bigger and they can access a lot of data in parallel. So, manufacturers of NAND memory have targeted applications that benefit from relatively fast access to large chunks of data, rather than fast word-by-word random access. It’s no surprise that this characterizes NAND memory’s most popular applications, including photo storage in digital cameras, program and data storage in notebooks and smartphones, and SSDs in larger computers. When programs or data must be accessed randomly in these applications (for example, when a program is actually invoked and is running), it is first copied into volatile, read/write random-access memory.
The difference between NAND and NOR memory is often described in terms of their external interfaces, which are quite different. But the difference between those interfaces has been driven by their different applications, rather than their internal array architectures, as we’ll see in the next subsections.