Consumer and Professional Digital Video RecordingFormats

Copyright © 1995, 1996, 1997, 1998, Roger Jennings, all rights reserved. Revised and updated October, 1998. This document may be copied and/or distributed for non-commercial purposes only. Please send comments and corrections to the author at Roger_Jennings@compuserve.com.

OakLeaf Home

Sony Corporation's September 1995 announcement of two digital video (DV) camcorders sparked a flurry of articles in consumer-oriented video magazines. The first "affordable" digital camcorders with digital inputs and outputs had arrived about six months earlier than projected by most industry pundits. Sony's 3-CCD DCR-VX1000 (U.S. MSRP US$4,199) and, to a lesser extent, the single-CCD Sony DCR-VX700 (MSRP US$2,999, now discontinued) captured the imagination of both professional and amateur videographers. Direct digital output to non-linear editing (NLE) systems and lossless multigenerational editing using conventional linear methods seemed about to become a reality. Although Matsushita beat Sony to the punch by a couple of weeks with a DV camcorder announcement, most reviewers considered the Panasonic model's lack of an IEEE-1394 (FireWireŽ) connector to be a critical omission. The Sony and Panasonic camcorders claimed 500 TV lines of horizontal resolution achieved by a luminance sampling rate of 13.5 MHz and chrominance sampling rate of 6.75 MHz (4:1:1 YCrCb format for NTSC), which is a substantial improvement over the bandwidth of Hi8 and S-VHS formats that offer about 400-line resolution.

JVC and Sony released in 1996 two highly-miniaturized DV camcorders to the U.S. market. The US$3,000 Sony DCR-PC7, which also comes in a DVCAM version, sports a 2.5-inch fold-out LCD monitor, in addition to a color viewfinder, and offers an IEEE-1394 connector, which Sony now calls i.LINK™. In January 1997, Matsushita released details of two miniature DV camcorders, one of which has a 4-inch LCD monitor. Both of the new Panasonic camcorders have IEEE-1394 connectors. Sharp's 1996 DV entry featured a large LCD display, but neither the early JVC or Sharp models included IEEE-1394 capability. Sony's DHR-1000 DVCR (MSRP US$4,199) gained widespread U.S. availability in early 1998. Later in 1998, Sony introduced the GV-D900 and GV-D300 "DV Video Walkman," with and without a 5.5-inch active-matrix LCD display, respectively. According to In-Stat, a market research organization, more than three million consumer DV camcorders had been sold world-wide as of August 1998.

Three "professional" variants of the consumer DV format have set off a new skirmish in the video "format war." Panasonic took the wraps off its then-nascent DVCPRO product line at the National Association of Broadcaster's (NAB) show and convention in April 1995. Panasonic's DVCPRO uses the same basic video and audio encoding format as the consumer DV format, but most DVCPRO gear is priced substantially above beyond even affluent prosumers' pocketbooks (US$10,000 and up.) Sony unveiled its DVCAM format, also based on the DV standard, at NAB '96. As of mid-1998, Sony offers a complete line of DVCAM camcorders and DVTRs. Panasonic announced at Amsterdam's International Broadcast Conference (IBC) in September 1996, its intent to deliver a DVCPRO variant (DVCPRO-50) with 3.3:1 compression in a 4:2:2 YCrCb format with a 50-Mbps data rate, twice that of the consumer DV format. The full line of DVCPRO-50 gear became available in Fall, 1998. So far, DVCPRO has far outpaced sales of DVCAM gear; the fate of DVCPRO-50 remains to be seen. Paxon and other broadcasters recently have made major buys of DVCPRO-50 VTRs.

This paper describes the technology behind consumer DV camcorders and DV tape decks (consumer DVCRs and professional DVTRs), including recording formats, basic electronic architecture for DV, and the data format for communication between consumer and professional DV devices. For the most part, this paper is based on the Digital Interface for Consumer Electronic Audio/Video Equipment, Draft 2 (DI Draft 2), jointly presented by Philips, Matsushita, Thomson Multimedia, and Sony to the October 1995 meeting of the (IEEE) 1394 Trade Association. Additional details are derived from General Specifications for Consumer-Use Digital Interface (Specification), a proposed International Electrotechnical Commission (IEC) standard for transmitting digital audio/video data over the IEEE-1394-1995 High Performance Data Bus, "The Development of Audio and Video Signal Processing LSI for SD-DVC" (IEEE Transactions on Consumer Electronics, Vol. 41, No. 3, August 1995), and The Digital Video Tape Recorder by John Watkinson (1994, Focal Press, ISBN 0-240-51373-8.) Information about Panasonic's DVCPRO product line is based on a paper entitled "DVCPRO: A comprehensive Format Overview," presented at the SMPTE meeting in New Orleans (Fall 1995), and Panasonic product bulletins. Details of Sony's DVCAM format are based on published Sony specification sheets.

Cassette Dimensions and Interchangeability

Part 1 of the Specifications defines the dimensions for standard and small DV cassettes, and recording geometry. The standard (4.9 x 3 x 0.57 inch) cassette is designed for use in DV videocassette recorders (DVCRs) for both home recording and playing prerecorded cassettes up to 4.5 hours in length. Figure 1 shows the bottom view and insertion end view of the standard (4.5-hour) DV cassette, which is smaller than an audio Compact Cassette. The small (2 x 2.2 x 0.5 inch) DV cassette is intended for use in DV camcorders. Figure 2 shows the bottom view and insertion end view of the small (30-minutes or 1-hour) DV cassette.

Fig. 1. Standard (4.5-Hour) DV Cassette for VCRs

Fig. 2. Small (30-Minute or 1-Hour) DV Cassette for Camcorders

Tape width is 6.35 mm (1/4-inch) and tape speed in normal play/record mode is 18.81 mm/sec (0.75 inch/sec.) Thus a one-hour mini-DV cassette has a total tape length of about 65 m and the standard cassette contains more than 250 m of tape. Consumer DVCRs accommodate by means of moving reel motors both standard and small cassettes without the need for an adapter. One of the interesting features of both cassette sizes is Memory in Cassette (MIC); cassettes include non-volatile RAM that the specifications allow to store up to 16MB (Megabytes). MIC can store a variety of information in a combination of fixed and optional data areas. Sony consumer DV cassettes have 512 bytes of MIC RAM that hold information on the tape type, cassette grade, plus date and time of multiple recording segments. MIC data is used internally by the VCR or camcorder, and is transmitted over the digital interface. Only Sony DV tapes currently offer MIC; MIC is not required for DV cassette compatibility.

Sony DV tape consists of the following five layers:

  • Back coating (lubricant) to reduce friction between the tape and guide pins.
  • Base film that acts as the substrate of the tape.
  • Double metal-evaporated magnetic layer.
  • Evaporated carbon overcoat layer (Diamond-Like Carbon , DLC) to protect the magnetic layer
  • Surface preparation layer (lubricant) to reduce friction between the recording drum and the tape, and as a barrier to oxidation of the magnetic layer

Hi8 metal-evaporated (ME) tapes have a history of problems with dropouts, both in new and heavily-used tapes. Thus many Hi8 shooters use a metal particle tape, such as Fuji's ME-221, designed for the flux density and having the coercivity of metal-evaporated tape. Initial reports indicate that neither Sony or Panasonic ME tapes for DV camcorders exhibit dropout problems, even with repeated reuse. One user of Sony tapes recently reported logging 24 hours of footage without a single dropout. One-hour mini-DV cassettes have a street price from about US$12 (Panasonic, no MIC) to US$20 (Sony with MIC, US$14 without).

Tape Track Geometry and Recording Sectors

DV recording is an extraordinary example of miniaturization of electronic devices. DV uses helical azimuth recording (+/- 10 degrees), which requires a minimum of two heads on a drum rotating at approximately 9,000 rpm for NTSC video (exactly 9,000 rpm for PAL and SECAM.) The track width is 10 microns (millionths of a meter), compared with Hi8's 20.5 microns and VHS's 58 microns. (A human hair is about 100 microns in diameter.) The track slant angle is about 9 degrees, resulting in a nominal track length of 35 mm and an active track length of 33 mm. DV does not provide a conventional control track to maintain head tracking; instead, pilot tones are embedded in the data tracks. The speed of the capstan varies in accordance with the relative intensity of low- and high-frequency pilot tones created by "flipping" unused bits in the data. In addition to the helical tracks, the Specification shows two optional linear tracks; DVCPRO uses these two linear tracks for linear timecode (LTC) and audio cueing during shuttle. DVCPRO also provides an internally-generated vertical interval time code (VITC) for backward compatibility with existing editing systems. Figure 3 shows the track geometry and division of the helical (slant) track into standard sectors for the DV format.

Note: The basic track geometry shown in figure 4 can be used for a variety of video and audio applications, in addition to DV. As an example, the audio and video sectors could be combined into a single sector containing MPEG-2 video and audio data for timeshift recording of direct-to-home satellite broadcasts from programmers such as DirecTV and EchoStar. An application code (APT) in the ITI sector specifies the number and location of track sectors, as well as their purpose. The version of the Specification discussed here is limited to the DV format in which the APT value is 000. The Panasonic DVCPRO format uses optional linear track 1 as an audio cue track and track 2 as a control track providing linear timecode (LTC). The track width of the DVCPRO format is 18 microns with a tape speed of 33.8539/1.001 (33.82 mm/sec.) The increased track width and the concomitant increase in tape speed reduces recording density to allow the use of metal particle tape, rather than DV's metal-evaporated tape. DVCPRO uses a pair of separate erase, record, and playback heads, for a total of 6 heads on a 21.7 mm drum.

Fig. 3. Track Recording Geometry (Dimensions in millimeters)

To maintain maximum commonality between 525/60 (NTSC) and European 625/50 video formats, the DV format changes the number of tracks per frame. NTSC uses 10 tracks per frame (at 29.97 fps +/- 1 percent) and PAL requires 12 tracks per frame (at 25 fps). Figure 4 shows the frame sequence for NTSC systems.

Fig. 4. Frame Sequence for Recording (NTSC)

To provide for video-over-sound (VOS) insert editing, audio dubbing, and timecode during shuttle, the helical track is divided into the following four track sectors:

  • Insert and Track Information (ITI) Sector. The ITI sector contains information on track status and serves in place of a conventional control track during video insert editing, because the tracking pilot tones in the insert region are not accessible during overwrite.
  • Audio sector. The audio sector contains both audio data and auxiliary data (AAUX). DV accommodates two 32-kHz, 12-bit (nonlinear) stereo channels (1 and 2) or one 48-, 44.1-, or 32-kHz, 16-bit stereo channel (1). Sampling of both 32-kHz audio formats or the 48-kHz format can be (and ordinarily is) locked to the horizontal frequency for frame synchronization. The Sony consumer camcorders record only a single PCM 32-kHz, 12-bit stereo channel, which has a high-frequency cutoff below 16 kHz. Unlike Hi8, which provides independent stereo FM and 16-bit PCM (pulse-code modulation) tracks with about the same frequency response, you only can dub background music or narration with a DVCR to the second 32-kHz, 12-bit channel. If you want the full frequency range of a CD-ROM (44.1 kHz, 16-bit), you'll need to use a digital audio editing and mixing system and settle for a single stereo channel.
  • Video sector. The video sector contains video data and auxiliary video data (VAUX). Video data is compressed about 5:1, depending upon the amount of motion within a frame. Discrete cosine transform (DCT) compression is used, variations of which also is used for spatial compression by Motion-JPEG and MPEG. Groups of 27 8- by 8-pixel compressed blocks are gathered into macroblocks for recording. VAUX data includes recording date and time, lens aperture, shutter speed, color balance, and other camera setting data.
  • Subcode sector. This sector stores a variety of information, the most important of which is timecode. The subcode sector uses very short blocks of data (called packs) to maximize the probability of recovering continuous timecode data during shuttle.

Note: The NTSC versions of DV camcorders and VCRs use SMPTE dropframe timecode (29.97 fps), rather than the non-dropframe timecode (based on 30 fps) used by consumer and prosumer Hi8 VCRs (Sony Rewritable Consumer Timecode, RCTC).

An advantage of digital recording is that a substantial amount of auxiliary data can accompany the digitized audio and video. As an example, the Specification specifies formats for the following types of auxiliary data in subcode packs:

Titles, tables of contents, chapters, and parts for prerecorded tapes

  • Program identification for user-recorded tapes
  • Teletext (Japanese and European versions only)
  • Subtitles and karaoke lyrics in multiple languages
  • Closed captioning in multiple languages

The DV format includes a provision for copy management, although the copy management mechanism remains "to be determined" in the Specification. A combination of high-price and copy management issues led to the demise of DAT as a major consumer audio recording format. Sony explained the delay in U.S. availability of a companion DVCR for the DV camcorders as due to copy management issues. (The Sony DHR1000 PAL DVCR was available in Europe about six months before the U.S. version was announced.) Copy management relates to DVCR owners copying digital data from Digital Video/Versatile Discs (DVDs), not today's movies recorded in VHS format. DV technology creates a duplicate indistinguishable from the original, so US movie makers want the DVD format and DVD players to include an encryption system that prevents copying and to include a geographical playback restriction so, for example, German users can't play back DVDs released in the UK.

Encrypted programming with and without a pre-recorded decryption key is provided for a "restricted audience." A recorded decryption key allows only those who know the key to play a prerecorded tape. Thus youngsters can be thwarted from watching adult fare and children can be protected from overly violent movies and even user-recorded TV programs. This feature assumes a market for pre-recorded DV tapes, a highly-unlikely event. An unrecorded decryption key might be used for DBS timeshift recording in conjunction with a DBS set-top box. (MPEG-2 video compression has a substantial lower data rate than DV, and the DV format provides an MPEG-2 option.) As long as you pay your monthly programming fee, the encryption code is transmitted by satellite, and you can watch your recorded tapes; stop paying and your tapes won't play.

DV Camcorder and DVCR Video Quality and Architecture

The architecture of DV camcorders and DVCRs doesn't differ materially from today's high-end compressed digital video recording systems, such as Digital Betacam (DB). Both DV and DB are component formats, encoding luminance (Y) and two separate chrominance (R-Y and B-Y) signals on tape. DV uses a 13.5-MHz sampling rate (as does DB), but DB uses 4:2:2 encoding to obtain increased chrominance fidelity compared with DV's 4:1:1 sampling. DB also offers 10-bit encoding (versus DV's 8 bits) for improved signal-to-noise ratio (SNR). Digital Betacam uses almost 2:1 spatial compression (with 8-bit sampling), while DV employs nominal 5:1 compression. (The next section discusses relative compressed data rates.) DV gains some of its compression ratio by adding interfield (not interframe) compression of video images that don't have substantial motion. Because interfield compression results in a variable amount of data per frame and DV requires a constant digital data rate, adaptive intraframe spatial compression is necessary. As the amount of motion in a scene increases, the spatial compression increases (and vice-versa).

Most video professionals conclude that the DV format compares favorably with analog Betacam SP. DV's video signal-to-noise ratio of 54 dB is somewhat better than Betacam SP's 51 dB, and DV's 5.75 MHz luminance bandwidth beats Betacam SP's 4.1 MHz by a significant margin. Professional Betacam SP camcorders have substantially better lenses and larger CCDs than consumer DV gear, but cost at least five times as much as the street price of the DCR-VX1000. Most professional-quality interchangeable camcorder lenses (Fujinon, Canon) cost more than the entire DCR-VX1000. Direct comparison of DV vs. Betacam SP image quality requires the use of a professional dockable (digital) camcorder with a DV recorder, such as JVC's BR-DV1U that mates directly with most JVC dockable camcorders. (An optional adapter fits Sony dockable camcorders. Sony offers only a DVCAM dockable recorder.)

Note: A companion paper, "DV vs. Betacam SP: 4:1:1 vs. 4:2:2, Artifacts and Other Controversies," provides a technical comparison of the luma and chroma bandwidth of the DV format and several Betacam SP VTRs.

Fig. 5. Simplified Block Diagram of a DV Camcorder or DVCR in Record Mode

Figure 5 is a simplified block diagram of a DV camcorder or DVCR based on the designs of the Sony DCR-VX1000 and DHR-1000. Figure 5 shows record mode; reversing the data flow direction provides an approximation of the playback architecture. The Sony camcorders have digital (IEEE 1394-1995) and analog video (S-video/composite) and single-channel (32-kHz, 12-bit, non-linear stereo) audio outputs, but only a digital input. (European versions of the DCR-VX1000 don't have a digital input because of the much higher tariff for video recorders.) The Canon XL-1 has four 32-kHz, 12-bit or two 48-kHz, 16-bit audio inputs; an accessory shoulder pad provides two balanced (XLR) audio inputs. DVCRs, such as the Sony DHR-1000, incorporate full digital and composite/S-video (Y/C) analog and audio I/O, including support for two stereo channels using the 32-kHz, 12-bit format or one stereo 48/44.1-kHz 16-bit linear channel. As shown in figure 5, interfield motion detection is used to determine if a substantial amount of information is duplicated between fields. If such is the case, only the differences between fields are decoded and the quantization threshold is changed to maintain a constant data rate.

Video, Audio, and Data Recording Formats

Figure 6 shows the recording format for video, audio, and subcode data for a single frame of NTSC video. A frame of compressed video data consists of 10 tracks of 138 data blocks containing 76 bytes of actual video data and a 1-byte header. Multiplying 10 tracks * 138 blocks * 76 bytes * 8 bits/byte * 29.97 frames/sec results in a video data rate of 25.146 Mbps (Megabits per second), which corresponds to the nominal 25 Mbps video data rate ascribed to the DV format. The theoretical data rate of 8-bit ITU-R BT.601 (formerly CCIR-601) with non-standard 4:1:1 decoding (720 active video samples per line, 8 luminance bits and 4 average chrominance bits per sample, 485 active lines, 29.97 frames/sec) is 125.5 Mbps. 125.5/25.146 gives the 5:1 compression ratio generally attributed to DV, which encodes 480 active lines per frame. If the compression ratio is based on BT.601's standard 4:2:2 decoding, the compression ratio is about 167.5/25.146 or about 6.6:1. Because of the adaptive quantizing process that results in lower compression for frames of relatively low motion content, the overall compression ratio, compared to BT.601 can be considered between 5:1 and 6:1.

Fig. 6. Video, Audio, and Subcode Recording Format

Audio data consists of 9 blocks of 76 bytes, giving a maximum audio data rate of 10 tracks/frame * 9 blocks/track * 76 bytes/block * 8 bits/byte * 29.97 frames/sec = 1.64 Mbps. Four 32-kHz, 12-bit channels require a data rate of 4 channels * 32/1000 MHz * 12 bits = 1.536 Mbps, the maximum digital audio data rate. (Two 48-kHz, 16-bit channels also need 2 * 48/1000 * 16 = 1.536 Mbps.) The aggregate recording rate, including parity but less the ITI sector, is about 10 * ((90 * 163) + (12 * 12)) * 8 * 29.97 = 35.5 Mbps. Thus about 35.5 - 25.15 - 1.64 = 8.7 Mbps or about 25 percent of the recorded data is devoted to subcode data, error detection, and error correction.

Video and audio data is accompanied by error-correcting rank (inner) and file (outer) parity codes. Figure 7 shows a simple error correction system created by a single outer and inner parity bit for an 8- by 8-bit matrix (8 bytes). To create the parity bit, you count the number of 1 bits in each column and row. Add a 1 to the parity cell for columns or rows with an uneven number of 1 bits (called even parity.) If a single bit gets "flipped" during recording or playback, the inner and outer parity bits can locate the flipped bit and correct the error. Single-bit parity, however, can't correct multiple-bit errors in a single column or row for a bit matrix larger than 3 by 3. More sophisticated error correction codes based on polynomial expansions are used for large blocks of data. Like CD-ROMs, DV uses Reed-Solomon (RS) error detection and correction coding. RS can correct localized errors, but seldom can reconstruct data damaged by a dropout of significant size (burst error). Error concealment techniques use an estimate of the missing data based on preceding and succeeding fields or frames; error concealment techniques are not described in the Specification. If concealment is implemented in DV devices, it is assumed that proprietary methods will be used, because concealment is a device-internal function. Concealment is a very complex process and requires a substantial amount of RAM to be successful.

Fig. 7. Simplified Diagram of Error Correction with Parity Bits

LSI Implementation of the DV Recording System

Matsushita Electric Ind. Co., Ltd. (MEI) describes a large-scale integration (LSI) implementation of the circuitry required for consumer-use DV products in a paper, "The Development of Audio and Video Signal Processing LSI for SD-DVC," published in the August 1995 IEEE Transactions on Consumer Electronics. The LSI chipset is used in Panasonic's DVCPRO product line, as well as its 1-CCD and 2-CCD consumer camcorders, and is the primary component of Panasonic Broadcast & Television Systems Company's DVCPRO developer's kit. One of the most interesting features of the of the MEI LSI chipset is its capability to interpolate DV's 4:1:1 YCrCb digital component data to and from standard 8-bit parallel BT.601 4:2:2. Audio I/O uses the industry-standard Philips I2C interface. Thus commercial ICs can be used to deliver serial D-1 (SMPTE 259M) video and AES/EBU serial audio for integration of DVCPRO gear into existing digital patch bay systems. Although neither the current Panasonic DVCPRO products or consumer camcorders include IEEE-1394 digital I/O as a standard feature, the partitioning of the LSI chipset is suited to addition of standard LSI components to implement IEEE-1394 connectivity. Truevision and Matrox have announced IEEE-1394 adapter cards that use the Panasonic chipset as a hardware DV codec (coder-decoder). FAST Multimedia's DVMaster adapter uses a Sony hardware DV codec, which (at least theoretically) reduces the time needed to render transitions and special effects.

Note: DVCPRO uses 4:1:1 encoding for the PAL 50/625 format, rather than DV's standard PAL 4:2:0 encoding. According to Panasonic representatives, 4:1:1 encoding provides better image quality than 4:2:0. (This claim is disputed by the DV Consortium; Panasonic probably uses 4:1:1 encoding for both NTSC and PAL because it simplifies the chipset.) Panasonic uses the 16-bit, 48-kHz uniformly quantized digital stereo audio recording format in both its consumer and DVCPRO product lines. The LSI chipset also supports the 12-bit, 32-kHz stereo format used by Sony and other consumer camcorders.

Fig. 8. Simplified Block Diagram of Matsushita's LSI Implementation of DV Circuitry

Matsushita's LSI chips divide the processing chores into Application and Tape Format layers, as shown in figure 8. The application layer handles shuffle/deshuffle and compression/decompression of the DV data, while the Tape Format layer implements RS error correction/detection and modulation/demodulation for the record/playback heads. The two layers are interconnected by Matsushita's proprietary DVC_BUS architecture, which consists of 8 data lines (BD0...BD7) and three control signals, BDEN (data enable), BDCK (data clock), and BQUIET (data start). Recording or playing back the audio and video data involves a two-frame delay, as shown in figure 9.

Fig. 9. Timing Diagram for Matsushita's LSI Chipset for DV and DVCPRO.

Digital Interface (DIF) Format

The organization of digital input and output data differs from the recording format shown in preceding figure 6, primarily because error correction is not required for digital transmission. (Error detection, which is simpler than error correction, is used to resend data packets containing errors.) Figure 10 shows the organization of data for digital transmission via IEEE 1394-1995. Isochronous data is packaged into digital interface (DIF) sequences. One NTSC frame consists of 10 DIF sequences, each of which contains 150 DIF blocks of 80 bytes (20 quadlets) each. The maximum number of quadlets (called payload) in a 100-Mbps (S100) 1394 packet is 256, so a DIF block must be sent in multiple packets, with padding to equalize the size of the packets. (The organization of DIF blocks and packets is covered by the Specifications; six DIF blocks padded with two quadlets comprise a 128-quadlet packet for S100 525/60 SD video systems; 12 DIF blocks padded with four quadlets comprise an 1125/60 HD packet.) Thus 25 packets are required for a NTSC DIF sequence, which translates into 250 packets per NTSC frame. Six DIF blocks are devoted to header, subcode, and video auxiliary (VAUX) data. 144 video and audio DIF blocks contain data identical to that recorded in the shaded video data and audio data blocks of figure 6. The DIF data rate is 10 sequences/frame * 150 blocks/sequence * 80 bytes/block * 8 bits/byte * 29.97 frames/sec = 28.77 Mbps (not including quadlet padding.) The single-byte header in the DIF block, identified as H in preceding figure 6, often is considered part of the DIF data, so many diagrams show 77 bytes/DIF block.

Note: A companion paper, "Fire on the Wire: The IEEE 1394 High Speed Digital Bus," by this author deals with the physical implementation of the IEEE 1394-1995 bus in a digital video environment. Although 200-Mbps (S200) and 400-Mbps (S400) implementations of IEEE-1394 are available, the data rate of the DV implementation of the bus remains at 100-Mbps. Fortunately, 100-Mbps devices are downwardly-compatible with higher-speed buses.

Fig. 10. Organization of DIF Blocks within Video Frames

Figure 11 shows the transmission sequence of DIF blocks within a single DIF sequence. Nine audio DIF blocks are interleaved with 135 video blocks in a 9- by 14-block matrix following the six blocks of header, subcode, and VAUX data. (Audio auxiliary data, AAUX, is included within the audio blocks.) Although the video blocks are numbered in sequence in figure 9, the sequence does not correspond to left-to-right, top-to-bottom transmission of blocks of video data. Compressed macroblocks are shuffled in the recording process to minimize the effect of contiguous errors on the appearance of a single video frame; shuffling also aids in error correction and makes error concealment more effective. Audio data also is shuffled. Data is transmitted in the same shuffle order as recorded.

Fig. 11. Transmission Sequence of Data, Audio, and Video Content in DIF Blocks

Derivative DV Formats

The following four 1/4-inch formats, each of which is based on the DV standard, were announced in 1995 and 1996:

  • DVCPRO is a proprietary Panasonic Broadcast and Digital (formerly Television) Systems Co. (PB&DSC) format that uses metal-particle (MP) tape, an 18-micron track width, and a tape speed of 33.82 mm/sec, but otherwise complies with the consumer DV format for NTSC video recording, except for cassette size. The Society of Motion Picture and Television Engineers (SMPTE) have designated DVCPRO as the D-7 digital recording format. Philips-BTS also produces DVCPRO equipment under license from Matsushita. As noted earlier in this paper, DVCPRO uses optional linear tracks 1 and 2 for audio cue and control tracks, respectively. PB&DSC's primary market for DVCPRO is electronic news gathering (ENG) and corporate/industrial video production. PB&DSC claims that the 18-micron track width and use of MP tape provides the positioning accuracy and media durability required for conventional linear editing applications. Studio-grade DVCPRO decks sell in the range of US$10,000 to US$18,000 and offer 4X transfer between DVCPRO VTRs; PB&DSC announced under-$10,000 DVCPRO decks in late 1996. Panasonic showed at Fall 1996 Comdex a line of lower-priced "multimedia" DVCPRO gear, including the AJ-D230 desktop VTR and the AJ-D200 camcorder, which uses 1/3-inch conversion lenses. Both the AD-D230 and AD-D200 will offer optional IEEE-1394 I/O adapters in 1997. DVCPRO decks play back mini-DV tapes with a cassette adapter, and also can read DVCAM cassettes. PB&DSC has submitted the DVCPRO format for SMPTE standardization as D-7.
  • DVCAM is Sony's DV variant for industrial videomakers. DVCAM uses ME tape and differs from the DV format only in track pitch (15 microns) and tape speed (28.22 mm/s). Sony claims the 15-micron track pitch is necessary for frame-accurate linear editing. 4.5-hour DV cassettes provide 3 hours of DVCAM recording, in contrast to DVCPRO's 123-minute maximum with the large DVCPRO cassettes. DVCAM cassettes include a 2-Mbit MIC that permits storing data on up to 198 scenes. Unlike Sony's consumer DV camcorders, which use the 12-bit, 32-kHz audio format, DVCAM records 16-bit, 48 kHz stereo audio for DAT-like quality. Sony offers the DSR-130, a dockable DV camcorder, the DSR-85 DVCAM VTR, and ES-7 edit station for hybrid and non-linear editing. The low-end DSR-200 camcorder is a DVCAM version of the DCR-VX9000 (a shoulder-mount DCR-VX1000 with manual control by knobs and switches, rather than menu operations.) Like DVCPRO, DVCAM offers 4X digital transfer between VTRs and disk drives. DVCAM VTRs can read conventional consumer DV tapes, but cannot play DVCPRO tapes.
  • DVCPRO-50 is a recently-announced PB&DSC format for recording video on 1/4-inch tape at 3.3:1 compression with 4:2:2 sampling, specifications similar to those for JVC's 1/2-inch Digital-S format. This new DV-related format doubles the number of tracks per frame to 20 (NTSC) or 24 (PAL) and tape speed (to 67.64 mm/s); it uses an additional two heads (for a total of four) to record and play back the added tracks. The data rate doubles to 50 Mbps, still well within the 80-Mbps maximum isochronous rate of 100-Mbps IEEE-1394. 4:2:2 DVCPRO is directed to broadcast signal transport and video archiving applications where 4:2:2 R.601 sampling is considered mandatory. PB&DSC intends 4:2:2 DVCPRO to be interoperable with tapes recorded in the standard DVCPRO format.
  • SDL is a long-play consumer DV format introduced by Sony in conjunction with the release of the DCR-PC7 miniature camcorder that, along with HD (16:9 analog High-Definition) and MPEG-2 formats, are included in the DVC "Blue Book" specification. SDL achieves 50 percent more recording time on mini-DV cassettes by reducing the track width and slowing the tape speed commensurately.

With the exception of DVCPRO-50, each of the preceding formats uses the standard DV data format described in the Specification. Thus there is no significant difference in the underlying technology, signal transport, or integrated-circuit chipsets needed to implement these formats.

Software vs. Hardware DV Codecs for Editing on the PC

Relatively low-cost (US$750 and under) PCI adapter cards for Wintel and Macintosh PCs, such as Adaptec Corp.'s AHA-8940 and AHA-8945, employ proprietary software implementations of the DV codec that run on Pentium-class PCs. Intel's MMX extensions significantly speed the process of DV-to-RGB-to-DV transcoding required to add transitions (fades, dissolves, titles, and special effects) during the editing process. However, DV-to-RGB and RGB-to-DV software transcoding isn't a real-time process. Depending on the complexity of the transition, CPU and bus speed, and other PC hardware-related factors, rendering in either direction is six to 15 (or even more) times slower than real time. At present, the primary bottleneck is moving an extremely large amount of RGB or YUV video data. At 30 fps, 720x480 RGB images with 24-bit (3-Byte) color depth theoretically require delivery of 720 * 480 * 3 * 30  = 31.1 MBps to or from the codec for transcoding. YUV (rather than RGB) encoding reduces the data rate, but it remains unlikely that software codecs will enable two-stream (or more) real-time DV transcoding with current PC architecture. Encoding to DV is a more CPU-intensive process than decoding, because the encoder must use a lookup table to pre-determine the optimum DCT quantizing algorithm for the scene.

Early hardware codecs, such as the Sony DVBK-1, operate at a fixed clock rate and thus can't take advantage of increases in CPU processing speed and bus frequency. Further, the DVBK-1 can't handle DVCPRO50's 50-Mbps data rate. Two new hardware DV codecs, introduced in fall 1998, promise to break the price-performance barrier for consumer and professional DV adapter cards:

  • divio, Inc. claims its NW701 device is "the world's first single-chip DV codec. In addition to DV's standard 3.6 MBps data rate, the NW701 offers optional low data-rate modes of 3.0, 2.4, 1.8, 1.5, and 1.0 MBps to conserve disk space and accommodate older IDE fixed-disk drives. The NW701 also has an "[i]ntegrated multi-tap video filter for 4:2:2 to 4:1:1, or 4:2:2 to 4:2:0 conversions." Only 256K of 32-bit (1 MB) EDO RAM is required for internal data buffering. Two NW701s in parallel support the DVCPRO50 format. The NW701 is targeted at consumer, industrial, and broadcast DV capture and editing applications, with emphasis on the high-volume consumer market, and for use in the production of DV camcorders.
  • C-Cube Microsystems, Inc.'s DVXpress-MX product line consists of two chips, a mid-range (industrial/ENG) MX25 product for conventional 25-Mbps DV streams and a higher-priced (broadcast) MX50 version that handles DVCPRO50. Both chips have the capability of seamlessly transcoding DV and MPEG-2, including the MPEG-2 4:2:2 production format used by Sony Betacam SX gear. The two DVXpress-MX chips let editors mix DV and MPEG-2 content during the edit, and output the final product in DV or as fixed- or variable-bit-rate MPEG-2. One of the most intriguing features of the DVXpress-MX chips is their "integrated special effects engine, which enables on the fly blending -- on a pixel-by-pixel basis -- of up to two video streams plus four 24-bit colors." This means that a "full spectrum of 2D effects, including fades, wipes, and dissolves, can be implemented in real time."

There were no commercial implementations of the preceding new DV codec chips when this paper was last updated (October 1998.) Announcements of adapters based on the divio NW701 are expected at Fall Comdex and boards using the DVXpress-MX chips are likely to be announced at the April 1999 NAB show.

Competing Digital Recording Formats

Following are the three primary competitors to professional DV formats:

  • Digital-S is Japan Victor Corp.'s (JVC) 1/2-inch competitor to DVCPRO and Betacam SP. Digital-S uses 3.3:1 DCT compression of 4:2:2-sampled video recorded on 1/2-inch tape in a cassette housing derived from the VHS consumer standard. Standard tape lengths are 34, 64, and 104-minutes. The Digital-S format provides two linear audio cue tracks and a linear control track. Although the Digital-S format provides for two stereo 16-bit, 48-kHz pairs, JVC's initial product line only supports one stereo channel. Some Digital-S decks can play back S-VHS tapes. According to JVC, the production cost of Digital-S equipment is lowered by taking advantage of existing VHS transport mechanisms, and the special 1/2-inch tape is less likely to suffer from dropouts than 1/4-inch tape. JVC offers a dockable Digital-S recorder, two editing VTRs, and a playback deck. Currently, Digital-S equipment costs about the same as that for the high-end Panasonic DVCPRO equipment, but tape costs are in the same range as Betacam SP.
  • Betacam SX is Sony's new 1/2-inch digital format for ENG applications. Betacam SX uses MPEG-2 compression and 4:2:2 sampling to create a new 4:2:2P@ML (4:2:2 Profile at Main Level) MPEG-2 format with 720 samples per line and 512 (not DV's 480) active lines per frame. (The Standard-Definition MPEG-2 format is MP@ML, Main Profile at Main Level, which uses 4:2:0 sampling, delivers 480 lines per frame, and is limited to a 15 Mbps data rate.) Although the proposed 4:2:2P@ML standard has a maximum data rate of 50 Mbps, Sony defines its "Studio Profile" as having a data rate of 18 Mbps. The 18-Mbps data rate targets satellite news gathering (SNG) by allowing 2X uplinks or two simultaneous feeds to a single satellite transponder. Sony claims that its 18-Mbps format "can maintain picture quality equal to that achieved by other intra-only compression systems requiring bit-rates as high as 30 Mbps." Sony wants to restrict DVCAM to industrial videography, while positioning Betacam SX for broadcast ENG/SNG use. The equipment cost for Betacam SX is in the same range as Betacam SP, but hourly tape costs are lower.
  • D-VHS is JVC's long-awaited 1/2-inch bitstream format intended for consumer time-shift recording of satellite MPEG-2 TV programming from EchoStar's DISH Network. D-VHS uses standard VHS cassettes with digital-grade tape (based on the S-VHS formulation) and offers data rates of approximately 7, 14, and 28 Mbps. D-VHS VCRs can play, but not record, conventional VHS tapes. Head rotational speed is a constant 1,800 rpm, so the three D-VHS data rates are based on the use of 1, 2, and 4 tracks/revolution, respectively, doubling the tape speed for each bandwidth increment. At the 14-Mbps "standard mode" data rate, tape speed is 16.67 mm/s, and JVC says a standard tape will hold 5 hours of data. (According to JVC's technical specification, standard tape capacity is 31.7 GB; calculations based on a stated recording rate of 19.14 Mbps, which includes Reed-Solomon error correction overhead, calculates to 3 hours and 40 minutes recording time.) JVC first announced D(igital)-VHS in April 1995; there are recent references to D(ata)-VHS and use of the format as a "data refrigerator." JVC says that the D-VHS interface "is based on IEEE-1394," which infers that the connection does not fully comply with the IEEE-1394 specification. Toshiba and Thomson Consumer Electronics, the U.S. marketer of RCA- and GE-branded consumer electronic products, promised D-VHS recorders for RCA DSS (DirectTV and USSB) receivers would be available in 1996. They weren't, but Hitachi claims a D-VHS recorder will accompany its forthcoming DSS set-top box in 1997. The Hitachi DX815 D-VHS VCR, displayed at 1997 Winter CES, uses consumer-style IEEE-1394 connectors, but doesn't conform to the IEEE-1394 standard for data transmission. Non-conforming use of IEEE-1394 connectors is an anathema to many members of the 1394 Trade Association because of the potential for consumer confusion as A/V equipment with real IEEE-1394 compliance becomes widely available.

Note: D-VHS is not a video acquisition format; as of mid-1998 there have been no announcements of a D-VHS camcorder and none are expected. The short-term commercial viability of a format that is capable of recording only MPEG-2 MP@ML (Main Profile at Main Level, 720 by 480 pixels, 4:3, 4:2:0) digital programming, playing back VHS tapes, and possibly backing up PC data is questionable, at best. The 28-Mbps D-VHS data rate appears to be capable of recording DTV (digital TV, formerly ATV and HDTV) MP@HL (Main Profile at High Level, 1,920 by 1,080, 16:9, 4:2:0) broadcast content, which is limited by 8-VSB (vestigal sideband modulation) to 19.3 Mbps for a 6-MHz channel, so long-term prospects for the D-VHS format are brighter.

Summary

There is a variety of traditional component digital and analog video recording formats and standardized digital data transmission schemes, such as SMPTE 259M (serial D-1), for professional video equipment. Users of consumer and "prosumer" grade video gear in the under US$20,000/device range have been forced to accept analog color-under recording methods offered by VHS, 8-mm, S-VHS, and Hi8 formats. Decoding consumer-quality analog video signals to a digital data stream for computer editing, then encoding back to NTSC or PAL analog format decreases an already low SNR. The DV component video format, which delivers via the IEEE-1394 High-Performance Serial Bus recorded material with a luminance SNR of > 54 dB, has the potential to provide a level of video quality formerly reserved to Betacam SP, M-II, and even higher-cost formats. DV critics complain that 4:1:1 (NTSC) or 4:2:0 (PAL) sampling isn't adequate for chroma-keyed effects; however, DV's chrominance bandwidth of 1.5 MHz (-5 dB, analog component out) is about equal to that of Betacam SP's 1.5 MHz (-4 dB with MP tape.) Chrominance bandwidth is the primary determinant of chroma-keying accuracy. The difference in perceived image quality between industry-standard Betacam SP and digital content recorded on DV tape primarily is determined by the quality of the camcorder's camera/lens combination, not by the recording format. For example, one of the major complaints of DCR-VX1000 users has been the camera section's noise and artifacts when shooting at low light levels, which is a function of the lens and CCD components.

The constant 3.5-MBps (MegaBytes per second) data rate of the DV format is well within the read/write rate of today's Wide SCSI-2 disk drives, making low-cost non-linear editing (NLE) of DV content practical for relatively short-form productions. Capturing DV content with an IEEE-1394 adapter card fills 1 GB of drive space in about 4.75 minutes. Taking into consideration temporary files and the final movie file for recording to DV tape, figure on about 2 minutes/GB for the average production. Thus 9-GB (3.5-inch) and 23-GB (5.25-inch) Wide Ultra-SCSI drives using embedded-servo tracks to eliminate thermal recalibration are the best bet for long-form NLE. One of the trade-offs between linear and non-linear editing for long-form work is the relative cost of disk drives and tape decks; the cost per GB of SCSI drives is decreasing rapidly and reached less than US$100/GB in late 1997. High-performance Ultra-EIDE (Ultra-DMA33) drives now cost less than US$40/GB, and many models handle DV capture and playback quite well in Pentium II-class computers.

Several suppliers of professional video editing products have introduced non-linear editing hardware for DV-over-1394. Pinnacle Systems, which produces the Genie 3D effects processor, offers a $1,695 DV interface for its ReelTime desktop video editing system. At the lower end of the price spectrum, Pinnacle offers the miroVideo DV 300, a DV-over-1394 video capture and playback card that uses the Sony software DV codec. Truevision, famous for its line of Targa analog video adapters, offers its Bravado DV2000 DV capture and playback card with a full version of Adobe Premiere 5.0 editing software for $499 (direct sales only). Unlike most DV capture/playback cards, which use Microsofts AVI (Audio-Video Interleave) format, the Bravado uses Apple Computer's Quicktime 3.0 for Windows. The Bravado is a Wintel adaptation of Radius Inc.'s 1394 FireWire Host Adapter Card sold by Radius and its channel partners to Macintosh users. Radius's MotoDV software consists of a software DV codec and capture software with device control; EditDV ($US499 MSRP with the adapter card) and EditDV Unplugged (US$99, less adapter) are Radius's DV editing applications for the Mac. Digital Processing Systems (DPS), Inc. offers its Spark capture/playback card and Spark Plus (a Spark with an Ultrawide SCSI-2 port) for both Wintel and Mac users. The Spark products are based on OEM adapter cards manufactured by Adaptec, Inc.

Sony's newest PC mini-tower PCs, the VAIO PCV-E302DS (US$1,499) and PCV-E308DS (US$2,299, both less monitor) mini-tower PCs, come with Sony DVgate DV editing software. Sony's Kunitake Ando, head of the Information and Technology group's PC business unit, said in July 1998 at Windows World Expo in Tokyo: "We are not planning to make profits out of [PC] hardware sales. We hope our PC system VAIO works as a core in the total value chain of Sony's digital audio and video products."

Today's $2,000 to $4,000 "consumer" DV camcorders and $3,000 DVCRs are intended to replace the Hi8, S-VHS, and 3/4-inch U-matic gear used for a variety of video productions, ranging from instructional tapes to local cable-TV programming. Recent reports indicate that DV camcorders now represent close to 100 percent of camcorder unit sales in Japan. Microsoft's PC97, PC98, and PC99 hardware specifications require or recommend that the "Entertainment PC" version include at least one IEEE-1394 connector. Despite Microsoft's encouragement, only a desktop PC from Compaq and a laptop and two tower PCs from Sony came with built-in IEEE-1394 connectors as of late 1998.

DVCPRO, DV, and, to a lesser extent, DVCAM target budget-minded broadcast news directors and cable network producers. As an example, Video News International (VNI, owned by the New York Times) used Panasonic DV camcorders and DVCPRO decks to produce the weekly 30-minute "Trauma—Life and Death in the ER" series for Discovery Communications' The Learning Channel. VNI also has replaced with Panasonic AG-EZ1 DV camcorders the Sony Hi8 CCD-VX3 camcorders previously used by their video news stringers. (Sony uses the lens and CCD assembly of the venerable CCD-VX3 in the DCR-VX1000, DCR-VX9000, and DSR-200.) As more DV-compatible camcorders with better lens/CCD combinations, such as the $5,000 Canon XL-1, have become widely available, DV, DVCPRO and DVCAM are challenging Betacam SP's status as the standard format for ENG, low-budget electronic field production (EFP), and industrial videography. Once producers are able to take advantage of fully-digital acquisition, editing, and duplication, the issue of suitability of the 4:1:1 format for the majority of today's video productions is likely to be resolved in favor of DV.

Acknowledgments: FireWire is a registered trademark of Apple Computer, Inc. Thanks to Johann Safar, Technology Manager for Panasonic Broadcast & Television Systems Company, for corrections to figures 3 and 4, plus clarification of DV/DVCPRO compression terminology. Larry Blackledge and Alan Wetzel of Texas Instruments, Inc. provided valuable guidance for the discussion of the IEEE-1394 High-Performance Serial Bus, the digital interface for consumer DV products. Gary Bartlett of Sony Electronics contributed the correction to the size of the video data blocks in figure 6 (76 not 77 data bytes, plus the header byte). Consumer camcorder sales estimates are from U.S. Consumer Electronics Sales & Forecasts 1992 - 1997 published by the Consumer Electronics Manufacturers Association (CEMA) in September 1996. DV's share of the 1997 camcorder market is based on several November 1996 reports in the trade press, including Audio-Video International. The market share of DV camcorders increase dramatically in 1998, especially in Japan (where DV camcorders now have almost 100% of the consumer market.)

OakLeaf Home