■Multiple-language dubbing of up to eight languages ..............................................................................30
■Multiple-language subtitling of up to 32 languages.................................................................................30
2
Stream and packet transmission .....................................................................................................31
■“Stream” data flow...................................................................................................................................31
■Transmission by packet multiplexing .......................................................................................................31
■Title menu and DVD menu .......................................................................................................................32
■Multi-story function ..................................................................................................................................32
■Multi-angle functioning up to nine angles................................................................................................33
■Parental lock function (optional standard) ...............................................................................................33
■Multi-aspect ratio capability.....................................................................................................................34
Manufactured under license from Dolby Laboratories Licensing Corporation.
“Dolby”, the double-D symbol , “Pro Logic” and “AC-3” are trademarks of Dolby Laboratories Licensing Corporation.
3
Welcome to the next generation in
optical disc media
Announcing the Diverse, High Performance World of DVD
The digital audio era began in the fall of 1982 with
the release of the Compact Disc and the CDP-101,
the Sony’s first CD player. The CD represented a
true breakthrough in sound reproduction. Just 12
cm in diameter, it was only a fraction of the size of
vinyl analog records popular at the time. But the
CD, with its highly dynamic, crystal clear sound,
not to mention the sheer convenience of random
access, soon eclipsed analog records to become a
leading form of music software.
The CD format proved too good for music alone. In
the 1980s, it was applied in the rapidly progressing
computer field as a quick-retrieval data storage
system called CD- ROM. Not only is CD-ROM
software extremely convenient, it is also
remarkably inexpensive to manufacture. The
subsequent development of Video CD and Photo
CD was proof that the era of optical disc media
had arrived.
Interestingly enough, the CD was not the first
consumer- use optical disc media. It was preceded
by the LaserDisc in 1980. This 30 cm disc, roughly
the same size as the analog record, was designed
to hold up to one hour of high quality audio and
video per disc side.
Although the LaserDisc and CD differ in size, signal
recording system, and contents, they are
fundamentally similar in their method of recording
on micron-order pit rows and playback by laser
pickup. As the first step toward practical
application of optical disc media, the development
of the LaserDisc began in the 1960s with the
dream of creating “records which can also display
images.”
In the nearly two decades since the release of the
LaserDisc, a broad array of technological advances
in basic materials, hardware, software, and digital
processing have given birth to an exciting new
optical disc video format. DVD-Video is now a
reality, offering superior performance in every
parameter of operation.
DVD-Video, just one of the many proposed forms
of the DVD format, is the same size as the CD, yet
boasts far higher data storage capacity. In
combination with the highly advanced MPEG2
data compression technology, an entire movie —
with multiple soundtracks and subtitle choices —
can be recorded and played back on a single disc
with full-quality image and sound reproduction.
DVD-Video was joined by DVD-ROM, which was
standardized at the same time, then followed by
DVD-R and DVD-RAM, which were subsequently
standardized, as universal formats. DVD-Audio is
currently being studied for standardization. These
media are expected to play a central role in the
coming age of multi-media and information
technology.
■History of Optical Discs
LaserDisc
introduced
1980198219911997
Compact
Disc introduced
MO (magneto-optical disc)
introduced
■Evolution of the CD format
Red
Book
1981
CD-DA
Yellow
Book
1985
CD-G
1985
CD-ROM
Green
Book
(International Organization for
Standardization format)
1986
19871996
CD-V
1987
ISO9660
CD-I
Orange
Book
1988
CD-ROM XA
(extended format)
1989
CD-R
White
Book
1992
Blue
Book
1994
Photo CD
CD TEXT
1996
CD Extra
Video CD
1996
CD-RW
DVD introduced
4
Chapter One
The High Capacity Optical Digital Disc for
a New Era
A remarkably large storage capacity, the primary feature
of DVD, has been realized by higher recording density
made possible by advanced technology developed after
the introduction of the CD. The storage capacity of DVD
is 4.7 GB (gigabytes) by recording on a basic single
sided, single layer disc. This is approximately seven
times greater than that of the CD.
In addition to this basic single side, single layer disc,
there are three other types of discs: 1) the single sided,
dual layer disc which can store 8.5 GB of data; 2) the
double sided, single layer disc; and 3) the double sided,
dual layer disc. The latter two, respectively are the
double-sided versions of the single side, single layer and
single side, dual layer discs. At present, four types of
discs have been standardized.
In addition to DVD-Video, DVD-ROM, DVD-R (writeonce media) and DVD-RAM (rewritable media), for which
specifications have already been published, DVD-Audio
is being studied for standardization. This makes for a
total of five DVD categories.
In this chapter, our discussion is focused on the physical
format and fundamental signal process. DVD-Video will
be taken up in Chapter 2.
Key Points
●High density optical disc for the next generation which realizes a large capacity of 4.7 GB (single
side, single layer), approximately seven times greater than that of CD.
●Four disc types: single sided, single layer (4.7 GB); single sided dual layer (8.5 GB); double sided,
single layer (4.7 GBx2); double sided, dual layer (8.5 GBx2).
●Minimum pit length and track pitch are approximately one half of those of the CD. Primary
technologies which made this high density storage possible are:
(1) Short wavelength red semiconductor laser.
(2) Improved numerical aperture (NA) of the objective lens.
(3) Disc construction using thin 0.6 mm substrate.
●Signal processing capability for the optical disc system was improved in both modulation and
error correction.
●Five disc categories have been defined: DVD-Video, DVD-ROM, DVD-Audio, DVD-R, and DVD-
RAM.
●Making use of its large storage capacity and interactive characteristics, DVD is more than just a
video disc. In the coming multi-media era, DVD represents a new data medium for a variety of
applications in many different fields.
5
DVD disc construction and high-density recording
t=1.2mm
t=0.6mm
A
A
A
'
A
'
thickness of the disc thickness of the disc
Even a single sided, single layer DVD has about seven
times the storage capacity of CD. What has made this
large storage capacity possible is new technology in
high density recording and reproduction.
Minimum pit length in DVD is 0.4 microns, with a track
pitch of 0.74 microns. Both are approximately half of
those of the CD. This has been made possible by
making the laser beam spot much smaller than that of
the CD. The main technical approach used to make the
■Basic Specifications of DVD
DVD-Video/
DVD-ROM
(Single sided,
single layer type)
DVD-Video/
DVD-ROM
(Single sided,
dual layer type)
disc
diameter
disc
thickness
(two 0.6mm
substrates
bonded)
(two 0.6mm
substrates
bonded)
minimum
pit
length
maximum
pit
length
track
pitch
sector
alignment
CLV
CLV
laser beam spot smaller is the use of a laser beam with
shorter wavelength, and the adoption of a larger
objective lens numerical aperture (NA). Improved signal
processing of data encoding (modulation format) and
error correction and reduction of data redundancy also
contributed to an increase in storage capacity. In
addition, the storage capacity of DVD can be greatly
increased by using both dual layer and double sided
configurations.
reference
scanning
linear
velocity
ISO9660
ISO9660
file system modulation
UDF Bridge
(
UDF &
UDF Bridge
(
UDF &
EFM plus
(
8-16
)
EFM plus
(
8-16
)
correction
)
)
error
RS-PC
RS-PC
readout
wavelength
of laser
diode
(reference)
650/
635nm
650/
635nm
NA of
objective
lens
(reference)
0.64.7GB
0.68.5GB
data
capacity
CD-ROM
CLVEFMCIRC
ISO9660
0.45780nm
■Two 0.6 mm substrates bonded back-to-back
The DVD disc itself is 1.2 mm thick and 120 mm in
diameter and is made of polycarbonate, the same as a
CD. The disc consists of two 0.6 mm thick substrates
bonded back-to-back, while the CD is composed of a
single substrate. This has an advantage of minimizing
the effects of the inclination of recorded surface against
the incident laser beam (discrepancy against vertical
angle) or tilt angle, thus reducing reading errors and
contributing to higher storage density.
■Effects of tilt angle differ by disc thickness.
The thicker a disc is, the larger the difference in length there
is between A and A’
■Relationship between the laser beam’s wavelength and the NA of the objective lens
The diameter of the laser beam spot is in proportion to
the laser beam’s wavelength, and is in inverse proportion
to the numerical aperture of the objective lens. In DVD, a
red semiconductor laser beam with a short wavelength
of 650 nanometers or 635 nanometers is employed,
while the CD uses a 780 nanometer laser beam. The
numerical aperture (NA) of the objective lens of a DVD
pickup is 0.6, larger than the 0.45 of the CD. This
enables focusing the laser beam to a smaller spot than
with the CD, so pit sizes and track pitch can be made
smaller, resulting in higher storage density on a given
disc size.
0.68GB
6
The larger the NA of the objective lens, the smaller the
tolerance of the tilt angle. However, the DVD uses two
0.6 mm thick substrates bonded back-to-back, and
necessary reading precision is maintained. The physical
recording density of the DVD is 4.6 times as dense as
that of the CD. In addition to these improvements in
physical precision, improved signal processing in data
encoding (modulation) and error correction also
contribute to higher data capacity of the DVD.
■
DVD vs CD in pit length and track pitch
DVD
track pitch: 0.74 µm
beam spot Ø0.9 µm
minimum pit length:
CD
0.4
µm
track pitch: 1.6 µm
beam spot ø1.4 µm
minimum pit length:
0.83
µm
7
Four types of high-capacity disc construction
Four different types of disc construction have been
standardized: single sided, single layer; single sided,
dual layer; double sided, single layer; double sided, dual
layer disc. The single sided, single layer disc is quite
similar to CD except that its substrate is 0.6 mm thick.
The double sided, single layer disc is composed of two
■Single sided,
single layer disc:
4.7 GB storage
capacity
■Single sided,
dual layer disc:
8.5 GB storage
capacity
disc
thickness
1.2mm
disc
thickness
1.2mm
substrate
reflective layer
substrate
substrate
reflective layer
0.6mm
0.6mm
laser beam
0.6mm
0.6mm
single side, single layer substrates bonded back-toback. The single sided, dual layer disc is a single sided,
single layer disc having one more storage layer beyond
the first layer. To enable the pickup to read data on the
second layer, the first layer is covered by semi-reflective
material.
■Double sided,
single layer disc:
9.4 GB storage
capacity
■Double sided,
dual layer disc: 17
GB storage
capacity
disc
thickness
1.2mm
disc
thickness
1.2mm
laser beam
laser beam
substrate
reflective layer
substrate
reflective layer
substrate
semi-reflective
layer
laser beam
reflective
layer
0.6mm
0.6mm
0.6mm
0.6mm
semi-reflective
layer
substrate
laser beamlaser beam
■Data reading and recording methods
In the case of the single sided, single layer disc and the
single sided, dual layer disc, the data is read from one
side of the disc as is done with CD. With the double
sided, single layer disc and the double sided, dual layer
disc, the data must be read from both sides of the disc.
Data is recorded from the inner circumference to the
outer, like with CD, except for dual layer discs (both
single sided and double sided).
There are two methods of recording dual layer DVD. One
method is to record data on both layers from the inner
circumference to the outer (parallel track path) and the
other is to record data from the inner circumference to
the outer on the first layer and from the outer to the inner
on the second layer (opposite track path). Data is
regarded as a single volume in either method, and
producers can select either method depending on the
content of the software and the purpose of its
expression.
semi-reflective
layer
reflective
layer
substrate
laser beam
■Recording method for the dual layer disc
2
parallel
track path
opposite
track path
lead-in area lead-out area
lead-out area
lead-in area
1
1
data area
2
middle area
data area
8
Modulation and error correction
■EFM PLUS Modulation
in the DVD system
In recording data on the disc, the “EFM plus (8-16
modulation) format” is employed to encode the source
data for storage on the DVD. Digital signals are normally
encoded prior to recording them on the storage media
instead of recording them as they are (taking the CD and
DVD as examples; the “0s” and “1s” of the original data
are not formed in pits as they are). By EFM the signals, a
much larger volume of data can be recorded on a track
of the same length. EFM is performed to improve the
efficiency (linear storage density) of the recorded wave
form.
In CD and DVD, the original data constructed in 8-bit
units are translated into blocks of 14-bit or 16-bit
patterns using the conversion table. (This is called 8-14
modulation or 8-16 modulation.) The edges of pits
pressed on the disc indicate where “1s” appear in the
successive patterns of “0s” and “1s.” The principle
of the modulation format called Non Return to Zero
Inverse (NRZI) is used here.
The 8-16 modulation of the EFM plus format used in
DVD is the improved version of the 8-14 modulation of
the EFM format adopted in the CD. This 8-16
modulation may appear inefficient when compared to 814 modulation because 16 bits are used instead of 14
bits. In actual practice, however, it achieves a higher
storage density. A total of 17 bits (14+3=17) are required
because 3 merging bits are used to link the blocks of 14
bits in the EFM, while a multiple number of conversion
tables are used (only one conversion table is used in
EFM) to suppress indispensable DC components
needed to form the optical disc system and merging bits
are not required in EFM plus. The EFM plus format
increases storage density by approximately 6 % when
compared to the EFM format. (Note 1)
Note 1: (17-16)/16=0.0625
■8-16 modulation and pit pattern
8-bit data
The original 8-bit data is
translated into a 16-bit pattern by
the conversion table.
EFM plus modulation (16 bits)
recorded wave form
Recorded wave form reverses phase
if "1" comes and maintains
the phase when "0" comes.
pit pattern
Boundaries between the pits and
intervening reflective area
on the disc, or pit edges, are "1s".
00000111
0010000001001000
9
■DVD error correction system
In digital data media, errors or data dropouts in
recording data due to scratches on the disc, disc
vibration and other causes need to be corrected. Error
correction is done by adding error correcting codes.
Data with those codes are modulated, and then pits are
formed. By adding those codes, mistakes made in
reading data are corrected.
The ratio of added codes in the total data, which
includes the original data, is called data redundancy. The
■Error correction system for DVD
The following explains the principle of data correction in digital data recording:
original data
RS-PC (Reed-Solomon Product Code) system is used
for error correction in DVD. This has more error
correcting capability than the CIRC (Cross Interleave
Reed-Solomon Code) used in the CD, and also
boasts lower data redundancy. Burst error correction
capability, which is expressed by how long a pit row on
the disc can be corrected, of a DVD player is about 6
mm versus about 2 mm of a CD player.
011
011
error correcting codes
are added
Digitized data are arranged in the
table as shown on the left.
001
011
0
001
100
011
0
011
100
(B)
011
011
001
row error correcting codes
0
011
(a)
1
column error correcting codes
An error occurs
when reading data
on the disc.
0
011
(A)
1
locating and
correcting an error
For example, a code is added to
make the number of “1” in the row
and column even.
Taking (a) as an example, a code “1”
is added to make the number of "1"
in the row "001" even.
The number of “1” in the row and
column is checked.
The number of “1” is decided to be
even after a code is added, so an
error must be generated somewhere
in either row (A) or column (B).
The data where the row (A) and
column (B) cross is erroneous
data. There is a “1” which is an error
and we know the correct value is
“0”. The data is restored to the
correct value.
10
Naturally, the error correction system actually employed is much more
complicated, and it can correct more complicated errors.
Five diverse formats make up the DVD format group
In the DVD format group, there are five format variations:
DVD-Video, DVD- ROM, DVD-R, DVD-Audio, and DVDRAM. They can be classified by the kind of data
application and whether it is read-only or read-writable.
The physical and logical formats for each format are
defined in published specifications; DVD-Video in Book
B, DVD-ROM in Book A, DVD-R in Book D and DVDRAM in Book E. Specifications for DVD-Audio is
currently under study as Book C, respectively Looking at
■Five DVD format variations
Designation
DVD-Video
DVD-ROM
TypeApplicationStatus of specs
Read only
Read only
High quality package media for movie film
software with pictures and sound
Large capacity data media that allows high
quality multi-media application programs
for use with computers.
these five DVD categories from the standpoint of the
physical format; the read-only types such as CD-ROM
(Book A), DVD-Video (Book B), and DVD-Audio (Book C)
use the same format, while the recordable DVD-R (Book
D) and DVD-RAM (Book E) use independently separate
formats. As a file system for use with personal
computers, the UDF subset is employed on the readonly disc to facilitate compatibility.
Specs published in Sept.
1996 (Book B)
Specs published in Sept.
1996 (Book A)
DVD-Audio
DVD-R
(
Write Once
DVD-RAM
(
Rewritable
*3.95 GB/single sided **2.6 GB/single sided
)
)
Read only
Write once
Rewritable
High sound quality package media specifically
for music
Large capacity and write once data media for
computers
Large capacity and rewritable data media for
computers
Under study
*Specs published in July
1997 (Book D)
**Specs published in July
1997 (Book E)
As of January 1998
11
DVD file systems
UDF (Universal Disc Format) is adopted as a file format
in DVD. It was devised by OSTA (Optical Storage
Technology Association, an international organization)
which consists of some fifty optical storage media
related companies. The “UDF-Bridge” shown in the
figure below, which extends to DVD-ROM, DVD-Video,
■Format construction and applicable books
application layer
volume layer
physical layer
format name
DVD-ROM
video
data format
UDF-BridgeUDF
physical format
read-only disc
DVD-Audio, and DVD-R, means that the bridge format
which makes the conventional ISO 9660 valid as a
subset of the UDF for DVD is used. The ISO 9660,
which was originally devised as the file format for CDROM, is used to make the DVD compatible with CDROM.
audio
data format
physical
format
write-once
disc
DVD-AudioDVD-Video
physical
format
rewritable disc
DVD-RAMDVD-R
Book ABookBook BBook CBook DBook E
12
Questions & Answers
What is write-once DVD?
Write-once DVD, DVD-R, can be recorded only
once. Its data recording layer is coated by a
proprietary organic dye. Recording is performed by
applying a powerful laser beam on the layer.
Thermal change takes place at the spot radiated by
the powerful laser beam and the light absorbing
property of the spot becomes smaller than the part
not radiated. In reading data, digital signals
consisting of “0s” and “1s” can be retrieved by the
change in reflectance of data reading laser beam
like data is retrieved by existence of pits in the CD.
Spots thermally changed do not change even
when they are radiated by the relatively weak datareading laser beam and data once recorded cannot
be rewritten or overwritten.
Data recording and playback principle of DVD-R
protective
coating
reflective
coating
organic dye
polycarbonate
cross section of the disc
spot not
radiated
by the
laser
beam
recorded
Dye on the spot
chemically changes by
radiation of a powerful
laser beam and its light
absorbing property
also changes.
powerful
laser beam
reading recorded unrecorded
recorded
unrecorded
spot
part
weak
laser beam
Differences in the
reflective ratio are
obtained from the spot
radiated by laser beam
and the part not
radiated.
Can pictures be recorded on DVD?
We have explained that DVD-Video is a read-only
format and the rewritable DVD-RAM is currently still
being discussed. Will a disc that can record video
and audio with quality and time duration equivalent
to DVD-Video be available from other format
variations in the DVD group?
The answer is that it will be some time in the future
until such a disc will be available for home use. The
reasons are the lack of storage capacity (data
capacity of DVD-RAM is about 2.6 GB which is still
smaller than the 4.7 GB of DVD-Video) and some
as-yet undiscovered technical breakthroughs such
as video signal encoding LSIs. In addition to these
technical obstacles, there are very important
unavoidable problems in protecting the copyrights
which cover various digital information, not to
mention for moving pictures and sound. Sony is
presently engaged in various activities to address
these problems.
What about Rewritable DVD?
This is called DVD-RAM, and is a rewritable type of
DVD disc. For the data recording layer, it uses a
metal compound that reverses its phase from a
crystallized state (crystalline phase) to an
amorphous state (non-crystalline phase) and vice
versa by the methods used to heat and cool the
compound. The compound is rapidly heated and
cooled by radiating a powerful laser beam to create
an amorphous state spot on the crystalline state
recording layer to record data. This type of disc is
generally called a “phase change optical disc”
because data is read using diverse reflectance of
the amorphous state spots and the crystalline state
parts caused by phase transition.
Data can be written repeatedly on the disc
because amorphous state spots revert to a
crystalline state if a weak laser beam is radiated on
them and then cooled gradually. By controlling the
power of laser beam radiated on the recording
layer, simultaneous erasing and writing or direct
overwriting can be
performed.
phase change recording principle
recording power
laser
power
media
amorphous state
(molecular arrangement
diagram)
laser
(molecular arrangement
crystalline state
diagram)
erasing power
read-out power
13
14
Chapter Two
Features of the DVD-Video Format
The world of DVD begins with the advent of DVD-Video,
while the world of CD began with the music CD (CDDA).
As the storage capacity of DVD was initially aimed at
enabling the recording of an entire movie on a single
disc, DVD-Video is designed to make the dream of
enjoying pictures and sound with theater quality in the
home entertainment. As a new home entertainment
media, DVD-Video has many features which have not
been available from conventional package media. In this
chapter, we will highlight the technology which has
realized high quality video and audio from a compact
120 mm disc, and describe the many unique features
Key
●An entire movie can be stored on a disc the same size as a CD by making use of its large
storage capacity.
●Picture quality close to that of the Digital Video Format D1 used for professional video
masters has been realized by the adoption of MPEG2 compression.
●The same realistic sound and excitement as in movie theaters can be enjoyed in homes
through the high-fidelity 5.1 Channel Surround of the Dolby Digital (AC-3) System, and the
high-fidelity 5.1/7.1 Channel Surround of the MPEG system.
●Interactive software, with features such as the multi-story function which allows users to
select a story and the multi-angle function for the selection of viewing angles, can be
designed.
●The multi-language function allows dubbing of up to eight language soundtracks, and
subtitles in a maximum of 32 languages. The multi-aspect function enables the reception
of wide pictures on your TV screen.
15
DVD-Video technology in detail
■
DVD-Video realizes high quality pictures, high quality sound, and multiple
playback functions.
The main features of DVD-Video can be summarized as
high quality pictures, high quality sound, and multiple
playback functions which are made possible by its large
storage capacity and digital signal processing. In terms
of picture quality, high quality images have been realized
with 500TV-line resolution, better than either laserdisc
or CD-Video.
The Dolby Digital (AC-3) System and the MPEG Audio
System are adopted for sound in addition to the linear
PCM audio. The 7.1/5.1 channel surround in the MPEG
system reproduces high quality sound, while the 5.1
channel surround in the Dolby Digital (AC-3) system
offers high fidelity sound with a very impressive and
realistic sound field not available with Dolby Pro Logic.
Using these high quality pictures and high quality sound
The most remarkable feature of DVD-Video
is that it offers both the highest
picture quality of home video media and
as a basis, interactive functions such as the multi-story,
multi-angle, and multi- language functions are also
achieved.
the capability to record an entire movie, which requires a
long recording time, all available on a single disc the
same size as a CD.
■Video and sound specifications for DVD-Video
DVD-Video
video compression system
resolution (pixels)
horizontal resolution
compression ratio
video
video bit rate
field/frame
aspect ratio
audio
audio system
audio bit rate
audio
number of channels
quantization bit sampling
frequency
others
Note 1: In the case of PAL, DVD-Video is not compatible with the high definition system. Note 2: Either Dolby Digital, MPEG or Linear PCM can be selected for each audio system.
subtitles
MPEGLinear PCM
max. 912 kbps
(per stream)
max 7.1 ch. /
stream
48 kHz48 kHz
MPEG2 (MP@ML)
720 x 576 pixels (Note 1.)
approx. 500 TV lines
approx. 1/40
9.8 Mbps, max. (variable)
field/frame
4:3/16:9 (pan scan/letter box)
8 streams, max. (Note 2)
Dolby Digital (AC-3)
max. 448 kbps
(per stream)
max 5.1 ch. /
stream
2 bits, run length bit map system,
32 streams, max.
max. 6.144 Mbps
16-bit, 20-bit, 24-bit
(per stream)
max 8 ch. /
stream
48 kHz, 96 kHz
approx. 250 TV lines (same as VHS)
Video CD
MPEG1
352 x 286 pixels (Note 1)
approx. 1/140
1.15 Mbps (fixed)
frame
4:3
2 channel (stereo)
MPEG1 layer 2
224kbps (fixed)
2 ch only
16-bit 44.1kHz
open caption only
Dolby Digital (AC-3): 1 stream,
Laserdisc
analog
approx. 420 TV lines
analog
4:3
analog 2 channel,
digital 2 channel
(16-bit/44.1 kHz)
or
analog 1 channel,
digital 2 channel
(16-bit/44.1 kHz)
open caption, closed caption
16
■Data volume can be distributed according to contents to be recorded.
DVD may be regarded as a large container of digital data
and DVD-Video stores pictures and sounds in it as the
main contents. DVD-Video features superior flexibility in
accommodating the source, including subtitles and
dubbing in multiple languages. Data is appropriately
distributed depending on contents of the source within
the framework of the total storage capacity. For
example, when recording a movie, whether subtitles and
dubbing are recorded in several languages, or just in a
single language, will make the recording time much
different for the same title of software. When recording a
music clip which does not require as much recording
time as a movie, its sound can be recorded by the linear
PCM format without compressing the sound since there
is a plenty of room in terms of total storage capacity.
■One movie (av. 3.5 Mbps) + subtitle in
one language + surround soundtrack
in one language = about 160 minutes
of recording time = 4.673 GB data
volume
■One movie (av. 3.5 Mbps) + subtitles
in 4 languages +surround
soundtracks in 3 languages = about
133 minutes of recording time =
4.680 GB data volume
■One music clip (av. 6.5 Mbps) +
48 kHz 20-bit 2 channel linear
PCM audio = about 74 minutes
of recording time = 4.673 GB
0.010Mbps
0.384Mbps
3.5Mbps
subtitles x 1 (ex: Japanese)
surround audio 1 (ex: English)
video
3.894Mbps × 160
time
minutes =
4.673GB < 4.7GB
160
minutes
3.894Mbps
0.040Mbps
0.384Mbps
0.384Mbps
0.384Mbps
subtitles X 4 (ex: Japanese/English/Chinese/Arabic)
Recordable data volume is the same in all examples.
133
4.692Mbps
minutes
1.92Mbps
6.5Mbps
20-bit 2 channel
linear PCM audio
8.42Mbps × 74
video
time
minutes =
8.42Mdps
74
minutes
4.673GB < 4.7GB
17
The high image quality technology of DVD-
Video (1) MPEG2 picture compression system
■Why is data compressed?
The data volume which can be stored on a CD is 5440
Mbits [Note 1] (680 MB). In the case of a music CD,
digital audio data equivalent to 74 minutes of playing
time can be stored on one disc. If picture signals in the
PAL format are digitized, data volume for one minute will
be more than 160 Mbits [Note 2] without compression
and the data volume which can be stored on one disc
will be less than 4 minutes of recording time even if a 4.7
GB DVD disc is used, and less than 34 seconds if
stored on a conventional CD.
Storing 74 minutes worth of picture data on a Video CD
was made possible by reducing the number of picture
elements by half in both the vertical and horizontal
directions to 352 x 286 pixels, and then finally reducing
the data to 1/140 by using the MPEG1 compression
system.
What made it possible to store picture data for 133
minutes (note that most popular movies are within 133
minutes) with a resolution of 720 x 576 pixels
(corresponding to PAL) on a DVD-Video disc (single
side, single layer) was the adoption of the efficient data
compression [Note 3] of the MPEG2 system in addition
to tremendous improvements in disc storage capacity.
Note 1: 5440 Mbits is a simply converted figure of 680
MB based on 1 byte = 8 bits.
Note 2: Calculated on 25 pictures with 720 x 576 pixels
a second providing 8 bits to luminance and 8 bits to
color per pixel.
Note 3: The compression ratio is about 1/40, lower than
that of Video CD.
■Playing time of movies
movies of over 133 minutes
playing time: 7%
movies of less than 133 minutes
playing time: 93%
At present, about 93% of popular
movies are less than 133 minutes long
■Profile and level of MPEG2
MPEG2 was originally designed as a universal encoding
system which can also be used in transmission media
for communication applications. The “profile (five types)”
is used to indicate combinations of functions suitable to
a number of applications to maintain compatibility
between media, while the “level (four types)” is used,
and both are prescribed in the MPEG2 format
■Currently prescribed profiles/
Profile
Level
High
1920 × 1080 × 30
or
1920 × 1152 × 25
High-1440
1440 × 1080 × 30
or
1440 × 1152 × 25
Main
720 × 480 × 29.97
or
720 × 576 × 25
Low 352 × 288 × 29.97
Note 1
Note 1
Note 1
Note 1
SimpleMainSNR Scalable
MP@HL
US digital HDTV
MP@H1440HP@H1440
SP@ML
Digital
transmission
cable TV
MP@ML
DVD-Video, Digital satellite
broadcasting
(PerfecTV and others)
MP@LL
specifications. Of the eleven currently proposed profiles
and levels, what was adopted for DVD-Video is
“MP@ML (Main Profile/ Main Level ).” Until digital HDTV
formats become popular in the future, many different
applications other than DVD-Video may be put into
practical application using the MP@ML standards.
Note 2
Note 3
SNP@MP
SNP@LL
Spatial Scalable
SSP@H1440
European digital
HDTV
High
HP@HL
HP@ML
18
Note 1: Shows the standard number of horizontal pixels x vertical pixels x frame frequency
Note 2: SNR = Signal Noise Ratio
Note 3: MP@ML = Main Profile at Main Level
■Hybrid coding by combining three main methods
Roughly speaking, the MPEG2 (MPEG1 is also the same
in basic concept terms) motion picture compression
method is a combination of three main methods; “DCT
(Discrete Cosine Transform)” which uses correlation in
moving pictures to compress data, “Motion
Compensation” which uses correlation between pictures
to compress data, and “Huffman Code Processing”
which employs correlation of code rows.
DCT uses the correlation of the spatial direction of
pictures to remove the redundancy of data in the spatial
axis, and Motion Compensation uses the correlation of
the time direction in pictures to remove the redundancy
of data in the time axis. In Motion Compensation, data is
compressed to about 1/2; by DCT, 1/10 to 1/20; and by
Huffman Code Processing, 2/3 to 1/2. Data is
compressed to roughly 1/40 of the original volume in
total. Reducing data to 1/40 means that picture signals
■Picture compression in DVD-Video
DVD-Video
of 240 Mbps can be sent at the rate of 6 Mbps, or 160
Mbps at 4 Mbps.
Compressing and decompressing motion pictures by
these methods require an enormous amount of
complicated calculation. The LSI of MPEG2 decoder
circuits performing these calculation at high speed has
made commercialization of the DVD-Video player
possible.
The superior features of DVD-Video (employing MPEG2)
over the video CD (which uses MPEG1) are: 4 times the
number of picture elements (2 times each the in
horizontal and vertical directions) and pictures are
interlaced at 60 fields/second (MPEG1 is non-interlaced
with 30 frames/sec.). MPEG2, the higher standard, is
compatible with MPEG1.
data is compressed to about 1/40 on average by MPEG2
video data
input
motion compensation
(compressed by
correlation between
pictures), about 1/2
Discrete Cosine
Transform (compressed
by correlation within
pictures), about 1/10
Video CD
data is compressed about 1/140 on average by MPEG1
video data
input
motion compensation
(compressed by
correlation between
pictures), about 1/5
Discrete Cosine
Transform (compressed
by correlation within
pictures), about 1/14
Questions & Answers
What is MPEG?
MPEG stands for the Moving Picture Experts
Group. This is the popular name of the working
committee of experts who worked on encoding
motion pictures as a sub-group (WG11) to JTC1/
SC29 of ISO/ IEC. The MPEG name is also used for
the standard agreed on in this committee and
approved as the international standard by ISO/IEC.
MPEG2 is a more sophisticated, more powerful
standard of MPEG1, which was adopted in 1991 as
variable rate
Huffman Coding
(compressed by
correlation of code
rows), about 1/2
Huffman Coding
(compressed by
correlation of code
rows), about 1/2
compression
data is distributed
efficiently according to
difference in
video data volume
fixed rate
compression
data is distributed
evenly regardless of
differences in
video data volume
picture
quality nearly
equivalent to
studio-use
master tape
(D1)
picture
quality
equivalent to
home use
VHS video
the standard for CD-ROM. MPEG2 was adopted in
1994 as the universal coding system for various
transmission media including broadcasting and
communication as well as for storage media such
as optical disc.
The formal names of standards are
ISO/IEC CD11172 for MPEG1 and ISO/IEC
13818 for MPEG2.
19
Spatial axis compression -
discrete cosine transform
■DCT removes unnoticeable frequency elements.
The nature of human perception is deeply considered in
the compression of data for both video and audio.
Human vision perceives pictures with some of high
frequency elements omitted the same as pictures
without any omissions. The total data is reduced by
omitting unnoticeable frequency elements selectively
after picture signals are resolved in several frequency
element groups. The frequency element resolving
process of MPEG2 (and MPEG1 is basically the same) is
■Basic concept of spatial axis compression
frequency A
original wave form
resolution
called the “Discrete Cosine Transform” method and
selective omission of specified frequency elements
taking advantage of limitations in human perception is
called the “Quantize Matrix” method. These processes
are done with dividing a frame of the picture in blocks of
8 x 8 elements (6480 blocks in total) and the luminance
levels of the elements in each block are converted into
values, and then are converted into frequency element
data.
frequency B
freq A + freq B + freq C
frequency C
■Encoding process for spatial axis compression
one frame is
576 elements
Changes in luminance and color
are small in some parts of a
frame (low frequency) while
changes are large in other parts
(high frequency) and a frame
can be considered as an
overlaid composite picture
comprised of a very low
frequency picture, a low
frequency picture, a high
frequency picture, and a very
high frequency picture.
divided into
6480 blocks.
720 dots
one frame is
divided in
blocks of 8 x 8
elements.
One frame is divided into
small blocks, and luminance
and color values of picture
elements in each block are
converted into numerical
data. Then, the data is
converted by DCT into an 8 x
8 block of frequency data.
frequency D
(high frequency elements)
removed selectively
10
5
1
0
0
5
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Larger picture element values,
distributed at random before
the DCT conversion, tend to
gather in low frequencies
region (upper left area of the
block) by the DCT conversion.
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Values after conversion are
divided by a specific number
and the remainders are
rounded (quantization step)
to get many 0s line up in
higher frequency regions
(lower right area of the block).
Removing these zeros (higher
frequencies region)
compresses the total data.
20
Time axis compression -
predictive coding with motion compensation
■Predictive cording constructs image by predicting motion from preceding and
succeeding pictures
An image in PAL television and video is composed of 25
synthesizing with the differential to reproduce pictures.
frames/second (a frame consists of 625 scanning lines)
and the display equipment reproduces a picture of 50
fields/second since it is scanned by interlacing to
eliminate flickering. A large part of each picture is made
of the same elements if continuous 25 pictures per
second are compared with each other. And, they do not
change much in a short time.
The method to reconstruct the original picture by
extracting and coding the differential between
continuous pictures is called “Predictive Coding”
because the current picture is predicted from the picture
immediately preceding it. As described in detail in the
following section, “Time axis compression, GOP
construction and bi-directional prediction, ”MPEG
If there is a changing part in 25 pictures, the data
volume necessary to store and reproduce the picture will
be much smaller if only the changed part (differential) is
provides the coding to predict the current picture from
the immediately preceding and succeeding pictures on a
bi-directional prediction.
stored and the other part is stored for one picture for
■Motion Compensation extracts only motion changes
There are two types of changing parts or parts with
motion in pictures: one does not change its shape but
changes its position as time elapses, while the other
changes its shape as time elapses. Data of the shape
should be used as it is for the former. The amount of
position change without changing shape which takes
place with time, or the amount of movement, is called
the “motion vector.” The original picture can be
reconstructed when reproduced (in the decoding
process) with smaller data volume by coding only the
movement. This method which uses the motion vector is
■Predictive coding and motion compensation
(1)
(1)—(2)
(2)
1. Pictures (1) and (2) to be reproduced as time elapses
have many elements common to each other.
Differential or (1) minus (2) is extracted and (2) is
reproduced by synthesizing the differential with (1).
Preceding frame
Current frame
The
position
in the
The
position
in the
2. In the motion compensation process, pictures
are divided into blocks and the motion vector
which indicates the amount of movement is
extracted and coded for the part whose
position changes, without changing its shape.
By combining this motion compensation with
predictive coding, data is more efficiently
compressed.
21
Time axis compression -
GOP construction and bi-directional prediction
■GOP construction by I picture, P picture, and B picture
To perform predictive coding employing motion
compensation, MPEG2 uses GOP (group of pictures)
construction made of three elements called the I picture,
P picture, and B picture and MPEG2 organizes roughly
15 consecutive video frames into GOPs. The I picture
(Intracoded picture) is compressed by DCT using
information within the frame only without predicting the
motion from the preceding frame.
If pictures constructed by the predictive coding are
successively lined up, pictures cannot be displayed
instantly when accessed at random. Then, the standard
for access is made periodically to respond to random
accessing. The I picture is for maintaining independence
from the GOP, so to speak.
The frequency of the I picture is normally 1-I picture/ 15
frames it is decided based upon the random access
capability required for applications. The data volume of
an I picture is 2 to 3 times that of the P picture and 5 to
6 times that of the B picture. The GOP is the group of
pictures from one I picture to the next I picture. Thus, in
simple terms, picture prediction is performed within
pictures in the same group.
■An example of picture arrangement in GOP
.... I
BBPBBPBBPBBPBBI ....
GOP
I picture = Intracoded picture P picture = Predictive coded picture B picture = Bi-directionally predictive coded picture
Time
■The P picture and B picture
The P picture (predictive coded picture) is made on
the basis of the I picture by predictively encoding the
immediately preceding picture. The P picture may be
defined as an “interframe forward direction predictive
coded picture” in relation to the I picture, which itself
is an “intraframe coded picture.” The B picture is a
“bi-directional predictive coded picture” and is made
by predicting two P pictures, the preceding and the
succeeding. Taking a close look at the relationship
between the I picture, P picture, and B picture in a
GOP, the first step prediction made from the I picture
located at the beginning of the GOP is performed in
the forward direction and the P pictures are made. At
that time, the P pictures are positioned jumping over
■Forward and bi-directional
a multiple number of B pictures to be constructed
later.
A multiple number of B pictures are made by the second
step prediction from the first I picture and the P picture
coded at the first step prediction (bi-directional
prediction) and they are positioned between the I picture
and the P picture. Another set of B pictures is made
between the first P picture and the second P picture.
The B picture when decoded compensates motion
using two motion vectors and two reference pictures,
preceding and succeeding. Bi-directional prediction, a
feature of MPEG, uses two pictures, the past and future
pictures timewise, for prediction to achieve highly
efficient prediction.
22
IBBP
bi-directional
prediction
forward
prediction
B
BP
The P picture is constructed by forward predicting
the immediately preceding I picture or P picture
while the B picture is made by bi-directional
prediction using the immediately preceding and
succeeding pictures.
Huffman coding
■Huffman coding: a statistical method for data compression
In addition to motion compensation, which removes data
redundancy using correlations of the time axis in
pictures, and DCT which removes data redundancy
employing correlations of spatial axis in pictures, MPEG2
(MPEG1 is basically the same) uses a method called
Huffman coding to achieve a further reduction in data
redundancy.
Huffman coding uses correlation in the code row made
by MPEG processing to compress data (for example, five
Questions & Answers
What is 4:2:0 coding in MP@ML?
In MP@ML (Main Profile at Main Level) of MPEG2, a
component system called “4:2:0” is adopted for
encoding video signals. Numbers 4, 2, and 0
indicate the sampling frequency ratio of the
luminance signal (Y) and two color differential
signals (Cb and Cr) included in the horizontal
scanning lines of video pictures, or the resolution
ratio. For one picture element [Note 1], 8 bits are
used for luminance and 8 bits each are used for
color differential. The eye is not so sensitive to color
as it is to luminance, and because the human eye is
unable to perceive the reduction of color information
as picture quality degradation in relation to
luminance information, data can be compressed
with no perceivable visual difference.
Pictures without color information reduction are
called “4:4:4,” pictures reduced to half in the
horizontal direction called “4:2:2,” and pictures
reduced to half in both the horizontal and vertical
directions are called “4:2:0.” In the “4:2:0”system,
color information is one fourth of luminance
consecutive 0s are expressed as 0 x 5 instead of lining
up five 0s). This may be called a “statistical” method if
the motion compensation is called a “time” method and
DCT a “spatial” method. Picture data is compressed
spatially, statistically, and chronologically in MPEG2 and
the total volume of codes generated is controlled by the
quantization step for efficient data transmission.
information. The DV system, which has achieved
excellent color reproduction for home use digital
video, employs a “4:1:1” type component signal
system that reduces color information to one fourth
of luminance information and its data volume is the
same as the “4:2:0” of the MP@ML of MPEG2. The
“4:2:0” [Note 2] coding system is normally
processed and output as “4:2:2” after lines are
supplemented by processing within the LSI.
Note 1: In MPEG2, 8 bits are used for each of
luminance Y, and color differential Cb and Cr in one
picture element (3 X 8 = 24 bits).
Note 2: The sampling frequency for all of Y, Cb, and
Cr is 13.5 MHz in “4:4:4.” In “4:2:2”, 13.5 MHz for
Y, 6.75 MHz for Cb and Cr. In “4:1:1”, 13.5 MHz for
Y, 3.375 MHz for Cb and Cr. In ”4:2:0”, 2 lines with
different sampling rates are alternatively repeated.
One is 13.5 MHz for Y, 6.75 MHz for Cb and Cr is
not sampled, while the other is 13.5 MHz for Y, Cb
23
Are there any differences between DV and DVD data compression?
The DV format of digital video recording is available
as home use video media. It is quite different from
DVD-Video in terms of compression because of the
basic physical difference between the two forms of
media, magnetic tape and optical disc. Moreover,
the DV format uses only “within picture correlation’’
compression while DVD employs both ”within
picture correlation” and “time axis correlation”
compression.
What is the reason for this? DV was designed from
the very beginning as a means to provide video
recording and playback in the home, so tape editing
at any point is as an important requisite as
recording and playback. Data compressed on the
time axis has to be decompressed each time it is
played back or edited and data must be re-
compressed on the time axis to keep the data on
the tape. This requires tremendous signal
processing power, and is very inefficient.
For this reason, taking advantage of the far larger
storage capacity of tape over disc,“ within picture
compression” is used in DV for compressing data
on the tape while DVD must also rely on “time axis
compression” because of capacity limitations.
In addition to DVD, MPEG2 can also be applied in
digital TV broadcasting, and is expected to be
advantageous in multi-channel broadcasting. To
make this possible, the bandwidth per channel
must be kept as narrow as possible, so time axis
compression will be employed.
24
Loading...
+ 57 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.