1. What is surround? .................................................................................................................................. 6
1-1. Stereo and surround ..................................................................................................... 6
4-4. Monitoring the decoder output .................................................................................. 57
5. Monitor systems .................................................................................................................................... 59
Surround sound has evolved into more than the experience heard in cinemas. Through the introduction of
the DVD, it has invaded most every aspect of our lives — our homes, our cars, and even our workplaces.
We n ow l is te n to multi-channel audio delivered via television programs, video games, and even by the
music of our favorite bands. With the introduction of the DM2000, DM1000, and O2R96 Digital
Consoles, Yamaha provides a platform that includes complete surround sound mixing and monitoring
capabilities for studios of all types. These consoles offer a vast array of features and functions that enable
the user to create a world of multi-channel content.
Masataka Nakahara (the celebrated acoustician/studio designer and the author of this booklet) and SONA
Corporation have designed and supported numerous THX
representatives in Japan, they continually inform and educate studios owners in the calibration and design
of studio playback systems. During the development of these consoles, Mr. Nakahara offered his years of
experience to assist in the design of the surround monitoring capabilities. In conjunction with THX
engineers, the release of the Version2 software expands their features even further. This THX pm3
Approved revision includes the addition of THX presets for film, DVD, and music mixing. These are the
same settings used in THX certified studios.
Studios have a long track record in mixing mono and stereo content, but for some industry professionals,
multi-channel mixing is relatively new. There are more channels, more equipment, and more techniques
to be learned. How do you set up your studio? Do I use bass management? There are many questions to
be answered. This booklet offers an excellent compilation of the knowledge required to construct a
properly configured surround playback environment. Much of this document shares the same principles
as THX pm3 program. We are proud of our association with Yamaha, Mr. Nakahara, and SONA
Corporation and their efforts to create a manual to help guide the user. It is my sincere wish that engineers
carefully read this guidebook in order to obtain an accurate understanding of the surround monitoring
functionality provided by the Yamaha digital consoles. Here are the tools. Now, it's up to you to create the
perfect mix.
As one whose profession is the acoustical design of studios, I place great value on the parting ceremony
of handing over to its new owner my creation (studio) whose playback environment and acoustical
response I have ensured.
In order to actualize these characteristics in a multichannel studio, it is necessary to collect the
fragmentary technical information provided by various standards organizations and manufacturers, and
then to organize and understand this information.
Doing so takes an enormous amount of time, but one of the most valuable things I gained from the
process has been friendships with many superb professionals in the field, including Mr. Steven Martz
from THX.
As the lessons I learned from them began to take root in me, I have been acquiring valuable new strategies
and techniques for studio design.
Initially, I had doubts regarding techniques that seemed at first glance to conflict with a professional
approach, such as bass management and diffused surround, but as I spent time with professionals of
multi-channel audio, I came to see why many top-ranked experts with far more experience than myself
held these opinions and requirements for surround studios. In the process, I gradually obtained a glimpse
of various problems and aspects of surround playback that lie behind such questions.
This publication is a valuable booklet that brings together much valuable information obtained from firstrate professionals such as Steven from THX. I consider myself to have been a “ghost-writer” for these
experts, and think of them as the real authors of this booklet.
I would like to take this opportunity to extend my thanks to each of them.
In view of these intentions, portions of this booklet dealing with various standards have been written so as
to list the various multichannel formats as broadly, fairly, and accurately as possible.
I beg the indulgence of the reader for allowing me to include material that represents my own opinion as
an acoustic designer.
In my opinion, user experience as a listener is of great value in the production process.
In order for this to be so, a space for hearing multichannel audio in a correct playback environment is a
requirement not only for commercial applications but also for personal applications.
This is a case of “one hearing is better than a hundred views.”
It is my hope that this booklet will be a step toward obtaining the “hundred views” that will give you the
confidence to construct your own multichannel playback environment.
The most important consideration for a studio monitoring environment is that “the response of all
channels be consistent.”
The second most important consideration is that this consistent response be “good response.”
We c ou l d list numerous parameters for deciding whether the response is “good,” ranging from subjective
to physical, but the key point is that there be no large peaks or dips in the frequency response.
In the case of two-channel, it is fairly easy to create an environment in which “the response of all
channels — i.e., L and R — is consistent.” We simply need to ensure that the shape of the room and the
placement of the speakers is symmetrical between left and right.
In the case of multi-channel, on the other hand, it is often difficult to obtain a consistent playback
response for all channels simply by creating a symmetrical speaker placement and room shape.
Mixing of the final product must be done in a properly configured playback environment.
No matter how high the grade of your equipment, it is impossible to create a final mix unless you have a
good-sounding playback environment.
The essential identity of a professional studio is in its good monitoring environment.
The arrival of multi-channel is a good opportunity for us to reconsider the question of “what is a studio
monitoring environment?”
This method is based on a two-channel system (L, R), and adds a center channel (C) and surround
channel (S).
Although there are two surround speakers, one each at left and right, the playback is monaural.
The “3” in “3-1” indicates L, C, and R, and the “–1” indicates S.
Note that if “3-1” is expressed as “3.1,” this means “L, C, R” + “LFE” .
1-2-2. 5.1 ch
This method is based on the 3-1 ch system, but changes the surround to stereo (LS, RS) and adds an LFE
(Low Frequency Effect) channel for low-frequency effects.
The LFE channel is played back through a dedicated subwoofer designed for low-frequency playback.
1-2-3. 6.1 ch
This method is based on the 5.1 ch system, and adds a new back-surround channel (BS).
If two speakers are provided to play back the back-surround channel, these are sometimes called BSl and
BSr, but the signal that is played back is a monaural signal where BSl = BSr.
1-2-4. Other
As other formats, there is 3-2 (without LFE) and 2-2 (without C and LFE), which are based on 5.1ch but
do not use specific channel(s) of them
As a format with a greater number of channels than 6.1ch, we have 7.1ch.
7.1ch can be subdivided into the SDDS format which is used in film, and Dolby ProLogic IIx which is
used in DVD-Video etc.
SDDS is a discrete 7.1ch format which adds LC and RC channels between L and C and between R and C
respectively, and is used in applications such as supplementing the center gap between screen speakers in
large movie theaters. Since the 7.1ch SDDS format is compatible with 5.1ch, we can say that SDDS
supports both 5.1ch and 7.1ch configurations.
Dolby ProLogic IIx uses matrix logic processing within the decoder to stereoize BS (BSl, BSr), and at
present is targeted for surround processing in the playback system of consumer decoders (receivers).
Current multi-channel systems were developed to maintain compatibility with previous systems, and have
not been researched or developed in order to reproduce a 360° virtual acoustic space.
This means that if you expect current multi-channel systems to deliver full virtual acoustic playback
capability, you will be at your wits end. In particular, sound images directly to the side (the phantom
sound image of L and LS, or the phantom sound image of R and RS) are difficult to portray with current
speaker configurations, due to the physiology of hearing.
The key to multi-channel production is how to make effective use of the newly-obtained channels to
create a product with the maximum “entertainment value.”
In our consideration of multi-channel monitoring, it is important to understand the following three key
points.
Multichannel
formats
Playback
environment
Bass
management
[Fig. 2] Three keys of multichannel monitoring
In addition to the above three points, this document will discuss the construction of a monitor system, and
the measurements and adjustments that are necessary in order to create a multi-channel playback
environment.
It should be noted that this booklet is written for medium-to-small multichannel studios, and that much of
the material (e.g., speaker placement, delay adjustment, bass management) will not apply to surround
monitoring in a large space, such as in a movie theater or in a dubbing studio where the final mix of a film
is being made.
At present, multi-channel playback is supported by numerous types of consumer media, of which DVD is one.
The playback response for each of these types of media is defined by the organizations or manufacturers listed
below.
Media
Playback response
specification
FilmSMPTE
DVD-Video
*1
Dolby lab., DTS
DVD-AudioDVD Forum WG4
Super Audio
CD
Digital
broadcast
Sony, PhilipsDST coded DSD
*5
ARIB
Dolby lab.Dolby DIGITAL
DTSDTS
Administrative bodyMPEG-2
Storage method used(Note)
Dolby DIGITAL, DTS, SDDS,
and others
Dolby DIGITAL, DTS, and
others
*2
LPCM, PPCM
(Packed PCM, MLP)
MPEG-2 AAC
*7
<<SMPTE, ISO
<<DVD Forum WG1
*3
*4
*5
*6
=DVD Forum WG4
=Sony, Phillips
<ISO, IEC
–-
–-
*8
<ISO, IEC
Media standards
Other matrix methods*9 such as Dolby Surround, Dolby ProLogic II(x), and Circle Surround
GAMEDolby lab., DTSDolby, DTS<<
manufacturers
(Notes) “<<”Within the recording format specified by the standards organization, the actual
recording method and playback response are provided by another party.
“<”The recording method specified by the standards organization is used, and the
applying organization considers the playback response.
“=”The standards organization directly specifies the recording method and the playback
response.
*1
DVD-Video also allows LPCM multichannel recording.
*2
The PPCM algorithm is provided by Meridian Audio Ltd.
*3For PPCM, maximum 96 kHz/24-bit/6ch.
For LPCM, maximum 96 kHz/24-bit/4ch, 96 kHz/20-bit/5ch, 96 kHz/16-bit/6ch.
(For 2ch, maximum is 192 kHz/24-bit)
*4
(For 2ch, Plain DSD (uncompressed DSD) is also possible)
*5
Japan
*6
Europe, USA and Korea
*7
Europe, etc.
*8
Europe, etc.
*9
Can also be applied to analog broadcast.
**
Indicates that this is not a broadcast media standard, but a recording format standard.
organization
**
**
Hardware
[Table 1] Multi-channel formats and standards organizations
Each format of multi-channel media is characterized by a combination of “surround processing method,”
“encoding and compression method,” “recording response,” and “playback response.”
Most of these types of media provide “downmixing” functionality to allow two-channel playback.
Multichannel media
Surround
processing
A/D and D/A,
Compression
Record
specification
specification
Down
mixing
Playback
[Fig. 3] Factors that feature multichannel media
Currently, the following major multi-channel formats exist as mass consumer media.
There are two types of surround processing method; “matrix” and “discrete.”
2-1-1. Matrix
This method uses phase synthesis technology to record a larger number of channels on a limited number
of tracks.
This means that for some channels, there may be restrictions in playback bandwidth and channel
separation (crosstalk).
Matrix processing is often used for analog recording where the number of tracks is limited, such as for the
analog tracks of a film, or on video cassette tape.
However in principle, it could also be applied to digital media such as CD.
Recently, 5.0 matrix formats using Dolby Pro Logic II have been used frequently in game media.
Production
L
R
C
S
Master
Surround processing
[Fig. 4] 3-1Matrix
Production
L
R
C
(LFE)
LS
RS
Master
Playback by end-users
Lt (L total)
Rt (R total)
Media
Surround processing
Movie, VHS etc.
Playback by end-users
Lt (L total)
Rt (R total)
Media
L
R
C’(≒in-phase signal of Lt and Rt)
S’(≒anti-phase signal of Lt and Rt)
L (+LFE)
R (+LFE)
C’
LS’
RS’
Surround processing
Surround processing
Game etc.
[Fig. 5] 5.0 matrix
If the master source of the LFE channel contains the important information and it needs to be played back,
it should be mixed into L&R in advance.
This method allows each channel to be recorded as a completely independent track.
This became possible with the advent of high-capacity media such as DVD, and with the advance of
digital compression technology.
Production
L
R
C
S
Master
Surround processing
DVD-Video, DVD-Audio, DTV etc.
Playback by end-users
L
R
C
S
Media
Surround processing
L
R
C
S
[Fig. 7] 3-1Discrete
Production
L
R
C
LFE
LS
RS
Master
Surround processing
Movie, DVD-Video, DVD-Audio, Super Audio CD, DTV, GAME etc.
When encoding an analog signal into a digital signal, the encoding performance is largely dependent on
two parameters; the sampling frequency (fs[Hz]) which corresponds to the sampling precision of the time
axis (frequency axis), and the number of bits used for quantization (Qb[bit]) which corresponds to the
sampling precision of the amplitude (loudness). For both fs[Hz] and Qb[bit], higher values allow the
occurrence of digital encoding noise to be minimized. This means that for both fs[Hz] and Qb[bit], higher
values are generally interpreted as “higher audio quality.”
In two-channel media, a CD is encoded at fs=44.1 kHz/Qb=16 bit, and DAT is encoded at fs=48
kHz/Qb=16 bit. The dynamic range for these types of media is approximately 96 dB. In multimedia,
DVD-Audi o is encoded with six channels of fs=96 kHz/Qb=24 bit, giving a dynamic range of
approximately 144 dB. This type of encoding is known as multi-bit encoding; the upper limit of the
frequencies that can be reproduced is determined by fs/2, and Qb essentially determines the dynamic
range.
In contrast, the single-bit high-speed sampling method uses the minimum number of quantization bits —
Qb= 1bit — and instead samples at an extremely high sampling frequency. In the Super Audio CD (SACD) developed by Sony and Phillips, this is called the DSD (Direct Stream Digital) method.
Because single-bit high-speed sampling expresses the amplitude of the sound not as a stepwise amplitude
of Qb but rather by the density of the sound pressure. It is said that this encoding method is closer to the
physical characteristics of the sound wave itself. However since Qb=1 bit, the quantization noise when
encoding is much greater than with multi-bit methods and an extremely high sampling frequency is
required in order to remedy this. The Super Audio CD uses a very high sampling frequency of 2.8224
MHz with Delta-Sigma conversion, shifting (noise shaping) quantization noise outside the audible range,
and delivering better than approximately 120 dB of dynamic range in the audible range. The recording
bandwidth is said to be DC through 100 kHz.
In this way, there are currently two ways to digitally encode an audio signal; “multi-bit methods” and
“single-bit high-speed sampling methods.” Generally, “PCM” or “LPCM” indicate “multi-bit methods.”
In contrast, since the Super Audio CD is currently the only mass-market media that uses single-bit highspeed sampling, single-bit high-speed sampling and DSD are often used as synonyms.
Compression methods can be broadly divided into two types; lossy compression and lossless
compression.
With lossy compression, the original signal cannot be recovered in its entirety from the compressed signal
that is recorded; i.e., this is irreversible compression.
This method generally takes advantage of psychoacoustic phenomena to lower the redundancy of the
original signal, thus compressing it.
Lossless compression allows the original signal to be completely recovered from the compressed signal
that is recorded; i.e., this is reversible compression. This method is used to compress files on a computer.
It uses mathematical means to lower the redundancy of the original signal, compressing it.
Thus, lossless compression delivers a lower compression ratio than lossy compression.
Examples of lossyMethodDolby AC-3, DTS coherent acoustic, ATRAC, MPEG-2(AAC), etc.
compressionMediaFilm, DVD-Video, digital broadcast, games, etc.
Examples of lossless MethodMLP (PPCM: Packed PCM), DST (Direct Stream Transfer)
compressionMediaDVD-Audio, Super Audio CD
•SDDS (film, ATRAC) allows 7.1 ch (8 ch) which adds LC (between L and C) and RC (between R and
C) to 5.1 ch.
•Mandatory audio signals for DVD-Video: LPCM signal or Dolby Digital (AC-3) signal (MPEG signal
is also required in TV system 625/50 regions). DVD-Video players must have Dolby Digital (AC-3)
playback capability.
•Optional audio signals for DVD-Video: DTS, MPEG, SDDS
[Table 3-2] Examples of lossy compression formats
Examples of lossless (reversible) compression formats
MediaCHCompressionfs [Hz]Qb [bit]Bitrate [bps]
DVD-Audio1 - 5.1(6)ch
Super Audio CD 2 - 5.1(6)ch
PPCM44.1k, 88.2k, 176.4k*
(Packed PCM, MLP)48k, 96k, 192k*
DST
(Direct Stream Transfer)
** Va l ue in the audible bandwidth. Includes the effect of noise shaping from
2.8224M1Max 14.99136M
*Only one or two channels at fs=176k or 192 k
Delta-Sigma modulation.
16, 20, 24Max 9.6MMax 144dB
Dynamic range
[dB]
More than
120dB **
•Super Audio CD requires that a two-channel source be stored (discs containing only a multi-channel
source are not allowed).
•DVD-Audio allows either of two methods; storing both a two-channel source and a multi-channel
source, or storing only a multi-channel source together with downmixing coefficients provided as
meta-data.
[Table 3-3] Examples of lossless compression formats
By “recording response” we mean the response allowed when the master tape produced by the studio is
recorded onto the production target media.
The response of each channel recorded on the media will depend on the encoding method and
compression method as described above.
In the case of analog recording, the response will depend on the specifications of the recording media.
However for lossy compression (irreversible compression), it is important to note that “fs” and “Qb” do
not directly determine the recording response (in particular, the dynamic range).
Currently for most media, full-range recording is possible for all channels.
However in the case of LFE and surround channels, there will be differences depending on the media.
2-3-1. LFE channel
For media that is recorded in Dolby DIGITAL, such as film and DVD-Video, the bandwidth is restricted
to 120 Hz at the time of encoding*.
This also applies to DTS. However in film, the range to 80 Hz is the recording band for the LFE channel
of DTS.
Similarly for the MPEG-2 used in digital broadcast (Europe), the upper limit of the LFE storage
bandwidth is restricted to 125 Hz.
In MPEG-2 AAC (digital broadcast, Japan), full-range recording is possible for encoding, but due to
considerations of the propagation spectrum, there may be a bandwidth limitation on the LFE channel.
Thus, it is necessary to be aware of the recording bandwidth of the LFE channel when the propagation
system is taken into account (see ISO/IEC and ARIB).
For music media (DVD-Audio, Super Audio CD), the LFE channel allows full-range recording in the
same way as the main channels.
* To be precise, Dolby Digital can record signals of up to about 600 Hz on the LFE channel of DVD-
Video, but since the LFE channel LPF (fc=120 Hz) is applied by default as an option during encoding, it
is best to consider 120 Hz as the upper frequency limit for recording and playback on the LFE channel
except for special cases.
2-3-2. Surround channels (S, LS, RS, BS)
For 3-1 matrix (Dolby stereo, Dolby surround, DTS stereo), the recording bandwidth of the S channel is
restricted to 100 Hz–7 kHz. For 5.0 matrix (Dolby Pro Logic II), the LS and RS recording channels are
restricted to 100 Hz–20 kHz.
In DTS for film (5.1, 6.1), the recording bandwidth of the surround channels (LS, RS, BS) is restricted to
80 Hz and above, but since sound recorded on the master tape that is lower than this point is collectively
recorded on the LFE channel, the resulting playback is full-range. This is known as “bass management”
(described in section 4).
By “playback response” we mean the desired (recommended) response of the playback system that plays
back the media. For example, this corresponds to the frequency response of each speaker and the level
balance.
It is important to be aware that depending on the media and the channel format, playback response may
not be the same as the recording response.
The following pages describe playback response for typical media.
[Fig. 10] Playback specification for DVD-Video program
1/3 octave band level
LFE : approx. 81dB
L=C=R=LS=RS : approx. 71dB
LS=RS=-3dB : approx. 68dB
+10dB
L=C=R, LS=RS=BS (5.1ch, 6.1ch)
80
125
1/3 octave band center frequency [Hz]
200
315
500
800
LS=RS (3-1ch)
2k
1.25k
Input Signal
Wide-band Pink Noise
approx. 0VU
(-20dBrms)
5k
8k
3.15k
20k
12.5k
In DVD-Video (Dolby, DTS), the playback level of the LFE channel (20–120 Hz) is set so that it will be
+10 dB relative to the level of the main channel bands. In the case of 3-1, LS and RS are set
approximately 3 dB lower so that the playback levels of L, C, R, and S (LS+RS) will be the same.
[Front channel]
Level
L = C = R (= 85 dBC)
Match the playback level of all channels.
Playback bandwidth
Full-range
[Surround channels]
Level
3-1: S (LS+RS) = L/C/R (=85dBC)
Set the LS and RS playback levels lower than for 5.1 (LS = RS ≈ 82 dBC)
5.1:LS = RS = L/C/R (= 85 dBC)
6.1:LS = RS = BS = L/C/R (= 85 dBC)
23 / 74
Loading...
+ 51 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.