As a pioneer of IP video monitoring systems, Sony has incessantly innovated and
continually enhanced its line of network cameras throughout the years. Among a
*1
wealth of products, the SNC-RX550, SNC-RZ50, and SNC-CS50
TM
the Sony IPELA
family – incorporate a variety of intelligent features that provide
high-quality images and efficient operation over IP networks.
These cameras incorporate a number of unique features that have been designed
for surveillance applications, yet can also be useful in other types of monitoring
applications. Among the many features adopted by these cameras, our customers
have specifically requested more detailed information on the following:
This manual is a comprehensive guide covering the above topics and explaining
how each of the intelligent network cameras utilizes the technology, while
concurrently identifying user benefits. It has been written in a manner that is easy
to read and comprehend. Illustrations have been used to depict concepts that are
difficult to explain in words alone. And each section is written so that it can be
read independently from the rest – it is not necessary to read the document from
cover to cover. This manual is targeted at product and marketing managers,
account managers, resellers, system integrators, and end users who have a strong
desire to understand these technologies.
– which belong to
We hope that by reading through this manual, you will fully understand the
innovative technologies that Sony has incorporated in the SNC-RX550, SNC-RZ50,
and SNC-CS50 Series of network cameras. And we hope that you find these
technological benefits to be a great advantage when you think about taking your
surveillance and remote monitoring solution to the next step.
SNC-RZ50SNC-CS50SNC-RX550 (Black and White)
*1
In the following text, “SNC-RX550,” “SNC-RZ50,” and “SNC-CS50” refer to both NTSC and PAL models
(i.e. SNC-RX550N/SNC-RX550P, SNC-RZ50N/SNC-RZ50P, and SNC-CS50N/SNC-CS50P).
The Sony SNC-RX550/RZ50/CS50 Series of network
cameras is capable of encoding images using any of the
following three compression formats: JPEG, MPEG-4, and
H.264. This multi-codec capability allows users to flexibly
choose the appropriate compression format to match
their network environment and monitoring applications.
This section provides a general explanation of these three
compression formats beginning with the basics of video
compression.
Basics of Video Compression
Most practical video compression techniques are based on
lossy compression, under which there are two basic
methods of compressing video: intra-frame compression
and inter-frame compression.
Intra-frame compression is a technique that compresses
each video frame independently without reference to any
other frame of video, while inter-frame compression
makes use of data from previous and/or subsequent video
frames. Note that inter-frame compression is generally
used in conjunction with intra-frame compression.
With intra-frame compression, each frame of video is
compressed spatially (i.e. redundant or nonessential data
is removed from the image).
Inter-frame compression, however, is a technique that
compresses multiple video frames by utilizing data from
adjacent frames (i.e. temporal prediction). Inter-frame
compression takes advantage of the characteristics of
video by “capturing” only the difference between
successive frames. By doing so, redundant information
between two frames can be eliminated, resulting in high
compression ratios.
JPEG
JPEG (standardized by ISO/IEC IS 10918-1/ITU-T T.81) is
the “industry-standard” image compression format for
surveillance applications and is ideal for use when highquality still images are required. These individual still
images are captured in sequence of 30 (NTSC) or 25 (PAL)
frames per second to form video and is sometimes referred
to as “Motion JPEG.” All these images are independently
compressed using intra-frame compression (Fig. 1).
Because intra-frame compression is the only method used,
JPEG data is larger than MPEG-4 and H.264, which
employ both intra-frame and inter-frame compression
techniques.
With the SNC-RX550/RZ50/CS50 Series of network
cameras, the JPEG picture quality can be set to a level
within the range of one to ten as shown in the table
below. By presetting the picture quality level, these
cameras output images with a “near-constant” data size,
meaning that the data size fluctuates about a pre-defined
constant value. This is useful for calculating the required
storage capacity and bandwidth for streaming JPEG
images over a network.
Before looking at the MPEG-4 compression format
adopted by these cameras, it is important to clarify the
term “MPEG-4.” MPEG-4 is a series of standards
developed by ISO/IEC MPEG (Motion Pictures Experts
Group) and has many “Parts,” “Profiles,” and “Levels”
related to multimedia content. Among these “Parts,”
“Profiles,” and “Levels,” the SNC-RX550/RZ50/CS50
Series of network cameras employs MPEG-4 Part 2
(ISO/IEC 14496-2) Simple Profile Level 3, and MPEG-4 Part
10 (ISO/IEC 14496-10), which is also called H.264 and
was jointly developed with ITU-T. In the following text,
“MPEG-4” refers to MPEG-4 Part 2 Simple Profile Level 3
and “H.264” refers to MPEG-4 Part 10.
Structure of MPEG-4
Let’s take a look at the structure of MPEG-4. A video
“frame” in MPEG-4 is referred to as a Video Object Plane
(VOP). There are two types of VOPs: an I-VOP (initial) and a
P-VOP (predictive). A Group of VOPs (GOV) consists of an
I-VOP and several P-VOPs. In these cameras, a GOV makes
up one second
An I-VOP is compressed using the intra-frame compression
technique and is similar to a single JPEG image. This initial
“frame” of a GOV is often called an “anchor.” I-VOPs are
much larger in data size than P-VOPs; however, they are
essential in the GOV structure, and are required when
searching image data.
P-VOP data is generated by predicting the difference
between the “current image” and the previously encoded
I-VOP or P-VOP (reference frame). This is performed using
inter-frame compression. As explained in the section on
“Basics of Video Compression,” this method of prediction
takes advantage of the video property that two consecutive
“frames” are very similar. Because P-VOP data contains
information related only to the difference between two
frames (i.e. VOPs) and not the image data itself, the data size
of P-VOPs are greatly reduced when compared to I-VOPs.
*2
of video (Fig. 2).
P-VOPs and Motion Compensation
“Motion Compensation” is the key to predicting
movement within an image and forming P-VOP data to
efficiently compress MPEG-4 video. This section briefly
introduces this technique.
As described above, P-VOP data is generated by predicting
the difference between the previous VOP (reference VOP)
and the current image that is input from the camera. To
predict this movement, “blocks” consisting of 16 x 16
pixels, called macroblocks are first formed within the
image. Next, motion vectors are calculated based on the
predicted movement within each macroblock. The
prediction process is such that the movement within each
macroblock between the reference VOP and the current
image is compared. The resultant “shift” of the
comparison is represented as a motion vector.
*3
In MPEG-4, sub-blocks consisting of 8 x 8 pixels within
the 16 x 16 macroblocks can also be used to predict the
current VOP (Fig. 3). The smaller the “frame” is divided,
the more accurately movement can be predicted, which
can result in an even higher compression ratio.
Fig. 3 MPEG-4 Motion Compensation Blocks
Fig. 2 MPEG-4 GOV Structure
*2
The default GOV setting of SNC-RX550/RZ50/CS50 Series of network cameras is one second. The length of a GOV can be set between one and five
seconds.
*3
The actual prediction process utilizes a number of feedback loops and complicated algorithms including triggers to reset the I-VOP when there are
extreme movement patterns. This method helps to accurately produce motion vectors. Further technical details are beyond the scope of this paper.
4
H.264
40
Video Parameters:
•10 frames/s
•QCIF (176 x 144 pixels)
•10 seconds of video (100 frames)
JPEG
PSNR
(dB)
Bit rate (Kb/s)
38
36
35
34
32
30
28
0100200300
H.264
MPEG-4
16 pixels
16 pixels
8 pixels
4 pixels 8 pixels 4 pixels
4 pixels
4 pixels
8 pixels
16 pixels
16 pixels
8 pixels
8 pixels
8 pixels
JPEG/MPEG-4/H.264 Comparison
H.264 (or MPEG-4 Part 10) has been developed with the
aim of providing high-quality video at a much lower bit rate
than MPEG-4. A number of techniques for achieving
efficient compression are incorporated in H.264. One major
contributing factor is the improvement in motion prediction.
As in the case of MPEG-4, each image is divided into
blocks to predict movement. However, with H.264, the
block patterns can be a 16 x 16 pixel macroblock or any
combination of the seven options shown in Fig. 4 (e.g. 4 x
4 sub-blocks in the upper right quadrant of the
macroblock, an 8 x 8 sub-block in the upper left
quadrant, and an 8 x 16 sub-block in the lower half, as
shown in Fig. 5). The block pattern is variably determined
depending on the amount and speed of movement within
the image. If an area of the image has little movement,
the algorithm utilizes large blocks (such as 16 x 16 pixels
or 8 x 8 pixels) to predict the difference between the
previous VOP and the current image. However, where an
area of the image includes significant motion, the
algorithm utilizes smaller blocks for prediction. By
dynamically adapting the size of each block to the
amount of motion, the prediction accuracy for each block
is significantly improved. Because the predicted data is
more accurate, less image data needs to be transmitted;
therefore, compression efficiency is greatly improved
when compared to MPEG-4.
Though motion prediction using variable block sizes
increases prediction accuracy and minimizes the amount
of data to be transmitted, it does require greater
processing power within the codec.
The difference between JPEG, MPEG-4, and H.264
compression formats has been explained in the above
sections. Here, let‘s relate picture quality to transmission
bit rate.
Fig. 6 is a graph depicting the picture quality vs. the bit
rate of these three compression formats.
*4
The vertical
axis (PSNR level) expresses the picture quality, and the
horizontal axis expresses the transmission bit rate. PSNR
(Peak Signal-to-Noise Ratio) is a metric widely used by
engineers to measure the “quality” of compressed video
images.
At a PSNR of 35 dB, JPEG images are transmitted at
approximately 260 Kb/s, while MPEG-4 transmits at
approximately 85 Kb/s and H.264 transmits at 50 Kb/s. To
put this into perspective, MPEG-4 requires approximately
one-third of the bandwidth used by JPEG, and H.264
requires just one-fifth.
In summary, both MPEG-4 and H.264 are ideal for image
transfer over a network because they require much less
network bandwidth than JPEG.
Fig. 4 H.264 Motion Compensation Blocks
Fig. 5 H.264 Combination Block Pattern
*4
The graph shows just one example of comparing bit rates at which JPEG, MPEG-4, and H.264 images can be transmitted. Actual bit rates for
transmitting data using these three compression formats differ with image quality and image size settings.
Fig. 6 Comparison Between H.264, MPEG-4,
and JPEG (picture quality vs. bit rate)
5
Bandwidth and Storage Capacity Calculations
Bandwidth =
(Mb/s)
Image data size(KB)
frame
#frames
sec
8 bits
Byte
1 MByte
1000 KB
Storage capacity =
(MB/hour)
Image data size(KB)
frame
#frames
sec
3600 sec
hour
1 MB
1000
KB
Storage capacity =
(GB/day)
Storage capacity
(MB)
hour
24 hours
day
1 GB
1000
MB
When designing your surveillance system, it is essential to
prepare a sufficient amount of storage and network
bandwidth. The following is an example showing how to
calculate the required network bandwidth and storage
capacity to transmit and store JPEG images.
JPEG
Formulas for Calculating Bandwidth and Storage
Capacity
Sample Calculation for a Four-Camera Installation
ResolutionImage fps
Compression data sizestoragestorage
levelcapacity/hourcapacity/day
Camera1VGA30 KB30 fps7.2 Mb/s3240 MB77.76 GB
Camera2QVGA7.5 KB10 fps0.6 Mb/s270 MB6.48 GB
Camera3QVGA4.5 KB10 fps0.36 Mb/s162 MB3.88 GB
Camera4QQVGA1.875 KB20 fps0.3 Mb/s135 MB3.24 GB
Totals8.46 Mb/s 3807 MB/hour91.36 GB/day
Level 5
Level 5
Level 2
Level 5
*6
BandwidthRequired Required
*5
MPEG-4/H.264
Because of the nature of MPEG-4 and H.264, it is very
difficult to accurately calculate required network
bandwidth and storage capacity as we did with JPEG. As
explained above, these compression methods are based
on the difference in movement within a scene; therefore,
scenes with little movement require less data than scenes
with significant movement.
The MPEG-4 and H.264 bandwidth settings in these
cameras can be preset to any of the nine levels as follows:
This table shows sample calculations for reference purposes only. Actual bandwidth and storage requirements should be properly tested with each system
installation.
*6
Maximum frame rate might be limited depending on compression level, resolution, and camera function settings. For more details, please contact your
local Sony sales office or authorized dealer.
6
Dual Encoding Capability
The Sony SNC-RX550/RZ50/CS50 Series of network
cameras is equipped with a dual encoding capability that
generates both MPEG-4 and JPEG images simultaneously.
This feature further expands your surveillance and
monitoring applications by offering flexible system
configurations. For example, it allows live monitoring of
clear and smooth MPEG-4 streams over a WAN or an
Internet VPN, where network bandwidth is limited, while
storing high-resolution JPEG images on removable media
inserted into the camera’s built-in card slot(s) (Fig. 6). Or,
you can record high-quality JPEG images using the IMZ-RS
Series software configured with a server on a LAN, while
distributing MPEG-4 streams to multiple PCs running the
Microsoft
(Fig. 7).
®
Internet Explorer®browser over a WAN/VPN
Local Area Network (LAN)
JPEG & MPEG-4 Images
ROUTER
JPEG
Removable Media
SNC Series
MPEG-4
WAN/VPN
IMZ-RS Series Monitoring
Software Installed PC
MPEG-4
Fig. 6 Dual Encoding (streaming while recording locally on removable media)
Monitoring PC
Local Area Network (LAN)
MPEG-4
JPEG & MPEG-4 Images
ROUTER
JPEG
MPEG-4MPEG-4
WAN/VPN
MPEG-4
Monitoring PC
SNC Series
IMZ-RS Series Monitoring
Software Installed Server
Monitoring PC
Fig. 7 Dual Encoding (streaming while recording locally on server)
7
Motion Detection Basics
Previous imageCurrent image
Previous imageCurrent image
16 pixels
16 pixels
Motion VectorMotion Vector
To understand the “Intelligent Motion Detection” and
“Intelligent Object Detection” functions built into the
Sony SNC-RX550/RZ50/CS50 Series of network cameras, it
is important to first understand motion detection in
general.
What is Motion Detection?
Motion detection is a relatively common feature built-into
surveillance equipment.
One benefit of motion detection is that it can greatly
reduce the required storage capacity of a recorder.
A surveillance system can be configured in several
different ways depending on what is being monitored.
For example, the recorder can be set up to store lowresolution images at a low frame rate or to record nothing
at all to save storage capacity under normal conditions.
When an alarm is triggered by movement, the recorder
automatically begins to record higher-resolution images at
higher frame rates so that critical scenes can be clearly
captured.
Another benefit of motion detection is that it can alert
operators when movement has been detected in several
ways, for example by sending an e-mail notification,
providing an audible alert with a pre-recorded audio file,
flashing an alarm message on the monitor, or by
displaying a full-screen image from the camera that
detected movement. What‘s more, the alarm can trigger
local image recording or move the camera to a preset
pan/tilt/zoom position to get a closer look at the
monitoring object. In addition, the system can be
configured to perform any of the following actions and
more when movement has been detected: sounding an
audible alarm, turning lights on/off, triggering a door
lock, etc.
Conventional Methods of Movement
Detection
A variety of detecting methods have been designed into
surveillance equipment from the time these systems were
first sold. Sony has incorporated movement detection
features not only in recorders but also in cameras.
Sony first-generation network cameras that employ the
JPEG compression format, such as the
SNC-RZ30/Z20/CS3, incorporate a basic detection method
that compares the average change in luminance levels
between adjacent frames (i.e. adjacent JPEG images) on a
pixel basis. If the result is greater than a preset threshold,
then it is treated as motion in the monitoring area, and
triggers an alarm (Fig.1).
Another detecting method that was incorporated in Sony
second-generation network cameras, such as the
SNC-RZ25/DF70/DF40/CS11/CS10, employing the MPEG-4
compression format, utilizes motion compensation inherent
in MPEG-4 compression. Motion compensation is based on
movement within 16 x 16 pixel areas of an image called
“macroblocks.” In the motion-compensation process,
motion vectors, which are based on the direction, speed,
and distance of a moving object within each macroblock,
are determined. These motion vectors are then added and
if the resultant vector exceeds a preset threshold level, an
alarm is triggered (Fig. 2).
8
Fig. 1 Detection Using Average Change in Luminance of Pixels
Fig. 2 Detection Using Motion Vector
Intelligent Motion Detection (IMD)
*1
The “IMD” function incorporated in the Sony
SNC-RX550/RZ50/CS50 Series of network cameras makes
further strides in motion-detection functionality. By
utilizing a sophisticated and robust movement-detection
algorithm, these cameras drastically lower the number of
false alarms.
Conventional methods, as described earlier, compare the
difference between two adjacent frames; however, “IMD”
utilizes 15 frames to determine whether or not to trigger
an alarm (Fig. 1). By analyzing more frames and
movement patterns, “IMD” can distinguish the difference
between the movement of actual objects or persons that
are supposed to trigger an alarm and repeated motion
patterns such as shaking leaves on a tree and ripples in
water that are not supposed to trigger an alarm. As a
result, accidental alarm triggers are minimized.
Detection by comparing 15 frames
This sophisticated algorithm also minimizes false alarms
that can result from camera vibration, and can
differentiate between moving objects and shadows,
further increasing the accuracy of motion detection.
Moreover, the “IMD” function can be used when the
camera is operating in MPEG-4, JPEG, and dual-encoding
(MPEG-4/JPEG) modes.
IMD triggered by truck movement
Fig. 1 IMD Frame Comparison
*1
The “IMD” function cannot be used when the camera is operating in H.264 mode.
*2
The figure above shows sample images to depict “IMD”. The markings on the images do not appear on the monitoring display. They are used for
illustration purpose only.
*2
Disregards repeated motion patterns
9
Intelligent Object Detection (IOD)
A briefcase is left unattended.
The camera identifies a “potential alert area.”
The camera defines an “alert area.”
In addition to “Intelligent Motion Detection,” the
SNC-RX550/RZ50/CS50 Series of network cameras is
equipped with an “Intelligent Object Detection”
function.
determine whether or not there are abandoned objects or
objects that have been removed from the monitoring
area. This feature can prove useful for detecting
suspicious objects left in public spaces, illegal parking,
stalled cars or accidents on the road, or for detecting
articles that have been removed from museums,
warehouses, or other places of business.
Whether detecting an abandoned object or an object
removed from the monitoring area, the camera‘s
detection methods and algorithms are identical. The IOD
algorithm is such that the camera first creates a “base
image,” and stores it in the camera‘s memory. This base
image can be the entire area being monitored or a prespecified area in the scene. The image currently being
monitored is compared to the base image and areas
where a change has occurred are regarded as “potential
alert areas.” The camera then continues processing
subsequent frames. After a period of time the camera
determines that an object was either removed or was left
behind, and defines an “alert area,” triggering an alarm
(Fig. 1). In order to accommodate scene environment
changes, the base image is regularly updated as time
passes.
*2
With the “IOD” function, these cameras can
*1
*1
The “IOD” function cannot be used when the camera is operating in H.264 mode.
*2
“IOD” and “IMD” cannot be used simultaneously.
*3
These are sample images to depict “IOD.” The highlighted areas and red rectangle do not appear on the monitoring display.
10
Fig. 1 IOD Mechanism
*3
Spherical Privacy Zone Masking
Updates PTZ position data every 50 ms
Privacy concerns have become an important worldwide
issue in all aspects of society. Likewise, in surveillance and
monitoring applications, privacy protection is not only
desired but sometimes mandatory as defined by laws and
ordinances.
The Spherical Privacy Zone Masking function incorporated
in the SNC-RX550 and SNC-RZ50 Series of network
cameras is a sophisticated masking method that responds
to these requirements.
Privacy zone masking is a feature in surveillance
applications that is used to protect personal privacy by
masking private areas in the camera‘s field of view, such
as windows and doorways that are within the monitoring
area but not subject to surveillance, and other private
property. Many manufacturers incorporate a privacy zone
masking function in surveillance recorders and monitoring
software, but may not incorporate this function in
cameras. However, in networked video surveillance
applications, privacy zone masking on the recorder or
processing device poses a major security concern, because
images captured by a camera are streamed over a
network before the mask is generated. What’s more,
delays can occur when processing masking data on
recorders or with software because the mask is generated
after streamed image data is received.
To mitigate the risks associated with this type of
processing, the Sony SNC-RX550 and SNC-RZ50 Series of
network cameras incorporate the Privacy Zone Masking
function in the camera itself.
Privacy Zone Masking area is set.
Monitor
Privacy Zone Masking area after PTZ movement.
With these network cameras, the masked areas of the
image are dynamically interlocked with the camera‘s
Pan/Tilt/Zoom (PTZ) movements for comprehensive image
masking (Fig. 1) – this is called “Spherical Privacy Zone
Masking.” The spherical masking mechanism works as
follows: when you specify the area and color of a mask,
the camera produces a signal to generate an image that
replaces the data for that area with a mask. This data,
along with the camera‘s PTZ settings are stored in the
camera‘s memory. When the camera pans, tilts, and/or
zooms, the PTZ settings are updated, and the mask is
repositioned so that it continues to cover the original
masked area. This data is updated every 50 ms so that
regardless of the PTZ speeds, masked areas within the
image remain covered. A maximum of eight masking
areas and one of nine masking colors can be set for each
camera.
Monitor
Masking area
Movement of masking area
Fig. 1 Spherical Privacy Zone Masking
11
Image Stabilizer
5% (24 pixeles in VGA)
Calculation blocks
Motion vector generated by
bird movement
Image shift to the direction opposite
that of the resultant motion vector
Motion vector
Disregards motion vector
generated by bird movement
5% (32 pixeles in VGA)
Previous FrameCurrent Frame
Previous FrameCurrent Frame
In outdoor surveillance and monitoring applications,
surveillance cameras are usually attached to poles or
mounted on buildings. Depending on the installation site,
captured video might be displayed as shaky images
resulting from vibration caused by wind and other
environmental effects. The image stabilizer function
incorporated in the Sony SNC-RX550/RZ50/CS50 Series of
network cameras minimizes the effect caused by highand low-frequency vibration to provide stable images.
This function is especially useful for outdoor surveillance
and traffic-monitoring applications.
The image stabilizer mechanism works as follows: when
the image stabilizer function is activated, the camera
assigns a 5% border area in the image to compensate for
camera vibration (Fig. 1).
*1
blocks (reference areas in the image) are assigned as
calculation points for motion vectors (Fig. 2). The camera
stores 30 frames of this image data in its memory.
Movement of each calculation block is compared frame by
frame using what is called a Block Matching Algorithm
*2
(BMA),
and individual motion vectors are calculated
(Fig. 3). The system is such that individual motion vectors
associated with movement of objects or beings within the
image are disregarded. Individual motion vectors
associated with movement of the camera are then
processed to obtain an average resultant motion vector
for the entire image (Fig. 4). The camera then shifts the
image to the direction opposite that of the resultant
motion vector to correct for any camera movement (Fig.
5). Because the camera stores data from the past 30
frames, these motion vectors are continually updated and
corrected in such a manner that the resultant image is
smooth and natural (i.e. the correction process is performed
incrementally so as not to cause abrupt transitions in the
image).
The image stabilizer function is effective in environments
where the vibration frequency of the camera is
approximately 2 Hz. This function can be used when the
camera is operating in MPEG-4, JPEG, and dual-encoding
(MPEG-4/JPEG) modes as well as when “Intelligent Motion
Detection” or “Intelligent Object Detection” are active.
An 8 x 8 matrix of calculation
Fig. 1 Border Area (5% of image)
Fig. 2 Calculation Blocks (8 x 8 matrix)
Fig. 3 Individual Motion Vectors
Fig. 4 Resultant Motion Vector for Camera Movement
*1
Digital zoom is used to allow for compensation; therefore, the effective
viewing area is reduced. Depending on the settings, the camera might
operate at a lower frame rate due to the amount of processing
required by the camera.
*2
BMA is an algorithm for locating matching blocks in a sequence of
video frames for the purposes of motion compensation.
12
Fig. 5 Image Shift to Compensate for Camera Movement
Dynamic Frame Integration (DFI)
High vertical
resolution
for still areas
Blurry for
fast-moving
objects
Jagged edges and
decreased
resolution in
still areas
High vertical
resolution
for still areas
Reduced blur for
fast-moving objects
Reduced blur for
fast-moving objects
Even field
Blue : Still areas in image
Red : Moving objects in image
Interlaced Fields
Progressive Signal
Odd field
Fig. 1-A Frame Mode
Fig. 1-B Field Mode
Fig. 1-C Auto Mode (DFI ON)
Auto Mode (DFI ON)
Still area: Frame mode
Moving area: Field mode
Frame Mode
The Sony SNC-RX550/RZ50/CS50 Series of network cameras
incorporates “Dynamic Frame Integration” technology to
reproduce clear and smooth images for both still and
moving areas within an image. This technology takes
advantage of the relatively high sensitivity inherent in
interlaced scanning CCDs, which are incorporated in these
cameras.
A technology called I/P (Interlace/Progressive) conversion is
required to produce progressive pictures from a camera
that employs interlaced scanning CCDs.
One method of producing these progressive pictures is to
simply combine two adjacent picture fields into one
picture frame. This is called “Frame Mode” in Sony
network cameras. This method provides high vertical
resolution and works well for still areas within an image;
however, if a fast-moving object appears in the image,
those areas with movement become blurry (Fig. 1-A).
On the contrary, “Field Mode,” which is an optional
setting with these network cameras, produces progressive
pictures by utilizing data from the even field only (i.e. lines
0, 2, 4, 6...). This method reproduces absent lines of the
interlaced field by interpolating data from the lines above
and below them. “Field Mode” can reduce blurred
images caused by fast-moving objects; however, vertical
resolution is half that of “Frame Mode” and this method
of processing images can produce jagged edges in still
areas of the image, particularly in angled lines with high
contrast (Fig. 1-B).
Combining the advantages of the two I/P conversion
techniques, DFI technology adaptively selects from “Frame
Mode” and “Field Mode” within an image to reproduce a
progressive picture. The algorithm is such that it detects
‘Motion’ within an image on a two-pixel basis. For areas
where motion is detected, DFI applies “Field Mode” to
minimize blurring, and at the same time, it applies “Frame
Mode” to still areas to maintain high resolution without
jagged edges (Fig. 1-C).
*1
In summary, DFI technology takes
advantage of the high sensitivity inherent in the
SNC-RX550/RZ50/CS50 Series of cameras to produce clear
and smooth images even under low-light conditions (Fig. 2).
*1
Fig. 1 I/P Conversion Mechanism
Fig. 2 Comparison Between Auto Mode (DFI ON) and Frame Mode
Depending on the scene, the DFI algorithm may not process the image properly; however, the image will always be clearer than that of Frame Mode.
13
CCDs (Super HAD CCD/Exwave HAD/SuperExwave Technology)
On-chip Lens
Color Filter
Photo Shielding Film
On-chip Lens
Color Filter
Poly Si
Vertical CCD
Photo Shielding Film (electrode)
Si-substrate
Internal Lens
Reduction of the
insulating film
thickness
Sensor
Poly Si
Vertical CCD
Sensor
On-chip Lens
Gap
Color Filter
Photo Shielding Film
N-SubstrateN-SubstrateN-Substrate
Poly Si
Vertical CCD
Sensor
Gapless
As a leading manufacturer of image sensor products, Sony
has developed a wide variety of CCDs for a number of
decades, and these CCDs have been used worldwide in a
great number of camera products. Sony CCDs utilize a
common HAD structure that reproduces images with
reduced smear and low noise characteristics. In addition
to the benefits of the HAD structure, refinements in the
On-Chip Lens (OCL) layer and CCD‘s photo sensors have
significantly contributed to improved CCD picture
performance.
The SNC-RZ50, SNC-RX550, and SNC-CS50 Series of
network cameras incorporate a Super HAD CCD, CCD
with Exwave HAD technology, and CCD with
SuperExwave technology, respectively. These CCDs have
been incorporated into each camera making them ideal
for applications ranging from surveillance to web
attractions. In the following sections, we will take a
detailed look at the different CCD types.
Super HAD CCD
When compared to the first generation of On-Chip Lenses
developed in 1989, the Super HAD CCD has an improved
OCL layer providing much greater sensitivity. As shown in
Fig. 1, the first-generation On-Chip Lenses above each
pixel are separated by gaps, and the light that falls on
these gaps is wasted. Whereas, the Super HAD CCD OCL
structure is virtually gapless (Fig. 2), which raises its lightconvergence capability and provides a drastic
improvement in sensitivity.
The CCD with Exwave HAD technology was developed to
provide extra sensitivity for both visible and near infrared
regions of the spectrum, allowing the camera to capture
bright images both during the day and night.
Inheriting the OCL structure from the Super HAD CCD,
Exwave HAD technology further increases sensitivity by
incorporating an internal lens layer between the color
filter and photo shielding film (Fig. 3). These internal
lenses are used to efficiently converge light that was not
converged toward the photo sensor by the OCL. This
double convergence structure enables more light to be
directed to the photo sensors.
As a result, the CCD incorporating Exwave HAD
technology has a higher sensitivity than the Super HAD
CCD.
In addition to this higher light convergence capability,
Exwave HAD technology uses a unique structure to
improve the CCD sensitivity to near infrared light.
This enhancement allows much brighter images to be
captured in the dark using the Day/Night function.
With earlier CCD structures, near infrared light was
difficult to capture because of its nature of being
converted to electric charges in areas deeper than the
photo sensor surface. By extending the photo sensor
deeper into the silicon substrate, the CCD with Exwave
HAD technology achieves a much higher sensitivity to
infrared light, allowing images to be captured in extreme
darkness.
Furthermore, Exwave HAD technology incorporates a
thinner insulating film between the silicon substrate and
the electrodes. Compared to earlier-generation CCDs,
this structure reduces the amount of light that leaks
directly into the vertical shift register, and suppresses the
smear level.
SuperExwave technology adds a further improvement to
the sensitivity of Exwave HAD technology, especially for
near infrared light. SuperExwave technology improves on
Exwave HAD technology by changing the structure of the
photodiodes in such a manner that it has even higher
photoelectric conversion efficiency. This structure can
capture even more of the light in the near infrared region
that would normally escape to the substrate in normal
CCD image sensors. As a result, the sensitivity in the near
infrared region is increased by approximately 50%, and
sensitivity of visible light is increased by approximately
10% compared to the CCD with Exwave HAD technology
(Fig. 4).
<SuperExwave><Exwave HAD>
Shooting environment: LED lights (wavelength 950 nm, irradiation
distance 1 m), Dark room
*1
This chart has been simplified to show the difference in sensitivity between SuperExwave and Exwave HAD. The values are for reference only.
Fig. 4 Comparison Between SuperExwave and Exwave HAD