Sony SNC-CS50P User Manual

Product Information Manual
SNC-RX550N SNC-RX550P SNC-RZ50N SNC-RZ50P SNC-CS50N SNC-CS50P
Introduction
As a pioneer of IP video monitoring systems, Sony has incessantly innovated and continually enhanced its line of network cameras throughout the years. Among a
*1
TM
the Sony IPELA
family – incorporate a variety of intelligent features that provide
high-quality images and efficient operation over IP networks.
These cameras incorporate a number of unique features that have been designed for surveillance applications, yet can also be useful in other types of monitoring applications. Among the many features adopted by these cameras, our customers have specifically requested more detailed information on the following:
- Selectable JPEG, MPEG-4, H.264 Compression Formats/Dual Encoding Capability
- Intelligent Motion Detection
- Intelligent Object Detection
- Spherical Privacy Zone Masking
- Image Stabilizer
- Dynamic Frame Integration
TM
- CCDs (Super HAD CCD
/Exwave HADTMTechnology/SuperExwaveTMTechnology)
This manual is a comprehensive guide covering the above topics and explaining how each of the intelligent network cameras utilizes the technology, while concurrently identifying user benefits. It has been written in a manner that is easy to read and comprehend. Illustrations have been used to depict concepts that are difficult to explain in words alone. And each section is written so that it can be read independently from the rest – it is not necessary to read the document from cover to cover. This manual is targeted at product and marketing managers, account managers, resellers, system integrators, and end users who have a strong desire to understand these technologies.
– which belong to
We hope that by reading through this manual, you will fully understand the innovative technologies that Sony has incorporated in the SNC-RX550, SNC-RZ50, and SNC-CS50 Series of network cameras. And we hope that you find these technological benefits to be a great advantage when you think about taking your surveillance and remote monitoring solution to the next step.
SNC-RZ50 SNC-CS50 SNC-RX550 (Black and White)
*1
In the following text, “SNC-RX550,” “SNC-RZ50,” and “SNC-CS50” refer to both NTSC and PAL models (i.e. SNC-RX550N/SNC-RX550P, SNC-RZ50N/SNC-RZ50P, and SNC-CS50N/SNC-CS50P).
2
Selectable JPEG, MPEG-4, H.264 Compression Formats
3 frames
At 30 fps, 1 sec = 30 images
The Sony SNC-RX550/RZ50/CS50 Series of network cameras is capable of encoding images using any of the following three compression formats: JPEG, MPEG-4, and H.264. This multi-codec capability allows users to flexibly choose the appropriate compression format to match their network environment and monitoring applications. This section provides a general explanation of these three compression formats beginning with the basics of video compression.
Basics of Video Compression
Most practical video compression techniques are based on lossy compression, under which there are two basic methods of compressing video: intra-frame compression and inter-frame compression. Intra-frame compression is a technique that compresses each video frame independently without reference to any other frame of video, while inter-frame compression makes use of data from previous and/or subsequent video frames. Note that inter-frame compression is generally used in conjunction with intra-frame compression.
With intra-frame compression, each frame of video is compressed spatially (i.e. redundant or nonessential data is removed from the image). Inter-frame compression, however, is a technique that compresses multiple video frames by utilizing data from adjacent frames (i.e. temporal prediction). Inter-frame compression takes advantage of the characteristics of video by “capturing” only the difference between successive frames. By doing so, redundant information between two frames can be eliminated, resulting in high compression ratios.
JPEG
JPEG (standardized by ISO/IEC IS 10918-1/ITU-T T.81) is the “industry-standard” image compression format for surveillance applications and is ideal for use when high­quality still images are required. These individual still images are captured in sequence of 30 (NTSC) or 25 (PAL) frames per second to form video and is sometimes referred to as “Motion JPEG.” All these images are independently compressed using intra-frame compression (Fig. 1). Because intra-frame compression is the only method used, JPEG data is larger than MPEG-4 and H.264, which employ both intra-frame and inter-frame compression techniques.
With the SNC-RX550/RZ50/CS50 Series of network cameras, the JPEG picture quality can be set to a level within the range of one to ten as shown in the table below. By presetting the picture quality level, these cameras output images with a “near-constant” data size, meaning that the data size fluctuates about a pre-defined constant value. This is useful for calculating the required storage capacity and bandwidth for streaming JPEG images over a network.
Level
10 150 KB 45 KB 11.25 KB 1/6
9 90 KB 22.5 KB 5.625 KB 1/10 8 60 KB 15 KB 3.75 KB 1/15 7 45 KB 11.25 KB 2.8125 KB 1/20 6 36 KB 9 KB 2.25 KB 1/25 5 30 KB 7.5 KB 1.875 KB 1/30 4 25.7 KB 6.43 KB 1.607 KB 1/35 3 22.5 KB 5.625 KB 1.406 KB 1/40 2 18 KB 4.5 KB 1.125 KB 1/50 1 15 KB 3.75 KB 0.9375 KB 1/60
Approximate Data Size [Resolution (pixels)]
VGA (640 x 480) QVGA (320 x 240) QQVGA (160 x 120)
Compression
Ratio
(approx.)
Fig. 1 “Motion JPEG” Structure
3
MPEG-4
1 GOV , 1 sec = 1 I-VOP and 29 P-VOPs
3 P-VOPs
I-VOP
16 pixels
16 pixels
8 pixels
8 pixels
Before looking at the MPEG-4 compression format adopted by these cameras, it is important to clarify the term “MPEG-4.” MPEG-4 is a series of standards developed by ISO/IEC MPEG (Motion Pictures Experts Group) and has many “Parts,” “Profiles,” and “Levels” related to multimedia content. Among these “Parts,” “Profiles,” and “Levels,” the SNC-RX550/RZ50/CS50 Series of network cameras employs MPEG-4 Part 2 (ISO/IEC 14496-2) Simple Profile Level 3, and MPEG-4 Part 10 (ISO/IEC 14496-10), which is also called H.264 and was jointly developed with ITU-T. In the following text, “MPEG-4” refers to MPEG-4 Part 2 Simple Profile Level 3 and “H.264” refers to MPEG-4 Part 10.
Structure of MPEG-4
Let’s take a look at the structure of MPEG-4. A video “frame” in MPEG-4 is referred to as a Video Object Plane (VOP). There are two types of VOPs: an I-VOP (initial) and a P-VOP (predictive). A Group of VOPs (GOV) consists of an I-VOP and several P-VOPs. In these cameras, a GOV makes up one second An I-VOP is compressed using the intra-frame compression technique and is similar to a single JPEG image. This initial “frame” of a GOV is often called an “anchor.” I-VOPs are much larger in data size than P-VOPs; however, they are essential in the GOV structure, and are required when searching image data. P-VOP data is generated by predicting the difference between the “current image” and the previously encoded I-VOP or P-VOP (reference frame). This is performed using inter-frame compression. As explained in the section on “Basics of Video Compression,” this method of prediction takes advantage of the video property that two consecutive “frames” are very similar. Because P-VOP data contains information related only to the difference between two frames (i.e. VOPs) and not the image data itself, the data size of P-VOPs are greatly reduced when compared to I-VOPs.
*2
of video (Fig. 2).
P-VOPs and Motion Compensation
“Motion Compensation” is the key to predicting movement within an image and forming P-VOP data to efficiently compress MPEG-4 video. This section briefly introduces this technique. As described above, P-VOP data is generated by predicting the difference between the previous VOP (reference VOP) and the current image that is input from the camera. To predict this movement, “blocks” consisting of 16 x 16 pixels, called macroblocks are first formed within the image. Next, motion vectors are calculated based on the predicted movement within each macroblock. The prediction process is such that the movement within each macroblock between the reference VOP and the current image is compared. The resultant “shift” of the comparison is represented as a motion vector.
*3
In MPEG-4, sub-blocks consisting of 8 x 8 pixels within the 16 x 16 macroblocks can also be used to predict the current VOP (Fig. 3). The smaller the “frame” is divided, the more accurately movement can be predicted, which can result in an even higher compression ratio.
Fig. 3 MPEG-4 Motion Compensation Blocks
Fig. 2 MPEG-4 GOV Structure
*2
The default GOV setting of SNC-RX550/RZ50/CS50 Series of network cameras is one second. The length of a GOV can be set between one and five seconds.
*3
The actual prediction process utilizes a number of feedback loops and complicated algorithms including triggers to reset the I-VOP when there are extreme movement patterns. This method helps to accurately produce motion vectors. Further technical details are beyond the scope of this paper.
4
H.264
40
Video Parameters:
•10 frames/s
•QCIF (176 x 144 pixels)
•10 seconds of video (100 frames)
JPEG
PSNR
(dB)
Bit rate (Kb/s)
38
36
35
34
32
30
28
0 100 200 300
H.264
MPEG-4
16 pixels
16 pixels
8 pixels
4 pixels 8 pixels 4 pixels
4 pixels
4 pixels
8 pixels
16 pixels
16 pixels
8 pixels
8 pixels
8 pixels
JPEG/MPEG-4/H.264 Comparison
H.264 (or MPEG-4 Part 10) has been developed with the aim of providing high-quality video at a much lower bit rate than MPEG-4. A number of techniques for achieving efficient compression are incorporated in H.264. One major contributing factor is the improvement in motion prediction.
As in the case of MPEG-4, each image is divided into blocks to predict movement. However, with H.264, the block patterns can be a 16 x 16 pixel macroblock or any combination of the seven options shown in Fig. 4 (e.g. 4 x 4 sub-blocks in the upper right quadrant of the macroblock, an 8 x 8 sub-block in the upper left quadrant, and an 8 x 16 sub-block in the lower half, as shown in Fig. 5). The block pattern is variably determined depending on the amount and speed of movement within the image. If an area of the image has little movement, the algorithm utilizes large blocks (such as 16 x 16 pixels or 8 x 8 pixels) to predict the difference between the previous VOP and the current image. However, where an area of the image includes significant motion, the algorithm utilizes smaller blocks for prediction. By dynamically adapting the size of each block to the amount of motion, the prediction accuracy for each block is significantly improved. Because the predicted data is more accurate, less image data needs to be transmitted; therefore, compression efficiency is greatly improved when compared to MPEG-4. Though motion prediction using variable block sizes increases prediction accuracy and minimizes the amount of data to be transmitted, it does require greater processing power within the codec.
The difference between JPEG, MPEG-4, and H.264 compression formats has been explained in the above sections. Here, let‘s relate picture quality to transmission bit rate. Fig. 6 is a graph depicting the picture quality vs. the bit rate of these three compression formats.
*4
The vertical axis (PSNR level) expresses the picture quality, and the horizontal axis expresses the transmission bit rate. PSNR (Peak Signal-to-Noise Ratio) is a metric widely used by engineers to measure the “quality” of compressed video images.
At a PSNR of 35 dB, JPEG images are transmitted at approximately 260 Kb/s, while MPEG-4 transmits at approximately 85 Kb/s and H.264 transmits at 50 Kb/s. To put this into perspective, MPEG-4 requires approximately one-third of the bandwidth used by JPEG, and H.264 requires just one-fifth. In summary, both MPEG-4 and H.264 are ideal for image transfer over a network because they require much less network bandwidth than JPEG.
Fig. 4 H.264 Motion Compensation Blocks
Fig. 5 H.264 Combination Block Pattern
*4
The graph shows just one example of comparing bit rates at which JPEG, MPEG-4, and H.264 images can be transmitted. Actual bit rates for transmitting data using these three compression formats differ with image quality and image size settings.
Fig. 6 Comparison Between H.264, MPEG-4,
and JPEG (picture quality vs. bit rate)
5
Loading...
+ 11 hidden pages