Test data........................................................................................................................................... 25
Data creation tools ......................................................................................................................... 25
HPCreateData and L&TT.............................................................................................................. 25
For more information.......................................................................................................................... 78
Page 4
Executive summary
This white paper provides performance-related information for HP Data Protector software 6.0
together with some typical examples. The emphasis is on backup servers and two common backup
and restore performance questions:
• Why are backups so slow?
• Why are restores so slow?
The first step toward exactly estimating backup requirements and performance is a complete
understanding of the environment. Many performance issues can be traced to hardware or
environmental issues. A basic understanding of the complete backup data path is essential in
determining the maximum performance that can be expected from the installation. Every backup
environment has a bottleneck. It may be a fast bottleneck, but it will determine the maximum
performance obtainable in the system.
There are many configuration options and procedures available that can help IT professionals to
improve the performance of their backup environment. This white paper focuses on server running
HP-UX 11.11 (11iv1) and Microsoft® Windows® Server 2003, which are backing up file server and
Microsoft Exchange Server 2003 in local area network (LAN) and storage area network (SAN)
environments. All data was located on an HP StorageWorks 6000 Enterprise Virtual Array
(EVA6000) and backed up or restored with an HP StorageWorks Ultrium960 tape drive.
The following hardware and software was installed and configured:
• HP9000 rp3440 with HP-UX 11.11 (11iv1)
• HP ProLiant DL380 G5 server with Microsoft Windows Server 2003 SP2 and Exchange Server
2003
• HP ProLiant DL380 G4 server with Microsoft Windows Server 2003 SP2 and HP Data Protector
software 6.0
• HP ProCurve LAN switch 20 10/100/1000-ports
• HP ProCurve SAN Switch 4/16
• EVA6000
• Two Ultrium 960 tape drives
The HP-UX test environment showed good performance results across all tests. The NULL device
backup performance was 386 MB/s or 1,359 GB/h and the Ultrium 960 tape backup performance
156 MB/s or 548 GB/h.
The Windows test environment also showed good performance results except for the test with millions
of small files. In such environments, the Windows NTFS file system responds very slowly if a file is
restored and its attributes recovered. The restore performance from the Ultrium 960 tape drive was
just 3.38 MB/s. See
As a result of these tests, several recommendations and rules of thumb have emerged:
1.
HP Data Protector software tuning can help to improve the performance, for example, for file
systems with millions of small files whereas the first tree walk is disabled and the backup
concurrency increased. See
2.
Data Protector’s default configuration parameters are well sized for most use cases.
3.
Some changes of configuration parameters have almost no performance impact, for example, the
Disk Agent buffers. See
4.
Software compression causes high CPU loads and less backup performance than the Ultrium 960
built-in compression. See
Local restore of small files.
Tuning Recommendations.
Disk Agent buffers.
Software compression.
4
Page 5
Overview
HP servers, storage, and software can help to provide a seamless enterprise backup and recovery
solution. The solution starts with understanding what server and storage components work best for the
required workload. Being able to determine performance baselines and uncovering potential
bottlenecks in the solution can help to focus on areas that may need improvement, and can also
provide information that helps with planning for data growth. Understanding the backup and
recovery requirements for a data center can also help to maintain a consistent process and set
expectations for data backup and recovery. Many times these business expectations are documented
in a Service Level Agreement (SLA).
HP has conducted testing in typical solution configurations for backup and recovery. While solution
design can be done by directly attaching storage devices, by way of SCSI or Fibre Channel,
attaching to backup servers on the LAN, attaching to devices over a Fibre Channel SAN, or
offloading backup processes to a dedicated backup server for a non-disruptive backup solution, or
Zero Downtime Backup, this paper focuses only on SCSI-attached tape devices in LAN and SAN
configurations.
Objectives and target audience
The objective of this white paper is to educate and inform users of the HP Data Protector software
about what levels of performance are achievable in typical backup scenarios.
The emphasis is in showing mid-size environments and not very large installations with academic
performance. This white paper highlights where the performance bottlenecks are and how these might
be overcome. User loads are disregarded on the assumption that backups and restores are executed
in idle environments, for example, executing backups at night when nobody is online.
The target audience is system integrators and solution architects and indeed anyone involved in
getting the best backup performance out of their HP infrastructure investments.
Introduction and review of test configuration
Testing was conducted in a Microsoft Windows Server 2003 and HP-UX 11.11 environment utilizing
an HP StorageWorks 6000 Enterprise Virtual Array for file server and Microsoft Exchange Server
2003 data. The HP StorageWorks Ultrium 960 tape drive was utilized for backups and restores by
way of LAN and SAN. The configuration details are described in the following sections.
5
Page 6
Figure 1 illustrates the topology layout of the test environment.
Figure 1. Topology layout of test environment
Topology of Test Environment
Cell Manager &
Domain Controller
HP ProLiant
DL380 G4
1 Gbit LAN (simplified)
HP ProCurve
4/16
Backup Server &
Exchange Server
HP ProLiant
DL380 G5
4 Gbit SAN Switch
Backup Server
HP 9000
rp3440
HP EVA6000
2 FC-Controllers
HP Ultrium 960HP Ultrium 960
Ultra320 SCSI
LTO-3 Tape Drive
56 x 146GB 10K
VRaid1
Typical Files
Small Files
Exchange DB
Ultra320 SCSI
LTO-3 Tape Drive
EVA6000
Storage array
The HP StorageWorks 6000 Enterprise Virtual Array is an enterprise class, high-performance, highcapacity, and high-availability “virtual” array storage solution. The EVA6000 is designed to meet the
needs of the data center to support critical application demands for consistently high transaction I/O.
This document assumes that the reader has previous knowledge of the EVA family of storage arrays.
However, there are several terms that are exclusive to the EVA and referenced throughout the
document. These terms are defined to provide a knowledge baseline. For further information on the
EVA6000, visit
The EVA6000 storage array was set up in a 2C4D configuration (2 controller and 4 disk shelves)
with a total of 56 x 146-GB 10K Fibre Channel disk drives.
http://www.hp.com/go/eva6000.
EVAs perform excellent when all the disks are in a single disk group, so in this example the disk
group consists of all 56 disks. A total of 15 x 100-GB LUNs (Vdisks) were created on the array and
presented to the two client servers. The RAID level applied to these LUNs was VRaid 1 to get the best
read performance out of the available drives.
The firmware version was XCS 6.000.
6
Page 7
Storage management
The HP StorageWorks Command View general purpose server running on an HP ProLiant DL580 G2
server was configured with Microsoft Windows Server 2003 Enterprise Edition. HP StorageWorks
Command View EVA V5.0 was used to manage the EVA6000.
LAN infrastructure
The LAN consisted of one HP ProCurve switch with 20 x 10/100/1000-ports. All network links were
configured with1 Gb.
SAN infrastructure
The fabrics consisted of one HP ProCurve SAN switch 4/16. All links between servers and storage
devices were 4 Gb.
The HP-UX and Windows server were equipped with one dual channel host bus adapter (HBA) each
and connected to the SAN switch for optimal performance.
The EVA6000 was equipped with two Fibre Channel controllers and four host ports total (two per
controller).
HP StorageWorks Ultrium 960 tape drive
The backup devices consisted of two HP StorageWorks Ultrium 960 tape drives, each connected to
one backup server by way of an Ultra320 SCSI interface.
Management server
The HP ProLiant DL380 G4 management server was configured as a Windows domain controller and
installed with:
• Windows Server 2003 SP1, 32-bit Enterprise Edition
• Data Protector 6.0 Cell Manager including patches DPWIN_00243, DPWIN_00244,
DPWIN_00245, DPWIN_00246, and DPWIN_00260
HP ProLiant DL380 G4 server
The HP ProLiant DL380 G4 management server was equipped with one Intel® Xeon™ 3.60-GHz
dual-core processor (hyper-threaded to two) and 1-GB RAM.
Both internal 146-GB disks were configured with RAID 1, which gave the Data Protector Internal
Database (IDB) an excellent performance. For the LAN connection, only one adapter of the integrated
dual port NC7782 Gigabit Server Adapter was configured.
Windows backup and Exchange Server 2003
The HP ProLiant DL380 G5 Windows backup and Exchange server was installed with:
• Windows Server 2003 SP2, 32-bit Enterprise Edition
• Microsoft Exchange Server 2003 SP2
7
Page 8
HP ProLiant DL 380 G5 server
The HP ProLiant DL380 G5 Windows backup and Exchange Server 2003 was equipped with two
Intel Xeon 3.00-GHz dual-core processors (hyper-threaded to four) and 16-GB RAM.
Both network adapters of the dual embedded NC373i Multifunction Gigabit Server Adapter were
configured for teaming with failover functionality. The total network performance was 1 Gb/sec. Fibre
connectivity was provided by way of one Emulex PLUS 4-Gb PCIe dual channel HBA with the native
Windows MPIO driver used for host connectivity for a total of eight physical paths to the fabric (two
HBAs, one switch, four EVA controller ports). Ultra320 SCSI connectivity to the tape drive was
provided by the HP SC11Xe PCIe HBA.
HP-UX backup server
The HP9000 rp3440 backup server was installed with HP-UX 11.11.
HP 9000 rp3440 server
The HP9000 rp3440 backup server was equipped with two 1.0-GHz dual-core PA-8900 processors
and 8-GB RAM. Fibre connectivity was provided by way of one 4-Gb dual port host HBA.
Configuration guidelines
For general information, refer to the HP StorageWorks Enterprise Backup Solutions (EBS) design guide
http://www.hp.com/go/ebs.
at
Backup environment
The backup manager is the core of any backup environment that must be checked for requirements
and compatibility. Visit
• Data Protector concepts guide
• Data Protector installation and licensing guide
• Data Protector product announcements, software notes, and references
• Data Protector support matrices
Note
Carefully check the Data Protector support matrices and install the latest
patches.
http://www.hp.com/go/dataprotector for the following documents:
Backup server
The following components of a backup server environment must be additionally checked for
compatibility:
• HBA
• OS/HBA drivers
• HBA BIOS
• Tape drive firmware
For detailed information, refer to the EBS compatibility matrix at
For fast tape devices, the PCI bus performance should be checked. Use separate PCI busses for disks
and tape if possible. The total PCI bandwidth needs to be more than double the backup rate.
http://www.hp.com/go/ebs.
8
Page 9
Windows 2000/2003 tape drive polling
The Windows Removable Storage Manager service (RSM) polls tape drives on a frequent basis (every
three seconds in Microsoft Windows 2000, and every second in Windows Server 2003). This
polling, which involves sending a Test Unit Ready (TUR) command to each tape or library device with
a loaded driver, is enabled by way of the AUTORUN feature of Windows Plug and Play. Windows
built-in backup software (NTBACKUP) relies on the device polling to detect media changes in the tape
drive.
Note
In SAN configurations, this polling can have a significant negative impact
on tape drive performance. For SAN configurations, HP strongly
recommends disabling RSM polling.
For additional information, refer to the Microsoft Knowledge Base article number 842411.
Data Protector overview and architecture
HP Data Protector software automates high-performance backup and recovery, from disk or tape, over
unlimited distances, to ensure 24x7 business continuity and maximize IT resource utilization. Data
Protector offers acquisition and deployment costs which are 30–70 % less than competition,
responding to the pressure for lower IT costs and greater operational efficiency. The license model is
very simple and easy to understand, helping to reduce complexity. Data Protector lowers data
protection costs while ensuring backup/recovery flexibility, scalability, and performance. Data
Protector is a key member of the fast-growing HP storage software portfolio, which includes storage
resource management, archiving, replication, and device management software. No software can
integrate better with the HP market-leading line of StorageWorks disk and tape products, as well as
with other heterogeneous storage infrastructures, than Data Protector.
Data Protector can be used in environments ranging from a single system to thousands of systems on
several sites. The basic management unit is the Data Protector cell.
The Data Protector cell, as shown in Figure 2, is a network environment consisting of a Cell Manager,
one or more Installation Servers, Clients Systems, Backup and Disk Devices. The Cell Manager and
Installation Server can be on the same system, which is the default option, or on separate systems.
9
Page 10
Figure 2. Data Protector architecture
Cell Manager
The Cell Manager is the main system that controls the Data Protector cell from a central point, where
the Data Protector core software with the Internal Database (IDB) is installed. The Cell Manager runs
Session Managers that control backup and restore sessions and write session information to the IDB.
The IDB keeps track of the backed up files as well as of the configuration of the Data Protector cell.
Installation Server
The Installation Server is the computer where the software repository of Data Protector is stored. There
is at least one Installation Server for UNIX® and one for the Windows environment required so that
remote installations through the network can be executed and the software components to the client
systems in the cell be distributed.
Client Systems
After installing the Cell Manager and the Installation Server of Data Protector, components can be
installed on every system in the cell. These systems become Data Protector clients. The role of a client
depends on the Data Protector software being installed on this system.
10
Page 11
Systems to be backed up
Client systems (application server) to be backed up must have the Data Protector Disk Agent installed.
The Disk Agent reads or writes data from/to disks on the system and sends or receives data from/to a
Media Agent.
The Disk Agent is installed on the Cell Manager as well, allowing the administrator to back
on the Cell Manager, the Data Protector configuration, and the IDB.
up data
Systems with backup devices
Client systems with connected backup devices must have a Data Protector Media Agent installed. A
Media Agent reads or writes data from/to media in the device and sends or receives data from/to
the Disk Agent. A backup device can be connected to any system and not only to the Cell Manager.
Client systems with backup devices are also called Backup Servers.
Backup and restore designs
Design of the backup environment can be performed using a variety of configurations. The following
list describes three designs typically used in a backup and restore solution.
Direct attached storage (DAS)
Direct attached storage (DAS) implies that the storage you are backing up to and restoring from is
directly attached to the server through a SCSI or Fibre Channel bus. This can be a good solution for
smaller data centers with only a few servers that do not require the sharing of a large tape or disk
resource. While this solution can provide for good performance, it cannot be leveraged across
multiple servers as illustrated in Figure 3.
Figure 3. DAS
11
Page 12
Local area network (LAN)
When combined with a server that has a DAS design, this solution allows for servers and clients to
back up and restore over the LAN to the machine with the DAS tape library or disk array. Now, this
is the traditional method that is employed throughout most data centers. This method allows for the
sharing of backup devices by other servers on the network. However, all backup data must be passed
over the public LAN, or a dedicated backup LAN, which can reduce the overall performance (see
Figure 4).
Figure 4. LAN
12
Page 13
Storage area network (SAN)
SANs are very similar to DAS solutions. However, the storage on the SAN may be shared between
multiple servers. This shared storage can be both disk arrays and tape libraries. To the server it will
appear as if it owns the storage device, but the shared Fibre Channel bus and the controllers for the
disk and tape devices allow the devices to be shared. This configuration is very advantageous for
both resource sharing and performance (see Figure 5).
Figure 5. SAN
Others
The following technologies go beyond the scope of this white paper:
• Direct backup (sever-less backup solution for SAN environments)
• Disk backup
• Split mirror backup
• Snapshot backup
• Microsoft Volume Shadow Copy Service (VSS)
For further information, see the HP Data Protector Software Zero Downtime Concepts Guide at
http://www.hp.com/go/dataprotector.
13
Page 14
Performance bottlenecks
The goal is that a tape drive becomes a performance bottleneck. In that case, backup and restore
times are predictable.
Backup performance
Backup performance will always be limited by one or more bottlenecks in the system, of which the
tape drive is just one part. That way it affects the backup time more than the restore time.
The flow of data throughout the system must be fast enough to provide the tape drive with data at its
desired rates. High-speed tape drives, such as the HP StorageWorks Ultrium 960 tape drive, are so
fast that making them the bottleneck can be very challenging.
All components must be considered and analyzed for getting the best backup performance. Standard
configurations with default values are mostly not sufficient. In enterprise environments, performance
estimations are very difficult to provide without benchmarks.
Restore performance
Many backup environments are tuned for backup performance because frequent backups require
most resources. Large restores are less common and runtimes less predictable. Backup runtimes
cannot be used for calculating exact restore runtimes. Therefore, runtimes can only be proved by
restore tests.
Backup and restore data path
The limiting component of the overall backup and restore performance is of course the slowest
component in the backup environment. For example, a fast tape drive combined with an overloaded
server results in poor performance. Similarly, a fast tape drive on a slow network also results in poor
performance.
The backup and restore performance depends on the complete data transfer path.
Figure 6 shows that the backup path usually starts at the data on the disk and ends with a copy on
tape. The restore path is vice versa. The tape drive should be the bottleneck in the data path. All other
components should process data faster than the tape drive so the tape drive has never to wait for
data.
Figure 6. Backup and restore data path
14
Page 15
Manager server (Data Protector Cell Manager)
The backup manager server should be sized according to its installation requirements.
Note
Additional applications or backups on the manager server would increase
the installation requirements. Carefully check any application installation
requirements. In larger backup environments, it is not recommended to
share this server with other applications or to utilize it as a backup server.
The server should be clustered for having no single point of failure (SPOF) in the backup environment.
The network connections to each client are essential and should be protected against single link
failures.
Backup and application server (Data Protector Client System)
Backup and application servers should be sized according to the installation requirements.
During backup, these servers should not be under heavy load by additional applications that run I/O
and CPU-intensive operations, for example, virus scans or a large amount of database transactions.
Backup servers demand special attention for proper sizing because they are central to the backup
process as they run the required agents. The data is passed into and out of the server’s main memory
as it is read from disk subsystem or network and written to tape. The server memory should be sized
accordingly, for example, in case of an online database backup whereas the database utilizes a
large amount of memory. Backup servers that receive data from networks rely also on fast
connections. If the connection is too slow, a dedicated backup LAN or moving to SAN architecture
could improve the performance.
Application servers without any backup devices depend basically on a good performance of
connected networks and disks. In some cases, file systems with millions of small files (for example,
Windows NTFS) could be an issue.
Backup application
For database applications (for example, Oracle®, Microsoft SQL and Exchange), use the backup
integration provided by those applications as they are tuned to make best use of their data structures.
Use concurrency (multi-threading) if possible—this allows multiple backups to be interleaved to the
tape, thus reducing the effect of slow APIs and disk seeks for each one. Note that this can have an
impact on restore times as a particular file set is interleaved among other data.
File system
There is a significant difference between the raw read data rate of a disk and the file system read
rate. This is because traversing a file system requires multiple, random disk accesses whereas a
continuous read is limited only by the optimized data rates of the disk.
The difference between these two modes becomes more significant as file size reduces. For file
systems where the files are typically smaller than 64 KB, sequential backups (for example, from raw
disk) could be considered to achieve the required data rates for high-speed tape drives.
File system fragmentation could be also an issue. It causes additional seeks and slower throughput.
15
Page 16
Disk
If the system performing the backup has a single hard disk drive, the major factor restricting the
backup performance is most likely to be the maximum transfer rate of the single disk. In a typical
environment, the maximum disk throughputs for a single spindle can be as low as 8 MB.
High-capacity disk
One high-capacity disk is still one spindle with its own physical limitations. Vendors have the tendency
to sell them with the arguments “best price per MB.”
This is just one side of the reality. The cons are really important for high-performance environments
that can cause serious problems.
Two smaller spindles provide double performance than one large spindle. The backup performance of
large disks may be acceptable without any application load. But if an application writes in parallel to
that disk, the total disk performance can go below 5 MB/s and the hit ratio of a disk array read
cache below 60%.
Disk array
Benchmarks have shown that the theoretical disk array performance cannot be achieved with
standard backup tools. The problem is the concurrency with the belonging read processes that cannot
be distributed equally among all I/O channels and disk drives. The disk array can be seen as a
bunch of disks, which internal organization and configuration is hidden for the backup software.
High-capacity disks can cause additional problems at which intelligent disk array caches do not
improve the situation. They are not able to provide reasonable throughput for backup and restore
tasks—the number of sequential reads and writes are just too high.
The 50% backup performance rule became a standard for the disk array sizing.
RAID
The use of RAID for a disk backup should be carefully considered. There are three main levels of
RAID, each with its own strengths. The raw I/O speed of the disk backup device significantly affects
the overall backup performance.
There are five main levels of RAID that are commonly referred to. These are:
• RAID 0 (Striping)
Data is striped across the available disks, using at least two disk drives. This method offers
increased performance, but there is no protection against disk failure. It will allow you to use the
maximum performance the disks have to offer, but without redundancy.
• RAID 1+0 (Disk Drive Mirroring and Striping)
The disks are mirrored in pairs and data blocks are striped across the mirrored pairs, using at least
four disk drives. In each mirrored pair, the physical disk that is not busy answering other requests
answers any read request sent to the array; this behavior is called load balancing. If a physical disk
fails, the remaining disk in the mirrored pair can still provide all the necessary data. Several disks in
the array can fail without incurring data loss, as long as no two failed disks belong to the same
mirrored pair. This fault-tolerance method is useful when high performance and data protection are
more important than the cost of physical disks. The major advantage of RAID 1+0 is the highest
read and write performance of any fault-tolerant configuration.
16
Page 17
• RAID 1 (Mirroring)
Data is written to two or more disks simultaneously, providing a spare “mirror copy” in case one
disk fails. This implementation offers high assurance of the disk transfers completing successfully, but
adds no performance benefit. It also adds cost, as you do not benefit from the additional disk
capacity.
• RAID 3 (Striping and Dedicated Parity)
This type of array uses a separate data protection disk to store encoded data. RAID 3 is designed
to provide a high transfer rate. RAID 3 organizes data by segmenting a user data record into either
bit- or byte-sized chunks and evenly spreading the data across N drives in parallel. One of the
drives acts as a parity drive. In this manner, every record that is accessed is delivered at the full
media rate of the N drives that comprise the stripe group. The drawback is that every record I/O
stripe accesses every drive in the group. RAID 3 architecture should only be chosen in a case where
the user is virtually guaranteed that there will be only a single, long process accessing sequential
data. A video server and a graphics server would be good examples of proper RAID 3
applications. RAID 3 architecture is also beneficial for backups but becomes a poor choice in most
other cases due to its limitations.
• RAID 5 (Striping and Distributed Parity)
Mode data is striped across all available drives, with parity information distributed across all of
them. This method provides high performance, combined with failure protection, but requires at
least three disk drives. If one disk fails, all data will be recoverable from the remaining disks due to
the parity bit, which allows data recovery. The disk write performance on RAID 5 will be slower
than RAID 0 because the parity has to be generated and written to disk.
Tape drive
The mechanical impact on tape drives of “not-feeding-enough” is principally underestimated and
customers suffer from slow performance and broken-down tape drives.
The tape drive should be operating at or above its lowest streaming data rate to achieve the best life
for the head, mechanism, and tape media. If the data is not sent fast enough, the internal buffer will
empty and the drive will not write a continuous stream of data. At that point, the drive will start
exhibiting what is called head repositioning. Head repositioning is also known as shoe shining and it
causes excessive wear on the tape media, on the tape drive heads, and on the mechanical tape drive
components. Tape drives have buffers in the data path that are large enough to prevent head
repositioning from explicitly slowing the backup down further. However, the increased error rate from
worn heads/media causes more tape to be used and additional retries to be performed. This will
slow the backup down and it will get worse over time.
Storage area network (SAN)
The standard topology for a mid-size and large environment is SAN-based. The SAN has its own
components and data paths. Each of them could become a bottleneck:
• Any HBA (of server, disk, tape and tape library bridge)
• SAN switch (total and single port bandwidth)
• Cabling (each cable type has unique performance and distance specifications)
17
Page 18
Finding bottlenecks
Process for identifying bottlenecks
To get the maximum performance of a backup and restore environment, it is important to understand
that many elements influence backup and restore throughput.
A process is needed that even breaks down a complex SAN infrastructure into simple parts, that can
then be analyzed, measured, and compared. The results can then be used to plan a backup strategy
that maximizes performance.
Figure 7 illustrates an example for identifying bottlenecks in SAN environments.
Figure 7. Example for identifying bottlenecks
The following steps are used to evaluate the throughput of a complex SAN infrastructure:
1.
Tape subsystem’s WRITE performance
2.
Tape subsystem’s READ performance
3.
Disk subsystem’s WRITE performance
4.
Disk subsystem’s READ performance
5.
Backup and restore application’s effect on disk and tape performance
Details of each of the steps are demonstrated in the test environment. See the Evaluating tape and
disk drive performance section.
With the test and subsequent analysis on a component level, it is possible to identify bottlenecks in the
SAN.
18
Page 19
Performance tools
Understanding the performance of a server, its HBAs, CPU, memory, and storage can be vital in
determining the expected performance of a backup solution. Much of the theoretical performance
information may be known when sizing the server and its components, but the true performance may
not be known until baseline testing is conducted on the system. HP offers tools that help in determining
the raw performance of each component involved in the backup and restore solution.
The following performance tools were utilized:
• HP StorageWorks Library and Tape Tools diagnostics (L&TT)
• HPCreateData
• HPReadData
• LoadSim 2003
HP StorageWorks Library and Tape Tools diagnostics (L&TT)
The HP industry-leading Library and Tape Tools diagnostics assist in troubleshooting backup and
restore issues in the overall system. It includes tools that help identify where bottlenecks exist in a
system and valuable tools for HP tape drive performance and diagnostics needs. The Windows
version of L&TT uses a graphical user interface (GUI) as shown in Figure 8. All other versions of the
program use a command screen interface (CSI), for example, HP-UX.
Figure 8. HP StorageWorks Library and Tape Tools Diagnostics (L&TT) GUI—Drive information
L&TT version 4.2 SR1 was used in this white paper for:
• Determining the rate at which the tape drive reads and writes data
• Determining the rate at which the disk subsystem can write and supply data
19
Page 20
The belonging features can be found in the Tape Drive Performance GUI as shown in Figure 9. The
embedded tools behind that are HPCreateData and HPReadData.
Figure 9. HP StorageWorks Library and Tape Tools Diagnostics (L&TT) GUI—Tape drive performance
L&TT can be downloaded free from http://www.hp.com/support/tapetools.
20
Page 21
HPCreateData
The HPCreateData PAT utility is a file system generator for restore performance measurements. It is
useful in assessing the rate at which your disk subsystem can write data, and this is ultimately what
will limit the restore performance. To write more than one stream, initiate multiple instances of
HPCreateData.
HPCreateData version 1.2.3 was used for testing. Figure 10 shows an example for creating Windows
test data.
Figure 10. HPCreateData for Windows example
21
Page 22
Executing HPCreateData results in a directory and file structure similar to that shown in Figure 11. In
each directory, 750 files were created with a depth of 5 and breadth of 6.
Figure 11. Executing HPCreateData for Windows—Directory and file structure
HPCreateData can be downloaded free from http://www.hp.com/support/pat.
22
Page 23
HPReadData
The HPReadData PAT utility is useful in assessing the rate at which your disk subsystem can supply
data, and this is ultimately what will limit the backup performance. It simulates the way Data Protector
reads files. A single instance of HPReadData can read eight streams simultaneously from your array.
To read more than eight streams, initiate multiple instances of HPReadData. HPReadData is available
for Windows, HP-UX, Solaris, and Linux.
HPReadData version 1.2.4 was used for testing. Figure 12 shows an example for reading Windows
test data.
Figure 12. HPReadData for Windows example
Figure 12 shows HPReadData reading one LUN from a file system in a manner similar to the way a
backup application will read files. We can see that the maximum read rate from this configuration is
105.25 MB/s, so we cannot expect any higher backup transfer performance to tape than this figure.
Note
For simulating Data Protector disk read agents, start the equivalent number
of HPReadData instances whereas each instance should only read from one
file system.
HPReadData can be downloaded free from http://www.hp.com/support/pat.
23
Page 24
LoadSim 2003
The Microsoft Exchange Server 2003 Load Simulator (LoadSim) is basically a benchmarking tool to
simulate the performance load of MAPI clients.
For this white paper, it was very useful to use its additional features for data creation:
• Create Topology
• Initialize Test
Figure 13 shows the initial screen of LoadSim 2003.
Figure 13. LoadSim 2003—Initial screen
LoadSim provides many configuration parameters for simulating customer’s Microsoft Exchange
Server environments. If 500 users are initialized with the default initialization values for either the
MMB3, Heavy, or Cached Mode profile, the Exchange database will grow to approximately 54 GB.
Figure 14 illustrates the default LoadSim configuration parameters for distribution lists.
Figure 14. LoadSim—Distribution Lists screen of topology properties
LoadSim 2003 is downloadable from http://go.microsoft.com/fwlink/?LinkId=27882.
24
Page 25
Test data
In the following proof points, two different types of file systems and one Microsoft Exchange Server
2003 were created, so that the results shown are realistically achievable in similar situations:
• Typical file server data with fewer files and a broad range of size (KB/MB)
• Problematic file server data with millions of small files
• Typical Microsoft Exchange Server 2003 data
Data creation tools
All test data was created using the following public tools:
• HP L&TT version 4.2 SR1a
• HP HPCreateData version 1.2.3
• Microsoft Exchange Server 2003 Load Simulator
HPCreateData and L&TT
The datasets were developed using the following utilities:
• HPCreateData for Windows
• L&TT for HP-UX
HPCreateData for HP-UX was not used because it is only available as a CLI and cannot create more
than one directory—all files would be created in one single directory, which does not correspond to
what actually occurs in real file server environments.
HPCreateData and L&TT generate different file sizes with different data contents (fixed, random, up to
4:1 compression ratio) and different distribution (file-based, MB-based). All created files had a name
with maximal 16 characters for avoiding corner cases.
HPCreateData and L&TT are downloadable from
LoadSim 2003
LoadSim 2003 was used for creating the Exchange test data. But before it was used, the Exchange
organization had be prepared with the Exchange System Manager. For optimal performance, all
databases files were moved to separate disk volumes. When data is read or written, the storage
system should not become a bottleneck for the test.
http://www.hp.com/support/pat.
25
Page 26
Figure 15 illustrates an example of an Exchange Server that is configured for LoadSim testing. In this
example, the Exchange System Manager was used to add three additional storage groups named
StorageGroup1–StorageGroup3. Also, one mailbox store was added to each storage group. These
mailbox stores were named Store1–Store4.
Figure 15. Exchange System Manager with additional storage groups and additional mailbox stores
Two LoadSim features were utilized with their default initialization parameters:
• Create Topology
After topology parameters were specified, the topology was created. This step creates the LoadSim
users and distribution groups (also referred to as DLs or distribution lists), in Active Directory on the
Exchange Server.
• Initialize Test
When the test was initialized, LoadSim added or deleted messages in the Inbox and folders of each
user in the test so that each mailbox has the number of messages specified in the Initialization tab
of the Customize Tasks window. LoadSim created also new folders, smart folders, rules in Inbox,
appointments, and contacts.
LoadSim 2003 is downloadable from http://go.microsoft.com/fwlink/?LinkId=27882.
26
Page 27
Creating typical files for Windows NTFS
The typical file system was created with file sizes between 64 KB and 64 MB and the compressibility
of the data 2:1. The utility created an equal distribution of files in each directory.
Figure 16 shows the HPCreateData input parameters and results.
Figure 16. HPCreateData for Windows—Creating typical files
Finally, the file system contained 49.85 GB with 4,389 files in 20 folders.
27
Page 28
Creating typical files for HP-UX VxFS
The typical file system for HP-UX VxFS was created with file sizes between 64 KB and 64 MB and the
compressibility of the data 2:1. The utility created an equal distribution of files in each directory.
Figure 17 illustrates the L&TT input parameters and Figure 18 the results.
Figure 17. HP StorageWorks Library and Tape Tools Diagnostics (L&TT) for HP-UX—Creating typical files—Input parameters
Figure 18. HP StorageWorks Library and Tape Tools Diagnostics (L&TT) for HP-UX—Creating typical files—Results
Finally, the file system contained 49.85 GB with 4,389 files in 20 folders.
28
Page 29
Creating millions of small files for Windows NTFS
The file system with millions of small files for Windows NTFS was created with file sizes between 4 KB
and 16 KB and the compressibility of the data 2:1. The utility created an equal distribution of files in
each directory.
Figure 19 illustrates the HPCreateData input parameters and results.
Figure 19. HPCreateData for Windows—Creating small files
Finally, the file system contained 49.27 GB with 5,535,750 files in 7,380 folders.
29
Page 30
Creating millions of small files for HP-UX VxFS
The file system with millions of small files for HP-UX VxFS was created similar to Windows. The file
sizes were between 4 KB and 16 KB and the compressibility of the data 2:1. The utility created an
equal distribution of files in each directory.
Note
The number of files per directory was set to 120 due to limitations of L&TT
version 4.2 SR1a. The restore pre-test was executed six times for getting a
HP-UX VxFS layout, which is similar the Windows NTFS layout as described
in the previous section,
In later L&TT releases, the number of files should be increased to 750.
Figure 20 illustrates the L&TT input parameters and Figure 21 the results.
Figure 20. HP StorageWorks Library and Tape Tools Diagnostics (L&TT) for HP-UX—Creating small files—Input parameters
Creating millions of small files for Windows NTFS.
Figure 21. HP StorageWorks Library and Tape Tools Diagnostics (L&TT) for HP-UX—Creating small files—Results
Finally, the file system contained 47.29 GB with 5,314,320 files in 44,286 folders.
30
Page 31
Creating data for Microsoft Exchange Server 2003
LoadSim was configured for simulating 2,000 users. Four equal storage groups were created with
one store each as illustrated in Figure 22.
Figure 22. LoadSim Topology for simulating 2,000 users
The LoadSim initialization resulted in four storage groups with one 50-GB database each. The total
size, which is relevant for backups and restores, was 200 GB.
31
Page 32
Evaluating tape and disk drive performance
The performance test of tape and disk drives gives a good overview of what the source and target
devices are able to provide. Backup applications cannot perform better than these basic tools.
Tape write and read performance
The tape drive performance was determined with the HP StorageWorks Library and Tape Tools
Diagnostics (L&TT).
L&TT for Windows was configured for writing and reading:
• Zeros with fixed block mode, 256-KB block size, 1M I/O size, 32-GB test size and no file marks
• 2:1 compressible data with fixed block mode, 256-KB block size, 1M I/O size, 32-GB test size
and no file marks
Figure 23 illustrates the L&TT for Windows input parameters and Figure 24 some test results.
Figure 23. HP StorageWorks Library and Tape Tools Diagnostics (L&TT) for Windows—Input parameters
32
Page 33
Figure 24. HP StorageWorks Library and Tape Tools Diagnostics (L&TT) for Windows—Test results
L&TT for HP-UX was configured for writing and reading with the following parameters:
• 4.3/1 compressible data with 128-KB block size, 32768 repeats and 8 blocks
• 2:1 compressible data with 128-KB block size, 32768 repeats and 8 blocks
Note
L&TT for HP-UX does not offer zero patterns. Therefore, the highest
available compression pattern of 4.3/1 was specified.
Figure 25 illustrates the L&TT for HP-UX input parameters and Figure 26 some test results.
Figure 25. HP StorageWorks Library and Tape Tools Diagnostics (L&TT) for HP-UX—Input parameters
33
Page 34
Figure 26. HP StorageWorks Library and Tape Tools Diagnostics (L&TT) for HP-UX—Test results
The results of all L&TT tape drive test are illustrated in Figure 27, which shows the performance limits
of a direct attached Ultrium 960 tape drive. This figure is a reference for any later Data Protector test.
The tape drive cannot be faster as with L&TT. This applies to any backup application.
Figure 27. Library and Tape Tools—Results of tape drive tests
HP Storage W orks Library and Tape Tools
200
180
160
140
120
100
MB/s
80
60
40
20
0
Windows HP-UX
176
Tap e Write
Zeros
Windows
180
Tap e Read
Zeros
Windows
Tap e Write
Comp. 2 /1
Windows
Basic Tape Drive Tests
154
156
Tape Read
Compr. 2 / 1
Windows
188
Tap e W rite
Compr. 4 . 3 / 1
HP-UX
181
Tape Read
Compr. 4.3/ 1
HP-UX
158
Tap e Write
Compr. 2/1
HP-UX
160
Tape Read
Compr. 2/1
HP-UX
The tests had revealed that the Ultrium 960 tape drive performs slightly better in the HP-UX than in the
Windows environment. But the difference is only marginal. Both environments demonstrated that they
are fully able to utilize the Ultrium 960 tape drive.
34
Page 35
Disk write performance
The disk write performance was determined with the HP StorageWorks Library and Tape Tools
Diagnostics (L&TT) for HP-UX and with the HPCreateData utility for Windows.
Note
The performance data of parallel writes is determined by the data volume
divided by the time of the slowest write process. This approach simulates a
backup application that does not finish before the last byte is written.
Figure 28 demonstrates that single writes of typical files (one stream to one disk volume) already
perform good compared to multiple writes (five streams to five disk volumes). With Windows, single
writes of small files (4–16 KB) are very slow because its NTFS file system overhead is much higher
than with HP-UX VxFS.
Figure 28. Results of disk write tests
Disk Write Performance
Typical a nd M illions of Small Fi l es
250.00
200.00
150.00
MB/s
100.00
50.00
0.00
134
Typical
Single
Windows
165
Typical
Parallel
Windows
166
Typical
Singl e HP -
UX
193
Typical
Parallel
HP-UX
20
Small
Single
Windows
51
Small
Single HP-
UX
For Windows, parallel writes (five streams to five disk volumes) of small files failed due to problems
with an overflow of the Windows system paged pool. This cannot be solved without Windows kernel
tuning, which is not a focus of this white paper.
Note
For Windows, it is recommended that file systems with millions of small files
are only restored single.
35
Page 36
Disk read performance
The disk read performance was determined for HP-UX with the HP StorageWorks Library and Tape
Tools Diagnostics (L&TT) and for Windows with the HPReadData utility.
Figure 29 demonstrates that multiple reads (five streams from five disk volumes) perform much better
than single reads (one stream from one disk volume). With Windows, single reads of small files
(4–16 KB) are very slow because its NTFS file system overhead is much higher than with HP-UX VxFS.
Figure 29. Results of disk read tests
Disk Read Performance
Typical and Mill ions of Smal l Fi l es
300.00
250.00
200.00
MB/s
150.00
100.00
50.00
0.00
w
o
d
Win
le
a
r
al Sing
al Pa
c
i
c
p
pi
y
y
T
T
105
s
llel W
Typ
262
s
w
o
P-UX
d
n
H
i
ngle
Si
l
Parallel HP
Ty
l
pica
ca
i
131
S
m
all S
-
i
UX
n
266
le Wi
g
Sm
53
29
g
le H
S
ma
P
ll
U
-
Pa
24
X
X
U
-
P
H
l
e
ll
a
r
11
ows
d
n
a
l P
l
a
ws
o
d
n
i
W
l
le
l
a
r
l Sin
l
Sma
Note
The performance data of parallel reads is determined by the data volume
divided by the time of the slowest read process. This approach simulates a
backup application, which does not finish before the last byte is read.
36
Page 37
Evaluating Data Protector performance
This section highlights the Data Protector performance and uses the results of the previous sections for
finding the bottleneck and giving recommendations. The procedures of this section should be
transferable to many customer environments.
The Data Protector performance was tested and analyzed for two different types of file servers and a
classic Microsoft Exchange Server 2003:
• Typical file server data with a common amount of files and a broad range of size (KB/MB)
• Problematic file server data with millions of small files (KB)
• Typical Microsoft Exchange Server 2003 data
The performance tests were executed by configuring single and multiple backup/restore streams—
also named as concurrency or multiplexing. The results of single and multiple stream tests were
compared and analyzed. In some cases, multiple streams could be slower than single streams.
The tests included the measurement of performance (MB/s), CPU load (%), and memory usage (MB)
of the backup server and if applicable the client. Windows performance data was measured with the
built-in tool Perfmon and HP-UX performance data with the built-in tool vmstat.
Note
The memory usage is not documented in the following sections because all
tests had shown that the memory usage by Data Protector itself was very
little and never exceeded 46 MB. For example, Data Protector did not
utilize for the HP-UX local backup of typical files more than 19-MB memory
and for the worst case with the Windows local backup of typical files not
more than 46-MB memory. These days, 46-MB usage is not relevant for
servers with gigabytes of memory.
Note
The Cell Manager performance was not logged because of not being a
focus of this white paper. During testing, some quick Cell Manager
performance checks confirmed that the Cell Manager was far away from
getting a bottleneck. The CPU and IDB load was very little, even while
saving and restoring the file server with millions of small files. The test
scenarios of this paper were not sufficient to put the Cell Manager under
pressure.
Data Protector configuration
The backup and restore tests were executed with the following changes of Data Protector’s default
configuration:
• 256-KB tape drive block size as described in the
configuration section
• Only one file system tree walk for Windows NTFS file systems with millions of small files as
described in the
File system tree walk section
HP StorageWorks Ultrium 960 tape drive
37
Page 38
HP StorageWorks Ultrium 960 tape drive configuration
For all tests with the HP StorageWorks Ultrium 960 tape drive, the block size was configured in Data
Protector with 256 KB (64-KB default) as shown in Figure 30. For further details, see the “Getting the
most performance from your HP StorageWorks Ultrium 960 tape drive white paper” (downloadable
File system tree walks are typically problematic in Windows NTFS file systems with millions of small
files. UNIX file systems are less sensitive and respond much faster.
For each file system backup, Data Protector executes two tree walks by default:
• The first file system tree walk is required for:
– Running backup statistics
By default Data Protector creates backup statistics during runtime. The tree walk scans the files
selected for the backup and calculates its size, so that the progress (percentage done) can be
calculated and displayed in the Data Protector Monitor GUI.
– Detecting Windows NTFS hard links
NTFS hard links are detected if the advanced WINFS filesystem option Detect NTFS Hardlinks is
selected (ON).
– Detecting UNIX POSIX hard links
POSIX hard links are detected and backed up as links if the advanced UNIX filesystem option
Backup POSIX Hard Links as Files is not selected (OFF).
•The second file system tree walk is required for cataloging, indexing, and saving the selected files.
Note
Disabling the first file system tree walk is only recommended in the
following scenarios:
• Running unattended backup sessions
• Backing up millions of small files from Windows NTFS or UNIX file
systems with the requirement of an overall runtime decrease
• Having no NTFS or POSIX hard links, otherwise Data Protector would
back up the entire file contents for each hard link, which would occupy
more space on the backup media.
38
Page 39
How to disable first tree walk for Windows NTFS: If the advanced WINFS file system option Detect
NTFS Hardlinks is not set, the first tree walk of NTFS can be disabled by setting “NoTreeWalk=1” in
the client’s local “omnirc” file (<Data_Protector_home>\omnirc).
How to disable first tree walk for UNIX file systems: If the advanced UNIX file system option Backup
POSIX Hard Links as Files is selected (ON), the first tree walk of UNIX file systems is disabled and
POSIX hard links are backed up as files. Figure 31 illustrates the correct configuration for disabling
the tree walk.
Figure 31. Disabling the first tree walk for UNIX file systems—Backup POSIX hard links as files
Data Protector IDB considerations
Note
The IDB logging level was not changed from its default value “Log All.” This
logging level is the worst case for the IDB while each file is tracked. For
further details, see the
The following section tests had shown that more tape space was consumed for small files than for
typical files. The backup of 5,535,750 small files and 249.36-GB total volume resulted in 267.32-GB
tape space usage. The overhead was caused by the high number of small files and the required
catalog information for each single file written to tape, for example, its file name and attributes.
IDB logging level section.
39
Page 40
Backup of typical and small files
This section covers the local and the network backup of typical and small files. For Windows NTFS
file systems with millions of small files, the first tree walk was disabled as described in the
tree walk section. For UNIX file systems, the first tree walk was always enabled (default) because it is
less critical.
Figure 32 illustrates the effect of the first file system tree walk during the HP-UX network backup of
small files. At the beginning, the Windows server loaded the tape and waited for data from the
remote HP-UX server, which was busy with the execution of the HP-UX file system tree walk.
Figure 32. CPU Load of the Windows backup server during the HP-UX network backup of millions of small files
File system
40
Page 41
Local backup of typical files
The typical files were saved to the SCSI-attached Ultrium 960 tape drive.
For Windows, Test 1 of Table 1 shows that that the tape device wrote with 154.00 MB/s. This was
sufficient for backing up a single file system as shown in Test 2 with 105.25 MB/s. But if backing up
multiple file systems as shown in Test 3, the file systems were able to provide 262.24 MB/s, which
was faster than the Ultrium 960 tape drive. Test 5 showed that the Data Protector performance of
151.69 MB/s was very close the to the tape drive performance of Test 1 with 154.00 MB/s. In this
Test 5, the tape drive was able to stream with its highest performance and therefore it became the
bottleneck. This is a very good example of a well-balanced environment for an Ultrium 960 tape
drive.
For HP-UX, Test 1 of Table 1 shows that the results were much better than for Windows but that the
same rules applied. The Ultrium 960 tape drive was the bottleneck.
The CPU load was low except the HP-UX NULL device backup of Test 4, which resulted in 43% and
the excellent backup performance of 368.76 MB/s. Test 5 with the backup to the Ultrium 960 tape
drive did not allocate much resources. The CPU load was just 13% for Windows and 16% for HP-UX.
Table 1. Local backup of typical files—bottleneck determination
Test Performance (MB/s) CPU Load Bottleneck
1. Windows L&TT Tape Write
2. Windows HPReadData Single
3. Windows HPReadData Parallel
1
1
1
154.00 - Yes (Tape)
105.25 - No
262.24 - No
4. Windows DP NULL Parallel 197.15 18% No
5. Windows DP Ultrium 960 Parallel 151.69 13% No
1. HP-UX L&TT Tape Write
2. HP-UX HPReadData Single
3. HP-UX HPReadData Parallel
4. HP-UX DP NULL Parallel 368.76 43% No
5. HP-UX DP Ultrium 960 Parallel 156.11 16% No
1
1
1
158.54 - Yes (Tape)
130.58 - No
265.75 - No
Recommendation
If backing up typical files locally to the SCSI-attached Ultrium 960 tape
drive, parallel backups (multiplexing/concurrency) are recommended
because one single stream cannot fully utilize the tape drive.
1
Tested in the Eval section uating tape and disk drive performance
41
Page 42
Local backup of small files
The small files were saved to the SCSI-attached Ultrium 960 tape drive.
Table 2 shows that file systems with millions of small files were not able to fully utilize the Ultrium 960
tape drive. The file systems were the bottleneck for both operating systems. This had also some impact
to Data Protector because the large amount of file information had to be written to tape and into Data
Protector’s IDB, which resulted in additional performance loss of 6.04 MB/s for Windows (Test 3 with
29.02 MB/s—Test 5 with 22.98 MB/s = 6.04 MB/s). For further details, see the
Data Protector IDB
considerations section.
The CPU load of Test 4 and Test 5 was higher for Windows (61%) than for HP-UX (25% and 16%).
This is the result of the inefficient Windows NTFS file system if keeping millions of small files.
Note
For Windows Tests 4 and 5, the first file system tree walk was disabled,
which improved the overall performance. Without this, the performance
would have been even worse.
Table 2. Local backup of small files—bottleneck determination
Test Performance (MB/s) CPU Load Bottleneck
1. Windows L&TT Tape Write
2. Windows HPReadData Single
3. Windows HPReadData Parallel
2
2
2
154.00 - No
10.59 - Yes (File System)
29.02 - Yes (File System)
4. Windows DP NULL Parallel 22.54 61% No
5. Windows DP Ultrium 960 Parallel 22.98 61% No
1. HP-UX L&TT Tape Write
2. HP-UX HPReadData Single
3. HP-UX HPReadData Parallel
4. HP-UX DP NULL Parallel 50.50 25% No
5. HP-UX DP Ultrium 960 Parallel 48.76 16% No
2
2
2
158.54 - No
24.41 - Yes (File System)
53.08 - Yes (File System)
Recommendation
If backing up file systems with millions of small files, parallel backups with
a higher concurrency are recommended. These kinds of file systems are
very slow and should be multiplexed. In case of the fast Ultrium 960 tape
drive, a slower Ultrium tape drive or backup-to-disk technology could be
considered.
2
Tested in the Eval section uating tape and disk drive performance
42
Page 43
Network backup of typical files
The typical files were saved from the client server by way of the network (Gigabit Ethernet) to the
remote backup server and its SCSI-attached Ultrium 960 tape drive.
Table 3 shows that the tape device in Test 1 and the disk device in Test 3 were faster than Data
Protector in Test 4 and Test 5. In this scenario, both operating systems showed the same backup
performance by way of the network, which was the bottleneck. The Gigabit Ethernet itself has a
1,000-Mb/s or 120-MB/s limitation, which is very close to the results of Test 5 (Windows
108.29 MB/s and HP-UX 111.84 MB/s).
Table 3. Network backup of typical files—bottleneck determination
Test Performance (MB/s) CPU Load
Client
1. Windows L&TT Tape Write
2. Windows HPReadData Single
3. Windows HPReadData Parallel
4. Windows DP NULL Parallel 108.16 12% 12% Yes (Network)
If backing up typical files from a fast disk by way of Gigabit Ethernet to a
remote Ultrium 960 tape drive, backups without multiplexing/concurrency
could be considered. In this case, the network is the bottleneck.
3
Tested in the Eval section uating tape and disk drive performance
43
Page 44
Network backup of small files
The small files were saved by way of the network (Gigabit Ethernet) to the remote backup server and
its SCSI-attached Ultrium 960 tape drive.
Table 4 shows that file systems with millions of small files were not able to fully utilize the remote
Ultrium 960 tape drive with the network (Gigabit Ethernet) limit of 120 MB/s. The file systems were
the bottleneck for both operating systems.
Note
For Windows Tests 4 and 5, the first file system tree walk was disabled,
which improved the overall performance. Without this, the performance
would have been even worse.
Table 4. Network backup of small files—bottleneck determination
Test Performance (MB/s) CPU Load
Client
CPU Load
Backup
Bottleneck
Server
1. Windows L&TT Tape Write
2. Windows HPReadData Single
3. Windows HPReadData Parallel
4
4
4
154.00 - - No
10.59 - - No
29.02 - - No
4. Windows DP NULL Parallel 22.02 3% 62% Yes (Network)
Parallel backups with a higher concurrency are recommended for file
systems with millions of small files—also if backing up by way of the
network (Gigabit Ethernet). These kinds of file systems are very slow and
should be multiplexed. In case of the fast Ultrium 960 tape drive and the
Gigabit Ethernet, a slower Ultrium tape drive or backup-to-disk technology
could be considered.
4
Tested in the Eval section uating tape and disk drive performance
44
Page 45
Restore of typical and small files
This section covers the local- and network-based restore of typical and small files.
Local restore of typical files
The files were restored directly from the SCSI-attached Ultrium 960 tape drive.
For Windows, Test 1 of Table 5 shows that the tape device did not read faster than 156.00 MB/s.
Test 3 shows that the disk device resulted in164.67-MB/s write performance and Test 4 in
145.07-MB/s restore performance for Data Protector. If the performance of two devices is so close,
the tape has to wait sometimes for I/O, which explains the slightly slower Data Protector
performance. Note that a disk device does not show 100% consistent performance from the operating
system point of view. It will always have higher and lower values and monitoring tools just displaying
the average value in a configured timeframe. This can be verified, for example, with the built-in
performance tool Perfmon for Windows.
For HP-UX, Test 1 of Table 5 shows that the tape device result of Test 1 (160.71 MB/s) was less than
the disk device result in Test 3 (193.07 MB/s). The tape device was the bottleneck.
Table 5. Local restore of typical files—bottleneck determination
Test Performance (MB/s) Bottleneck
1. Windows L&TT Tape Read
2. Windows HPCreateData Single
3. Windows HPCreateData Parallel
5
5
5
156.00 Yes (Tape)
133.98 No
164.67 No
4. Windows DP Ultrium 960 Parallel 145.07 No
1. HP-UX L&TT Tape Read
2. HP-UX HPCreateData Single
3. HP-UX HPCreateData Parallel
4. HP-UX DP Ultrium 960 Parallel 153.30 No
5
5
5
160.71 Yes (Tape)
166.28 No
193.07 No
5
Tested in the Eval section uating tape and disk drive performance
45
Page 46
Local restore of small files
The files were restored directly from the SCSI-attached Ultrium 960 tape drive.
For Windows, Test 1 of Table 6 shows that the tape device did not read faster than 156.00 MB/s.
Test 2 shows that the disk device resulted in 19.94-MB/s single write performance. Test 3 resulted
with Data Protector in just 3.38-MB/s single restore performance. The bottleneck was the file system,
which was very busy during the recovery. Millions of small files were written back with their original
names and file attributes. The tape device was always in the start/stop mode.
Note
With Windows, Test 4 with parallel writes to file systems was not possible
due to problems with an overflow of the Windows system paged pool. For
further details, see the Disk write performance section. Out of that reason
HP-UX Test 4 was skipped.
For HP-UX, Test 2 of Table 6 shows a much better disk write performance (51.41 MB/s) than for
Windows (19.94 MB/s). If the results of Test 3 are compared between Windows (3.38 MB/s) and
HP-UX (20.76 MB/s), the difference is even bigger. This shows how efficient the HP-UX file system is.
Table 6. Local restore of small files—bottleneck determination
Test Performance (MB/s) Bottleneck
1. Windows L&TT Tape Read
2. Windows HPCreateData Single
6
6
156.00 No
19.94 Yes (File System)
3. Windows DP Ultrium 960 Single 3.38 Yes (File System)
4. Windows DP Ultrium 960 Parallel - -
1. HP-UX L&TT Tape Read
2. HP-UX HPCreateData Single
3. HP-UX DP Ultrium 960 Single 20.76 Yes (File System)
4. HP-UX DP Ultrium 960 Parallel - -
6
6
160.71 No
51.41 Yes (File System)
6
Tested in the Eval section uating tape and disk drive performance
46
Page 47
Network restore of typical files
The typical files were restored by way of the network (Gigabit Ethernet) from the remote backup
server and its SCSI-attached Ultrium 960 tape drive.
Table 7 shows that the tape device of Test 1 and the disk device of Test 3 were faster than with Data
Protector in Test 4. In this scenario, both operating systems showed the same performance for the tests
by way of the network, which was the bottleneck. The Gigabit Ethernet itself has a 1,000-Mb/s or
120-MB/s limitation, which is very close to the results of Test 4 (Windows with 104.96 MB/s and
HP-UX with 104.22 MB/s).
Table 7. Network restore of typical files—bottleneck determination
Test Performance (MB/s) Bottleneck
1. Windows L&TT Tape Read
2. Windows HPCreateData Single
3. Windows HPCreateData Parallel
4. Windows DP Ultrium 960 Parallel 104.96 Yes (Network)
Tested in the Eval section uating tape and disk drive performance
47
Page 48
Network restore of small files
The small files were restored by way of the network (Gigabit Ethernet) from the remote backup server
and its SCSI-attached Ultrium 960 tape drive.
For Windows, Table 8 shows that the tape device did not read faster than 156.00 MB/s and Test 2
shows that the disk device can write with 19.94 MB/s. But Test 3 resulted in just 3.66-MB/s restore
performance for Data Protector. The bottleneck was the file system, which was very busy during the
recovery. Millions of small files were written back with their original names and file attributes. The
tape device was always in the start/stop mode.
Note
For Windows, Test 4 with parallel writes to file systems was not possible
due to problems with an overflow of the Windows system paged pool. For
further details, see the
HP-UX Test 4 was skipped.
Table 8. Network restore of small files—bottleneck determination
Test Performance (MB/s) Bottleneck
1. Windows L&TT Tape Read
2. Windows HPCreateData Single
Disk write performance section. Out of that reason,
8
8
156.00 No
19.94 Yes (File System)
3. Windows DP Ultrium 960 Single 3.66 Yes (File System)
4. Windows DP Ultrium 960 Parallel - -
1. HP-UX L&TT Tape Read
2. HP-UX HPCreateData Single
3. HP-UX DP Ultrium 960 Single 20.18 Yes (File System)
4. HP-UX DP Ultrium 960 Parallel - -
8
8
160.71 No
51.41 Yes (File System)
Backup and restore of Microsoft Exchange Server 2003
The Microsoft Exchange Server was principally tested for backup performance. Four storage groups
with one 50-GB database each were saved to the SCSI-attached Ultrium 960 tape drive. Other
scenarios were not tested, for example, with multiple databases per storage group, because it would
not significantly change the performance results (MB/s). Multiple databases within one storage group
are backed up one after the other.
Note
Multiple storage groups are backed up in parallel but multiple databases
(stores) within a storage group only sequentially.
Basic Microsoft Exchange Server 2003 test tools for backup and restore were not available during the
creation of this white paper. Therefore, only L&TT and Data Protector were used for testing.
8
Tested in the Eval section uating tape and disk drive performance
48
Page 49
Data Protector configuration parameters
The maximum device concurrency for backing up the Exchange Server data is two. A higher
concurrency would utilize too much Exchange Server resources and finally not improve the backup
performance.
Recommendation
The recommended device concurrency for backing up the Exchange Server
data is two for devices connected directly to the server, and one for devices
connected remotely.
The buffer size is based on the formula buffer size = concurrency * 16 KB. The minimum buffer size is
32 KB, which is the default buffer size as well.
Local backup of Microsoft Exchange Server 2003
All storage groups were saved to the SCSI-attached Ultrium 960 tape drive with the default
configuration values for concurrency (2) and backup buffer size (32 KB).
The performance was only calculated based on the backup time of storage groups and its databases.
The backup time for the subsequent transaction log backup was excluded.
Table 9 demonstrates with Test 2 (226.99 MB/s) and Test 3 (289.55 MB/s) that the Exchange Server
integration of Data Protector was able to provide more performance than the Ultrium 960 tape drive
can handle (154 MB/s). The Exchange Server backup of Test 4 (138.70 MB/s) did not completely
reach the Ultrium tape drive performance of Test 1 (154 MB/s). This is because the data was not
always streamed with the same performance. The performance ups were higher and if a database
was switched—one database backup had finished and the next one started—the downs much lower.
Table 9. Local backup of Exchange Server 2003—bottleneck determination
Test Performance (MB/s) Bottleneck
1. Windows L&TT Tape Write
2. DP Null Single (conc.=1) 226.99 No
3. DP Null Parallel (conc.=2) 289.55 No
4. DP Ultrium 960 Parallel 138.70 Yes (Tape)
9
154.00 No
9
Tested in the Eval section uating tape and disk drive performance
49
Page 50
Local restore of Microsoft Exchange Server 2003
All storage groups were restored from the SCSI-attached Ultrium 960 tape drive.
The performance was only calculated based on the restore time of the storage groups and its
databases. The time for the subsequent transaction log restore and recovery was excluded.
As shown in Table 10, the Exchange Server restore of Test 2 (109.61 MB/s) was slower than the
Ultrium tape drive performance of Test 1 (156 MB/s). This is because the data was not always
streamed with the same performance. The performance ups were higher and, for example, if a
storage group was initialized, the downs much lower.
Table 10. Local restore of Exchange Server 2003 – bottleneck determination
Test Performance (MB/s) Bottleneck?
1. Windows L&TT Tape Read
2. DP Ultrium 960 Parallel 109.61 Yes (Tape)
9
156.00 No
Tuning Data Protector performance for typical files
Data Protector’s backup and restore performance can be improved by modifying its configuration for
backups, devices, and media.
Because most customers tune their environment for backup performance, restores are only discussed.
The focus of this white paper is on the most important backup parameters, which are configurable in
the GUI. The client-based parameter for the
File system tree walk is not considered.
For simplicity, all tests of this section were executed on HP-UX with the typical files dataset as created
Creating typical files for HP-UX VxFS section.
in the
Note
In this section, all tests were based on HP-UX typical files with the
compressibility of the data 4:1.
Backup options
Data Protector offers a comprehensive set of backup options for fine tuning. The most relevant options
for performance are:
• Load balancing
• Software compression
• IDB logging level
• Detect Windows NTFS hardlinks
50
Page 51
Load balancing
By default, Data Protector automatically balances the load (usage) of devices so that they are used
evenly. Since load balancing is done automatically during backup time, it is not required to manage
the assignment of objects to devices, so that all assigned devices stay busy during the backup session.
If this option was selected in the Create New Backup dialog (the default), it cannot be deselected.
Figure 33 illustrates a backup configuration with the default load balancing switched ON.
Figure 33. Backup configuration—load balancing
If load balancing was not selected (OFF) in the Create New Backup dialog, single devices can be
chosen, which will be used for each object in a backup specification. This could be very beneficial for
bundling objects (for example, file systems) based on their speed and not based on their size. Data
Protector does not track the backup speed of each object and therefore cannot automatically balance
based on that. But the manual load balancing option provides an alternative solution for fine tuning
the distribution of backup objects to devices.
Software compression
This option enables compressing the data, which is read by the Disk Agent. It is based on the LempelZiv compression algorithm, which is compatible with the standard UNIX compress utility.
Software compression could be advantageous in network environments with small bandwidths when
the network is the bottleneck. The Disk Agent would compress the data and subsequently send it
across the network to the remote Media Agent. Sometimes this procedure improves the backup
performance. The default compression value is OFF.
Note
Most modern backup devices provide built-in hardware compression that
you can configure when adding a device to a client. In this case, do not
use this option, since double compression only decreases performance
without giving better compression results.
51
Page 52
Figure 34 shows the default configuration parameters for file systems. The software compression is set
to OFF by default.
Figure 34. Backup configuration—default file system options—software compression OFF
52
Page 53
Figure 35 demonstrates how performance-efficient the Ultrium 960 tape drive compression is. With
the software compression disabled (default), the backup performance was 156 MB/s. With the
software compression enabled, the performance was just 46 MB/s.
Figure 35. Performance of HP-UX local backup to Ultrium 960 with software compression OFF/ON
Software Compression OFF/ ON
Perfor mance of HP-UX Local Backup t o U ltrium 960
180
160
140
120
100
MB/s
80
156
60
40
20
0
OFF ON
Softwar e Compr ession
46
53
Page 54
Figure 36 shows that enabling the software compression increased the CPU load from 13% to 99%.
The CPU load was very high because Data Protector compressed five file systems in parallel.
Figure 36. CPU load of HP-UX local backup to Ultrium 960 with software compression OFF/ON
Software Compression OFF/ ON
CPU Load of HP-UX Local Backup t o U lt rium 960
99%
CPU Load
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
13%
OFF ON
Softwar e Compr ession
Note
As a rule of thumb, each software compression process takes all available
resources of a single CPU. For instance, one Disk Agent would utilize only
one out of four CPUs, which would result in an overall 25% CPU load.
54
Page 55
Figure 37 shows that the ratios were different for software and hardware compression. HP-UX gzip
o
was additionally tested because it is the current standard compression utility for UNIX and a good
indicator for possible ratios. The compression ratio of HP-UX gzip resulted in 59%, the Ultrium 960
built-in compression in 50% and Data Protector software compression in 45%.
Figure 37. Compression ratios for HP-UX
Compression Ratios for Typical Files (2:1)
100.0%
90.0%
80.0%
70.0%
59%
Compression Rati
60.0%
50.0%
40.0%
30.0%
46%
51%
20.0%
10.0%
0.0%
Data Protector
Compression ON
Ultrium 960 Built-In
Compression
HP-UX gzi p (default
parameters)
Following are some reasons why Data Protector’s software compression ratio was lower:
• The Data Protector compression is based on the Lempel-Ziv compression algorithm, which is less
space-efficient than for instance the newer gzip (GNU zip) compression utility. On the other hand,
better compression has its price in terms of speed.
• gzip offers different compression levels—between less and best compression. This test was executed
with the default level, which is biased toward high compression at expense of speed.
• Ultrium tape drives use the Advanced Lossless Data Compression (ALDC) algorithm for data
compression. ALDC is an implementation of the Lempel-Ziv method of compressing data and can be
implemented very hardware-efficient. ALDC has the ability to switch into a non-compressed mode
according to the structure of the data pattern. This means that highly random data does not actually
expand when compressed. This way of compression is very fast and space-efficient simultaneously.
55
Page 56
IDB logging level
The logging level determines the amount of details on files and directories, which are written to the
IDB during backup, object copying, or object consolidation. Regardless of the logging level, the
backed up data can always be restored. Data Protector provides four logging levels: Log All, Log
Directories, Log Files, and No Log. The different logging level settings influence the IDB growth,
backup speed, and the convenience of browsing data for restore. Since the impact is mostly relevant
for the Cell Manager, which is not a focus of this white paper, it was not tested.
Note
Any Data Protector test was executed with the default configuration
parameter Log All, which represents the worst case in terms of
performance.
Figure 38 shows the default configuration parameters for file systems. The logging level is set to Log
All.
Figure 38. Backup configuration—default file system options—logging level Log Files
56
Page 57
Detect Windows NTFS hardlinks
This option enables detection of NTFS hard links. By default, Data Protector does not detect NTFS
hard links, and backs up hard links as files. This significantly improves backup performance, but the
files occupy more space on the media. The original structure is not preserved and, at restore, hard
links are restored as files.
Note
NTFS hard links are not commonly used in Windows environments.
Therefore, it is not a focus of this white paper.
Figure 39 shows the default configuration parameters for Windows file systems. The default NTFS
hardlinks detection is OFF.
Advanced options can be set for devices and media when configuring a new device, or when
changing device properties. The availability of these options depends on the device type.
Some of these options can also be set when configuring a backup, for example, the concurrency.
Device options set in a backup specification override options set for the device in general.
The most relevant options for performance are:
• Concurrency
• CRC check
• Block size
• Segment size
• Disk Agent buffers
• Hardware compression
Concurrency
The number of
and can be modified using the advanced options for the device or when configuring a backup. The
concurrency set in the backup specification takes precedence over the concurrency set in the device
definition.
Data Protector provides a default number of Disk Agents that are sufficient for most cases. The fast
HP StorageWorks Ultrium 960 device is configured with a default number of four.
Disk Agents started for each Media Agent is called Disk Agent (backup) concurrency
l
For example, if you have a
and each Media Agent receives data from four Disk Agents concurrently, data from eight disks is
backed up simultaneously.
Figure 40 shows the default configuration parameters for Ultrium tape drives. The concurrency is set
to four. Note that other tape drives or backup devices could have different values.
Figure 40. HP StorageWorks Ultrium 960 advanced options—concurrency
ibrary with two Ultrium 960 devices, each controlled by a Media Agent
58
Page 59
Figure 41 illustrates the performance of a local backup to the Ultrium 960 tape drive with different
concurrencies. It shows that the Ultrium 960 concurrency value of three was sufficient for this test
environment with the SAN-connected disk volumes. The default value of four or the higher value of
five did not result in a higher performance but also did not result in a lower performance. Higher
values for concurrency are useful for environments with slower disks and file systems.
Figure 41. Performance of HP-UX local backup to Ultrium 960 with concurrency 1–5
Concurrency 1 - 5
Perfor mance of HP- UX Lo cal Backup to Ultrium 960
180
160
140
120
100
MB/s
80
60
40
127
145
155
153
156
20
0
12345
Concurrency
59
Page 60
Figure 42 demonstrates that the CPU load was low during all backup tests. Concurrency values of
three to five resulted in the same CPU load. This is like the backup performance as shown in
Figure 41.
Figure 42. CPU load of HP-UX local backup to Ultrium 960 with concurrency 1–5
Concurrency 1 - 5
CPU Load of HP-UX Local Backup t o U lt rium 960
100%
90%
80%
70%
60%
50%
CPU Load
40%
30%
20%
10%
0%
11%
12345
15%
16%
Concurrency
16%
16%
60
Page 61
CRC check
The CRC check is an enhanced checksum function. When this option is selected, cyclic redundancy
check sums (CRC) are written to the media during backup. The CRC checks allow you to verify the
media after the backup. Data Protector re-calculates the CRC during a restore and compares it to the
CRC on the medium. It is also used while verifying and copying the media. This option can be
specified for backup, object copy, and object consolidation operations. The default value is OFF.
Figure 43 shows the default configuration parameters for Ultrium tape drives. The CRC check was
OFF.
Figure 43. HP StorageWorks Ultrium 960 advanced options—CRC check OFF
61
Page 62
Figure 44 demonstrates that enabling the CRC check required additional system resources. The CRC
ON test resulted in approximately 20% performance decrease.
Figure 44. Performance of HP-UX local backup to Ultrium 960 with CRC check OFF/ON
Data Protector CRC Check OFF/ON
Perfor mance of HP-UX Local Backup t o U lt rium 960
180.00
160.00
156
140.00
120.00
100.00
MB/s
80.00
60.00
40.00
20.00
0.00
OFFON
CRC Check
124
62
Page 63
Figure 45 shows that enabling the CRC check required additional CPU resources. The CRC ON test
resulted in more than twice the CPU load.
Figure 45. CPU load of HP-UX local backup to Ultrium 960 with CRC check OFF/ON
Data Protect or CRC Check OFF/ON
CPU Load of HP-UX Local Backup t o U lt rium 960
100%
90%
80%
70%
60%
50%
40%
CPU Load
30%
20%
10%
0%
16%
OFFON
CRC Check
37%
63
Page 64
Block size
Segments are not written as a whole unit, but rather in smaller subunits called blocks. Data Protector
uses a default device block size regarding different device types. The block size applies to all devices
created by Data Protector and to Media Agents running on the different platforms.
Increasing the block size can improve performance. You can adjust the blocks sent to the device while
configuring a new device or when changing the device properties using the advanced options for the
device. A restore automatically adjusts the block size.
For the HP StorageWorks Ultrium 960, the block size was configured with a fixed value of 256 KB as
shown in Figure 46 and as described in the
Data Protector configuration section.
Figure 46. HP StorageWorks Ultrium 960 configuration—block size, segment size, and Disk Agent buffers
64
Page 65
Segment size
A medium is divided into data segments, catalog segments, and a header segment. Header
information is stored in the header segment, which is the same size as the block size. Data is stored in
data blocks of data segments. Information about each data segment is stored in the corresponding
catalog segment. This information is first stored in the Media Agent memory and then written to a
catalog segment on the medium as well as to the IDB.
All segments are divided by file marks as shown in Figure 47.
Figure 47. Data Protector medium—data segments, catalog segment, header segment, and file marks
Segment size, measured in megabytes, is the maximum size of data segments. If you back up a large
number of small files, the actual segment size can be limited by the maximum size of catalog
segments. Segment size is user configurable for each device and influences the restore performance.
You can adjust the segment size while configuring a new device or when changing the device
properties using the Advanced options for the device.
Optimal segment size depends on the media type used in the device and the kind of data to be
backed up. The average number of segments per tape is 50. The default segment size can be
calculated by dividing the native capacity of a tape by 50. The maximum catalog size is limited to a
fixed number (12 MB) for all media types.
Data Protector finishes a segment when the first limit is reached. When backing up a large number of
small files, the media catalog limit could be reached faster, which could result in smaller segment
sizes.
65
Page 66
Figures 48 and 49 illustrate the results of backups with different segment sizes. Larger segment sizes,
which improved the backup performance, resulted also in some additional CPU load.
Figure 48. Performance of HP-UX local backup to Ultrium 960 with segment sizes 10–10,000 MB
Segment Size 10 - 10.000 MB
Perfor mance of HP-UX Local Backup t o U ltrium 960
180
160
140
120
100
MB/s
80
60
147
160
161
40
20
0
28
101001,00010,000
Segment Size (MB)
66
Page 67
Figure 49. CPU load of HP-UX local backup to Ultrium 960 with segment sizes 10–10,000 MB
Segment Size 10 - 10.000 MB
CPU Load of HP-UX Local Backup to Ultr ium 960
100%
90%
80%
70%
60%
50%
40%
CPU Load
30%
20%
10%
0%
16%
3%
101001,00010,000
Segment Size (MB)
17%17%
Note
Larger segments sizes improve the backup performance but could have
negative impact on the restore performance. Data blocks are found faster if
restored from a backup with smaller segment sizes.
Disk Agent buffers
Data Protector Media Agents and Disk Agents use memory buffers to hold data waiting to be
transferred. This memory is divided into a number of buffer areas (one for each Disk Agent and
depending on device concurrency). Each buffer area consists of eight Disk Agent buffers (of the same
size as the block size configured for the device).
This value can be changed while configuring a new device or when changing the device properties
using the advanced options for the device, although this is rarely necessary. There are two basic
reasons to change this setting:
• Shortage of memory
If there is a shortage of memory, the shared memory required for a Media Agent can be calculated
as follows:
Media Agent Shared Memory = Disk Agent Concurrency x Number of Buffers x Block Size
• Streaming
If the available network bandwidth varies significantly during backup, it is important that a Media
Agent has enough data ready for writing to keep the device in the streaming mode. In this case,
increasing the number of buffers could help.
67
Page 68
Figures 50 and 51 demonstrate that different numbers of Disk Agent buffers did not have a real
performance impact in this local backup scenario. The Media Agent always had enough data ready
for writing.
Figure 50. Performance of HP-UX local backup to Ultrium 960 with 1–32 Disk Agent buffers
Data Protect o r Disk Agent Buffers 1 - 32
Perfor mance of HP- UX Local Backup to Ultr ium 960
180
160
140
120
100
MB/s
80
60
40
20
0
158
18162432
156
160160
Disk Agent Buffers
161
68
Page 69
Figure 51. CPU load of HP-UX local backup to Ultrium 960 with 1–32 Disk Agent buffers
Data Protector Disk Agent Buffers 1 - 32
CPU Load of HP-UX Local Backup to U ltrium 960
100%
90%
80%
70%
60%
50%
CPU Load
40%
30%
20%
10%
0%
17%
18162432
16%16%
Disk Agent Buffers
17%
17%
69
Page 70
Hardware compression
Most modern backup devices provide built-in hardware compression that can be enabled by selecting
the belonging device file or SCSI address in the device configuration procedure. Hardware
compression increases the speed at which a tape drive can receive data, because less data is written
to the tape.
Consider the following regarding hardware compression:
• Do not use software and hardware compression simultaneously because double compression
decreases performance without giving better compression results.
• Keep the
software compression option disabled when an Ultrium drive is configured with Data
Protector.
When configuring a device, the SCSI address can be selected from the dropdown list. Data Protector
automatically determines whether the device can use hardware compression.
Figure 52 illustrates which devices are detected during the automatic configuration process for an
Ultrium 960 drive. In this example, the hardware compression is selected. The end of the
device/drive SCSI address is extended with the “C” option: Tape0:0:6:0C.
Figure 52. HP StorageWorks Ultrium 960 automatic configuration—SCSI address for hardware compression
Figure 53 shows down left that with the correct SCSI address selection, the Hardware compression
option is grayed out.
Figure 53. HP StorageWorks Ultrium 960 configuration—Hardware compression enabled
70
Page 71
Tuning Data Protector performance for Microsoft Exchange
Server 2003
Data Protector’s backup and restore performance for Microsoft Exchange Server 2003 can be
improved by modifying its configuration parameters for backups. All tests were executed with the
NULL device to remove the tape drive as a bottleneck.
Figure 54 demonstrates how efficient a higher concurrency value could be. All tests were executed
with the same buffer size of 64 KB because this is the minimum value for a concurrency of four
(4 x 16 KB). The minimum buffer for each stream (concurrency = 1) is 16 KB.
Figure 54. Performance of MS Exchange Server 2003 local backup to Ultrium 960 with concurrency 1–4
For the test with the concurrency of four, an additional NULL device had to
be configured. Note for the Exchange Server integration that the maximum
concurrency of one device is two.
321
71
Page 72
Figure 55 shows that a higher concurrency caused a higher CPU load. But this slight increase can be
disregarded if compared with the remarkable increase of the performance as shown in Figure 54.
Figure 55. CPU load of MS Exchange Server 2003 local backup to Ultrium 960 with concurrency 1–4
Figure 56 demonstrates that different buffer sizes resulted in similar backup performance. The default
buffer size of 32 KB (per backup device) was already a good choice.
Figure 56. Performance of MS Exchange Server 2003 local backup to Ultrium 960 with buffer size 32–1,024 KB
MS Exchange Server 2003
Buffer Size 32 - 1,024 KB
Performance of Local Backup to Ultrium 960
250.00
200.00
150.00
227227
222
224
230
227
MB/s
100.00
50.00
0.00
32641282565121,024
Buffer Size (KB)
73
Page 74
Figure 57 shows that different buffer sizes resulted in the same CPU load (12%). This was expected
because there was only a small performance difference as demonstrated in Figure 56.
Figure 57. CPU load of Exchange Server 2003 local backup to Ultrium 960 with buffer size 32–1,024 KB
MS Exchange Server 2003
Buffer Siz e 32 - 1,024 KB
CPU Load of Local Backup to Ult rium 960
100%
90%
80%
70%
60%
50%
40%
CPU Load
30%
20%
10%
0%
12%12%12%12%12%12%
32641282565121,024
Buffer Size (KB)
Tuning recommendations
Due to the high number of variables and permutations, it is not possible to give distinct
recommendations that fit all user requirements and affordable investment levels. However, the
following should be considered when trying to improve a backup or restore performance:
• Ensure that the server is sized for the backup requirements. For example, fast tape devices like the
HP StorageWorks Ultrium 960 tape drive should be placed on a dedicated 133-MHz PCI-X bus
and not share the bus with other HBAs like a Gigabit network adapter.
• During high-performance backups, Windows memory problems could occur. This is a general
Windows kernel problem and applies to almost all backup applications—including NTBACKUP.
The Data Protector error message looks like:
[Major] From: VBDA@tpc131.bbn.hp.com "G:" Time: 29.03.2005 19:08:57
[81:78] G:\1\file67108864_000003
Cannot read 57256 bytes at offset 0(:1): ([1450] Insufficient system resources
exist to complete the requested service.).
74
Page 75
A Microsoft article (Q304101) has been found that explains the circumstances that can lead to this
error and how to fix it.
Note
It is suggested to follow the instructions in Microsoft article Q304101,
where causes and settings for avoiding memory problems are described.
The link to this article:
option disabled when configured with Data Protector.
• If backup devices are SCSI- or SAN-attached, software compression should be disabled. On the
other hand, software compression could make sense if executing network backups across slow LAN
(100 Mb/s) environments. This could provide a better backup performance but it will also cause
high CPU loads on the client server.
• If backing up typical files directly to the SCSI-attached Ultrium 960 tape drive, parallel backups
(multiplexing/concurrency) are recommended because one single stream cannot fully utilize the
tape drive. Data Protector’s default concurrency is four.
• If backing up typical files by way of Gigabit Ethernet HBAs to a remote Ultrium 960 tape drive,
backups without multiplexing/concurrency should be considered. The network is often the
bottleneck and enabling multiplexing/concurrency does not really improve the performance.
• For Windows NTFS file systems with millions of small files, parallel backups with a high
concurrency are recommended. These kinds of file systems are very slow and should be
multiplexed. In case of the fast Ultrium 960 tape drive, a slower Ultrium tape drive or backup-to-disk
technology should be considered.
• For Windows NTFS file systems with millions of small files, double tree walks should be disabled.
The first tree walk briefly scans the files selected for the backup and calculates its size, so that the
percentage done can be calculated during the backup. The second tree walk is executed during the
actual file backup. On these particular systems, it is recommended to set the option
“NoTreeWalk=1” in the Data Protector template “<Data_Protector_home>\omnirc.”
• For Windows, it is recommended that file systems with millions of small files are only restored single
due to problems with an overflow of the Windows system paged pool during parallel restores. This
cannot be solved without Windows kernel tuning, which is not a focus of this white paper.
• For Exchange Servers, the recommended device concurrency is two for devices connected directly
and one for devices connected remotely (backup server).
75
Page 76
Appendix A. Reference documentation
HP documents and links
Storage
• HP StorageWorks Enterprise Virtual Array configuration best practices
• Enterprise Backup Solution (EBS) design guide and compatibility matrix
http://www.hp.com/go/ebs
Tools
• Performance troubleshooting and using performance assessment tools
http://www.hp.com/support/pat
• Library and tape tools
http://www.hp.com/support/tapetools
Microsoft documents and links
• Microsoft support and knowledge base
http://support.microsoft.com
76
Page 77
Glossary
• ALDC (Advanced Lossless Data Compression)—ALDC is a technique for data compression.
• Cell Manager—The main system in the cell where the essential software of Data Protector is installed
and from which all backup and restore activities are managed.
•CSI (Command Screen Interface)—CSI is a non-graphical user interface, which is based on a menu
structure.
•DAS (direct attached storage)—DAS is a digital storage system directly attached to a server or
workstation, without a storage network in between.
•Disk Agent—Disk Agent is a Data Protector component needed on a client to back it up and restore
it. The Disk Agent controls reading from and writing to a disk. During a backup session, the Disk
Agent reads data from a disk and sends it to the Media Agent, which then moves it to the device.
During a restore session, the Disk Agent receives data from the Media Agent and writes it to the
disk.
•GUI (graphical user interface)—A Data Protector–provided GUI is a cross-platform (HP-UX, Solaris,
Windows) graphical user interface, for easy access to all configuration, administration, and
operation tasks.
•HBA (host bus adapter)—An HBA connects a host system (the computer) to other network and
storage devices.
•IDB—The Data Protector IDB is an internal database, located on the Cell Manager, that keeps
information regarding what data is backed up, on which media it resides, the result of backup,
restore, copy, object consolidation, and media management sessions, and which devices and
libraries are configured.
•LAN (local area network)—A computer network covering a small geographic area, like a home,
office, or group of buildings.
20
• MB (Megabyte)—1 megabyte = 1,048,576 bytes (2
• Mb (Megabit)—1 megabit = 1,000,000 bits (10
• Media Agent—A Data Protector process that controls reading from and writing to a device, which
reads from or writes to a medium (typically a tape). During a backup session, a Media Agent
receives data from the Disk Agent and sends it to the device for writing it to the medium. During a
restore session, a Media Agent locates data on the backup medium and sends it to the Disk Agent.
The Disk Agent then writes the data to the disk. A Media Agent also manages the robotics control
of a library.
•PCIe (PCI Express)—PCI Express, officially abbreviated as PCIe (and sometimes confused with PCI
Extended, which is officially abbreviated as PCI-X), is a computer system bus/expansion card
interface format.
•SAN (storage area network)—A SAN is an architecture to attach remote computer storage devices
such as disk array controllers and tape libraries to servers in such a way that to the operating
system the devices appear as locally attached devices.
•SPOF (single point of failure)—SPOF describes any part of the system that can, if it fails, cause an
interruption of required service. This can be as simple as a process failure or as catastrophic as a
computer system crash.
Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation.
Intel and Xeon are trademarks of Intel Corporation in the U.S. and other countries.
UNIX is a registered trademark of The Open Group. Oracle is a registered U.S.
trademark of Oracle Corporation, Redwood City, California.
4AA1-3836ENW, July 2007
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.