HP V5. 1B-1 User Manual

QuickSpecs
HP TruCluster Server V5.1B-1
HP TruCluster Server V5.1B-1
HP TruCluster Server V5.1B-1HP TruCluster Server V5.1B-1

Overview

HP TruCluster Server Version 5.1B-1 for HP Tru64 UNIX Version 5.1B-1 provides highly available and scalable solutions for mission-critical computing environments. TruCluster Server delivers powerful but easy-to-use UNIX clustering capabilities, allowing AlphaServer systems and storage devices to operate as a single virtual system.
By combining the advantages of symmetric multiprocessing (SMP), distributed computing, and fault resilience, a cluster running TruCluster Server offers high availability while providing scalability beyond the limits of a single system. On a single-system server, a hardware or software failure can severely disrupt a client's access to critical services. In a TruCluster Server cluster, a hardware or software failure on one member system results in the other members providing these services to clients.
TruCluster Server reduces the effort and complexity of cluster administration by extending single-system management capabilities to clusters. It provides a clusterwide namespace for files and directories, including a single root file system that all cluster members share. A common cluster address (cluster alias) for the Internet protocol suite (TCP/IP) makes the cluster appear as a single system to its network clients while load balancing client connections across member systems.
A single system image allows a cluster to be managed more easily than distributed systems. TruCluster Server cluster members share a single root file system and common system configuration files. Therefore, most management tasks need to be done only once for the entire cluster rather than repeatedly for each cluster member. The cluster can be managed either locally from any of its members or remotely using Tru64 UNIX Web-based management tools. Tru64 UNIX and TruCluster Server software, and applications, are installed only once. Most network applications, such as the Apache Web server, need to be configured only once in the cluster and can be managed more easily in a cluster than on distributed systems.
A choice of graphical, Web-based, or command-line user interfaces makes management tasks easier for the administrator, flexible for those with large configurations, and streamlined for expert users.
TruCluster Server facilitates deployment of services that remain highly available even though they have no embedded knowledge they are running in a cluster. Applications can access their disk data from any cluster member. TruCluster Server also provides the support for components of distributed applications to run in parallel, providing high availability while taking advantage of cluster-specific synchronization mechanisms and performance optimizations.
TruCluster Server allows the processing components of an application to concurrently access raw devices or files, regardless of where the storage is located in the cluster. Member-private storage and clusterwide shared storage are equally accessible to all cluster members. Using either standard UNIX file locks or the distributed lock manager (DLM), an application can synchronize clusterwide access to shared resources, maintaining data integrity.
TruCluster Server is an efficient and reliable platform for providing services to networked clients. To a client, the cluster appears to be a powerful single-server system; a client is impacted minimally, if at all, by hardware and software failures in the cluster. TruCluster Server simplifies the mechanisms of making applications highly available. A cluster application availability (CAA) facility records the dependencies of, and transparently monitors the state of, registered applications. If a hardware or software failure prevents a system from running a service, the failover mechanism automatically relocates the service to a viable system in the cluster, which maintains the availability of applications and data. Administrators can manually relocate applications for load balancing or hardware maintenance.
TCP-based and UDP-based applications can also take advantage of the cluster alias subsystem. These applications, depending on their specific characteristics, can run on a single cluster member or simultaneously on multiple members. The cluster alias subsystem routes client requests to any member participating in that cluster alias. During normal operations, client connections are dynamically distributed among multiple service instances according to administrator-provided metrics.
TruCluster Server supports a variety of hardware configurations that are cost-effective and meet performance needs and availability requirements. Hardware configurations can include different types of systems and storage units, and can be set up to allow easy maintenance. In addition, administrators can set up hardware configurations that allow the addition of a system or storage unit without shutting down the cluster.
For the fastest communication with the lowest latency, use the PCI-based Memory Channel cluster interconnect for communication between cluster members. TruCluster Server Version 5.1B-1 also supports the use of 100 Mbps Ethernet or 1000 Mbps Ethernet hardware as a private LAN cluster interconnect. The LAN interconnect is suitable for clusters with low-demand workloads generated by a cluster running failover style, with highly available applications in which there is limited application data being shared between the nodes over the cluster interconnect. Refer to the Cluster Technical Overview manual for a discussion of the merits of each cluster interconnect.
DA - 11444 Worldwide — Version 4 — December 8, 2003
Page 1
QuickSpecs
HP TruCluster Server V5.1B-1
HP TruCluster Server V5.1B-1
HP TruCluster Server V5.1B-1HP TruCluster Server V5.1B-1

Features - TruCluster Server V5.1B-1

Cluster members in a given cluster must all use Memory Channel or must all use LAN. These interconnects cannot be mixed for cluster communication in the same cluster. Using multiple shared buses and redundant Memory Channel or LAN interconnect hardware promotes no-single-point-of-failure (NSPOF) characteristics for mission-critical applications.
A TruCluster Server cluster acts as a single virtual system, even though it is made up of multiple systems. Cluster members can share resources, data, and clusterwide file systems under a single security and management domain, yet they can boot or shut down independently without disrupting the cluster.
Cluster File System
Cluster File System
Cluster File SystemCluster File System
The Cluster File System (CFS) makes all files, including the root (/), /usr, and /var file systems, visible to and accessible by all cluster members. It does not matter whether a file is stored on a device connected to all cluster members or on one that is private to a single member. Each file system is served by a single-cluster member; other members access that file system as CFS clients with significant optimizations for shared access. CFS preserves full X/Open and POSIX semantics for file system access and maintains cache coherency across cluster members. For instance, an application can use standard UNIX file locks to synchronize access to shared files. The member that will serve a given file system can be specified at file system mount time.
For higher performance, applications can use direct I/O through the file system to bypass the buffer cache. CFS also provides a load balancing daemon to monitor and analyze file system usage. The daemon can be configured to automatically relocate file systems based on CFS memory usage, or during changes in the cluster, such as when members join or leave, or when storage connectivity changes.
CFS supports the Advanced File System (AdvFS) for both read and write access and supports AdvFS with BSD-type user and group quotas. NFS client and NFS server are supported for both read and write access, and NFS services are accessible to clients through cluster aliases in addition to the default cluster alias. The UNIX File System (UFS) is supported for read and write access from the CFS server or for read only access on client members. The Memory File System (MFS) is supported for both read only or read and write access by the member on which the file system is mounted. Remote access and failover are not supported. The CD-ROM File System (CDFS) and Digital Video Disc File System (DVDFS) are supported for read access only.
Device Request Dispatcher
Device Request Dispatcher
Device Request DispatcherDevice Request Dispatcher
Cluster Alias
Cluster Alias
Cluster AliasCluster Alias
The Cluster File System (CFS) load balancing daemon (cfsd) can monitor and analyze file system usage, make recommendations, and automatically relocate file systems. The daemon can be configured to automatically relocate file systems based on CFS memory usage, or during cluster transitions (when members join or leave the cluster), or when storage connectivity changes. In addition, the member that will serve a given file system can be specified when that file system is mounted.
The device request dispatcher supports clusterwide access to character and block disk devices, and to tape and tape changer devices. All local and remote cluster disk and tape I/O passes through the device request dispatcher. A member does not need a direct connection to a disk, or tape, or tape changer device to access data on that device. This permits great flexibility in selecting a hardware configuration that is both economical and useful.
A cluster alias is an IP address that makes the cluster look like a single system to clients and other hosts on the network. Cluster aliases free clients from having to connect to specific members for services. If the member providing the service goes down, a client reconnects to another member elected by the cluster alias to provide the service. With applications that run concurrently on multiple members, scaling is achieved by permitting multiple clients to connect to instances of the service on multiple cluster members, each using a cluster alias to address the service.
The cluster alias subsystem provides an optional virtual MAC (vMAC) address that can be associated with each cluster alias IP address. When configured, the same MAC address is used in all Address Resolution Protocol (ARP) responses for the cluster alias address, independent of which cluster node is responding to cluster alias ARP requests. This permits faster failover when a new node assumes responsibility for responding to cluster alias ARP requests.
DA - 11444 Worldwide — Version 4 — December 8, 2003
Page 2
QuickSpecs
Features - TruCluster Server V5.1B-1
Flexible Network
Flexible Network
Flexible NetworkFlexible Network Configuration
Configuration
ConfigurationConfiguration
Cluster Application
Cluster Application
Cluster ApplicationCluster Application Availability Facility
Availability Facility
Availability FacilityAvailability Facility
TruCluster Server offers flexible network configuration options. Cluster members do not need to have identical routing configurations. An administrator can enable IP forwarding and configure a cluster member as a full-fledged router. Administrators can use routing daemons such as gated or routed, or they can configure a cluster member to use only static routing. When static routing is used, administrators can configure load balancing between multiple network interface cards (NICs) on the same member. Whether gated, routed, or static routing are used, in the event of a NIC failure, the cluster alias reroutes network traffic to another member of the cluster. Ss long as the cluster interconnect is working, cluster alias traffic can get in or out of the cluster.
The cluster application availability (CAA) facility delivers the ability to deploy highly available single instance applications in a cluster by providing resource monitoring and application relocation, failover, and restart capabilities. CAA is used to define which members can run a service, the criteria under which to relocate a service, and the location of an application­specific action script. Monitored resources include network adapters, tape devices, media changers, and applications. CAA allows services to manage and monitor resources by using entry points within their action scripts. Applications do not need to be modified in any way to utilize CAA.
Administrators can request that CAA reevaluate the placement within the cluster of registered applications either at a regularly scheduled time, or any time at which they desire to manually balance applications by using the caa_balance command. Balancing decisions are based on the standard CAA placement mechanisms. Similarly, administrators can request that CAA schedule an automatic failback of a resource for a specific time. This allows an administrator to benefit from CAA automatically moving a resource to the most-favored cluster member without the worry of the relocation occurring at a critical time. The caa_report utility can provide a report of availability statistics for application resources. Administrators can redirect the output of CAA resource action scripts it is visible during execution. Lastly, user-defined attributes can be added to a resource profile and they will be available to the action script upon its execution.
HP TruCluster Server V5.1B-1
HP TruCluster Server V5.1B-1
HP TruCluster Server V5.1B-1HP TruCluster Server V5.1B-1
Rolling Upgrade
Rolling Upgrade
Rolling UpgradeRolling Upgrade
Cluster Management
Cluster Management
Cluster ManagementCluster Management
TruCluster Server allows rolling upgrade from the previous version of the base operating system and the TruCluster software to the next subsequent release of the base operating system and TruCluster software. It also allows the rolling of patches into the cluster. Updating the operating system and cluster software does not require a shutdown of the entire cluster. A utility is provided to roll the cluster in a controlled and orderly fashion. The upgrade procedure allows the monitoring of the status of the upgrade while it is in progress. Clients accessing services are not aware that a rolling upgrade is in progress.
To speed the process of upgrading the cluster, the administrator can use the parallel rolling upgrade procedure that upgrades more than one cluster member at a time in qualifying configurations.
Administrators looking for a quicker alternative to a rolling upgrade when installing patches have the option of a patch procedure that favors upgrade speed over cluster high availability. After the first member receives the patch, all remaining members of the cluster receive the patch at the same time followed by rebooting the entire cluster as a single operation.
See the
Cluster Installation
TruCluster Server.
The SysMan system management utilities provide a graphical view of the cluster configuration, and can be used to determine the current state of availability and connectivity in the cluster. The administrator can invoke management tools from SysMan, allowing the cluster to be managed locally or remotely.
Clusterwide signaling allows applications to send UNIX signals to processes operating on other members.
manual for recommended and supported paths to upgrade or roll to the latest version of
DA - 11444 Worldwide — Version 4 — December 8, 2003
Page 3
QuickSpecs
Features - TruCluster Server V5.1B-1
Performance Management
Performance Management
Performance ManagementPerformance Management
Cluster MIB
Cluster MIB
Cluster MIBCluster MIB
Highly Available NFS
Highly Available NFS
Highly Available NFSHighly Available NFS Server
Server
ServerServer
The performance management capability of Tru64 UNIX has been modified from one large performance management tool (
pmgr
) to several smaller and more versatile tools. The performance management tool suite consists of collect, collgui, and
two Simple Network Management Protocol (SNMP) agents (pmgrd and clu_mibs).
The collect tool gathers operating system and process data under Tru64 UNIX Versions 4.x and 5.x. Any subset of the 'subsystems' (Process, Memory, Disk, LSM Volumes, Network, CPU, Filesystems) and Header can be defined for which data is to be collected. Collect is designed for high reliability and low system-resource overhead. Accompanying collect are two highly integrated tools: collgui (a graphical front-end) and cfilt, (which allows completely arbitrary extraction of data from the output of collect to standard output). Collgui is a laborsaving tool that allows a user to quickly analyze collect data. The Performance Manager metrics server ( metrics on request. The
TruCluster Server supports the HP Common Cluster MIB. HP Insight Manager uses this Cluster MIB to discover cluster member relationships, and to provide a coherent view of clustered systems across supported platforms.
When configured as an NFS server, a TruCluster Server cluster can provide highly available access to the file systems it exports. No special cluster management operations are required to configure the cluster as a highly available NFS server. In the event of a system failure, another cluster member will become the NFS server for the file system, transparent to external NFS clients. NFS file locking is supported, as are both NFS V2 and V3 with UDP and TCP.
HP TruCluster Server V5.1B-1
HP TruCluster Server V5.1B-1
HP TruCluster Server V5.1B-1HP TruCluster Server V5.1B-1
pmgrd
) is a UNIX daemon process that provides general UNIX performance
pmgrd
metrics server supports the extensible SNMP agent mechanism (eSNMP).
Fast File System Recovery
Fast File System Recovery
Fast File System RecoveryFast File System Recovery
Increased Data Integrity
Increased Data Integrity
Increased Data IntegrityIncreased Data Integrity
TruCluster Server allows NFS file systems to be served from the cluster through both the default cluster alias and alternate aliases. Alternate cluster aliases can be defined to limit NFS server activity to those members that are actually connected to the storage that contains the exported file systems. NFS clients can use this alternate alias when they mount the file systems served by the cluster.
The Advanced File System (AdvFS) log-based file system provides higher availability and greater flexibility than traditional UNIX file systems. AdvFS journaling protects file system integrity. TruCluster Server supports AdvFS for both read and write access.
An optional, separately licensed product, the Advanced File System Utilities, performs online file system management functions. See the OPTIONAL SOFTWARE section of this document for more information on the AdvFS utilities.
Tru64 UNIX Logical Storage Manager (LSM) is a cluster-integrated, host-based solution to data storage management. In a TruCluster Server cluster, LSM operations continue despite the loss of cluster members, as long as the cluster itself continues operation and a physical path to the storage is available. LSM disk groups can be used simultaneously by all cluster members and the LSM configuration can be managed from any cluster member.
Basic LSM functionality, including disk spanning and concatenation, is provided with the Tru64 UNIX operating system. Extended functions, such as striping (RAID 0), mirroring (RAID 1), and online management, are available with a separate license. Mirroring of LSM is RAID Advisory Board (RAB) certified for RAID Levels 0 and 1.
LSM is supported for use in a TruCluster Server cluster and will support any volume in a cluster, including swap and cluster root and excluding the quorum disk and member boot disks. Hardware mirroring is supported for all volumes in a cluster without exception.
LSM RAID 5 volumes are not supported in clusters. See the OPTIONAL SOFTWARE section of this document for more information on LSM.
Global Error Logger &
Global Error Logger &
Global Error Logger &Global Error Logger & Event Manager
Event Manager
Event ManagerEvent Manager
TruCluster Server can log messages about events that occur in the TruCluster environment to one or more systems. Cluster administrators can also receive notification through electronic mail when critical problems occur.
DA - 11444 Worldwide — Version 4 — December 8, 2003
Page 4
QuickSpecs
Features - TruCluster Server V5.1B-1
Cluster Storage I/O
Cluster Storage I/O
Cluster Storage I/OCluster Storage I/O Failover
Failover
FailoverFailover
Cluster Client Network
Cluster Client Network
Cluster Client NetworkCluster Client Network Failover
Failover
FailoverFailover
Cluster Interconnect
Cluster Interconnect
Cluster InterconnectCluster Interconnect Failover
Failover
FailoverFailover
Support for Parallelized
Support for Parallelized
Support for ParallelizedSupport for Parallelized Database Applications
Database Applications
Database ApplicationsDatabase Applications
TruCluster Server provides two levels of protection in the event of storage interconnect failure. When configured with redundant storage adapters, the storage interconnect will be highly available. Should one interconnect fail, traffic will transparently fail over to the surviving adapter. When a member system is connected to shared storage with a single storage interconnect and it fails, transactions are transparently performed via the cluster interconnect to another cluster member with a working storage interconnect.
TruCluster Server supports highly available client network interfaces via the Tru64 UNIX redundant array of independent network adapters (NetRAIN) feature.
TruCluster Server allows the elimination of the cluster interconnect as a single point of failure by supporting redundant cluster interconnect hardware. You can configure dual-rail Memory Channel, allowing the cluster to survive the failure of a single rail. For LAN interconnect, two or more network adapters on each member are configured as a NetRAIN virtual interface. When properly configured across two or more switches, the cluster will survive any LAN component failure. This not only guards against rare network hardware failures, but also facilitates the upgrade and maintenance of the network without disrupting the cluster.
TruCluster Server provides the software infrastructure to support parallelized database applications, such as Oracle 9i Real Application Clusters (RAC) and Informix Extended Parallel Server (XPS) to achieve high performance and high availability. 9i RAC and XPS are offered and supported separately by Oracle Corporation and Informix Software, Inc., respectively.
HP TruCluster Server V5.1B-1
HP TruCluster Server V5.1B-1
HP TruCluster Server V5.1B-1HP TruCluster Server V5.1B-1
Distributed Lock Manager
Distributed Lock Manager
Distributed Lock ManagerDistributed Lock Manager
The distributed lock manager (DLM) synchronizes access to resources that are shared among cooperating processes throughout the cluster. DLM provides a software library with an expansive set of lock modes that applications use to implement complex resource-sharing policies. DLM provides services to notify a process owning a resource that it is blocking another process requesting the resource. An application can also use DLM routines to efficiently coordinate the application's activities within the cluster.
DA - 11444 Worldwide — Version 4 — December 8, 2003
Page 5
QuickSpecs
Features - TruCluster Server V5.1B-1
Support for Memory
Support for Memory
Support for MemorySupport for Memory Channel API
Channel API
Channel APIChannel API
TruCluster Server provides a special application programming interface (API) library for high-performance data delivery over Memory Channel by giving access to Memory Channel data transfer and locking functions. This Memory Channel API library enables highly optimized applications that require high-performance data delivery over the Memory Channel interconnect. This library is supported solely for use with Memory Channel.
High performance within the cluster is achieved by providing user applications with direct access to the capabilities of the Memory Channel. For example, a single store instruction on the sending host is sufficient for the data to become available for reading in the memory of another host.
The Memory Channel API library allows a programmer to create and control access to regions of the clusterwide address space by specifying UNIX style protections. Access to shared data can be synchronized using Memory Channel spin locks for clusterwide locking.
The Memory Channel API library facilitates highly optimized implementations of Parallel Virtual Machine (PVM), Message Passing Interface (MPI), and High Performance Fortran (HPF), providing seamless scalability from SMP systems to clusters of SMP machines. This provides the programmer with comprehensive access to the current and emerging de facto standard software development tools for parallel applications while supporting portability of existing applications without source code changes.
NOTE:
NOTE:
NOTE: NOTE: On Tru64 UNIX V5.*, the TruCluster Memory Channel Software V1.6 product is bundled as a feature of TruCluster Server V5.*. To run the Memory Channel API library on Tru64 UNIX Version 5.1B-1, you must install a TruCluster Server license to configure a valid TruCluster Server cluster. NOTE:
NOTE:
NOTE: NOTE: configuration: The Memory Channel API is not supported for data transfers larger than 8K bytes when loopback mode is enabled in two member clusters configured with MC virtual hub. For more information on loopback mode go to
http://www.tru64unix.hp.com/docs/updates/TCR51A/TITLE.HTM
Applications May Not Use Transfers Larger Than 8 KB with Loopback Mode Enabled on Clusters Utilizing Virtual Hubs."
To users of the Memory Channel API V1.6 product on Tru64 UNIX Version 4.0*:
To users of the Memory Channel application programming interface with Memory Channel virtual hub (vhub)
HP TruCluster Server V5.1B-1
HP TruCluster Server V5.1B-1
HP TruCluster Server V5.1B-1HP TruCluster Server V5.1B-1
and refer to the 21.Mar.2002 issue titled "MC API
Connection Manager
Connection Manager
Connection ManagerConnection Manager
Support for Fibre Channel
Support for Fibre Channel
Support for Fibre ChannelSupport for Fibre Channel Solutions
Solutions
SolutionsSolutions
Enhanced Security with
Enhanced Security with
Enhanced Security withEnhanced Security with Distributed
Distributed
DistributedDistributed Authentication
Authentication
AuthenticationAuthentication
The connection manager is a distributed kernel component that ensures that cluster members communicate with each other and enforces the rules of cluster membership. The connection manager forms a cluster, and adds and removes cluster members. It tracks whether members in a cluster are active and maintains a cluster membership list that is consistent on all cluster members.
TruCluster Server supports the use of switched Fibre Channel storage and Fibre Channel arbitrated loop. Compared to parallel SCSI storage, Fibre Channel provides superior performance, greater scalability, higher reliability and availability, and better serviceability. Compared to parallel SCSI storage, Fibre Channel is easier to configure and its long distance permits greater flexibility in configurations. Fibre Channel can be used for clusterwide shared storage, cluster file systems, swap partitions, and boot disks.
Compared with a switched Fibre Channel topology, arbitrated loop offers a lower cost solution by trading off bandwidth, and therefore some performance. Arbitrated loop is supported for two-member configurations only.
For more information on supported TruCluster Server configurations and specific cabling restrictions using Fibre Channel, see the
Cluster Hardware Configuration
http://tru64unix.compaq.com/docs/pub_page/cluster_list.html
TruCluster Server supports the Enhanced Security option on all cluster members. This includes support for features for enhanced login checks and password management. Audit and access control list (ACL) support can also be enabled independently of the Enhanced Security option on cluster members.
manual at the following URL:
DA - 11444 Worldwide — Version 4 — December 8, 2003
Page 6
Loading...
+ 13 hidden pages