Summary of the content on the page No. 1
HP Serviceguard Extended Distance
Cluster for Linux A.01.00 Deployment
Guide
Manufacturing Part Number: T2808-90006
May 2008 Second Edition
Summary of the content on the page No. 2
Legal Notices © Copyright 2006-2008 Hewlett-Packard Development Company, L.P. Publication Date: 2008 Valid license from HP required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license. The information contained herein is subject to change without notice. The only warranties for HP products a
Summary of the content on the page No. 3
Contents 1. Disaster Tolerance and Recovery in a Serviceguard Cluster Evaluating the Need for Disaster Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 What is a Disaster Tolerant Architecture?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Understanding Types of Disaster Tolerant Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Extended Distance Clusters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Summary of the content on the page No. 4
Contents Creating a Multiple Disk Device. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 To Create and Assemble an MD Device. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Creating Volume Groups and Configuring VG Exclusive Activation on the MD Mirror . 74 Configuring the Package Control Script and RAID Configuration File . . . . . . . . . . . . 76 Creating and Editing the Package Control Scripts. . . . . . . . . . . . . . . . .
Summary of the content on the page No. 5
Contents 5
Summary of the content on the page No. 6
Contents 6
Summary of the content on the page No. 7
Printing History Table 1 Editions and Releases Printing Date Part Number Edition Operating System Releases (see Note below) November T2808-90001 Edition 1 Red Hat 4 U3 or later 2006 Novell SUSE Linux Enterprise Server 9 SP3 or later Novell SUSE Linux Enterprise Server 10 or later May 2008 T2808-90006 Edition 2 Red Hat 4 U3 or later Novell SUSE Linux Enterprise Server 9 SP3 or later Novell SUSE Linux Enterprise Server 10 or later The printing date and part number indicate th
Summary of the content on the page No. 8
HP Printing Division: Business Critical Computing Business Unit Hewlett-Packard Co. 19111 Pruneridge Ave. Cupertino, CA 95014 8
Summary of the content on the page No. 9
Preface This guide introduces the concept of Extended Distance Clusters (XDC). It describes how to configure and manage HP Serviceguard Extended Distance Clusters for Linux and the associated Software RAID functionality. In addition, this guide includes information on a variety of Hewlett-Packard (HP) high availability cluster technologies that provide disaster tolerance for your mission-critical applications. Serviceguard has supported disaster tolerant clusters on HP-UX for several year
Summary of the content on the page No. 10
10
Summary of the content on the page No. 11
Preface Related The following documents contain additional useful information: Publications Clusters for High Availability: a Primer of HP Solutions, Second Edition. Hewlett-Packard Professional Books: Prentice Hall PTR, 2001 (ISBN 0-13-089355-2) Designing Disaster Tolerant HA Clusters Using Metrocluster and Continentalclusters (B7660-900xx) HP StorageWorks Cluster Extension EVA user guide HP StorageWorks Cluster Extension XP for HP Serviceguard for Linux HP Serviceguard for Linux
Summary of the content on the page No. 12
12
Summary of the content on the page No. 13
Disaster Tolerance and Recovery in a Serviceguard Cluster 1 Disaster Tolerance and Recovery in a Serviceguard Cluster This chapter introduces a variety of Hewlett-Packard high availability cluster technologies that provide disaster tolerance for your mission-critical applications. It is assumed that you are already familiar with Serviceguard high availability concepts and configurations. This chapter covers the following topics: “Evaluating the Need for Disaster Tolerance” on page 14 “
Summary of the content on the page No. 14
Disaster Tolerance and Recovery in a Serviceguard Cluster Evaluating the Need for Disaster Tolerance Evaluating the Need for Disaster Tolerance Disaster tolerance is the ability to restore applications and data within a reasonable period of time after a disaster. Most people think of fire, flood, and earthquake as disasters, but a disaster can be any event that unexpectedly interrupts service or corrupts data in an entire data center: the backhoe that digs too deep and severs a network conne
Summary of the content on the page No. 15
Disaster Tolerance and Recovery in a Serviceguard Cluster Evaluating the Need for Disaster Tolerance line inoperable as well as the computers. In this case disaster recovery would be moot, and local failover is probably the more appropriate level of protection. On the other hand, you may have an order processing center that is prone to floods in the winter. The business loses thousands of dollars a minute while the order processing servers are down. A disaster tolerant architecture is appro
Summary of the content on the page No. 16
Disaster Tolerance and Recovery in a Serviceguard Cluster What is a Disaster Tolerant Architecture? What is a Disaster Tolerant Architecture? In a Serviceguard cluster configuration, high availability is achieved by using redundant hardware to eliminate single points of failure. This protects the cluster against hardware faults, such as the node failure in Figure 1-1. Figure 1-1 High Availability Architecture. node 1 fails node 1 node 2 pkg A disks pkg B pkg A pkg A mirrors pkg B disks X pkg
Summary of the content on the page No. 17
Disaster Tolerance and Recovery in a Serviceguard Cluster What is a Disaster Tolerant Architecture? impact. For these types of installations, and many more like them, it is important to guard not only against single points of failure, but against multiple points of failure (MPOF), or against single massive failures that cause many components to fail, such as the failure of a data center, of an entire site, or of a small area. A data center, in the context of disaster recovery, is a physical
Summary of the content on the page No. 18
Disaster Tolerance and Recovery in a Serviceguard Cluster Understanding Types of Disaster Tolerant Clusters Understanding Types of Disaster Tolerant Clusters To protect against multiple points of failure, cluster components must be geographically dispersed: nodes can be put in different rooms, on different floors of a building, or even in separate buildings or separate cities. The distance between the nodes is dependent on the types of disaster from which you need protection, and on the tec
Summary of the content on the page No. 19
Disaster Tolerance and Recovery in a Serviceguard Cluster Understanding Types of Disaster Tolerant Clusters architecture are followed. Extended distance clusters were formerly known as campus clusters, but that term is not always appropriate because the supported distances have increased beyond the typical size of a single corporate campus. The maximum distance between nodes in an extended distance cluster is set by the limits of the data replication technology and networking limits. An ext
Summary of the content on the page No. 20
Disaster Tolerance and Recovery in a Serviceguard Cluster Understanding Types of Disaster Tolerant Clusters Figure 1-3 Extended Distance Cluster In the above configuration the network and FC links between the data centers are combined and sent over common DWDM links. Two DWDM links provide redundancy. When one of them fails, the other may still be active and may keep the two data centers connected. Using the DWDM link, clusters can now be extended to greater distances which were not possib