Product SiteDocumentation Site

Pacemaker Administration

Managing Pacemaker Clusters

Edition 2

Written by the Pacemaker project contributors

Legal Notice

Copyright © 2009-2019 The Pacemaker project contributors.
The text of and illustrations in this document are licensed under version 4.0 or later of the Creative Commons Attribution-ShareAlike International Public License ("CC-BY-SA")[1].
In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.
In addition to the requirements of this license, the following activities are looked upon favorably:
  1. If you are distributing Open Publication works on hardcopy or CD-ROM, you provide email notification to the authors of your intent to redistribute at least thirty days before your manuscript or media freeze, to give the authors time to provide updated documents. This notification should describe modifications, if any, made to the document.
  2. All substantive modifications (including deletions) be either clearly marked up in the document or else described in an attachment to the document.
  3. Finally, while it is not mandatory under this license, it is considered good form to offer a free copy of any hardcopy or CD-ROM expression of the author(s) work.

Abstract

This document has instructions and tips for system administrators who need to manage high-availability clusters using Pacemaker.

Table of Contents

Preface
1. Document Conventions
1.1. Typographic Conventions
1.2. Pull-quote Conventions
1.3. Notes and Warnings
2. We Need Feedback!
1. Read-Me-First
1.1. The Scope of this Document
1.2. What Is Pacemaker?
1.3. Cluster Architecture
1.4. Pacemaker Architecture
1.5. Node Redundancy Designs
2. Installing Cluster Software
2.1. Installing the Software
2.2. Enabling Pacemaker
2.2.1. Enabling Pacemaker For Corosync version 2 and greater
3. The Cluster Layer
3.1. Pacemaker and the Cluster Layer
3.2. Managing Nodes in a Corosync-Based Cluster
3.2.1. Adding a New Corosync Node
3.2.2. Removing a Corosync Node
3.2.3. Replacing a Corosync Node
4. Configuring Pacemaker
4.1. Configuration Using Higher-level Tools
4.2. Configuration Using Pacemaker’s Command-Line Tools
4.3. Working with CIB Properties
4.4. Querying and Setting Cluster Options
4.4.1. When Options are Listed More Than Once
4.5. Connecting from a Remote Machine
5. Using Pacemaker Command-Line Tools
5.1. Controlling Command Line Output
5.2. Monitor a Cluster with crm_mon
5.2.1. Styling crm_mon output
5.3. Edit the CIB XML with cibadmin
5.4. Batch Configuration Changes with crm_shadow
5.5. Simulate Cluster Activity with crm_simulate
5.5.1. Replaying cluster decision-making logic
5.5.2. Why decisions were made
5.5.3. Visualizing the action sequence
5.5.4. What-if scenarios
5.6. Manage Node Attributes, Cluster Options and Defaults with crm_attribute and attrd_updater
5.7. Other Commonly Used Tools
6. Troubleshooting Cluster Problems
6.1. Logging
6.2. Transitions
6.3. Further Information About Troubleshooting
7. Upgrading a Pacemaker Cluster
7.1. Pacemaker Versioning
7.2. Upgrading Cluster Software
7.2.1. Complete Cluster Shutdown
7.2.2. Rolling (node by node)
7.2.3. Detach and Reattach
7.3. Upgrading the Configuration
7.4. What Changed in 2.0
7.5. What Changed in 1.0
7.5.1. New
7.5.2. Changed
7.5.3. Removed
8. Resource Agents
8.1. Resource Agent Actions
8.2. OCF Resource Agents
8.2.1. Location of Custom Scripts
8.2.2. Actions
8.2.3. How are OCF Return Codes Interpreted?
8.2.4. OCF Return Codes
8.3. LSB Resource Agents (Init Scripts)
8.3.1. LSB Compliance
A. Revision History
Index

List of Figures

1.1. Example Cluster Stack
1.2. Internal Components
1.3. Active/Passive Redundancy
1.4. Shared Failover
1.5. N to N Redundancy

List of Tables

4.1. Environment Variables Used to Connect to Remote Instances of the CIB
4.2. Extra top-level CIB properties for remote access
5.1. Types of Node Attributes
7.1. Upgrade Methods
7.2. Version Compatibility Table
8.1. Required Actions for OCF Agents
8.2. Optional Actions for OCF Resource Agents
8.3. Types of recovery performed by the cluster
8.4. OCF Return Codes and their Recovery Types

List of Examples

2.1. Corosync configuration file for two nodes myhost1 and myhost2
2.2. Corosync configuration file for three nodes myhost1, myhost2 and myhost3
4.1. XML attributes set for a cib element
4.2. Deleting an option that is listed twice
5.1. Sample output from crm_mon -1
5.2. Sample output from crm_mon -n -1
5.3. Safely using an editor to modify the cluster configuration
5.4. Safely using an editor to modify only the resources section
5.5. Searching for STONITH-related configuration items
5.6. Creating and displaying the active sandbox
5.7. Use sandbox to make multiple changes all at once, discard them, and verify real configuration is untouched
5.8. Simulate cluster response to a given CIB
5.9. Simulate cluster response to current live CIB or shadow CIB
5.10. Generate a visual graph of cluster actions from a saved CIB
5.11. Small Cluster Transition
5.12. Complex Cluster Transition