corosync

HoTo: Create HA Cluster on Centos6.7

Posted on

Worked on versions

OS : Centos 6.7
Building the cluster
To build this simple cluster, we need a few basic components:
Resource manager that can start and stop resources (like Pacemaker)
Messaging component which is responsible for communication and membership (like Corosync or Heartbeat)
Optionally: Cluster manager to easily manange the cluster settings on all nodes (like PCS)

Preparation
Start with configuring both cluster nodes with a static IP, a hostname, make sure that they are in the same subnet and can reach each other by nodename.

1, Local name binding using hosts

cat /etc/hosts
10.0.0.11 dir01 dir01.cluster.domain.com
10.0.0.12 dir02 dir02.cluster.domain.com
10.0.0.13 dir03 dir03.cluster.domain.com
10.0.0.10 ldap-ha ldap-ha.cluster.domain.com

2, Disable SELINUX

vi /etc/selinux/config
SELINUX=disabled

3, Clean update the server

yum install clean all
yum update

Basic Firewall setting for All the nodes in the cluster:
When testing the cluster, we could temporarily disable the firewall to be sure that blocked ports aren’t causing unexpected problems.

1, Open UDP-ports 5404 and 5405 for Corosync:

iptables -I INPUT -m state --state NEW -p udp -m multiport --dports 5404,5405 -j ACCEPT

2, Open TCP-port 2224 for PCS

iptables -I INPUT -p tcp -m state --state NEW -m tcp --dport 2224 -j ACCEPT

3, Allow IGMP-traffic

iptables -I INPUT -p igmp -j ACCEPT

4, Allow multicast-traffic

iptables -I INPUT -m addrtype --dst-type MULTICAST -j ACCEPT

5, Save the changes made in iptables and restart

service iptables save
service iptables start

Installation
1, After setting up the basics, we need to install the packages for the components on all the server,

yum install corosync pcs pacemaker cman

2, To manage the cluster nodes, we will use PCS. This allows us to have a single interface to manage all cluster nodes. By installing the necessary packages, Yum also created a user, hacluster, which can be used together with PCS to do the configuration of the cluster nodes. Before we can use PCS, we need to configure public key authentication or give the user a password on all the nodes:

echo "hapasswd" | passwd hacluster --stdin

3, Startng pcsd pacemaker manager in all the nodes

service pacemaker start
service pcsd start
chkconfig pacemaker on
chkocnfig pcsd on

4, Create new corosync multicast configuration with the below given,

vi /etc/corosync/corosync.conf

compatibility: whitetank
totem {
 version: 2
 # Time (in ms) to wait for a token (1)
 token: 10000
 # How many token retransmits before forming a new
 # configuration
 token_retransmits_before_loss_const: 10
 # How long to wait for join messages in the membership protocol (ms)
 join: 1000
 # How long to wait for consensus to be achieved before starting a new
 # round of membership configuration (ms)
 consensus: 7500
 # Number of messages that may be sent by one processor on receipt of the token
 max_messages: 20
 # Stagger sending the node join messages by 1..send_join ms
 send_join: 45
 # Limit generated nodeids to 31-bits (positive signed integers)
 clear_node_high_bit: yes
 # Turn off the virtual synchrony filter
 vsftype: none
 # Enable encryption (2)
 secauth: on
 # How many threads to use for encryption/decryption
 threads: 0
 # This specifies the redundant ring protocol, which may be
 # none, active, or passive. (3)
 rrp_mode: active

# The following is a two-ring multicast configuration. (4)
 interface {
 ringnumber: 0
 bindnetaddr: 10.0.0.11
 mcastaddr: 239.255.1.1
 mcastport: 5405
 }
}

amf {
 mode: disabled
}

service {
 # Load the Pacemaker Cluster Resource Manager (5)
 ver: 1
 name: pacemaker
}

aisexec {
 user: root
 group: root
}

logging {
 fileline: off
 to_stderr: yes
 to_logfile: yes
 logfile: /var/log/cluster/corosync.log
 to_syslog: yes
 syslog_facility: daemon
 debug: off
 timestamp: on
 logger_subsys {
 subsys: AMF
 debug: off
 tags: enter|leave|trace1|trace2|trace3|trace4|trace6
 }}

5, Since we will configure all nodes from one point, we need to authenticate on all nodes before we are allowed to change the configuration. Use the previously configured hacluster user and password.

pcs cluster auth dir01 dir02 -u hacluster

From here, we can control the cluster by using PCS from dir01. It’s no longer required to repeat all commands on all the nodes.
Authorisation tokens are stored in the file /var/lib/pcsd/tokens.

Create the cluster and add nodes
1, start adding all nodes to a cluster named LDAP-HA-Cluster

pcs cluster setup --name LDAP-HA-Cluster dir01 dir02

2, After creating the cluster and adding nodes, start cluster packeages from the single poing , it will start pacemaker and corosync in all the nodes.

pcs cluster start --all

3, Optionally, depending on requirements, we can enable cluster services to start on boot,

pcs cluster enable --all

To check the status of the cluster after starting it:

pcs status
service pacemaker status
service corosync status

To check the status of the nodes in the cluster

pcs status nodes
corosync-objctl runtime.totem.pg.mrp.srp.members
corosync-cfgtool -s
pcs status corosync

Cluster configuration
1, Check the configuration for errors, and there still are some

crm_verify -L -V

The above command tells that erros still in the cluster, First time we can see error regarding STONITH (Shoot The Other Node In The Head), which is a mechanism to ensure that you don’t end up with two nodes that both think they are active and claim to be the service and virtual IP owner, also called a split brain situation. Since we have simple cluster, we’ll just disable the stonith option

pcs property set stonith-enabled=false

2, Ignore a low quorum

pcs property set no-quorum-policy=ignore

The below settings needed only if we have 3 servers in the cluster

pcs property set expected-quorum-votes=”3”

3, Set the basic cluster properties

pcs property set pe-warn-series-max=1000 \
 pe-input-series-max=1000 \
 pe-error-series-max=1000 \
 cluster-recheck-interval=5min

4, I believe we already configured HA-Proxy in the server, if not let start with basic install and start haproxy. Because we need to configure haproxy lsb in the cluster.

Yum install haproxy
service haproxy start

5, Adding Floating IP with hearbeat to monitor servers, This IP is used to connect HA-proxy and won’t assign to any serve where haproxy failed to start

pcs resource create LDAPfrontendIP0 ocf:heartbeat:IPaddr2 ip=10.0.0.10 cidr_netmask=32 op monitor interval=30s

To check the status;

pcs status resources

Now we can get the responce from the floating IP,

ping -c1 10.0.0.10

To see who is the current owner of the resource/virtual IP:

pcs status|grep virtual_ip

Adding HA-Proxy to Pacemaker configuration
1, Because there is no OCF agent for HA-Proxy we define a LSB resource haproxy (Note: This must be the same name as the startup script in /etc/init.d and comply to the LSB standard. The expected behavior of the startup scripts can be found at Linux-HA documentation. Fortunately the haproxy script can be used, so a recource LDAP-HA-Proxy will be created:

pcs resource create LDAP-HA-Proxy lsb:haproxy op monitor interval=5s

The resource will start on the node with the LDAPfrontendIP0 resource but complain about the other hosts in the HA-Cluster:

pcs status

2, Obviously the haproxy service fails to start if the IP adress of the loadbalancer does not exist. The default behavior of Pacemaker spreads the resources across all cluster nodes. Because the LDAPfrontendIP0 and LDAP-HA-Proxy resources are related to each other LDAP-HA-Proxy can only run on the node with the LDAPfrontendIP0 resource. To archive this a “colocation constraint” is needed. The weight score of INFINITY makes it mandatory to start the LDAP-HA-Proxy resource on the node with the LDAPfrontendIP0 resource:

pcs constraint colocation add LDAP-HA-Proxy LDAPfrontendIP0 INFINITY

3, The order of the resource is important otherwise the LDAPfrontendIP0 resource will be started on the node with the LDAP-HA-Proxy resource (which can not start because the LDAPfrontendIP0 resource provides the interface configuration for LDAP-HA-Proxy). Futhermore the LDAPfrontendIP0 resource should always start before LDAP-HA-Proxy resource so we have to enforce the resource start/stop ordering:

pcs constraint order LDAPfrontendIP0 then LDAP-HA-Proxy

After configuring the cluster with the correct constraints, restart it and check the status:

pcs cluster stop --all && pcs cluster start –all 
pcs status

Hence we completed cluster setup with HA-proxy, the following setup required to know how we can switch/Adding/removing resources

Advertisements