enchufado >> MySQL HA Multi-Master con Percona XtraDB Cluster, parte II

enchufado #

MySQL HA Multi-Master con Percona XtraDB Cluster, parte II (Bases de Datos)

2013-12-27 16:26:51

En un post previo vimos cómo crear una infraestructura Multi-Master con Percona XtraDB Cluster, pero si hacemos caso omiso a las recomendaciones y montamos clusters de 2 nodos, si se pierde la conexión entre los 2 nodos del cluster se pierde el quorum (cada nodo obtiene el 50%). Y ésto no sólo rompe el cluster, sino que además cada nodo por separado deja de funcionar.

Éste hecho lo tratan en la documentación de Percona relativa al failover:

Cluster Failover

The size of the cluster is used to determine the required votes to achieve 
quorum. A quorum vote is done when a node or nodes are suspected to no longer 
be part of the cluster (they do not respond). This no response timeout is the 
evs.suspect_timeout setting in the wsrep_provider_options (default 5 sec), and 
when a node goes down ungracefully, write operations will be blocked on the 
cluster for slightly longer than that timeout.

Once the node (or nodes) is determined to be disconnected, then the remaining 
nodes cast a quorum vote and if a majority remain from the total nodes 
connected from before the disconnect, then that partition remains up. In the 
case of a network partition, some nodes will be alive and active on each side 
of the network disconnect. In this case, only the quorum will continue, the 
partition(s) without quorum will go to the non-Primary state.

Because of this, it’s not possible to safely have automatic failover in a 2 
node cluster, because the failure of one node will cause the remaining node 
to go non-Primary. Further, cluster with an even number of nodes (say two 
nodes in two different switches) have some possibility of a split brain 
condition when if network connectivity is lost between the two partitions, 
neither would retain quorum, and so both would go to Non-Primary. Therefore: 
for automatic failover, the “rule of 3s” is recommended. It applies at various 
levels of infrastructure, depending on how far cluster is spread out to avoid 
single points of failure.

Ante éste escenario, ¿qué soluciones se nos ofrecen? En éste enlace de MySQL Performance Blog hablan del tema:

*Is is possible anyway to have a two nodes cluster?*

So, by default, Percona XtraDB Cluster does the right thing, this is how
it needs to work and you don’t suffer critical problem when you have
enough nodes. But how can we deal with that and avoid the resource to
stop ? If we check the list of parameters on galera’s wiki 
(http://www.codership.com/wiki/doku.php?id=galera_parameters_0.8) we can
see that there are two options referring to that:

- pc.ignore_quorum: Completely ignore quorum calculations. E.g. in
case master splits from several slaves it still remains operational. Use
with extreme caution even in master-slave setups, because slaves won’t
automatically reconnect to master in this case.

- pc.ignore_sb: Should we allow nodes to process updates even in the
case of split brain? This is a dangerous setting in multi-master setup,
but should simplify things in master-slave cluster (especially if only 2
nodes are used).

By ignoring the quorum (*wsrep_provider_options = “pc.ignore_quorum =
true”*), we ask to the cluster to not perform the majority calculation
to define the Primary Component (PC). A component is a set of nodes
which are connected to each other and when everything is ok, the whole
cluster is one component. For example if you have 3 nodes, and if 1 node
gets isolated (2 nodes can see each others and 1 node can see only
itself), we have then 2 components and the quorum calculation will be
2/3 (66%) on the 2 nodes communicating each others and 1/3 (33%) on the
single one. In this case the service will be stopped on the nodes where
the majority is not reached. The quorum algorithm helps to select a PC
and guarantees that there is no more than one primary component in the
cluster.

In our 2 nodes setup, when the communication between the 2 nodes is
broken, the quorum will be 1/2 (50%) on both node which is not the
majority… therefore the service is stopped on both node. In this case,
__/service/__means accepting queries.

By ignoring the splitbrain situation (*wsrep_provider_options =
“pc.ignore_sb = true”*). When the quorum algorithm fails to select a
Primary Component, we have then a split-brain condition. In our 2 nodes
setup when a node loses connection to it’s only peer, the default is to
stop accepting queries to avoid database inconsistency. If we bypass
this behaviour by ignoring the split-brain, then we can insert in both
nodes without any problem when the connection between the nodes is gone.
When the connection is back, the two servers are like independent, these
are now two single node clusters.

This is why two node clusters is not recommended at all. Now if you have
only storage for 2 nodes, using the galera arbitrator
(http://www.codership.com/wiki/doku.php?id=galera_arbitrator) is a very
good alternative then. On a third node, instead of running Percona
XtraDB Cluster (mysqld) just run *garbd*. Currently there is no init
script for garbd, but this is something easy to write as it can run in
daemon mode using -d:

garbd -a gcomm://192.168.70.2:4567 -g trimethylxanthine

If the communication fails between node1 and node2, they will
communicate and eventually send the changes through the node running
garbd (node3) and if one node dies, the other one behaves without any
problem and when the dead node comes back it will perform its IST or SST.

*In conclusion:* 2 nodes cluster is possible with Percona XtraDB Cluster
but it’s not advised at all because it will generates a lot of problem
in case of issue on one of the nodes. It’s much safer to use then a 3rd
node even a fake one using garbd.

If you plan anyway to have a cluster with only 2 nodes, don’t forget that:

- by default if one peer dies or if the communication between both nodes
is unstable, both nodes won’t accept queries.
- if you plan to ignore split-brain or quorum, you risk to have
inconsistent data very easily.

Así pues, podemos ver que la mejor solución para un cluster con 2 nodos pasa por crear un nuevo 'thin node' con un demonio (garbd, también conocido como galera arbitrator) para no encontrarnos con split brains a la mínima que los 2 nodos pierdan comunicación y evitar así que ésto pase. Cierto es que ésto supone usar un tercer nodo, pero sólo lo será a efectos de quorum, pues no aloja ni datos ni siquiera la parte cliente/servidor de MySQL.

Antes de proceder, nótese que usamos las direcciones de red del post previo relativo al tema más una nueva dirección ip para el 'thin node' (192.168.0.63). Asimismo, aquí asumimos que en el escenario que nos ocupa (con 2 nodos "reales") usaremos una configuración Unicast. Así pues, vamos a ello:

Instalamos el paquete que contiene el demonio:

$ apt-get install percona-xtradb-cluster-galera-2.x

Iniciamos el demonio indicando la cadena de conexión, el nombre del cluster y un fichero de log que nos puede ayudar en caso de necesitarlo:

garbd -d -a gcomm://192.168.0.61,192.168.0.62,192.168.0.63 -g my_percona_cluster -l /var/log/garbd.log

Hay que tener en cuenta de modificar y tener el mismo string de conexión (gcomm://...) en todos los nodos, adaptándolo para nuestro caso.

Si la comunicación es correcta con los demás nodos del cluster, esto debería ser cuanto se necesita. Comprobamos que tengamos el nuevo 'thin node' unido al cluster:

$ mysql> show status like 'wsrep%';
| wsrep_cluster_conf_id      | 19                                   |
| wsrep_cluster_size         | 3                                    |

Finalmente podemos situar éste demonio para que se inicie automáticamente (p.ej. en /etc/rc.local).

Con ésto, podemos hacer la prueba de bajar uno de los 2 nodos "reales" por un tiempo y veremos que podemos seguir lanzando queries al restante. Y no sólo eso, puesto que cuando el caído vuelva en sí, se podrá al día de los cambios que sufrió el primero.

Comentarios (0)

Volver al indice