Changing the master quorum
Currently the master leverages a paxos-based replicated log as its storage backend (
--registry=replicated_log is the only storage backend supported). Each master participates in the ensemble as a log replica. The
--quorum flag determines a majority of the masters.
The following table shows the tolerance to master failures, for each quorum size:
|Masters||Quorum Size||Failure Tolerance|
|2N - 1||N||N - 1|
It is recommended to run with 3 or 5 masters, when desiring high availability.
When configuring the quorum, it is essential to ensure that there are only so many masters running as specified in the table above. If additional masters are running, this violates the quorum and the log may be corrupted! As a result, it is recommended to gate the running of the master process with something that enforces a static whitelist of the master hosts. See MESOS-1546 for adding a safety whitelist within Mesos itself.
For online reconfiguration of the log, see: MESOS-683.
Increasing the quorum size
As the size of a cluster grows, it may be desired to increase the quorum size for additional fault tolerance.
The following steps indicate how to increment the quorum size, using 3 → 5 masters as an example (quorum size 2 → 3):
- Initially, 3 masters are running with
- Restart the original 3 masters with
- Start 2 additional masters with
To increase the quorum by N, repeat this process to increment the quorum size N times.
Decreasing the quorum size
The following steps indicate how to decrement the quorum size, using 5 → 3 masters as an example (quorum size 3 → 2):
- Initially, 5 masters are running with
- Remove 2 masters from the cluster, ensure they will not be restarted (See NOTE section above). Now 3 masters are running with
- Restart the 3 masters with
To decrease the quorum by N, repeat this process to decrement the quorum size N times.
Replacing a master
Please see the NOTE section above. So long as the failed master is guaranteed to not re-join the ensemble, it is safe to start a new master with an empty log and allow it to catch up.