Mesos Configuration
The Mesos master and slave can take a variety of configuration options through command-line arguments, or environment variables. A list of the available options can be seen by running mesos-master --help
or mesos-slave --help
. Each option can be set in two ways:
- By passing it to the binary using
--option_name=value
. - By setting the environment variable
MESOS_OPTION_NAME
(the option name with aMESOS_
prefix added to it).
Configuration values are searched for first in the environment, then on the command-line.
Important Options
If you have special compilation requirements, please refer to ./configure --help
when configuring Mesos. Additionally, the documentation lists only a subset of the options. A definitive source for which flags your version of Mesos supports can be found by running the binary with the flag --help
, for example mesos-master --help
.
Master and Slave Options
These options can be supplied to both masters and slaves.
Flag | Explanation |
---|---|
--ip=VALUE | IP address to listen on |
--[no-]help | Prints this help message (default: false) |
--[no-]initialize_driver_logging | Whether to automatically initialize google logging of scheduler and/or executor drivers. (default: true) |
--log_dir=VALUE | Location to put log files (no default, nothing is written to disk unless specified; does not affect logging to stderr) |
--logbufsecs=VALUE | How many seconds to buffer log messages for (default: 0) |
--logging_level=VALUE | Log message at or above this level; possible values: 'INFO', 'WARNING', 'ERROR'; if quiet flag is used, this will affect just the logs from log_dir (if specified) (default: INFO) |
--port=VALUE | Port to listen on (master default: 5050 and slave default: 5051) |
--[no-]quiet | Disable logging to stderr (default: false) |
--[no-]version | Show version and exit. (default: false) |
Master Options
Required Flags
Flag | Explanation |
---|---|
--quorum=VALUE | The size of the quorum of replicas when using 'replicated_log' based registry. It is imperative to set this value to be a majority of masters i.e., quorum > (number of masters)/2. |
--work_dir=VALUE | Where to store the persistent information stored in the Registry. |
--zk=VALUE |
ZooKeeper URL (used for leader election amongst masters)
May be one of:
|
Optional Flags
Flag | Explanation |
---|---|
--acls=VALUE |
The value could be a JSON formatted string of ACLs
or a file path containing the JSON formatted ACLs used
for authorization. Path could be of the form file:///path/to/file
or /path/to/file .
See the ACLs protobuf in mesos.proto for the expected format.
JSON file example:
|
--allocation_interval=VALUE | Amount of time to wait between performing (batch) allocations (e.g., 500ms, 1sec, etc). (default: 1secs) |
--[no-]authenticate | If authenticate is 'true' only authenticated frameworks are allowed to register. If 'false' unauthenticated frameworks are also allowed to register. (default: false) |
--[no-]authenticate_slaves | If 'true' only authenticated slaves are allowed to register. If 'false' unauthenticated slaves are also allowed to register. (default: false) |
--authenticators=VALUE |
Authenticator implementation to use when authenticating frameworks
and/or slaves. Use the default crammd5 , or
load an alternate authenticator module using --modules . (default: crammd5)
|
--cluster=VALUE | Human readable name for the cluster, displayed in the webui. |
--credentials=VALUE |
Either a path to a text file with a list of credentials,
each line containing 'principal' and 'secret' separated by whitespace,
or, a path to a JSON-formatted file containing credentials.
Path could be of the form file:///path/to/file or /path/to/file .
JSON file Example:
Text file Example:
|
--framework_sorter=VALUE | Policy to use for allocating resources between a given user's frameworks. Options are the same as for user_allocator. (default: drf) |
--hostname=VALUE | The hostname the master should advertise in ZooKeeper. If left unset, the hostname is resolved from the IP address that the master binds to. |
--[no-]log_auto_initialize | Whether to automatically initialize the replicated log used for the registry. If this is set to false, the log has to be manually initialized when used for the very first time. (default: true) |
--modules=VALUE |
List of modules to be loaded and be available to the internal
subsystems.
Use --modules=filepath to specify the list of modules via a
file containing a JSON formatted string. 'filepath' can be
of the form file:///path/to/file or /path/to/file .
Use --modules="{...}" to specify the list of modules inline.
JSON file example:
|
--offer_timeout=VALUE | Duration of time before an offer is rescinded from a framework. This helps fairness when running frameworks that hold on to offers, or frameworks that accidentally drop offers. |
--rate_limits=VALUE |
The value could be a JSON formatted string of rate limits
or a file path containing the JSON formatted rate limits used
for framework rate limiting.
Path could be of the form file:///path/to/file
or /path/to/file .
See the RateLimits protobuf in mesos.proto for the expected format.
Example:
|
--recovery_slave_removal_limit=VALUE | For failovers, limit on the percentage of slaves that can be removed from the registry *and* shutdown after the re-registration timeout elapses. If the limit is exceeded, the master will fail over rather than remove the slaves. This can be used to provide safety guarantees for production environments. Production environments may expect that across Master failovers, at most a certain percentage of slaves will fail permanently (e.g. due to rack-level failures). Setting this limit would ensure that a human needs to get involved should an unexpected widespread failure of slaves occur in the cluster. Values: [0%-100%] (default: 100%) |
--registry=VALUE | Persistence strategy for the registry; available options are 'replicated_log', 'in_memory' (for testing). (default: replicated_log) |
--registry_fetch_timeout=VALUE | Duration of time to wait in order to fetch data from the registry after which the operation is considered a failure. (default: 1mins) |
--registry_store_timeout=VALUE | Duration of time to wait in order to store data in the registry after which the operation is considered a failure. (default: 5secs) |
--[no-]registry_strict | Whether the Master will take actions based on the persistent information stored in the Registry. Setting this to false means that the Registrar will never reject the admission, readmission, or removal of a slave. Consequently, 'false' can be used to bootstrap the persistent state on a running cluster. NOTE: This flag is *experimental* and should not be used in production yet. (default: false) |
--roles=VALUE | A comma separated list of the allocation roles that frameworks in this cluster may belong to. |
--[no-]root_submissions | Can root submit frameworks? (default: true) |
--slave_reregister_timeout=VALUE | The timeout within which all slaves are expected to re-register when a new master is elected as the leader. Slaves that do not re-register within the timeout will be removed from the registry and will be shutdown if they attempt to communicate with master. NOTE: This value has to be atleast 10mins. (default: 10mins) |
--user_sorter=VALUE | Policy to use for allocating resources between users. May be one of: dominant_resource_fairness (drf) (default: drf) |
--webui_dir=VALUE | Directory path of the webui files/assets (default: /usr/local/share/mesos/webui) |
--weights=VALUE | A comma separated list of role/weight pairs of the form 'role=weight,role=weight'. Weights are used to indicate forms of priority. |
--whitelist=VALUE |
Path to a file with a list of slaves
(one per line) to advertise offers for.
Path could be of the form file:///path/to/file or /path/to/file . (default: *)
|
--zk_session_timeout=VALUE | ZooKeeper session timeout. (default: 10secs) |
Slave Options
Required Flags
Flag | Explanation |
---|---|
--master=VALUE |
This specifies how to connect to a master or a quorum of masters. This flag works with 3 different techniques. It may be one of:
|
Optional Flags
Flag | Explanation |
---|---|
--attributes=VALUE |
Attributes of machine, in the form:
rack:2 or 'rack:2;u:1'
|
--[no-]cgroups_enable_cfs | Cgroups feature flag to enable hard limits on CPU resources via the CFS bandwidth limiting subfeature. (default: false) |
--cgroups_hierarchy=VALUE | The path to the cgroups hierarchy root (default: /sys/fs/cgroup) |
--[no-]cgroups_limit_swap | Cgroups feature flag to enable memory limits on both memory and swap instead of just memory. (default: false) |
--cgroups_root=VALUE | Name of the root cgroup (default: mesos) |
--cgroups_subsystems=VALUE | This flag has been deprecated and is no longer used, please update your flags |
--[no-]checkpoint | This flag is deprecated and will be removed in a future release. Whether to checkpoint slave and frameworks information to disk. This enables a restarted slave to recover status updates and reconnect with (--recover=reconnect) or kill (--recover=cleanup) old executors (default: true) |
--containerizer_path=VALUE | The path to the external containerizer executable used when external isolation is activated (--isolation=external). |
--containerizers=VALUE | Comma separated list of containerizer implementations to compose in order to provide containerization. Available options are 'mesos', 'external', and 'docker' (on Linux). The order the containerizers are specified is the order they are tried (--containerizers=mesos). (default: mesos) |
--credential=VALUE |
Either a path to a text with a single line
containing 'principal' and 'secret' separated by whitespace.
Or a path containing the JSON formatted information used for one credential.
Path could be of the form file:///path/to/file< code> or |
--default_container_image=VALUE | The default container image to use if not specified by a task, when using external containerizer. |
--default_container_info=VALUE |
JSON formatted ContainerInfo that will be included into
any ExecutorInfo that does not specify a ContainerInfo.
See the ContainerInfo protobuf in mesos.proto for
the expected format.
Example:
|
--docker_stop_timeout=VALUE | The time as a duration for docker to wait after stopping an instance before it kills that instance. (default: 0secs) |
--network_enable_socket_statistics | Whether to collect socket statistics (e.g., TCP RTT) for each container. (default: false) |
--default_role=VALUE | Any resources in the --resources flag that omit a role, as well as any resources that are not present in --resources but that are automatically detected, will be assigned to this role. (default: *) |
--disk_watch_interval=VALUE | Periodic time interval (e.g., 10secs, 2mins, etc) to check the disk usage (default: 1mins) |
--docker=VALUE | The absolute path to the docker executable for docker containerizer. (default: docker) |
--docker_remove_delay=VALUE | The amount of time to wait before removing docker containers (e.g., 3days, 2weeks, etc). (default: 6hrs) |
--docker_sandbox_directory=VALUE | The absolute path for the directory in the container where the sandbox is mapped to. (default: /mnt/mesos/sandbox) |
--executor_registration_timeout=VALUE | Amount of time to wait for an executor to register with the slave before considering it hung and shutting it down (e.g., 60secs, 3mins, etc) (default: 1mins) |
--executor_shutdown_grace_period=VALUE | Amount of time to wait for an executor to shut down (e.g., 60secs, 3mins, etc) (default: 5secs) |
--frameworks_home=VALUE | Directory path prepended to relative executor URIs (default: ) |
--gc_delay=VALUE | Maximum amount of time to wait before cleaning up executor directories (e.g., 3days, 2weeks, etc). Note that this delay may be shorter depending on the available disk usage. (default: 1weeks) |
--hadoop_home=VALUE | Path to find Hadoop installed (for fetching framework executors from HDFS) (no default, look for HADOOP_HOME in environment or find hadoop on PATH) (default: ) |
--hostname=VALUE | The hostname the slave should report. If left unset, the hostname is resolved from the IP address that the slave binds to. |
--isolation=VALUE |
Isolation mechanisms to use, e.g., 'posix/cpu,posix/mem', or
'cgroups/cpu,cgroups/mem', or network/port_mapping
(configure with flag: --with-network-isolator to enable),
or 'external', or load an alternate isolator module using
the --modules flag. (default: posix/cpu,posix/mem)
|
--launcher_dir=VALUE | Directory path of Mesos binaries (default: /usr/local/lib/mesos) |
--modules=VALUE |
List of modules to be loaded and be available to the internal
subsystems.
Use --modules=filepath to specify the list of modules via a
file containing a JSON formatted string. 'filepath' can be
of the form file:///path/to/file or /path/to/file .
Use --modules="{...}" to specify the list of modules inline.
JSON file example:
|
--perf_duration=VALUE | Duration of a perf stat sample. The duration must be less that the perf_interval. (default: 10secs) |
--perf_events=VALUE | List of command-separated perf events to sample for each container when using the perf_event isolator. Default is none. Run command 'perf list' to see all events. Event names are sanitized by downcasing and replacing hyphens with underscores when reported in the PerfStatistics protobuf, e.g., cpu-cycles becomes cpu_cycles; see the PerfStatistics protobuf for all names. |
--perf_interval=VALUE | Interval between the start of perf stat samples. Perf samples are obtained periodically according to perf_interval and the most recently obtained sample is returned rather than sampling on demand. For this reason, perf_interval is independent of the resource monitoring interval (default: 1mins) |
--recover=VALUE | Whether to recover status updates and reconnect with old executors. Valid values for 'recover' are reconnect: Reconnect with any old live executors. cleanup : Kill any old live executors and exit. Use this option when doing an incompatible slave or executor upgrade!). NOTE: If checkpointed slave doesn't exist, no recovery is performed and the slave registers with the master as a new slave. (default: reconnect) |
--recovery_timeout=VALUE | Amount of time alloted for the slave to recover. If the slave takes longer than recovery_timeout to recover, any executors that are waiting to reconnect to the slave will self-terminate. NOTE: This flag is only applicable when checkpoint is enabled. (default: 15mins) |
--registration_backoff_factor=VALUE | Slave initially picks a random amount of time between [0, b], where b = registration_backoff_factor, to (re-)register with a new master. Subsequent retries are exponentially backed off based on this interval (e.g., 1st retry uses a random value between [0, b * 2^1], 2nd retry between [0, b * 2^2], 3rd retry between [0, b * 2^3] etc) up to a maximum of 1mins (default: 1secs) |
--resource_monitoring_interval=VALUE | Periodic time interval for monitoring executor resource usage (e.g., 10secs, 1min, etc) (default: 1secs) |
--resources=VALUE |
Total consumable resources per slave, in the form
name(role):value;name(role):value... .
|
--slave_subsystems=VALUE |
List of comma-separated cgroup subsystems to run the slave binary
in, e.g., memory,cpuacct . The default is none.
Present functionality is intended for resource monitoring and
no cgroup limits are set, they are inherited from the root mesos
cgroup.
|
--[no-]strict | If strict=true, any and all recovery errors are considered fatal. If strict=false, any expected errors (e.g., slave cannot recover information about an executor, because the slave died right before the executor registered.) during recovery are ignored and as much state as possible is recovered. (default: true) |
--[no-]switch_user | Whether to run tasks as the user who submitted them rather than the user running the slave (requires setuid permission) (default: true) |
--work_dir=VALUE | Directory path to place framework work directories (default: /tmp/mesos) |
Mesos Build Configuration Options
The configure script has the following flags for optional features:
Flag | Explanation |
---|---|
--enable-shared[=PKGS] | build shared libraries [default=yes] |
--enable-static[=PKGS] | build static libraries [default=yes] |
--enable-fast-install[=PKGS] | optimize for fast installation [default=yes] |
--disable-libtool-lock | avoid locking (might break parallel builds) |
--disable-java | don't build Java bindings |
--disable-python | don't build Python bindings |
--enable-debug | enable debugging. If CFLAGS/CXXFLAGS are set, this option won't change them default: no |
--enable-optimize | enable optimizations. If CFLAGS/CXXFLAGS are set, this option won't change them default: no |
--disable-bundled | build against preinstalled dependencies instead of bundled libraries |
--disable-bundled-distribute | excludes building and using the bundled distribute package in lieu of an installed version in PYTHONPATH |
--disable-bundled-pip | excludes building and using the bundled pip package in lieu of an installed version in PYTHONPATH |
--disable-bundled-wheel | excludes building and using the bundled wheel package in lieu of an installed version in PYTHONPATH |
--disable-python-dependency-install | when the python packages are installed during make install, no external dependencies are downloaded or installed |
The configure script has the following flags for optional packages:
Flag | Explanation |
---|---|
--with-gnu-ld | assume the C compiler uses GNU ld [default=no] |
--with-sysroot=DIR | Search for dependent libraries within DIR (or the compiler's sysroot if not specified). |
--with-zookeeper[=DIR] | excludes building and using the bundled ZooKeeper package in lieu of an installed version at a location prefixed by the given path |
--with-leveldb[=DIR] | excludes building and using the bundled LevelDB package in lieu of an installed version at a location prefixed by the given path |
--with-glog[=DIR] | excludes building and using the bundled glog package in lieu of an installed version at a location prefixed by the given path |
--with-protobuf[=DIR] | excludes building and using the bundled protobuf package in lieu of an installed version at a location prefixed by the given path |
--with-gmock[=DIR] | excludes building and using the bundled gmock package in lieu of an installed version at a location prefixed by the given path |
--with-curl=[=DIR] | specify where to locate the curl library |
--with-sasl=[=DIR] | specify where to locate the sasl2 library |
--with-zlib=[=DIR] | specify where to locate the zlib library |
--with-apr=[=DIR] | specify where to locate the apr-1 library |
--with-svn=[=DIR] | specify where to locate the svn-1 library |
--with-network-isolator | builds the network isolator |
Some influential environment variables for configure script:
Use these variables to override the choices made by `configure' or to help it to find libraries and programs with nonstandard names/locations.
Flag | Explanation |
---|---|
JAVA_HOME | location of Java Development Kit (JDK) |
JAVA_CPPFLAGS | preprocessor flags for JNI |
JAVA_JVM_LIBRARY | full path to libjvm.so |
MAVEN_HOME | looks for mvn at MAVEN_HOME/bin/mvn |
PROTOBUF_JAR | full path to protobuf jar on prefixed builds |
PYTHON | which Python interpreter to use |
PYTHON_VERSION | The installed Python version to use, for example '2.3'. This string will be appended to the Python interpreter canonical name. |