Software Projects Built on Mesos

DevOps tooling

  • Vamp is a deployment and workflow tool for container orchestration systems, including Mesos/Marathon. It brings canary releasing, A/B testing, auto scaling and self healing through a web UI, CLI and REST API.

Long Running Services

  • Aurora is a service scheduler that runs on top of Mesos, enabling you to run long-running services that take advantage of Mesos' scalability, fault-tolerance, and resource isolation.
  • Marathon is a private PaaS built on Mesos. It automatically handles hardware or software failures and ensures that an app is "always on".
  • Singularity is a scheduler (HTTP API and web interface) for running Mesos tasks: long running processes, one-off tasks, and scheduled jobs.
  • SSSP is a simple web application that provides a white-label "Megaupload" for storing and sharing files in S3.

Big Data Processing

  • Cray Chapel is a productive parallel programming language. The Chapel Mesos scheduler lets you run Chapel programs on Mesos.
  • Dpark is a Python clone of Spark, a MapReduce-like framework written in Python, running on Mesos.
  • Exelixi is a distributed framework for running genetic algorithms at scale.
  • Hadoop Running Hadoop on Mesos distributes MapReduce jobs efficiently across an entire cluster.
  • Hama is a distributed computing framework based on Bulk Synchronous Parallel computing techniques for massive scientific computations e.g., matrix, graph and network algorithms.
  • MPI is a message-passing system designed to function on a wide variety of parallel computers.
  • Spark is a fast and general-purpose cluster computing system which makes parallel jobs easy to write.
  • Storm is a distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing.

Batch Scheduling

  • Chronos is a distributed job scheduler that supports complex job topologies. It can be used as a more fault-tolerant replacement for Cron.
  • Cook is a job scheduler like Torque that not only supports individual tasks, but also Spark. Cook provides powerful automatic preemption and multitenancy features for shared clusters, in order to guarantee throughput to all users while allowing individuals to temporarily "burst" to additional resources as needed. Cook provides a simple REST API & Java client for interaction.
  • Elastic-Job-Cloud is a distributed scheduled job cloud solution designed with HA and fault-tolerance in mind. It focuses on horizontal scaling, and provides transient and daemon jobs, event and schedule based job triggers, job dependencies, and job history.
  • GoDocker is a batch computing job scheduler like SGE, Torque, etc. It schedules batch computing tasks via webui, API or CLI for system or LDAP users, mounting their home directory or other shared resources in a Docker container. It targets scientists, not developers, and provides plugin mechanisms to extend or modify the default behavior.
  • Jenkins is a continuous integration server. The mesos-jenkins plugin allows it to dynamically launch workers on a Mesos cluster depending on the workload.
  • JobServer is a distributed job scheduler and processor which allows developers to build custom batch processing Tasklets using point and click web UI.

Data Storage

  • Alluxio is a memory-centric distributed storage system enabling reliable data sharing at memory-speed across cluster frameworks.
  • Cassandra is a performant and highly available distributed database. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.
  • Ceph is a resilient, auto-healing, general purpose, open-source distributed storage solution. It provides mountable block storage, object storage API (S3 / Swift APIs supported), and a distributed file system (CephFS). While the framework is young, Ceph itself is mature and there are multitudes of large scale deployments.
  • ElasticSearch is a distributed search engine. Mesos makes it easy to run and scale.
  • Hypertable is a high performance, scalable, distributed storage and processing system for structured and unstructured data.
  • MrRedis MrRedis is a Mesos framework for provisioning Redis in-memory cache instances. The scheduler provides auto Redis master election, auto recovery of Redis slaves and comes with the CLI and a UI.

Machine Learning

  • TFMesos is a lightweight framework to help running distributed Tensorflow Machine Learning tasks on Apache Mesos with GPU support.