Celery
Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well.
The execution units, called tasks, are executed concurrently on a single or more worker servers using multiprocessing, Eventlet, or gevent. Tasks can execute asynchronously (in the background) or synchronously (wait until ready).
Task Queue
Task queues are used as a mechanism to distribute work across threads or machines.
A task queue's input is a unit of work, called a task, dedicated worker processes then constantly monitor the queue for new work to perform.
Celery communicates via messages, usually using a broker to mediate between clients and workers. To initiate a task a client puts a message on the queue, the broker then delivers the message to a worker.
A Celery system can consist of multiple workers and brokers, giving way to high availability and horizontal scaling.
Celery is written in Python, but the protocol can be implemented in any language. In addition to Python there's node-celery for Node.js, and a PHP client.
Language interoperability can also be achieved by using webhooks in such a way that the client enqueues an URL to be requested by a worker.
Brokers
- RabbitMQ
- Redis
- Amazon SQS
- The recommended message brokers are RabbitMQ or Redis.
Concurrency
Result Stores
- AMQP, Redis
- Memcached,
- SQLAlchemy, Django ORM
- Apache Cassandra, Elasticsearch, Riak
- MongoDB, CouchDB, Couchbase, ArangoDB
- Amazon DynamoDB, Amazon S3
- Microsoft Azure Block Blob, Microsoft Azure Cosmos DB
- File system
Serialization
- pickle, json, yaml, msgpack.
- zlib, bzip2compression.
- Cryptographic message signing.
States
- celery.states.FAILURE = 'FAILURE' - Task failed
- celery.states.PENDING = 'PENDING' - Task state is unknown (assumed pending since you know the id).
- celery.states.RECEIVED = 'RECEIVED' - Task was received by a worker (only used in events).
- celery.states.RETRY = 'RETRY' - Task is waiting for retry.
- celery.states.REVOKED = 'REVOKED' - Task was revoked.
- celery.states.STARTED = 'STARTED' - Task was started by a worker (task_track_started).
- celery.states.SUCCESS = 'SUCCESS' - Task succeeded
- celery.states.precedence(state: str) → int - Get the precedence index for state.
States — Celery 5.4.0 documentation
Features
Monitoring
A stream of monitoring events is emitted by workers and is used by built-in and external tools to tell you what your cluster is doing -- in real-time.
Work-flows
Simple and complex work-flows can be composed using a set of powerful primitives we call the "canvas", including grouping, chaining, chunking, and more.
Time & Rate Limits
You can control how many tasks can be executed per second/minute/hour, or how long a task can be allowed to run, and this can be set as a default, for a specific worker or individually for each task type.
Scheduling
You can specify the time to run a task in seconds or a datetime, or you can use periodic tasks for recurring events based on a simple interval, or Crontab expressions supporting minute, hour, day of week, day of month, and month of year.
Resource Leak Protection
The --max-tasks-per-child option is used for user tasks leaking resources, like memory or file descriptors, that are simply out of your control.