Task Queue

tags : System Design, Distributed Systems, Web Development, Message Passing, Queues, Scheduling and orchestrator

What?

A queue of tasks. It’s a specific usecase of Message Queue

Distributed task queue

See Distributed Task Queue :: SysDesignMeetup :: 2022-July-02 - YouTube

What is a distributed task queue?

This tripped me a little in the start. People be using the word “distributed” to mean multiple things.

“Sare milke humko …”

Distributed what?
- Worker: When you distribute the task over to multiple workers
  - workers can be processes in the same machine/node(same core, multiple cores etc)
  - workers can be processes running on different machine/node
  - Challenges: Maybe you want execute exactly once guarantee, maybe you want to make sure no task is being scheduled concurrently on two workers etc.
- Broker(queue manager): When the broker itself is distributed.
- Queue: This is only valid if the broker is distributed aswell, the queue is replicated and there is some shared consensus with the brokers around how to proceed with the items in the queue.
Not all distributed queue implementation out there do all 3(client/broker/queue), some do it with combination of something or some setting etc. So it’s hard to classify.

Distributed transactions and Distributed queue

If the usecase is really this, you might want to look at something like temporal (see Orchestrators and Scheduling)
Distributed task queues are not supposed to handle distributed transactions! (The naming confused me)

The exactly once guarantee

If your usecase allows for idempotence, dropped tasks etc. You don’t really need consensus.
But if you need to make sure of exactly once execution of the task, we need to have some kind of distributed Consensus
contrast: At-least-once Delivery | Cloud Computing Patterns

Durability vs Distributed

Whether the task is durable(in-memory) or persisted is different thing that whether the task queue is called “distributed” (as “distributed” can mean many things)
Levels of durability of items in the queue
- In-memory
- In-memory+persisted to disk: Survives program crash
- In-memory+persisted to disk+shared across machines: Survives machine crash

Delivery Gurantees

See Data Delivery

FAQ

TODO Components of a Task Queue (Doubt)

Components
- Queue (the store)
- Broker (the queue manager)
  - Sort of middleware that can handle things like routing, validate, store, handle backpressure, deliver stuff etc.
- The <thing>. (thing can be something like celery, hatchet etc.)
I am confused around, if the queue and broker does everything, what doe the <thing> do?
Who’s responsibility is task scheduling? Broker or the <thing> This additionally might do task sceduling(trigger)

Job Queue vs Task Queue vs Task Scheduler

A job queue/task queue is just the thing that stores the tasks
A task scheduler is the thing that does things with the items (usually the broker is involved in all this)

How to choose one?

A task queue may be simple or complext, based on your use case you might or might not need the following or more, see what fits.

Task scheduling (this is the only feature we need)
Delaying or priority (schedule at time - not needed)
Output (return values - we do not need as well)
Re-submitting failed task (we do not want)
Error fallback handlers (not needed)
Message headers (not needed)
Grouping (dependant tasks, again, not needed)
Rate limiting (ditto)
Compression (ditto)

Async task vs Background Tasks

In terms of an incoming request, an async task will operate in an out-of-band manner but the request cannot complete unless the async task itslef returns.
If we want the the request to respond faster, we will have to use a in-memory/external job queue. i.e We need a background task.

Handling task failures

Level 1: Writing tasks to handle failure

You write/design your tasks expecting some failures and be able to re-run safely.
So even if the

Level 2: Using a distributed queue (w broker redundancy)

Level 3: Using a workflow engine

Task Queue vs Workflow Engine(Eg. Temporal)?

See Orchestrators and Scheduling

They serve are different usecases
You can be using a job queue and can build out a workflow like thing on top of it. In other cases you might want to combine a job queue with an workflow engine for certain things.
In many cases, just a Task Queue might be enough, you might not need a workflow thing at all. Tradeoffs.
A workflow engine is better suited for more complex workflow-oriented tasks, while a task queue is better suited for simpler tasks such as sending emails.
If you need a high throughput of small tasks that need to be run ASAP, a message broker(task queue) is the better choice.

Debate around Database as Job Queue

It’s done widely in the industry, but there are edge cases we need to be aware of

Case of Redis

This is what most actual job queues use

Case of PostgreSQL

Task/Job Queue Patterns

“Concept Job Queue”
- Eg. in this blogpost slack mentions that they use the concept of job queue to refer to ”Kafka+Redis” in which Kafka is being used for persistence and redis as a store for the actual executor. But this is just one example of an usecase, the combination can be infinite only.
Job Queue with an additional staging area (aka Transactional enqueueing)
- Eg. With something like only sidekiq, in in case of a transaction rollback, we could end up with an invalid background job in the queue!
- To get around this, we put things into a staging db table before things get into the actual Job Queue.
- Instead of sending jobs to the queue directly, they’re sent to a staging table first, and an enqueuer pulls them out in batches and puts them to the job queue. This gives us nice ACID support for our workloads that go into the queue.
- See https://brandur.org/job-drain and Pattern: Transactional outbox

Pub/Sub Patterns

Fan in

1 sub, Multiple broadcasters

Fan out

Multiple subscribers, 1 broadcaster

Comparison of Background Task/Job Queue projects

Name	Distributed	Backend	Broker	Language	Comments
Celery	Yes	Redis	RabbitMQ/Redis	Mostly python	The queue itself is async but doesn’t support async functions
RQ	No	Redis		Python
SAQ	No	Redis		Python	like RQ but supports async using asyncio (See Python Concurrency)
Huey	No	Redis/in-mem/fs/sqlite	in-mem/redis	Python
APS	Yes	PG/Sqlite/mongo	PG/Redis	Python
tasiq	Yes			Python
wakaq	Yes(?)	redis	redis	Python
BullMQ				NodeJS
asynq	Yes			Go
machinery				Go
Hatchet	Yes	Postgres	RabbitMQ, NATS	Go	More than just a task queue, overrlaps with Temporal usecases
River		Postgres based		Go
Sidekiq				Ruby
GoodJob		Postgres based		Ruby, RoR
Que		Postgres based		Ruby	Alternative to trying to write your custom db based queue system
Custom DB based		DB of choice		-

Celery

Celery is a distributed task queue
It dispatches tasks/jobs to others servers and gets result back
It handles the task part of it but for the “distributed” part, it needs a Message Queue (MQ)
Needs a broker (Message Transports) and a backend (Result Store).
- RabbitMQ (as the broker) and Redis (as the backend) are very commonly used together.
- Can also be used in other combinations.

What does celery doesn’t support async mean?

The queue itself is async
But it doesn’t support async operations in the sense that the task operation itself will be blocking (no asyncio support)
In that case we want to use someting like SAQ

Against Celery

There’s lot to be said against celery

Lack of guarantee
I don’t think you need something like Celery in Go. Goroutines + Stdlib Timers/Tickers + Some minor libraries for additional time based stuff is all you need. And NATS if you’re distribution tasks over multiple instances of your application.
The Many Problems with Celery | Log Blog Kebab
Suggestion for blog post about covering celery problems - Show & Tell - Temporal

Hatchet

Comparison of Message Brokers (can be used by a Task/Job/Message queue)

See Message Queue

Name	Comments
RabbitMQ
NATS	Lot of good stuff
NSQ	Lesser options than NATS

RabbitMQ

NATS

see NATS dot io

🐏 mogoz

Table of Contents

Task Queue

What?

Distributed task queue

What is a distributed task queue?

Distributed transactions and Distributed queue

The exactly once guarantee

Durability vs Distributed

Delivery Gurantees

FAQ

TODO Components of a Task Queue (Doubt)

Job Queue vs Task Queue vs Task Scheduler

How to choose one?

Async task vs Background Tasks

Handling task failures

Level 1: Writing tasks to handle failure

Level 2: Using a distributed queue (w broker redundancy)

Level 3: Using a workflow engine

Task Queue vs Workflow Engine(Eg. Temporal)?

Debate around Database as Job Queue

Case of Redis

Case of PostgreSQL

Task/Job Queue Patterns

Pub/Sub Patterns

Fan in

Fan out

Comparison of Background Task/Job Queue projects

Celery

What does celery doesn’t support async mean?

Against Celery

Hatchet

Comparison of Message Brokers (can be used by a Task/Job/Message queue)

RabbitMQ

NATS

Links

Graph View

Backlinks

🐏 mogoz

Table of Contents

Task Queue

What? §

Distributed task queue §

What is a distributed task queue? §

Distributed transactions and Distributed queue §

The exactly once guarantee §

Durability vs Distributed §

Delivery Gurantees §

FAQ §

TODO Components of a Task Queue (Doubt) §

Job Queue vs Task Queue vs Task Scheduler §

How to choose one? §

Async task vs Background Tasks §

Handling task failures §

Level 1: Writing tasks to handle failure §

Level 2: Using a distributed queue (w broker redundancy) §

Level 3: Using a workflow engine §

Task Queue vs Workflow Engine(Eg. Temporal)? §

Debate around Database as Job Queue §

Case of Redis §

Case of PostgreSQL §

Task/Job Queue Patterns §

Pub/Sub Patterns §

Fan in §

Fan out §

Comparison of Background Task/Job Queue projects §

Celery §

What does celery doesn’t support async mean? §

Against Celery §

Hatchet §

Comparison of Message Brokers (can be used by a Task/Job/Message queue) §

RabbitMQ §

NATS §

Links §

Graph View

Backlinks

What?

Distributed task queue

What is a distributed task queue?

Distributed transactions and Distributed queue

The exactly once guarantee

Durability vs Distributed

Delivery Gurantees

FAQ

TODO Components of a Task Queue (Doubt)

Job Queue vs Task Queue vs Task Scheduler

How to choose one?

Async task vs Background Tasks

Handling task failures

Level 1: Writing tasks to handle failure

Level 2: Using a distributed queue (w broker redundancy)

Level 3: Using a workflow engine

Task Queue vs Workflow Engine(Eg. Temporal)?

Debate around Database as Job Queue

Case of Redis

Case of PostgreSQL

Task/Job Queue Patterns

Pub/Sub Patterns

Fan in

Fan out

Comparison of Background Task/Job Queue projects

Celery

What does celery doesn’t support async mean?

Against Celery

Hatchet

Comparison of Message Brokers (can be used by a Task/Job/Message queue)

RabbitMQ

NATS

Links