They mostly need Celery and Redis because in the Python world concurrency was an afterthought. In most other languages you can get away with just running tasks in the background for a really long time before you need spin up a distributed task queue. In Python I’ve seen Celery setups on a single machine.
You still should use some sort of work queue - your application process may need to restart (deploys), or for a period of time, work could overflow the amount of available resources (bursts), so having some place to put the task before it goes onto processing is useful regardless of the underlying concurrency primitives of the language.
Transactionality is enough for most systems. Your order either succeeds or fails, it never stays in an incomplete state. Queues are not a panacea, and introduce their own problems like obscuring problems by delaying them long enough to bring everything to a halt.
Yes, you wouldn't do all work in the task queue - commonly, we make some change in the database which can happen pretty fast, and after that transaction commits, we might defer a task that sends a notification, email, whatever.
It may not make sense to retain jobs across deployments. What if the contract of the job is changed by the code being deployed? Might be easier to keep it all in-process, letting queues drain in a graceful shutdown.
I haven't found that to be the case typically -- you could always serialize some information into the task to check for things like this.
Also consider if the machine running that process just disappears and that process dies. Putting work into a task queue allows you to do it durably until it can be processed so that it's not lost in some typical "that machine/instance died" scenario.
This might not be viable all the time, what if you have a stream of tasks being put to the queue ? The alternative would be to ensure that any change in the job contracts are backward compatible, and if any change in contract would need to have a remediation/migration plan for handling pending tasks.
In the business, customer, end-user sense, in many situations, you are much better off to have a transaction fail with an error as opposed to finally successfully happening after 2 hours once a service disruption is cleared.
Every problem is not the same. Work queues introduce a magnitude or more of complexity to an http application. Sometimes that is very needed. Sometimes its overengineering. Sometimes it is a gray area.
> They mostly need Celery and Redis because in the Python world concurrency was an afterthought
You have an operating system that you can use. You don't really _need_ concurrency when you have a machine that can timeshare amongst processes. That's the world Python was designed for
> In most other languages you can get away with just running tasks in the background for a really long time before you need spin up a distributed task queue
This is partly true, but not a lot of people do this because you need to persist those tasks unless you want them dropped during a reboot. Same logic goes for entire machines disappearing. You _can_ get by without a distributed system, but you will need to tolerate loss in those scenarios. Those losses are non-trivial for most apps/companies, so it doesn't seem all that practical to me to consider a world without a distributed system (and yes, persisting things in postgres/mysql before they are being worked on is still using a distributed system)
I agree. That is why I stopped using languages that have a poor concurrency story. Also, using an external queue and workers considerably increases the operational complexity of the system. An in-process solution with some light persistence for the work queue in an embedded database can go along way, at least until you outgrow a single machine. Doing both ops and development I’ve learned to appreciate simple solutions.
You could use something like async / await without Celery and Redis but Celery brings a lot to the table.
What happens when you want to retry jobs with exponential back off, or rate limit a task, or track completed / failed jobs?
You can wire all of this stuff up yourself but it's a hugely complicated problem and a massive time sink, but Celery gives you this stuff out of the box. With a decorator or 2 you can do all of those things on any tasks you want.
I use it in pretty much every Flask project, even on single box deploys.
> In most other languages you can get away with just running tasks in the background for a really long time before you need spin up a distributed task queue
In Python too:
# or ThreadPoolExecutor depending of the type of work
from concurrent.futures import ProcessPoolExecutor, as_completed
with ProcessPoolExecutor(max_workers=2) as executor:
a = executor.submit(any_function)
b = executor.submit(any_function)
for future in as_completed((a, b)):
print(future.result())
Any task you would put in celery would be a good candidate for being first passed to a process pool executor.
No, the real reason we use celery is that it solves many problems at once:
- it features configurable task queues with priorities
- it comes as an independent service accessible from multiple processes
- its activity can be inspected and monitored with stuff like flower
- it has the ability to create persistent queues that survive restart
- error handling is backed in, they are logged and won't break your process
- it offers an optional result backend for persisting results and errors
- many distribution and error handling strategies can be configured
- tasks are composable, can depend on each others or be grouped
- celery also does recurring tasks, and better than cron
- you can start tasks from other languages than Python
Now some People use celery when they should use a ProcessPool, because they don't know it exists. But that's hardly because of the language: you didn't seem to know about it either.
It is true that Python concurrency story is not comparable to something like Go, Erlang or Rust, but very common use cases are solved.
In the same way, we can perfectly use shelves instead of redis.
We use redis beacuse it solves many problems at once:
- it's very performant on a single machine, and can be load balanced if needed
- it has expiration baked in
- it offers many powerful data structures
- it embeds numerous strategies to deal with persistence and resilience
- it's accessible from multiple processes, and multiple languages
- its ecosystem is great, and it's hugely versatile
It's almost never a bad choice to add redis for your website. It's easy to setup, cheap to run, and and it shines to manage sessions and caching for any service with up to a few million users a day.
Once you are at it, why not later use it for other stuff, like storing the result of background tasks, log stream, geographical pin pointing, hyperlolog stats, etc. ? There are so many things it does better than your DBMS, faster, easier or for less resources.
It's such fantastic software really.
But no, nothing prevents you in Python to create a queue manually and serialize it manually. It's just more work for less features.
Thanks, I am well aware of every kind of multiprocessing and quasi-threading in the standard library, having built several large Python systems over the years.
I also understand the benefits of task queues. However, there are many cases in which you do not need any of those. Specifically, everything you wrote applies to the web backend/distributed systems usecase. Doing things in the background in a simple application, not so much. My problem is exactly with introducing distributed systems machinery for a local process on a single machine that doesn't need any of that.