Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I see a lot of comments that are talking about how python does not have a go-like concurrency story.

Fyi - python ASGI frameworks like fastapi/Starlette are the same developer experience as go. They also compete on techempower benchmarks. Also used in production by Uber, Microsoft,etc.

A queue based system is used for a very different tradeoff of persistence vs concurrency. It's similar to saying that the usecase for Kafka doesn't exist because go can do concurrency.

Running "python -m asyncio" launches a natively async REPL. https://www.integralist.co.uk/posts/python-asyncio/#running-...

Go play with it ;)



I think a hard part with lots of these “what do I use x for” examples is it starts with the tool and the discusses the problem that it solves. I find it more helpful to start with a problem, and discuss the various tools that address it, in different ways.

Forget email, say you have a app that scans links in comments for maliciousness. You rely on an internal api for checking against a known blacklist, which follows shortened links first, and an external api from a third party. You want the comment to appear to submit instantly to the poster but are comfortable waiting for it to appear for everyone else. What are your options?

You could certainly use message queues and workers. If you’re cloud native maybe you leverage lambdas. Maybe you spin up an independent service that does the processing and inserting into the database in the background, and all you need to do is send a simple HTTP request on an internal network.

Your solution depends on your throughout requirements, the size of your team and their engineering capabilities, what existing solutions you have in place. Everything has its pros and cons. Pretending that celery/redis is useless and would be solved if everyone just used Java ignores the fact that celery and redis are widely popular and drive many successful applications and use cases.


We've been experimenting with nice compromises for using pydata on-the-fly compute with caching. Basic tension is indiv requests get blocked by their blocking APIs, while caches (eg, IP-pinned) are best when same-python-app-thread.

Right now, we do hypercorn multiproc -> per-proc quart/asyncio/aiohttp IP-pinned event loop -> Apache arrow in-app cache -> on-gpu rapids.ai cache.

But not happy w event loop due to pandas/rapids blocking if heavy datasets by concurrent users. (Taking us back to celery, redis, etc, which we don't want due to extra data movement..) Maybe we can get immutable arrow buffs shared across python proc threads..

Ideas welcome!


While I agree with the rest of your comment, the sentence "if you’re cloud native maybe you leverage lambdas" made me irrationally angry.


Can you explain why? I use lambdas often and they seem to solve the problems they're meant for well.


It wasn't the lambdas, it was the combination of "cloud-native", which is a very salesmany term, and "leverage", which is my pet hate word. It's exactly as useful as "use", only much more pretentious. I'm just easily triggered with language :P

More off-topic (or, rather, on-topic), I find lambdas great for things like a static website that needs a few functions. I especially like how Netlify uses them, they seem to fit that purpose exactly.


> I'm just easily triggered with language :P

Me too! It makes me irrationally angry when people regurgitate linguistic clichés. I was already mad with:

"python does not have a go-like concurrency story"

when it would be enough (and 1000x less cringe) to say:

"python does not have go-like concurrency"

I think these mindless clichés make language really ugly and dysfunctional, and even worse they are thought-stoppers, because they make the reader/listener feel like something smart is being said, because they recognize the "in-group" lingo. In my experience, people get really offended when you point this out. It's kind of an HN taboo to discuss this. Which is also interesting in itself.

Going forward we should pay more attention to our communication use cases. Btw: I wonder if we can stack several of these clichés. For example: "leverage" + "use case" = "leverage case".


I agree, I think Orwell's "Politics and the English Language" is spot on here. I try to use simpler language whenever possible, I agree that people think that using longer words makes them sound smart but is just worse for communication.

I've found it's a taboo to discuss anything even slightly personal. People are averse to feeling bad, so criticism needs to be extremely subtle in order to not offend.

> Btw: I wonder if we can stack several of these clichés. For example: "leverage" + "use case" = "leverage case".

I hate you for even thinking of this.


> I've found it's a taboo to discuss anything even slightly personal. People are averse to feeling bad, so criticism needs to be extremely subtle in order to not offend.

The personal association you made between "discussing anything even slightly personal" and "criticism needs to be extremely subtled" makes it sound that your problem isn't language or Orwellian discourse but the way you subconsciously link discussing personal matters with harshly criticising those you speak with for no good reason.

If your personal conversations boil down to appease your own personal need to criticise others then I'm sorry to break it to you but your problem isn't language.


You just misconstrued my saying "personal" and clearly meaning "personal criticism" as meaning personal things in general and then criticized me on that straw man. I don't hold that opinion at all.

You also went to "Orwellian discourse", which has a specific meaning, from a text by Orwell I mentioned. It seems to me like you got personally offended, interpreted my comment in the most uncharitable way, and chose to lash out at me instead, and I'm not sure why. I wasn't even talking about anyone specifically.


You were the one associating discussing remotely personal stuff with criticising others, and if that was not bad enough your personal take was that you felt the need to keep criticising others but resorting to subtlety just to keep shoveling criticism without sparking the reactions you're getting for doing the thing you want to do to others.

I repeat, your problem is not language. Your problem is that you manifest a need to criticize others. That problem is all on you.


That is such quite a stretch of the imagination and I sure didn’t read it that way. I may be wrong but here’s one fun example from this comment section that I wanted to “respond” to and demand some clarification on.

https://news.ycombinator.com/item?id=22911497

PS: I work with like 65-70% of that stack daily


Ah, okay. If I had said "any remotely personal criticism", like I meant, you'd have an entirely different conclusion, but I guess that doesn't matter when you just want to jump to conclusions.

This shows from the fact that you're saying "criticize" like it's a bad thing.


Besides, serious projects in Go do use additional tools for creating task queues because they need to handle various stages of persistence, error handling, serialization, workflow, etc. Which are not stuff you want to write by hand yourself.

If you don't need all that, then it's not a problem in Python either. You don't even need asyncio, you can just use a ThreadPool or a ProcessPool, dump stuff with pickle/shelve/sqlite3 and be on your way.


I was using gevent in Python about 10 years ago, and from memory, it's roughly similar to goroutines. It's not exactly the same of course, but just like goroutines it's pretty easy to just spawn a few jobs off, wait for them to finish, and get the results.

It's not in the standard library, and there are probably other options too now (not a heavy Python user any more), but Python has had easy parallelism for at least a decade (probably longer).


Gevent is like goroutines with GOMAXPROCS=1. Which is to say not nearly as useful. It gives you concurrency without parrallelism, because Python never did shake the GIL. Which goes to show some technical debt will haunt you forever.


Funny you mention the GIL - this is being done as part of PEP 554 which was slated to release in python 3.9 (in alpha).

And it's showing great promise ...while it could be delayed.

https://www.mail-archive.com/python-dev@python.org/msg108063...

2021 is probable going to be the "Gone Gil" moment !


Why don't they use/fork the GoLang runtime?


There are a few reasons: 1. This would definitely break the CPython API, which is not an option for mainstream Python. 2. The Golang runtime isn't really well-understood as a backend for languages that aren't Golang, although I acknowledge that there's no reason in principle why you couldn't compile $LANGUAGE to Golang.


To nuance your comment, you can still get some form of parallelism, "just not" thread parallelism in Python. You can still spawn multiple handler process, or have threaded code in a C extension.


Yes, you can use multiple processes to get the parallelism. But that's quite limiting compared with goroutines. Passing data back and forth is hard, and you can petty much forget about shared data structures. Memory usage is also much higher.


It’s arguable whether either of those are ‘in python’.


The multiprocessing module in the standard library is absolutely a Python-native way to do parallelism:

    with Pool(5) as p:
        print(p.map(f, [1, 2, 3]))
This runs f(1), f(2), and f(3) in parallel, using a pool of five processes.

https://docs.python.org/3.6/library/multiprocessing.html


Whoa! It never occurred to me that `Pool` could be used with context manager. I've always typed out `.close()` manually like a sucker.


https://docs.python.org/2/library/multiprocessing.html

Python standard library since 2.6. Pretty much the definition of "in python".


> Gevent is like goroutines with GOMAXPROCS=1.

On its own, yes. For webapps, you can easily combine it with a multi-process WSGI server (like gunicorn or similar).


Running four processes with GOMAXPROCS=1 is strictly worse than 1 process with GOMAXPROCS=4. The difference is coarse-vs-fine grained parallelism. Notably, a WSGI+gevent system doesn't allow you to do parallelism within a request, not to mention configuring these WSGI implementations (especially for production) is a bunch of extra headache for which there is no analogy in Go.


The GIL is not a total ban on thread parallelism. It's a significant obstacle, but not a complete stop.

Besides, Go has its own set of problems with parallelism. None of those are best in class.


This comment makes it sound like Go and Python are pretty much in the same class because neither is perfect. I invite anyone who thinks this way to write considerable (shared-memory) parallel code in both languages and see which they prefer.

As for Go's "set of problems with parallelism", they're pretty much just that sharing memory is hard to do correctly without giving up performance. No languages do this well; Rust and Haskell make it appear easier by making single-threaded code more difficult to write--requiring you to adhere to invariants like functional purity or borrowing. If you're writing Python, you very likely have values that are incompatible with these invariants (you want to onboard new developers quickly and you want your developers to write code quickly and you're willing to trade off on correctness to do so).

Go is absolutely best-in-class if you have typical Python values.


Why bother writing shared-memory parallel code if it makes your life so hard? Most of the time you're i/o bound, or network bound, or storage bound. Being compute bound is exceptionally rare these days.


I do it when I have to do it. Most things are I/O bound, but sometimes things are compute bound. And since Python is about two orders of magnitude slower than Go (and Python also makes it much harder to optimize straight-line execution because it lacks semantics for expressing memory layout), you tend to need parallelism more often than you would in a fast language. Sometimes you can leverage pandas or write a small chunk in C, but very often those options aren’t available, and naively throwing c at the problem can make your performance worse.


> Go is absolutely best-in-class if you have typical Python values.

"Best in class" is not a relative term. Neither Go nor Python are appropriate choices for highly parallel intercomunicating code. Yes, Python is more limited than Go here, but hardly makes a difference when you avoid it.

Haskell and Rust do make it easier, by forcing developers to organize their code in a completely different way. Erlang does the same, with a different kind of organization. None of those languages are more difficult to program in, but yes, they are hard to learn.


> Neither Go nor Python are appropriate choices for highly parallel intercomunicating code.

thats an extraordinary claim which needs evidence.


Extraordinary?

Name any single Go feature aimed at helping parallel computing.


I still use and greatly prefer gevent to this day. They are indeed quite similar to goroutines. Asyncio is a different model, and more irritating for me to work with. (I'm sure this isn't the case for everyone. I'm just more productive with gevent, personally.)

Performance is pretty close for both. I was disappointed to not see Python get an official green thread implementation. The counter-argument I commonly see cited is https://glyph.twistedmatrix.com/2014/02/unyielding.html. I personally don't find it to be a very convincing argument.


What's missing in asyncio when compared to gevent ?

The coroutine and queue model is the same right ?

Cool thing in 3.8:

Running python -m asyncio launches a natively async REPL.


They're just completely different models. gevent is green threads, asyncio is explicit coroutines. In gevent, you don't use an "async" or "await" syntax. You just spawn greenlets (green threads), which run in the background, and if you want, you can have them block until they return a result. And they're efficient, so you can spawn thousands without a problem.


I always considered "green threads" and "coroutines" to be the same thing? Are they not?


The difference is that with coroutines, yielding is explicit. Tight-looping will hog the CPU, and a blocking call will block other coroutines too. Typically "green threads" are semantically just threads but cheaper. They're scheduled independently, so there's no risk of them hogging the CPU, and you can use synchronous apis. The one downside is that you need explicit synchronization between them, whereas with coroutines you can mutate shared data structures and not worry about race conditions as long as you don't yield in between.


They are both built on the same technology in the CPython runtime.


If you have a significant python code base that is not async, then all of that need to be ported to support async model where as with gevent I can do monkey patching and move to concurrency model. If I am starting a fresh project with python and need concurrency, yes "async" is a better choice, but if you already have some code base then moving to async is a fair amount of work.


With asyncio, your whole app falls over if you accidentally call a library function that makes a sync API call under the covers. gevent (as I understand it; haven't actually used it) will patch all sync APIs and make them async. Also, if you do `aiohttp.get("www.example.com/foo.json").json()", you get a TypeError because coroutine has no method '.json()' (you forgot `await`) unless you're using Mypy.


Yep, that about sums it up. gevent can't monkeypatch synchronous code that's implemented in non-Python native modules, but I think pretty much all native Python libraries struggle with those sorts of things, and asyncio of course also can't deal with it.

The vast majority of the time, gevent's monkeypatching works without any issues. With asyncio, you basically have to rewrite everything from the ground up to always use the new async APIs, and you can't interact with libraries that do sync I/O.


> Fyi - python ASGI frameworks like fastapi/Starlette are the same developer experience as go.

Can you provide some context for this statement? I've used Python asyncio extensively in Fargate (no ASGI frontend), and the developer experience is far from Go; however, I don't see how an ASGI framework can fix this. It seems like it offers the same general course-grained parallelism that you get from a containerized environment like Fargate except that Fargate abstracts over a cluster while ASGI frameworks presumably just abstract over a handful of CPUs.

For example, we have a large data structure that we have to load and process for each request. We want to parallelize the processing of that structure, but the costs to pickle it for a multiprocessing approach are much too large. We've considered the memory mapped file approach, but it has its own issues. We're also looking at stateful-server solutions, like Dask, but now we're talking about running infrastructure. In Go, we could just fork a few goroutines and be on our way.


you should ask this exact same question here - https://github.com/tiangolo/fastapi/

i dont claim to have expertise in your business domain, but you should get the answer here.


The difference is that it doesn't play well with large parts of the existing Python ecosystem. Concurrency is a pain if you are writing a library or trying to integrate with existing code. Python async is powerful, but:

(1) it's undersupported (e.g. you can in theory download s3 files using async botocore, but in practice it is hard to use because of strict botocore version dependencies)

(2) it isn't natural - once you're in the event loop it makes sense, but using an event loop alongside normal python code is confusing at best.

(3) it got introduced too late. The best async primitives are only available in pretty recent versions of python3.

The difference with go is that it go has these primitives built in from the beginning and using them doesn't introduce interoperability problems.


and on top of it go is performant enough that 90% of the time you don't need them to begin with.


Most common use case for concurrency is for IO intensive workloads. When it comes down to IO, programming language hardly matters.

If you have CPU intensive workload, an optimizing compiler can help. Well, for that you have C or other languages with more mature and aggressive compilers.


I've played with the concurrency in Python, and it's simply not worth it. Much better to use Node.js or Go where the async story is not an afterthought.

Of course if you are stuck with Python it's better than nothing.


> same developer experience as go

You can't say that when Python shoves Pip/Poetry/VirtualEnv/Black/Flake and what not into people. In contrast, Go has built-in package management and gofmt. Python is essentially gate-keeping new devs.


Go has had an equally bad packaging experience.

Kubernetes - which is one of the biggest projects built in go - has been struggling with dependency and package management.

Here's the CTO of Rancher commenting on his struggles

https://twitter.com/ibuildthecloud/status/118752909888666419...

https://twitter.com/ibuildthecloud/status/118753821015230873...

This is not trivial stuff..and it shouldn't be trivialised into a go vs python flamewar. Because it can't be.


I've used Python and Go extensively. Go's packaging story has a few rough edges, but Python's is an impenetrable maze of competing tools that each purport to address others' major hidden pitfalls.

To work with Python packages, you have to pick the right subset of these technologies to work with, and you'll probably have to change course several times because all of them have major hidden pitfalls:

* wheels

* eggs

* pex

* shiv

* setuptools

* sdist

* bdist

* virtualenv

* pipenv

* pyenv

* sys.path

* pyproject.toml

* pip

* pipfile

* poetry

* twine

* anaconda

To work with Go:

* Publish package (including documentation): git tag $VERSION && git push $VERSION

* Add a dependency: add the import declaration in your file and `go build`

* Distribution: Build a static binary for every target platform and send it to whomever. No need to have a special runtime (or version thereof) installed nor any kind of virtual environment nor any particular set of dependencies.


Yeah, until github is unreachable and the entire Go universe grinds to an immediate halt because nothing will build.

Python packaging is a mess, but Go doesn't even bother. "Just download from some VCS we'll pretend is 100% reliable and compile from source" is not a packaging solution.


How is that any different than the entire Python universe grinding to an immediate halt if there's an issue with pypi.python.org? (Hint: it's not.)

You can certainly debate the difference in uptime between specific services; I don't know either way, but if you told my that PyPi had higher uptime than GitHub, I'd believe you... but that's kinda missing the point. If you depend on an online service to host your release artifacts, if and when that service goes down, it's gonna hurt.

Meanwhile, Python's packaging wars continue to rage on. Go's is simple: a release is a tag in a VCS repository. I'm sure there are issues with that as well, but that should come as no surprise, considering there are issues with literally every packaging solution. At any rate, there's little moral difference between downloading a tarball (or a wheel, or... whatever), vs. pulling a tag from a git repo. It requires equal levels of trust to believe that no one has tampered with prior releases in both cases.

I'd like to also point out that I don't have a dog in this race. I've done a little Go here and there, but frankly I don't like the ergonomics of the language too much, so I stay away from it. I've done (and continue to do) a decent amount of Python. I like the language, but tend to prefer strongly-typed, functional languages, and languages with performant runtimes, so I tend to only use it for smaller projects.


You can trivially run your own local PyPI mirror or install packages directly from some other source (e.g. S3 bucket, LAN storage). Is there a way to do that for Go? If so, I've never seen it done.


It's possibly as of the proxy in Go 1.13, but this was not well-documented, suffers from competing implementations, and introduced in a way that probably broke more builds than it helped.


yes you just vendor your dependencies.

go has support for a proxy system, tooling is still immature though.


I’ll take “doesn’t even try, but just works” all day every day. What is Github’s downtime for cloning in the last year, and how does it compare to Pypi? And if you’re really worried, why not use a caching proxy just like you do with Pypi? In my experience (using Python since 2008 and Go since 2012), Go package management has far fewer problems.


I can't build lego from source due to a failed dependency. The docs don't help either, they're plain wrong. To make matters worse, Go pulls the latest dev version so good luck trying to build a stable binary of some complex package. I've opened an issue which was promptly closed and I was told to "just download the binary dist, source builds are for devs". To add insult to injury, each project is built in its own usually broken way. Out if date software? Good luck. Sorry, but I've had overall better experiences installing random Python programs with pip or building D libs with dub. Pulling half of Github rarely qualifies as "package management". It only encourages a giant mess, which is precisely what software development has been lately. Go is probably worse than npm in this respect.


no have no idea what you're talking about. these problems don't exist anymore since godep and now go modules which is builtin to standard go tooling.


> Yeah, until github is unreachable and the entire Go universe grinds to an immediate halt because nothing will build.

that's what vendoring is for and the proxy cache. this problem hasn't existed since like go 1.8 and is completely resolved in go1.14.


You're combining multiple problems: maintaining a package for redistribution, and using packages. For the second, the much more common case, 2/3 of the things on your list are irrelevant.


For either case, Python’s story is more complex than Go’s.


Go may be unique in being the only ecosystem built after Python that can't claim it avoided Python's packaging disasters.


How do you figure? Go's packaging is wayyyy better than Python's. I've done considerable work with each and while Go's ecosystem has warts here and there, it's far from disastrous. I can't say that about Python.

If nothing else, Go lets you distribute a static binary with everything built in, including the runtime. Python's closest analog is PEX files, but these don't include the runtime and often require you to have the right `.so` files installed on your system, and they also don't work with libraries that assume they are unpacked to the system packages directory or similar. In general, it also takes much longer to build a PEX file than to compile a Go project. Unfortunately, PEX files aren't even very common in the Python ecosystem.


In the context of

> Pip/Poetry/VirtualEnv

"packaging" refers to the way the language manages dependencies during the build and import process, not how you distribute programs you have built.

Python has a deservedly poor reputation here, having churned through dozen major overlapping different-but-not-really tools in my decade and a half using it. And even the most recent one is only about a year into wide adoption, so I wouldn't count on this being over.

Go tried to ignore modules entirely, using the incredibly idiosyncratic GOPATH approach, got (I think) four major competing implementations within half as long, finally started converging, then Google blew a huge amount of political capital countermanding the community's decision. My experience with Go modules has been mostly positive, but there's no really major new idea in it that needed a decade to stew nor the amount of emotional energy. (MVS is nice but an incremental improvement over lockfiles, especially as go.sum ends up morally a lockfile anyway.)


I'm slowly deprecating a python system at work and replacing it with elixir. We don't use containerization or anything, and installing the python system is a nightmare. You have to set up virtualenvs, not to mention celery and rabbit, and god help you if you're trying to operate it and you forget something or another.

With elixir, you run "mix release" and the release pipeline is set up to automatically gzip the release (it's one line of code to include that). Shoot the gzip over (actually I upload to s3 and redownload), unzip, and the entire environment, the dependencies, the vm, literally everything comes over. The only thing I have to do is sudo setcap cap_net_bind=+ep on the vm binary inside the distribution because linux is weird and, as they say, "it just works".


I fully agree with this assessment, but I don’t see how this puts Python’s story on par with Go’s. While GOPATH was certainly idiosyncratic, it generally just worked for me. While go modules aren’t perfect and the history was frustrating, they generally work fine for me. Python feels like an uphill battle by comparison.


If Go sticks with modules and doesn't keep making significant changes (e.g. the proxy introduced 1.13 was not handled well), then it will be better than Python.

But if Python finally "picks" poetry, sticks with it for a few years and incrementally fixes problems rather than rolling out yet another new tool, that will also be better.

You can only identify the end of the churn for either retroactively. Python just looks worse right now because it's been around longer.


difference here is track record.

go: tends to wait and implement something once the problem is understood. took 2 years after go maintainers decided to solve the dependency issues. and as of the latest release its finally been labelled production ready.

and honestly the proxy issues were not real. go modules was still optional. you could just turn it off.

python is how old now? couple decades? and it has only gotten worse over time.


Another thing Python and Go unfortunately have in common is a community (not necessarily core developers) with knee-jerk reactions to any criticism.

> go: tends to wait and implement something once the problem is understood.

Go's modules provide no additional "understanding" over any of the other Bundler-derived solutions in the world. MVS was the primary innovation, but wanting checksum validation means I have to track all the same data anyway.

> took 2 years after go maintainers decided to solve the dependency issues

This is revisionist history. There were other official "solutions" before ("you don't need it", "vgo is good enough", and "we'll follow community decisions"). If this one sticks, it's fine. But you can't say it's good now just because it's the one we have now - it's good now if it's the one we still manage to have in five years.

Go's track record is not "good" (in that regard I think only Cargo qualifies). At best it's "mercifully short."

> and honestly the proxy issues were not real.

Documentation was poor, the needed flags changed shortly before release, the design risks information leaks, and the entire system should not have been on by default for at least one more minor version.

> python is how old now? couple decades? and it has only gotten worse over time.

Yeah, that's exactly why I said "Python just looks worse right now because it's been around longer." It hasn't gotten worse though, it just also hasn't stopped churning. And if Go doesn't stop churning, in 10 years it will look the same.

The age argument works both ways - multiple major versions of Python predate Bundler. Go has no excuse for taking so long to reinvent "Bundler with incidentals", just like every other language.


I believe Python suffers from no leadership in that space (everyone creates their own packaging, every tutorial advocates something different, many tutorials are outright wrong).

There was also a bad decision of using Python code for installation (setup.py) instead of a declarative language.

Most of that issues are actually fixed in setuptools if you put all settings in setup.cfg and just call empty setup() in setup.py.

Like here: https://github.com/takeda/example_python_project/blob/master...


Cargo works pretty well too.


Cargo is not from Go


As a name for a package manager, "cargo" certainly appears more suitable for "go" than for "rust".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: