« BackA conceptual overview of asynciogithub.comSubmitted by anordin95 2 days ago
  • anordin95 2 days ago

    I've used Python's asyncio a couple times now, but never really felt confident in my mental model of how it fundamentally works and therefore how I can best leverage it. The official docs provide decent documentation for each specific function in the package, but, in my opinion, lack a cohesive overview of the systems design and architecture. Something that could help the user understand the why and how behind the recommended patterns. And a way to help the user make informed decisions about which tool in the asyncio toolkit they ought to grab, or to recognize when asyncio is the entirely wrong toolkit. This is my attempt to fill that gap.

    • sandeep1998 2 days ago

      thank you for that.

    • omh1280 2 days ago

      Great read!

      Python asyncio can really screw up your runtime performance if you use it poorly. And it's _really_ easy to use poorly.

      Consider a FastAPI server using asyncio instead of threading. _Any_ time you drop down into a synchrononous API, you better be sure that you're not doing anything slow. For example, encoding or decoding JSON in Python actually grabs the GIL depending on what library you're using, and then you have no hope of releasing control back to asyncio.

      • kccqzy 2 days ago

        That's a GIL problem not an async problem. Even if you choose to ditch asyncio and use threads, you still need to care about the GIL. And when I use asyncio I don't worry about CPU-bound tasks like encoding or decoding JSON; I worry about some library doing I/O synchronously regardless of whether such library releases the GIL or not.

        • bb88 2 days ago

          This is spot on. GIL-less python will be a thing, and when it happens, there will still be no reason to combine asyncIO with thread primitives. Waiting for IO can be spun off into a new thread, and it will work as you expect it would.

          Trying to combine mental models of asyncio and threading is a model for pure insanity.

          • boomer_joe 2 days ago

            I fail to see why. You can have an event loop per thread, and a hypothetical requirement of wanting to make sure all compute in each thread is spent inside of its event loop (assuming OS level parallelism). Eg a latency-sensitive server in thread A and a logger in thread B (dont even need the event loop there for this example)

        • deathanatos 2 days ago

          JSON encoding is, as someone else points out, a GIL problem, but I want to add that even if you do JSON encoding in an async context:

            async def foo(…):
              json.dumps(d)  # you're blocking the event loop
          
          You're still going to block on it.

            def sync_foo(…):
              json.dumps(d)  # you're holding the GIL … and so blocking here too
          
          Short of resolving the GIL somehow (either by getting ridding of it, which I think is still a WIP though it has been "merged", I believe) or subinterpreters, etc., JSON is inherently going to need to hold the GIL while it walks the structure it is encoding. (Unlike a large file I/O, where it might be possible to release the GIL during the I/O if we have a strong ref to an immutable buffer.)
          • kevmo314 2 days ago

            This is more of a usability problem. In the second example, it's obvious that `json.dumps()` blocks everything else and it can be readily observed. It's not obvious that it blocks in the former and I've encountered many surprised coworkers despite it seeming obvious to me.

            I think a lot of people assume you can slap `async` onto the function signature and it will not block anything anymore. I've had PRs come through that literally added `async` to a completely synchronous function with that misunderstanding.

        • alex5207 2 days ago

          [About the event loop]

          > She's behind the scenes managing resources. Some power is explicitly granted to her, but a lot of her ability to get things done comes from the respect & cooperation of her subordinates.

          What a wonderful paragraph. Playful, yet with a deep meaning. It makes the article a joy to read.

          • ISO-morphism 2 days ago

            This is great, thank you! Python's asyncio has certainly confused me more than other languages' async-await implementations.

            Nit in [1]: When timing durations inside of a program it's best to avoid the system clock as it can and does jump around. For Python, prefer time.monotonic() or time.perf_counter() over time.time() in those situations.

            [1] https://github.com/anordin95/a-conceptual-overview-of-asynci...

            • quotemstr 2 days ago

              Why would anyone want to use asyncio over trio. The latter is one of the few structured concurrency systems that doesn't make me want to pry my eyeballs out with a spoon.

              • nromiun 2 days ago

                Not every program needs thread/task cancellation. Somehow people have been convinced that threading is the same as goto and it is obviously the wrong thing to do. goto is goto, you can't take anything you dislike and say it will die like goto did.

              • rtpg 2 days ago

                I like how asyncio could just be built off of generators, and how it all ... well it mostly works, and it works well enough for people who care enough to make a whole async stack.

                I am very unhappy with asyncio leading to the gold rush of a lot of people writing "async-capable" libraries that all make (IMO) really gnarly design decisions in the process. I have seen loads of newer Python projects that take async-capable libraries that make life harder for people who like shipping stable software.

                Meanwhile a lot of existing libraries/frameworks that just have more "serious" overall designs have to churn quite a bit to support sync and async workflows.

                I care a lot about Django getting async ORM support in theory, but at this point I don't know how that's happening. My current mentality is crossing my fingers that something akin to virtual threads[0] happens

                [0]: https://discuss.python.org/t/add-virtual-threads-to-python/9...

                • eurleif 2 days ago

                  You could use gevent. It uses green threads, so that the code you write looks like synchronous code. It can also monkeypatch core networking modules so that existing code will work without changes (including the Django ORM).

                • essnine 2 days ago

                  This is a good read. I remember first using eventlet for writing concurrent code, and then having to do a bit of mental adjustment when moving to asyncio.

                  Another piece of writing I found useful for perspective at the time was What Color is Your Function?[1], which I bumped into after looking at the Node.js model of concurrency and being confused.

                  [1](https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...)

                  • Izkata 2 days ago

                    > Frankly, I'm not sure why that design decision was made and find it rather confuses the meaning of await: asynchronously wait.

                    I've always understood it to mean "wait for asynchronous object", not that the wait itself is asynchronous. It's just an English word that roughly means "wait for", that was chosen for the nice "a" prefix for asynchronous stuff.

                    • anordin95 2 days ago

                      Mmm fair point! Though, coroutines aren't really asynchronous objects in that usage, right? Since `await coroutine` would run that coroutine synchronously.

                      • Izkata 2 days ago

                        But the coroutine object itself isn't synchronous, it represents suspended processing and by stuffing it into a task can be run asynchronously. If we don't try to consider it a separate third thing, I'd still put the coroutine object in the asynchronous bucket and say await is the thing synchronously waiting for it.

                        • undefined 2 days ago
                          [deleted]
                          • throwawaymaths 2 days ago

                            i don't think that's the case. an await coroutine requires you to be asynchronous because you are implicitly suspending yourself until the awaited function completes (and through however many suspensions the awaited function creates). an await can never be synchronous, you need to pull in an event loop to close between asynchronous functions and sync-land, not an await.

                        • paulgb 2 days ago

                          This is great! Thanks for writing it.

                          One nit, the unquoted quotes in this file seem to be a parse error (I replaced the inner ones with single quotes and it ran) https://github.com/anordin95/a-conceptual-overview-of-asynci...

                          • anordin95 2 days ago

                            Ah, I'm so glad to hear it. And, thank you for the nit/feedback! I generally use python3.12 for my work which doesn't error out on that line. However, python3.11 and below will raise a SyntaxError on it. I've fixed the issue there and in a few other places and pushed the changes :)

                          • cadamsdotcom 2 days ago

                            Awesome job closing a gap in the asyncio docs - wonder if it could be contributed back & be added!

                            • dboreham 2 days ago

                              Change title to "The Fundamentals of Python Asyncio"? As is it seems like the article is going to be about the generic subject of async i/o.

                              • crvdgc 2 days ago

                                asyncio by itself doesn't support asynchronous file I/O, see their wiki: https://github.com/python/asyncio/wiki/ThirdParty#filesystem

                                You have to use something like aiofiles to do that.

                                • anordin95 2 days ago

                                  Mhm. You need another thread to accomplish async file reads, which is basically what aiofiles does. This isn't really to the fault of asyncio. The necessary OS primitive isn't available. See the Linux documentation for the O_NONBLOCK flag and note this part: "Note that this flag has no effect for regular files" [1]. I actually originally wrote the sockets example in this article as using file i/o until I came across this bump in the road.

                                  [1] https://man7.org/linux/man-pages/man2/open.2.html

                                  • jufter 2 days ago

                                    Yep. Even io_uring sends all block device commands to a backend thread pool.

                                    I think only benefit is reduced syscall overhead.

                                  • jufter 2 days ago

                                    Instead of adding another dependency you can just call `loop.run_in_executor` yourself: https://github.com/Tinche/aiofiles/blob/main/src/aiofiles/ba...

                                  • foresto 2 days ago

                                    Nit: I think you forgot a closing quote in part 1 after "asynchronous-function or coroutine-function".

                                    • whinvik 2 days ago

                                      This is excellent. Thanks.