Welcome to – Trying to be easy – Article 1.
I have been through many articles dump. Some of them are easy to go through, and some of them went over my head. So, I have decided to try writing in the easiest way possible.
My simple understanding of these terms
- Concurrent Computing: When two or more programs/tasks can run in an overlapping time period and don’t necessarily start at the same time. Multi-tasking on a single-processing machine is an example of concurrent programming.
- Parallel Computing: In parallel processing, two or more programs literally run parallel to each other which can be seen in the case of a multiprocessor.
- Distributed Computing: In distributed computing, two or more programs run on multiple processors (same as parallel computing) but communicate through a communication network whereas parallel computing processes will have shared memory.
The current post will completely discuss concurrent programming and frameworks in python.
What options do we have for concurrency in python?
We have many options to implement concurrency in python. In this post, I will discuss mostly about asyncio lib:
Asyncio
Asyncio is a library that helps to write concurrent logic using async/await. This framework is popular in many languages (JavaScript, Python, C#..). Let’s talk about async without further await.
async is generally used before defining functions in python (like the example below).
async def main():
// asynchronous function logic
Now, why do have async in the function definition like above? – simply it changes the way the function call behaves.
How does it change? – calling a regular synchronous function will run all the logic inside it. But, calling an asynchronous function/method would return a co-routine object instead of running the inside logic.
async def main():
print("hello world!")
main()
Output:
<coroutine object main at 0x7f49f3357840>
The next obvious question is, what is this co-routine object. why does the async function call return such an object?
Subroutines (functions/methods) are entered at one point and exit at another point (synchronous methods) after the logic execution. Whereas co-routines can be entered, exited, and resumed at any point – which facilitates concurrent logic processing.
How can we resume/pause any co-routine at any point?
To understand resume and pausing, we need to know the ways to run a co-routine – by “event loop“
An event loop is a loop that deals with events. The reason for its existence is co-routines can’t pass control over to another co-routine after they decide to pause execution to wait for something. It can pass control to the caller which can call another co-routine to run (the whole point of concurrent programming). So this juggler of all co-routines is generally called an event loop. To put it in another way, pausing the execution of different co-routines and resuming them as necessary should be handled by something – that something is called an “Event loop“.
Continuing on pausing and resuming co-routines
let’s understand the “await“ keyword – this await can be only used inside async methods/functions or methods that can handle awaitables (we will talk about these methods in a bit). Look at the below example:
async def main():
await check_connection_async()
print("hello world!")
If you have noticed, await is used in front of an async method call (check_connection_async). But, what exactly happens here is – once you call await in the main method – the execution of the main method will be paused and the state is stored. check_connection_async will begin running. Basically, what awaits does – to run new logic it will release the current thread (main method thread) but not the whole thread pool – this way the other threads can keep on running. Once, the new logic (check_connection_async) is complete, it gonna restore the previous running thread and resume execution.
So check_connection_async which is a co-routine object which can be awaited. Extending this, there are three types of awaitable objects: Co-routines (already discussed), Futures, and Tasks.
Checkpoint
By here, you should be comfortable with idea of co-routine, difference between subroutine and co-routine, using async and await. In the next part, we gonna move forward understanding how do we submit these co-routine objects to event loop for concurrent execution.
Here we gonna discuss – how we submit these co-routines to the event loop to run them concurrently.
asyncio.run:
This method/function manages to get the event loop, run the co-routine and closing loop (one of method as discussed above)
async def test(plug):
await check_connection_async(plug)
print("hello world!")
asyncio.run(test(plug))
But, in this example we are trying to test one plug. What if there are multiple plugs and you want to test all of them concurrently? That’s where tasks come in –
Task
- Task are used to run multiple co-routines concurrently. How is that? Let’s write one simple example –
async def test(plug):
await check_connection_async(plug)
print(f"{plug} is working")
def testing_plugs():
plugs = get_all_plugs()
plug_testing_tasks = []
for plug in plugs:
plug_test_task = asyncio.create_task(test(plug))
plug_testing_tasks.append(plug_test_task)
What did we do in above code?
- Method 1: test – for testing a plug which will check_connection_async and print working message.
- Method 2: testing_plugs –
- This method gets all the plugs
- Iterate through each one of them and create a task (that takes co-routine test(plug))
- create_task gonna return an awaitable handle – which we will be storing in a plug_testing_tasks.
- Why did we store these return handles from create_task? These handles can be either to be awaited (like await task) which we did earlier with co-routines. But, here with bunch of tasks – what else can we do
asyncio.gather
async def test(plug):
await check_connection_async(plug)
print(f"{plug} is working")
async def testing_plugs():
plugs = await get_all_plugs()
plug_testing_tasks = []
for plug in plugs:
plug_test_task = asyncio.create_task(test(plug))
plug_testing_tasks.append(plug_test_task)
await asyncio.gather(*plug_testing_tasks))
asyncio.run(testing_plugs())
You see the difference here – in the previous code snippet and this one. The second method (testing_plugs) became asynchronous as asyncio.gather() returns a awaitable object – which we will need to run using asyncio.run. Is there a way to do it other than asyncio.run – of-course, there is
loop.run_until_complete
Let’s jump directly to an example with this implementation. And discuss what exactly is this loop is?
async def test(plug):
await check_connection()
print(f"{plug} is working")
def testing_plugs():
loop = asyncio.get_event_loop()
plugs = get_all_plugs()
plug_testing_tasks = []
for plug in plugs:
plug_test_task = asyncio.create_task(test(plug))
plug_testing_tasks.append(plug_test_task)
loop.run_until_complete(asyncio.gather(*plug_testing_tasks)))
- Here, the second method: testing_plugs remain synchronous method but still manage to run awaitable object – how?
- Two steps are differing in this code
- loop = asyncio.get_event_loop – basically this would fetch you the current event loop
- Once, you get the event loop – you will submit all the gathered asyncio awaitable to the event loop to run until complete.
- Two steps are differing in this code
- So, on a whole the second method (testing_plugs) will act as a wrapper for getting event loop, creating tasks, gather and submit tasks to event loop to run till every task completes.
Remember, these are just two commonly used conventions to run async method, there are many other ways to run these co-routines. And also, there many other options in the above discussed methods. I would strongly encourage to go through the official python docs to further read about them. I feel, this articles would give you sufficient information to go further read and write concurrent programs!
See ya, in next article with parallel programming!
References:
- https://docs.python.org/3/library/asyncio-task.html
- https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.get_running_loop
- https://stackoverflow.com/questions/62528272/what-does-asyncio-create-task-do
- https://stackoverflow.com/questions/62286255/difference-between-ways-to-run-a-function-asynchronously