Tech Corner - 6. May 2022
header_image

Asynchronous Programming World (Part 1)

What will this multi-episode blog post be about?

In short:

Async programming principles…

…starting with Thread API, Executors and Callbacks

…through the Promises & Futures

…back to the Threads (Green ones!)

…on to the Actors

…perhaps ending with Effects & Reactive magic!

#JVM #Java #Scala #Kotlin #Rust #JavaScript #Go

The motivation

Life is too short, we can’t afford to do things synchronously!

Fortunately, with the help of others, we can do multiple things in parallel. And in the case of work that is not fully continuous, where we have windows in which we have to wait before we can proceed – that's our opportunity to optimize our processes. Let’s use that time for another task! Waiting is a waste of life, isn't it enough that people have to sleep? 

Don’t worry… I’m actually only referring to programming here. Sleep is definitely a good thing ;) So, how can we get more done in less time? 

Have you ever heard something like “let’s trigger some asynchronous computation for these non-blocking I/Os with the language construct that looks synchronous, but also spawn blocking tasks with the help of managed blocking in separate threads so that we will not affect the non-blocking computation performance?

What the heck, really? Like... I just wanted to run something asynchronously to the main flow of my program.

Everything simply works and the abstraction is here to make the life of software developers easier.” – This is the sweet illusion that a programming language often promises with its amazing levels of abstraction and syntactic sugar. But to write an effective code, ultimately it comes down to the necessity of having a deep understanding of asynchronous models, language features, library APIs and their internal implementation. One can’t only rely on abstraction. 

Either you are the one who is going to explain during the code review why the code is written as it is, or the one who is explaining async ideas that will make the solution better, or the one who is responsible for the quality of the product. Whichever you are, let's take this first blog episode as the foundation on the way to master your craft.

Episode #1 Intro

In this episode we will start with concepts, principles, language constructs and APIs, and we will touch on a couple of hidden mysteries that we will continue in later episodes.

Let's start with the early age of software development, where the language abstraction was about having the simple thread API that we can simply map to underlying OS threads, then continue on to talk about how to use this thread API for asynchronous programming with the famous concept of callbacks – and the related callback hell that this approach can cause.

We’ll follow this up with thread pools and executors, and continue on with a higher level of abstraction using Futures and/or Promises and the simple ways of how to work with them. 

Later, in the next episodes, we’ll move a step above to async/await constructs or even "for comprehension" constructs, and return back to the threads – well, to the green ones!

“Everyone likes theory”

I really don’t want to bother you with a theoretical part in this blog, but there are some “buzzwords” I want to recap as they are so critical to everything I’ll be discussing in this blog, and form the very foundations of the solution design.

Asynchronous vs Synchronous

When talking about synchronous computation we are executing tasks one at a time, waiting until the first one is done before continuing with the second. In an async world, we can spawn one task and continue with another without waiting till the first is done. Usually we have some main flow of the program and we spawn new tasks to be done asynchronously while continuing with the main flow, and at some point perhaps either use the results of all the tasks, or just have registered callbacks to handle the result of async computation. It depends on the async features of the language. 

In general (and in this blog for sure) when talking about asynchronous programming we are actually pointing at the features and APIs of the programming language that allow us to execute a job asynchronously.

Blocking vs Non-Blocking

Simply put, some operations block the underlying thread so that nothing else can be done on that thread until the operation is done. Those that are non-blocking allow the thread to be used for something else in the meantime.

Concurrent vs Parallel

In a concurrent environment we execute tasks by switching between them, so they are not really running at the exact same instant of time. Parallelism is the only way to really execute something at the very same instant of the time – either on multiple processors/cores, or separate computers in some distributed systems.

Cooperative vs Preemptive

When scheduling the next task to be done, the system can either decide in a preemptive way, which means it will stop the current computation, switch the context and continue with the other task, or it can be done in a collaborative (cooperative) way, letting the task decide when to hand over.

Demystification using examples

Early age async programming API in JVM languages

java


Consumer<String> callback = System.out::println;

new Thread(() -> {
    try {
        Thread.sleep(1000);
        callback.accept("Async task completed!");
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
}).start();

System.out.println("Hotovo!"); 

kotlin


fun callback(result: String) = println(result)

thread(start = true){
try {
        Thread.sleep(1000)
        callback("Async task completed!")
    } catch (e: InterruptedException) {
        e.printStackTrace()
    }
}

println("Hotovo!") 

scala


def callback(result: String): Unit = println(result)

new Thread(() => {
  try {
    Thread.sleep(1000)
  } catch {
    case e: InterruptedException =>
      e.printStackTrace()
  }
  callback("Async task completed!")
}).start()

println("Hotovo!") 

The code above is very simple. In the main program flow there is something we would like to make efficient so it’s running as a separate piece of code asynchronously. Let's consider this piece of code as a Task that will be executed in a newly created Thread, and after the internal computation is done, there is a callback that should be done in the end.

What's the main problem here? We have to create the Thread ourselves, which definitely leads to inefficient and unmaintainable code if there’s hundreds of async tasks we’d like to execute. The other problem is that the Thread + callback as an async feature is like sitting in a very old car – it does the job but it’s not eco-friendly, it’s expensive and there are a lot of things to do manually. And of course, the hell the callbacks could cause is still a valid and scary point here.

So what's the next level?

See the following code:

java


Consumer<String> callback = System.out::println;

var es = Executors.newFixedThreadPool(2);
es.execute(() -> {
    try {
        Thread.sleep(1000);
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
    callback.accept("Async task completed!");
});

System.out.println("Hotovo!"); 

scala


def callback(result: String): Unit = println(result)

import scala.concurrent.ExecutionContext.Implicits.global
global.execute(() => {
  try {
    Thread.sleep(1000)
  } catch {
    case e: InterruptedException =>
      e.printStackTrace()
  }
  callback("Async task completed!")
})

println("Hotovo!") 

As you can see, here we have the executor in place (Java's executor service and Scala’s execution context). This definitely makes our lives easier, as we do not need to directly maintain the threads, and other than the “Thread”Pool word there is nothing that exposes the fact that there are some threads under the hood. Note that the executors need Runnable to run (or Callable to call), but let's discuss that later.

So, we didn’t get rid of the callbacks but at least the threads are not bothering us anymore. Right? Nice illusion, but unfortunately it’s not that simple! There are multiple implementations of executors in different languages, and they all have varying strategies on how to maintain the existing threads or how to spawn new threads when needed. 

If we are not aware of the internal implementation we can still end up with inefficient async code that will make our computation slower. Everything depends on the nature of the task – is it an IO one that will mostly wait for the network result, or is it a CPU expensive computation that will block the thread completely? 

But there are more solutions. We can maintain two thread pools (or create executors) for IOs and for CPU expensive ops – this is a bit better because we will not block the thread pool for IOs with the other stuff if we immediately know that those tasks will block the whole executor from making progress, and we know that we have more CPU cores that can do a real job while the other tasks are waiting. 

There are also “managed blocking” features in the languages – as an example, in Scala we have the “blocking { ... }” construct that notifies the internal logic of the execution context that this is a blocking task so that we can spawn a new thread, otherwise it will block the worker thread with the job it does. It all depends on the executor whether the feature is supported.

And I want to raise another issue here. Even if something appears to be free, remember that there is almost always a hidden cost! (”TANSTAAFL” – search for it ;)). 

Have a look at this theoretical example demonstrating the use case of computation on multiple threads, and the solution of this problem – which unfortunately comes with a hidden cost:

Assumptions

  • Let's have the executor with a thread pool of fixed size X according to the number of your CPU cores. 
  • Once we spawn X tasks that are going to do IO operation, all of the threads will wait until the operation is done. 
  • One operation takes 2 seconds. 

Calculation

  • X threads start IO and wait 2 seconds. In a very simple calculation omitting the overhead it will be ~2 seconds together to finish all of the tasks and have the threads free again. 

Problem

  • In case we have 10*X tasks it means we will wait ~10*2 seconds... 

Solution

  • The solution with the hidden cost is lying with the executor that spawns more threads than the number of the CPU cores so that the OS scheduler can take the next set of threads while the first set is waiting. 
  • Do you see the hidden cost with the number of OS threads? Stacks size, context switching on OS level. What if our server has to process thousands (or even more) of requests in order to serve all the clients? It is up to the scheduler at runtime level that is responsible for managing this overhead, we will speak more about this later.

So let's recap this bit of abstraction that’s introduced in languages by the executors: 

  • they help with the management, 
  • they do not solve the issues in an intelligent way, 
  • they still need callbacks, 
  • they create a better API allowing us to not think about the Threads, 
  • but they still require a deeper knowledge of the implementation.

And another step higher in abstraction, perhaps a bright Future.

Promise vs Future – there are a range of definitions for these two words and concepts, but most aren't very helpful when it comes to asynchronous programming.

In the context of async programming, the Future is a read-only reference or an API to a value that can or cannot be completed in the future.

On the other hand, Promise is an API to set the value of the computation. Usually Promise allows read operation as well. If we simplify this, the Promise should be used from the internal computation to set the value when it is ready, and the Future from a client side as an access to that value. Promises could also expose the API externally, which means sometimes the external effects can set the value earlier even if the real computation is not done yet.

Now, does this match the features of your favorite language? I bet only partially. This is a theory, but the languages use different names or merge/split the functionality to one/multiple object(s). 

In Java the Future looks like a true Future. CompletableFuture seems to match the Promise definition as it covers the write operation as well as the read one, and can even be completed externally. Scala's Future is a kind of true Future with additional operations that are helping in the context of async programming use cases. Promise in Javascript is really a Promise that by default does not expose the API to resolve the value out of the internal implementation, but we can still work around this by exposing resolve and reject callbacks with some HOC implementation on top of the Promise object. It also covers the read operation and has a composition as well as error-handling API operations.

The JVM examples demonstrating those definitions

java

    try {
        Thread.sleep(1000);
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
    return "Async task completed!";
});

try {
    String result = future.get();
} catch (InterruptedException | ExecutionException e) {
    e.printStackTrace();
}

System.out.println("Hello!"); 

java


var es = Executors.newFixedThreadPool(2);

es.submit(() -> {
    try {
        Thread.sleep(1000);
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
    completableFuture.complete("Async task completed!");
});

completableFuture
        .thenApply(result -> result + " Amazing!")
        .thenAccept(System.out::println);

System.out.println("Hotovo!"); 

You can see that the Java Future is a really poor guy here, it gets the value in a blocking way so the usage is very limited – but of course, there are use cases for it. If we want the nice async features for the API we have to work with CompletableFuture. This covers the composition – the first example of a solution to the issue with the callbacks that we were looking to solve. It is undoubtedly more readable and it is harder to create a mess. Which I didn’t specifically mention yet, but it is really easy to cause a mess in the async way of programming.

scala


Future {
  Thread.sleep(1000)
  "Async task completed!"
} onComplete {
  case Success(result) => println(result)
  case Failure(err) => println("An error has occurred: " + err.getMessage)
}

println("Hotovo!") 

Scala's Future covers the same as CompletableFuture, as well as having an amazing functional interface. The JavaScript one in the TS example is kind of the most simplified way to cover those nice features – readable API, but nothing extra special.

An example in Rust (with Tokio) and JavaScript

rust

async fn main() {
    println!("Hello!");
    let future1 = async_task1();
    let future2 = async_task2();
    let (result1, result2) = tokio::join!(future1, future2);
    println!("This is {} and this is {}", result1, result2);
}

async fn async_task1<'a>() -> &'a str {
    tokio::time::sleep(Duration::from_secs(4)).await;
    println!("First task completed.");
    return "Result 1!";
}

async fn async_task2<'a>() -> &'a str {
    tokio::time::sleep(Duration::from_secs(1)).await;
    println!("Second task completed.");
    return "Result 2!";
} 

typescript

  return new Promise<string>((resolve, reject) => {
    setTimeout(() => {
      resolve("Async task completed!")
    }, 1000)
  });
};

try {
  console.log(await asyncTask())
} catch (e) {
  console.log(e)
} 

Have a look at the lazy one! Rust has a lazy implementation of the Future because as a language it does not come with the runtime, and so the Future holds only the closure to be executed when the executor (from external runtime implementation) calls the poll() method on the future for the first time. The API for handling the values and errors is pretty straightforward, as we already know from the other languages. I will get back to lazy/eager evaluations in the next episodes for sure!

Again, let's recap what this abstraction of Futures brought in comparison to previous concepts. 

  • Simply, it’s a better development experience, 
  • better way to handle results, 
  • better API allowing advanced composition of tasks results evaluation, 
  • no strict callbacks, 
  • but still the same issue with the requirement of deeper knowledge of the internal implementation.

The interesting part starts with the fact that the executor implementation of the Rust runtime seems to be a real black-box. For JVM languages it was quite easy to understand how the JVM thread is mapped to the OS thread, as the mapping was 1:1. Whereas runtimes with M:N threading move async features to another level. It is not that easy to understand anymore, even the M:1 model of JavaScript's event loop is conceptually a level above the simple JVM threading.

So let's start with JavaScript. It is single threaded (let's omit web workers for a second). Tasks are managed and scheduled by the event loop. Now we are coming to the Parallel vs Concurrent definition, the single threaded languages perform only one computation at a time. There is no parallel programming with single threaded languages. Web workers API comes to solve this by allowing us to execute some tasks in another thread. It solves the bottleneck of a single threaded computation. 

So it looks like we are sticking with the fact that we always need to know the internal implementation of the async features the language provides, otherwise we are not aware of the bottlenecks and we might choose the wrong solution just because we use what we personally prefer. We can use web workers as a way to code asynchronously as well as parallel for the tasks that are suitable for such an environment.

Now step back to the event loop for the asynchronous programming without additional threads. The event loop in the JavaScript engine is actually a very simple concept. It is very easy to visualize as well. It consists of one queue of messages that need to be executed. One by one. The function attached to the oldest message is executed once the previous one is fully done with its whole stack. If the message queue is empty it simply waits for the new message. Except for the legacy exceptions, the engine is always non-blocking as all the I/Os are handled by events and callbacks, which allows the other messages to be processed in the meantime.

The “asynchronous programming world episode #1 summary” and a teaser for the next one

So by now, we’ve hopefully reached a point where the JVM way is sort of clear, and the single threaded event loop makes sense as well. 

And now for a teaser of the next episode: we’ll look closer at M:N mapping between runtime threads and OS threads, and bring in buzzwords like Green Threads, Fibers, Coroutines and Goroutines. (Seems the sky's the limit when it comes to creativity for the names developers use for similar concepts!) 

With the next episode I would also like to go a bit deeper into the concept of M:N (multiple tasks scheduled to multiple threads), a kind of M:1 super-boosted to use multiple underlying threads. With the idea of using the maximum possible from both the hardware and software in order to optimize the computations for this model. And this is the most interesting part – we need a super scheduler that’s able to use the environment as efficiently as possible. 

And of course the next episode will also bring examples that will show how async features should look in the modern languages! 

Interesting fact: Did you know that Java had support for green threads (with limitations) and used them instead of native threads as the standard threading model in very early versions? Since Java 1.2 there has not been any support for this at the JVM level. Or, not yet! As we know that Project Loom is on the way…

Stay tuned, the next episode is coming soon!

about the author

Viktor Hanko

blog author
VP of Engineering at Hotovo. As a member of the Hotovo Technology team, I manage the company's technology direction.
blog author

READ MORE