When developing software, especially when it’s meant to run high-performance tasks, it’s important to be aware of how the code will run on the target machine. This is known as the execution model and is important to optimize the code according to this model. This will help in making it run as fast as possible by minimizing idle CPU time.
But what is an execution model? In simple terms, an execution model defines how the code executes, how the instructions (statements) are run by the host device (PC, server, cloud container, mobile phone, etc.) Every programming language has an execution model, and is implemented as part of the language implementation.
This covers things such as what is an indivisible unit of work (i.e. instructions, statements), and what are the constraints on the order in which those units of work can take place.
Normally, statements run linearly, one after each other (with eventual logical jumps determined by IF, WHILE, etc. statements). While this holds true for most statements, many programs interact with things outside of the processor and here is where the differences arise.
For example, they may communicate over a computer network or request data from the hard disk, which is a lot slower than getting it from memory or processing on the CPU. While this is happening, some languages will simply wait for these “slow” operations to complete (making the processor sit idle during this time) while others perform other actions during this time, optimizing CPU resources.
In part, this optimization is achieved at the operating system level by implementing multitasking (i.e. switching the processor between multiple running programs) but that doesn’t help when running a program, since there’s nothing else to run while waiting for resource fetching/writing tasks are completed.
This is where a language execution model comes into play, defining how the program execution behaves while interacting with elements outside the CPU/memory domain.
Synchronous Execution Models
Many programming languages (such as C, Pascal and some others) feature a synchronous execution model, where statements are executed linearly, regardless of the type. In these languages, when invoking a function that requires a long-running action (when compared to fast, CPU/memory operations), the program will “freeze” waiting of the result and continue execution only after the action completes and returns the result.
When comparing this to an office guy that sends emails and uses the responses to do some more work, a synchronous execution model would be like sending an email and waiting for the response before doing any other work. If he needs to send a second email, this will only happen after the first one has a response.
Under this linear execution model, since the program continues only after the request function has done its work, developers know they will always have the results of these requests right the next line of code. While this makes development simpler, this has the drawback of wasting a lot of processing power, as the processor sits idle until requests complete and results are available.
Synchronous Multi-threaded Execution Model
One of the approaches to avoid freezing (blocking) a program during long-running operations is to keep a synchronous execution model, but starting additional “threads” of control, each handling a different request.
Although much more complicated in real programs, a thread is just another running part of the program that runs in parallel with with either the program (master threads) or additional threads handling other requests.
In a multi-threaded programming language (such as Java, C#, Go, Scala, among many others), the master thread (main program logic) will start a second thread to handle the first request, and another for the second one. Each thread runs in parallel with the rest. Once each thread completes running (each can end at different times), they will re-synchronize with the main thread and combine results and get the final one.
Since most modern computers contain multiple CPU cores (and some, even multiple CPUs with multiple cores), multiple threads can run at the same time, one using a different core, avoiding blocking the CPU while waiting for results.
Going back to the office worker example, a multi-threaded execution model would send several emails (one for each request) in parallel and waiting for all of them to return before completing the work.
This execution model has the advantage of optimizing CPU resources by running several threads in parallel (making use of different CPU cores), reducing the time it takes to complete a program. However, managing several (many, in some cases) threads and synchronizing them with the master thread can make the code quite complex and difficult to debug. This also makes the learning curve quite steep to newcomers.
Asynchronous Execution Model
In this execution model, the execution of the program statements is blocking, I/O operations do not. All these kind of operations are done in parallel to the execution of your code. However, in this model, only one thread of execution is used. No additional threads are created.
Unlike the previous models, in the asynchronous execution model the main program flow continues to run right after the call is made (before getting the results), as in a multi-threaded model but without spawning extra threads of control. But how does the program get access to the data from the I/O actions?
When making a call to these functions, the program attaches a special type of function, named “callback functions” to these calls. When the action completes (e.g. a JSON file fetched from the Internet or a database record set), the engine will pause the main program and run the callback function, passing it the results of the call.
To put it another way, the main program flow continues its normal execution of code until a new result arrives (from a previous function call) and its callback function needs to run. This makes the program more efficient, as the CPU is always running, either on the main program or a callback.
Following the office worker example, an asynchronous program would send emails, continue to work on something else, and read/process responses (pausing the current work) as they arrive to the inbox.
In this model, it’s not the program that manages I/O operations but the underlying “engine” (such as Node’s runtime engine). This engine is the one in charge of performing I/O actions and invoking the callbacks functions when results are ready.
Other execution models
The models described above are not the only ones. There are others such as Erlang’s actor-model or Hadoop’s MapReduce, each with its own features, optimal use cases and restrictions. However, the ones above cover most of the programming languages used these days.
Which execution model is best?
Although there’s no absolute answer to this question, most of the times it is best to choose either a multi-threaded or asynchronous execution language. This provides better performance by optimizing CPU usage.
Using a multi-threaded language is best used when heavy CPU processing is accompanied by many I/O operations (such as in data processing systems). On the other hand, programming with threads can be notoriously hard (understanding what a program does is much more difficult when it’s doing multiple things at once), so this may become a problem for complex systems. In addition, this model is affected by some system-specific constraints, such as the maximum number of threads which may also play against using this model.
Using an asynchronous execution model is ideal for systems that require keeping track of many things at once, such as handling UI interaction or responding to many communication events, as they only block the main program flow when a new result arrives. Also, asynchronous code is critical for writing scalable server-side systems, as the server does not stall (as it would in synchronous systems) when waiting for resources so it can respond to other client requests.