- Project Loom & Virtual Threads in Java
- Why ?
- Using Project Loom in a Java application
- Threads
- Native threads results
- Virtual threads results
- Executor services
- CachedThreadPool Executor service
- VirtualThreadExecutor Executor service
- Conclusion
- INTRODUCTION TO CONCURRENCY AND THREADS IN JAVA WEB APPS
- What is a thread pool
- Why thread pools?
Project Loom & Virtual Threads in Java
Disclaimer: This post will mention some terms such as concurrency, threads, multitasking, and others that will be not explained in a detailed way, but will try to add references to other posts or pages about them. Concurrency in Java is natively managed by using threads (java.lang.Thread) which basically is a wrapper/mapper for a native OS thread. This model fits well in a system that does not need too many threads, however it brings some drawbacks when we want to use them in a large scale, let’s say hundreds or thousands threads.
Why ?
- Native OS threads must support all programming languages, they are not optimised for a specific one (maybe C 🤔).
- Expensive context switching.
- High memory resource usage (stack size).
These items bring issues to escalate java applications using the thread approach to do concurrent jobs using many threads.
E.g. A web server application processing 500 req / sec -> 500 threads -> 500 Mb. Thinking about only threads created for each request, for sure a web server use other threads and memory.
Here is where Project Loom comes as a solution, so first of all let’s define what Project Loom is and what it brings to the Java world.
Note Project Loom is still an ongoing project and all the information, names and definitions about it could change and there is not any official JDK released to work with.
Project Loom is to intended to explore, incubate and deliver Java VM features and APIs built on top of them for the purpose of supporting easy-to-use, high-throughput lightweight concurrency and new programming models on the Java platform — Project Loom Wiki
Project Loom is a platform based solution introducing the concept of Virtual Thread (a.k.a Fiber) that is a lightweight thread managed by the Java Virtual Machine rather than the operating system and it fits with the existing Java APIs allowing synchronous (blocking) code.
Using Project Loom in a Java application
Note This post wants to show a general overview and some notions about how project loom and virtual threads perform rather than showing any kind of micro benchmark or something deeper.
We will use jconsole to monitor the performance and then check how many threads are created, how much memory is used and how the CPU is behaving.
For these tests a MacBook Pro 15” 2018 will be used:
Since there is not any official release of the JDK including Project Loom, we must use the early-access binaries provided by the project.
We can download them from the official page of the project, these binaries are based on JDK 17 and there are options for Linux, MacOS and Windows binaries.
Project Loom Early-Access Builds
Once the binaries are downloaded and set in the PATH we can use the JDK Project Loom:
❯ java --version openjdk 17-loom 2021-09-14 OpenJDK Runtime Environment (build 17-loom+2-42) OpenJDK 64-Bit Server VM (build 17-loom+2-42, mixed mode, sharing)
Threads
Let’s do a naive test creating an app that creates 1000 threads and each thread run some random math operations between two numbers during a couple of minutes, and then compare the performance using the usual native threads vs virtual threads.
static class RandomNumbers implements Runnable Random random = new Random(); @Override public void run() for (int i = 0; i 120; i++) // during 120 seconds aprox try int a = random.nextInt(); int b = random.nextInt(); int c = a * b; System.out.println("c o">+ c); // print to avoid compiler remove the operation Thread.sleep(1000); // each operation every second > catch (InterruptedException e) e.printStackTrace(); > > > >
Then let’s create the threads.
public static void main(String[] args) throws InterruptedException // Create 1000 native threads for (int i = 0; i 1000; i++) Thread t = new Thread(new RandomNumbers()); t.start(); > Thread.sleep(120000); >
Native threads results
- Around 1000 threads.
- CPU usage between 2 and 3 %.
- Constant memory usage around 150 Mb.
Now let’s do the same operation using virtual threads
public static void main(String[] args) throws InterruptedException // Create 1000 virtual threads for (int i = 0; i 1000; i++) Thread.startVirtualThread(new RandomNumbers()); > Thread.sleep(120000); >
Virtual threads results
- Around 30 threads
- CPU usage under 2%
- Incremental memory usage from 20Mb to 60 Mb.
Executor services
Now let’s do a test using something more elaborate using Executor service to schedule the threads .
CachedThreadPool Executor service
public static void main(String[] args) throws InterruptedException ExecutorService executor = Executors.newCachedThreadPool(); for (int i = 0; i 1000; i++) executor.submit(new RandomNumbers()); > executor.shutdown(); if (!executor.awaitTermination(60, TimeUnit.SECONDS)) executor.shutdownNow(); > >
This implementation of executor service uses a new thread (native) per task scheduled unless there is a free thread to take it.
These results look pretty similar to the threads approach
VirtualThreadExecutor Executor service
public static void main(String[] args) throws InterruptedException ExecutorService executor = Executors.newVirtualThreadExecutor(); for (int i = 0; i 1000; i++) executor.submit(new RandomNumbers()); > executor.shutdown(); if (!executor.awaitTermination(60, TimeUnit.SECONDS)) executor.shutdownNow(); > >
These results show a light decrease of memory and CPU usage.
- Around 30 threads.
- CPU usage between 1 and 2 %.
- Incremental memory usage from 30Mb up to 40 Mb.
Conclusion
We can see how the use of virtual threads could help us when we need to use concurrency using thousands of threads without compromising performance and making optimal use of resources.
Project Loom is promising a good future to Java and its concurrency API to be at the level of other languages that already have lightweight concurrency models.
INTRODUCTION TO CONCURRENCY AND THREADS IN JAVA WEB APPS
Threads, concurrency, or synchronization are not very easy to understand concepts. When some concurrency is involved in our applications it’s pretty hard to avoid making mistakes. Although Java provides mechanisms to deal with parallel programming, sometimes there are just too many options. And often some essential options are missing. For web applications, Jakarta EE provides a simplified programming model to deal with parallel tasks. But in order to use it effectively and avoid mistakes, you need to understand the basic concepts which I’d like to explain here. Java provides a lot of mechanisms to help working with threads and concurrent tasks. As Java evolved, some new mechanisms were added while old mechanisms stayed. And it’s often not clear which of them are better and recommended to use in new applications. Jakarta EE builds on these features and makes them easier to understand and use. The standard Jakarta EE API intentionally specifies only interfaces and essential conceptual behavior. A lot of the complexity is abstracted away and provided by Jakarta EE runtimes to keep things simple. As a result, developers have a concise set of features. Easy to learn and understand but enough to build applications.
What is a thread pool
The basic concept of threading and parallelism in Jakarta EE runtimes is a thread pool. This is connected with the request processing model, where most of the tasks originate as a request from an external caller, they are then processed sequentially in a single thread, and produce some response that is usually sent back to the caller, persisted into a database or sent as a message to another system. Many separate tasks can run in parallel, each using its own separate thread. So there’s usually a simple mapping – one request needs one thread. After a task is finished, a thread doesn’t have to be destroyed. It can be reused to run another task, in order to avoid creating and destroying threads too often.
Incoming tasks are not tied with any specific thread. A thread pool always makes sure there’s a thread available for a task. This is called “scheduling” and is often referred to as “thread scheduling”. Tasks are scheduled and processed either: by an existing thread that is finished with its previous task by a new thread if no free thread is available or they are queued and wait until a thread is available if all threads are busy Vice versa, threads aren’t tied to tasks either. They just take a task from a queue and process it. When threads are done with their tasks, they start processing another new task from a queue or wait for a task if the queue is empty. A group of such threads, together with the logic how they are managed and scheduled, is called a thread pool.
Why thread pools?
- Threads allocates some memory for their stacks and keep the memory until the threads are disposed
- The CPU can run very small number of threads at once; having too many threads can even lead to performance degradation
- Each task requires some heap memory; it makes sense to limit the maximum number of parallel tasks to avoid overwhelming the system
The memory argument is pretty clear and often it’s evident when it becomes and issue. There’s always a limited amount of memory on the system. When threads consume a big portion of that memory, it’s easy to reach the limit. The default stack size for each thread is 1MB, which means that each thread needs 1MB of system memory. If there are 1000 threads in a JVM, they clearly need 1GB of memory on top of the Java heap size just to exist. You can probably imagine the consequences if there are even more threads.
The CPU argument isn’t so straightforward but is really valid. Only X amount of threads can run on a CPU at the same time (usually 8 on an 8-core CPU). With more threads, it’s more likely that a thread will be suspended and another thread will be scheduled. Switching of threads is a relatively time-consuming operation and doesn’t contribute to the computation. Therefore, having an excessive amount of threads can actually decrease performance.
The last argument is that executing too many tasks in parallel costs too much memory as each tasks needs to store something to the heap even if it’s waiting for an I/O operation or the CPU. As described above, more parallel tasks doesn’t not always lead to increased performance. But it definitely leads to increased memory. Therefore it’s better to limit how many tasks can run in parallel and queue other tasks to be executed later.
For these reasons, it’s good to have some reasonable amount of threads ready to handle new tasks immediately. It’s also necessary to limit the number of threads to a reasonable amount. And threads should also be disposed after some time if they are really not needed. With this, it’s possible to avoid system become thrashed and unusable under high load. If there are too many requests to handle, some of them simply need to wait while others can be efficiently processed. This will make the system usable at least for some requests/users.
Sometimes a single request needs to be processed by multiple threads in parallel. This doesn’t fit the simplified thread-per-request model but it’s also supported by Jakarta EE runtimes. Applications can use a specialized concurrency API, which allows splitting a task and execute each part in a separate thread. This again works on top of thread pools to retain all the advantages mentioned above. But more on that later in a separate post.
Published on Java Code Geeks with permission by Ondrej Mihalyi, partner at our JCG program. See the original article here: INTRODUCTION TO CONCURRENCY AND THREADS IN JAVA WEB APPS