Tutorial: Detect concurrency issues
Concurrency-related bugs are often trickier than those in a single-threaded application because of their random nature. An app may run flawlessly a thousand times and then fail unexpectedly for no obvious reason.
In this tutorial, we'll analyze a code example that demonstrates the core principles of debugging and analyzing a multithreaded app.
Problem
A common example of a concurrency-related bug is a race condition. It happens when shared data is modified by several threads at the same time without being properly synchronized. Such code may work fine as long as reads and writes in the two threads don't overlap.
The overlapping may be very rare and lead us into thinking there is no flaw in the code. However, when the thread operations do overlap, the data gets corrupted.
If we don't take this into account, there is no guarantee that the threads will not operate on the data simultaneously, especially if we deal with something more complex than just a single read and write. Luckily, Java has built-in synchronization mechanisms that ensure only one thread works with the data at a time.
Let's consider the following code:
The addIfAbsent
method checks if a list contains a specific element, and if not, adds it. We call this method twice from different threads. Both times we pass the same integer value (17
), and because of the guard condition (!a.contains(x))
, only the first thread to call that method should be able to add the value. The use of SynchronizedList
is supposed to protect us against race conditions. Finally, the System.out.println(a)
statement prints out the contents of the list.
If we were to use this code for a long time, we would see that at times it still produces unexpected results.
To find the cause, let's examine how the code operates and see if we really managed to prevent race conditions.
Reproduce the bug
Using the IntelliJ IDEA debugger, you can test the multithreaded design of your application and reproduce concurrency-related bugs by controlling the execution of individual threads rather than the entire application.
Set a breakpoint at the statement that adds elements to the list.
Configure the breakpoint to only suspend the thread in which it was hit. This will ensure that both threads were suspended at the same line. To do this, right-click the breakpoint, then click Thread.
Start the debug session by clicking the Run button near the
main
method and selecting Debug.When the program runs, both threads are individually suspended in the
addIfAbsent
method. Now you can switch between the threads in the Threads tab and control the execution of each thread.At this point, both threads have checked that the list does not contain
17
and are ready to add the number to the list.In the Threads tab, switch to
Thread-0
.Resume the thread by pressing F9 or clicking in the left part of the Debug tool window.
After you resume
Thread-0
, it proceeds with adding17
to the list and is then terminated. After that, the debugger automatically switches back to the main thread.Resume the main thread to let it execute the remaining statements and then terminate.
Review the program output in the Console tab.
The output [17, 17]
demonstrates that it was possible for the two threads to add the same value bypassing the incorrectly set guard condition and synchronization. We used the debugger to reproduce the order of events, which showed us that a race condition exists, and we need to correct our approach.
Fix the program
As we have just seen, the use of SynchronizedList
alone was not sufficient. It made sure that only one of the threads modifies the list at a time. However, we should have still taken into account that checking if (!a.contains(x))
and modifying a.add(x)
were not an atomic operation. For this reason, both threads were able to evaluate the condition and enter the code block at the same time.
Let's correct the code by wrapping the condition in a synchronization block.
We can now repeat the procedure with the corrected code and make sure that the issue is longer there.