Data Platform/Systems/Java

Thread Dump

A thread dump will provide a state of all active threads within the java process at that particular point of time.

Each thread is displayed with it's stack trace.

It is written in plain text that can be saved in file for later analysis.

It is useful for diagnosing different issues (deadlock, thread contention, performance issues...)

How to take a java thread dump

Thread dumps from most Java processes

Connect as the user running the java process
Run the below command

jstack -l $(pgrep -a java|grep <arg_to_identify_process>|awk '{print $1}')

It is usually interesting to get multiple thread dump in a row (for ex. one thread dump taken every 1/10 second 5/30 times)

for i in {1..30}; do jstack -l $(pgrep -a java|grep presto|awk '{print $1}') > /tmp/threaddump-$(date +%s).txt; sleep 1;done

Having multiple thread dumps permits to check how the threads are evolving.

For ex. it can help detect threads waiting for a lock during multiple thread dump and identify which thread is taking that lock (from which part of the code the lock is taken)

Thread dumps from Apache Spark

Apache Spark provides a built-in thread dump facility in its User Interface:

Go to https://yarn.wikimedia.org/, login, and figure out the Spark job you're interested in. Click on it. Now click on the 'Application Master' link to proxy into the Spark job's UI.
You can now click on the 'executors' tab. You will be shown a table with a row for each executor and for the driver. For each row, you will see at the far right a link aptly named 'Thread Dump'. Each time you click on the link a new thread dump for that java process is made, and you are redirected to a UI facility where you can explore and analyze the dump right there.

Analyzing the thread dump

Spotify provide an interesting online thread dump analyzer

Threads are group per stack to facilitate reading it.

It provide a view of object locks with the thread holding them and the threads waiting for it:

Fast Thread

Another solution to analyze thread dump.

Locks between threads are easy to identify.

It provide comparative view between to thread dump to identify threads stuck from one dump to another

GC analysis

Enabling GC logs

When the JVM is facing long pause duration tuning the GC es needed.

If the JVM is facing OutOfMemory tuning the GC won't help.

In order to know how to tune the JVM gc logs must be enabled.

Depending to were time is spend multiple specific tag can be added to get more details. Those tags also depends on the garbage collector used by the java process: https://newrelic.com/blog/best-practices/java-garbage-collection#toc-what-garbage-collectors-are-available-for-java-

Ex. of basic GC logs configuration:

Java 8

-XX:-PrintCommandLineFlags -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintAdaptiveSizePolicy -XX:+PrintTenuringDistribution -XX:+UseGCLogFileRotation -Xloggc:/tmp/gc.log -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10m

java 11 and greater

-Xlog:gc=debug:file=/tmp/gc.log:time,uptime,level,tags:filecount=10,filesize=10m

Analyzing GC logs

Analyzing the GC logs file need to understand the behavior of the garbage collector used on the java process.

Then you should be able to know which step is slow and what potential tuning can be done.

Some tools like this website can provide good insight on root cause of slow GC: https://gceasy.io/

OutOfMemory/HeapDump

When facing OOM or bad memory usage it is possible to generate heap dump in order to understand which objects are using the memory.

Enabling HeapDump on OutOfMemmory

Add below java option to your java process

-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heapdump.bin

Ensure you have enough space on the HeapDumpPath to store it (it can take up to the Xmx size)

Generating HeapDump manually

As the user running the java process run below command

jmap -dump:format=b,file=/tmp/heapdump.bin <pid>

Ensure you have enough space on the HeapDumpPath to store it (it can take up to the Xmx size)

Analyzing it

Memory Analyzer Tool

Analyzing a Heap Dump is a memory intensive. It could require same amount of memory to do parse it than the max memory configured on the java process.

MAT comes with a ParseHeapDump.sh script to do the parsing in cli which can be interesting to run it on a server having enough memory.

Once parsing done MAT can look at memory leak or let you browse the list of objects within the heap and links between them.

Jcmd

Utility to send diagnostic command request to the JVM

Running below command will provide a live view of the heap usage

> jcmd <pid> GC.class_histogram|more
<pid>:
 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:        143966      143151864  [B (java.base@13.0.14)
   2:         16292       49532104  [I (java.base@13.0.14)
   3:        123093        2954232  java.lang.String (java.base@13.0.14)
   4:         32714        2878832  java.lang.reflect.Method (java.base@13.0.14)
   5:         38151        2168912  [Ljava.lang.Object; (java.base@13.0.14)
   6:         15638        1836280  java.lang.Class (java.base@13.0.14)
   7:         41803        1672120  java.util.LinkedHashMap$Entry (java.base@13.0.14)
   8:         19221        1471400  [Ljava.util.HashMap$Node; (java.base@13.0.14)
   9:         32284        1033088  java.util.concurrent.ConcurrentHashMap$Node (java.base@13.0.14)
  10:           732         994672  [C (java.base@13.0.14)
  11:         17012         952672  java.util.LinkedHashMap (java.base@13.0.14)
  12:         25410         813120  java.util.HashMap$Node (java.base@13.0.14)
  13:         22785         729120  javax.management.MBeanAttributeInfo (java.management@13.0.14)
  14:         27168         630520  [Ljava.lang.Class; (java.base@13.0.14)
  15:         22974         551376  org.glassfish.jersey.server.internal.routing.CombinedMediaType$EffectiveMediaType
  16:         22583         541992  java.util.ArrayList (java.base@13.0.14)
  17:         12096         483840  java.util.TreeMap$Entry (java.base@13.0.14)
  18:         15491         371784  java.util.LinkedList$Node (java.base@13.0.14)
  19:         15311         367464  java.util.Collections$UnmodifiableRandomAccessList (java.base@13.0.14)
  20:         11543         336336  [Ljava.lang.String; (java.base@13.0.14)
  21:         20831         333296  java.lang.Object (java.base@13.0.14)
  22:         13733         329592  org.weakref.jmx.NestedMBeanAttribute
  23:           849         317408  [Ljava.util.concurrent.ConcurrentHashMap$Node; (java.base@13.0.14)
  24:          6146         295008  java.util.HashMap (java.base@13.0.14)
  25:          3683         294640  java.lang.reflect.Constructor (java.base@13.0.14)
  26:          4539         290496  java.net.URL (java.base@13.0.14)
  27:          4001         288072  java.lang.reflect.Field (java.base@13.0.14)
  28:          2236         286208  sun.nio.fs.UnixFileAttributes (java.base@13.0.14)

It can also help to do memory allocation profiling with tool like async-profiler in order to map the instances usage with the methods call

Profile Analysis

When facing some bad CPU usage or huge memory allocation rate preforming some profiling can help

async-profiler

https://github.com/async-profiler/async-profiler

They are multiple profiling tool but this one is providing good set of feature and is fully open source.

It is a low overhead sampling profiler. It can be attached to any running JVM, no need to start the JVM with a specific java agent.

How to use it

As of now this profiler is not automatically installed on servers running java based process

You will have first to download it locally and then to scp it on the server running the java instance to profile: https://github.com/async-profiler/async-profiler/releases/download/v2.9/async-profiler-2.9-linux-x64.tar.gz

Then run below command to get a flamegraph:

# Profile for 30s cpu usage
sudo ./profiler.sh --fdtransfer -e cpu -o flamegraph -d 30 -f /tmp/test.html <pid>

# Profile for 30s memory allocation usage
sudo ./profiler.sh --fdtransfer -e alloc -o flamegraph -d 30 -f /tmp/test.html <pid>

Ex. of CPU profiling Flamegraph from one presto node

From that flamegraph you can see that 22% of the cpu usage is done doing some kerberos authenticate action.