Quickly gain insights about common Java problems from Jira Support Zip files

Hello Atlassian Community.

Thiago Masutti here from the Atlassian Premier Support team.

 

Running a Java-based product, such as Jira, that serves thousands of users is inherently prone to incidents. In such cases, swiftly identifying the root cause is crucial for resolution.
Java-related issues can lead to significant disruptions. As support engineers we encounter these daily across various instances. As your instance grows in size and complexity, encountering such issues becomes inevitable.
Common examples of those problems are (not an extensive list):

 

Sometimes a Jira instance may suffer from these problems and they will cause a disruption to all users; or a subset of users that were accessing an affected node from a Jira Data Center cluster.
Being able to quickly identify the cause is key for a faster resolution of the problem.
A few months ago, I shared a tool used internally by Atlassian support to parse Support Zip files, transforming data into insights for troubleshooting -- Getting insights from your Jira Data Center instance from Support Zip files.
Today I'm sharing an example of how it can help identifying one type of problem, which is the Out of Memory Error (OOM).

 

Imagine an instance outage. After restarting, a Support Zip file is generated and parsed with JAGS.
From the landing dashboard we have important information with insights on the possible problem.
jags-jira-landing-001.png
Within the Overall Metrics section we have some characteristics of the problem:
  • A spike on CPU utilization.
  • Full GC activity.
  • Tomcat thread pilling up.
  • Database query latency spiking.

These metrics alone might not pinpoint the issue, but the General Problems section reveals:

  • 25 OOM errors in logs.
  • A heap dump file might have been generated.

 

Let's check first some details from the Garbage collection, heap and memory usage dashboard.
This dashboard gives us details about the following:
  • Heap and off-heap memory utilization.
  • Garbage collection (GC) activity.

jags-heap-gc-001.png

 

On this example we can see the following characteristics:
  • High heap consumption.
  • Full GC (G1 old generation) activity.
If we extend a bit the time window to check the heap utilization we can see heap utilization was healthy on the past days, but something is requiring more memory recently.
jags-heap-gc-002.png
With that information we can move to the "Out of memory errors" dashboard.
Initially it gives us a glance on relevant information:
  • If JVM properties that would be relevant to the troubleshoot are adjusted.
  • The distribution of OOM errors and of heap dump files generated by node.

jags-oom-001.png

 

Expanding the other sections we can see more details about the OOM errors over time.
jags-oom-002.png
An important panel is the Log entries about heap dump generated.
jags-oom-003.png
The heap dump generated during the OOM is paramount to continue the investigation and identify the root cause. This panel would show this file, as well as its size.
In this case the heap dump couldn't be saved because of lack of permissions on the chosen directory.
To know more about configuring the JVM for the relevant parameters to collect the heap dump, check Analyze OutofMemory errors in Jira Server and Data Center with heap dumps.

 

JAGS aids in quickly identifying OOM errors as incident causes. For root cause analysis, heap dump examination is necessary.

Partner with Atlassian Support for assistance.

Check more about Atlassian Support here.

Kind regards,

Thiago Masutti
Premier Support Engineer | Atlassian

2 comments

Comment

Log in or Sign up to comment
venkat patchigolla
Contributor
March 16, 2025

Good read. Thanks for sharing.

Like Thiago Masutti likes this
Chelo Moreira March 17, 2025

Great info, thank you for sharing!!!

Like Thiago Masutti likes this
TAGS
AUG Leaders

Atlassian Community Events