JVM Internals

Block Diagram: JVM Architecture

:

Memory Concepts

  • JVM Organizes memory it needs to run programs into several Runtime Data Areas.
  • Some Runtime Data Areas are Shared, others are exclusive to Threads.
  • Structural details of runtime data areas are left to designers of individual JVM implementations.

Block Diagram: JVM Runtime Data Access

Runtime Data Area

(A) Shared Runtime Data Area:
1. Method Area (sometimes called as part of Non-Heap Memory)
– contains Class Information
2. Heap Area
– contains Objects
The above areas are shared by all threads running inside the virtual machine

(B) Not-Shared Runtime Data Area – Exclusive to Each Thread:
1. JVM Stack
2. PC Register
3. Native Method Stacks.

  • As each new thread comes into existence, it gets its own pc register (program counter) and Java stack.
  • If the thread is executing a Java method (not a native method), the value of the pc register indicates the next instruction to execute.
  • The Java stack is composed of stack frames (or frames).
  • A stack frame contains the state of one Java method invocation. When a thread invokes a method, the Java virtual machine pushes a new frame onto that thread’s Java stack. When the method completes, the virtual machine pops and discards the frame for that method.

Sun Hotspot Implementation

Lets see, Sun Hotspot implementation of Shared Runtime Data Area…

PermGen (Method Area)
– Contains Class Definitions, i.e holds meta-data related to classes and its methods.
Young & Tenured (Heap Area) – Contains Objects/Arrays.

Young

      • – Eden Space: The pool from which memory is initially allocated for most objects.
      • – Two Survivor Spaces: The pool containing objects that have survived the garbage collection of the Eden space.

Tenured (a.k.a Old Generation)

  • – Tenured Generation: The pool containing objects that have existed for some time in the survivor space.

OutOfMemory

In the heap we get an OutOfMemoryError, if memory cannot be allocated to a new Object.

Garbage Collectors

Recovers memory used by objects that are no longer reachable(dead objects).
We will look at:
● GC Classification
● Generational Collectors
● Hotspot JVM Collectors

GC Classification

● Serial vs Parallel
● Stop-the-world vs Concurrent
● Compacting vs Non-compacting vs Copying

GC Classfication: Serial vs Parallel

In serial collection only one collection occurs at a time (even with multiple CPU cores).

In parallel collection, the task of collection is divided into subtasks and executed in parallel, possibly on multiple CPUs. This speeds up collection but is more complex and leads to potential fragmentation.

GC Classification: Stop-the-world vs Concurrent

Stop-the-world collectors suspend the entire application during collections.

Concurrent collectors run concurrently with the application (there could be occasional stop-the world collections). With concurrent collection, freeze times are shorter but it has to operate on the objects which are being used by the running application. This adds more overhead on the collector and requires more CPU power and heap.

GC Classification: Compacting vs Non-Compacting vs Copying

Compacting collectors arrange all the live objects together in contiguous memory blocks. Then the remaining space can be considered free. This way the collection is slow but the allocations are faster.

Non-compacting collectors free dead objects in-place. This leads to faster collections but also a recipe for fragmentation. Copying collectors copy (in contrast to moving) all the live objects to a different area in the memory. Then the source area can be considered free. This leads to slower and expensive collections but provides better allocation performance.

Generational Collectors

Hotspot JVM

Contains 3 garbage collectors:
1. Serial collector
2. Parallel collector
3. Concurrent mark-sweep collector

Hotspot JVM: Serial Collector (Mark-Sweep-Compact Collector)

It is a serial, stop-the-world, copying collector. Because it is serial and operates in the stop-the world mode it is not a very efficient collector.

Hotspot JVM: Parallel Collector (Throughput Collector)

This is very similar to the serial collector in many ways. In fact the only notable difference is that parallel collector uses multiple threads to perform the young generation collection.

The number of threads used for collection is equal to the number of CPUs available. The old generation collection is still carried out using a single thread in serial fashion.

This is the default collector used in Java HotSpot server JVM.

Hotspot JVM: Parallel Compacting Collector

This is an enhanced version of the parallel collector. It uses multiple threads to perform the old generation collection as well.

Hotspot JVM: Concurrent Mark-Sweep Collector (CMS Collector)

While the parallel collectors give prominence to application throughput, this collector gives prominence to low response time.

It uses the same young generation collection algorithm as the parallel collectors. But the old generation collection is performed concurrently with the application instead of going to stop-the-world mode (at least most of the time).

Live objects are initially marked. And after this concurrent sweep phase is initiated. Therefore it uses a set of free-lists when it comes to allocation. Therefore the allocation overhead is higher.

CMS Collector is not a compacting collector.
CMS collector is best suited for large heaps. Because collection happens concurrently, the old
generation will continue to grow even during collection. So the heap should be large enough to
accommodate that growth.

Another issue with CMS is floating garbage. That is objects considered as live may become garbage towards the end of the collection cycle. This gets cleaned up in next collection cycle.
CMS collector requires lot of CPU power as well.

Choosing suitable Collector

Serial Collector
– Apps that have small dataset.
– Single CPU machines
– No Low Pause requirement

Parallel Collector
– Multiple CPU machines
– App throughput is important, slightly longer pauses acceptable.
– No Low Pause requirement

Parallel Compacting Collector
– Preferred over Parallel Collector usually.
– Has higher CPU usage.

CMS Collector
– Low Pause and Low Response times requirement.
– Uses lot of CPU resources. Multiple CPU machines (if limited CPU resources are available, CMS can be run in incremental mode).
– Apps with large sets of long lived data (a large old generation)
– Web-Servers are a good candidate.

GC Command-line Options

● Serial collector: -XX:+UseSerialGC
● Parallel collector: -XX:+UseParallelGC
● Parallel compacting collector: -XX:+UseParallelOldGC
(combine with -XX:+UseParallelGC)
● CMS collector: -XX:+UseConcMarkSweepGC

Additional GC Settings

Additional Parallel collector settings
● Parallel GC thread count: -XX:ParallelGCThreads=n
● Desired maximum pause length: -XX:MaxGCPauseMilis=n
● Throughput (percentage of CPU time spent on application – defaults to 99):
-XX:GCTimeRatio=n

Additional CMS collector settings
● Enable incremental mode: -XX:+CMSIncrementalMode
● Parallel GC thread count: -XX:+ParallelGCThreads=n
● Old gen occupancy threshold that triggers collections: -XX:
CMSInitiatingOccupancyFraction=n

More Settings…

Heap sizing options
● Initial size: -Xms
● Maximum size: -Xmx
● Initial size of the new generation: -XX:NewSize=n
● Maximum size of the perm gen space: -XX:MaxPermSize=n
● Ratio between old and new generation sizes: -XX:NewRatio=n

Debug options
● Print basic GC info: -XX:+PrintGC
● Print verbose GC info: -XX:+PrintGCDetails
● Print details with time: -XX:+PrintGCTimeStamps

Classloaders

TODO

TODO:

1. http://techfeast-hiranya.blogspot.in/2010/11/taming-

Leave a comment