Java Virtual Machine (JVM) Internals: How Java Code Runs?

Java Virtual Machine (JVM) Internals

Are you a Java enthusiast looking to learn about the Java Virtual Machine internals? Do you want to know how Java code runs? Then this article is for you!

Java is a powerful programming language known for its platform independence and scalability. This is possible because of a bridge between the operating system and Java code, aka JVM.

So, let us learn more about the internal JVMs! Ready to begin?  

Summary Of The Article:

  • JVM stands for Java Virtual Machine and is responsible for the execution of the java code.
  • It has different components – ClassLoader, Data Areas, and Execution Engine.
  • Following the best practices and performance optimization strategies will help us to write better code and deal with internal issues.

What Is Java Virtual Machine (JVM)? Read Below

To learn how the Java code works, we will first have to understand what the JVM is. For this, let us start from the basics and learn what a VM – virtual machine is.

A virtual machine is a representation of a physical computer. It runs on a physical computer called a host machine and acts as a medium between the operating system of the host and the hardware.

what_is_JVM

One host machine or computer can have several virtual machines to run several applications. One such virtual machine is the Java Virtual Machine.

JVM or Java Virtual Machine provides us with an environment to run Java applications on any kind of operating system without modification. This makes Java a cross-platform programming language.

In Simple words, it follows the “write once, run anywhere” (WORA) principle. The JVM acts as an abstraction layer between the computer hardware and the Java code, enhancing its scalability and platform independence.

If you’re wondering why some developers find Java challenging, check out common hurdles developers face while learning it.

Role of JVM in Java Execution 

The JVM is a part of the Java Runtime Environment (JRE.) When we write Java code, it is converted to bytecode first. This bytecode is processed by the JVM. Let me describe the Java code execution process in steps one by one so that it is easy for you to understand.

  •   The developer writes the Java code in a .java file.
  • It gets converted to the bytecode, i.e., the .class file.
  • The JVM then loads it to the memory and verifies its correctness.
  • It then gets compiled to machine code by the Just In Time (JIT) compiler.
  • Memory is managed with the help of the Garbage Collector.

An Overview Of The JVM Architecture

Let us now have a look at the JVM architecture. This will help us by providing us with insights about the Java Virtual Machine Internals. Below are the key components of the JVM architecture.

JVM_arch

  • untime data area
  • JVM execution engine

Let us understand each of these components in detail. Keep reading to know more!

1. ClassLoader subsystem

When the Java code is ready for execution, it gets converted to bytecode. The bytecode is in the form of the .class file. This class file is loaded in the memory with the help of the class loader subsystem.

The first class that is loaded is the main() method. This is where the program execution starts first. The class loader subsystem is responsible for three operations. 

  • Loading
  • Linking
  • Initialization

classloader

Let us also discuss them in detail. Read below!

Loading

This is the initial phase of the class loader where the .class file is read and loaded to the method area along with some information about the class. The fully qualified class name along with its immediate parent class, the data members and member function information of the class, and whether the .class file is related to an interface of Enum –  is some of the information loaded along with the bytecode.

Java also offers us some built-in class loaders. These are:

  • Bootstrap ClassLoader: This class loader is used to load the core Java API classes. These classes are in the JAVA_HOME/lib directory. It is also known as the Root ClassLoader.

It is written in native code, not Java. The standard Java API that it loads is java.lang.String, java. util.ArrayList, java.io.File, and more. 

  •  Extension ClassLoader: This is the ClassLoader platform in Java 9+. This ClassLoader lies in the middle of the hierarchy, i.e., it is the subclass for the Bootstrap ClassLoader and the superclass for Application ClassLoader.

It loads the extension libraries from the JAVA_HOME/jre/lib/ext directory as well as any other directory specified by java.ext.dirs system. These libraries are not a part of the core APIs or libraries but are still crucial for the execution.

  • Application ClassLoader: This ClassLoader loads the files from the user-defined Java programs or the external jar files. These are called CLASSPATH. It loads the application-level source files (src/) and the custom jar files in CLASSPATH.

By default, the CLASSPATH is set to the current directory, however, it can also be changed by the user. This ClassLoader is implemented using Java. 

Linking

After loading and initialization, linking is the next step. It makes sure that the bytecode is structurally valid and all the dependencies are resolved. The linking process in Java has three phases.

The phases of Linking are given below –

  • Verification: In this phase, we check whether the bytecode structure is valid and correct. This is done by verifying it against some JVM constraints. If verification fails, we receive the VerifyException as the exception in the terminal.
  • Preparation: In this phase, the JVM allocates memory to the class-level variables and assigns default values to static variables. Doing this ensures that some amount of memory is reserved for the static field before the class is fully initialized.
  • Resolution: This is the final phase of the linking process. Here, the symbolic references that are used in the bytecode are replaced with the direct memory references for execution.

Initialization

The last phase before which the class is ready for execution is the initialization phase. In this phase, the static variables are assigned with the values, and the final values are assigned to the variables.

Overall, in this phase, we see the execution of the static variable initializers, the execution of the static block – if any is present, and the execution order of the class hierarchy is decided. Whenever the class’s constructor is called, the JVM triggers initialization.

2. Runtime Data Areas in JVM

The next component that we are going to discuss is the Runtime Data Areas in JVM. There are five components in the runtime data areas. Let’s see them one at a time.

runtime_data_area

Method Area

The method area stores the metadata of the class. This metadata can be its structure or method definitions. Also, it holds the static variables and contains bytecode methods, and runtime constant pool, as well as the resolved symbolic references. 

Consider the Java code given below.

				
					public class CodingZap{
   
    private string coding;
    private int day;
   
    public CodingZap(string coding, int day){
        this.coding = coding;
        this.day = day;
    }
   
}

				
			

Here, you can see that a parameterized constructor is defined for the public class CodingZap. The variables for this – coding and day, will be stored in the method area. It is also important to note that there is only one method area for each JVM.

Heap Area

Next, we have the heap area. It is the largest memory area in JVM. It is used for the purpose of storing objects and instance variables. The heap area is managed by the Garbage Collector.

The heap area is shared across all threads. It is important to know that holding references for too long in the heap area can lead to memory leaks. It can also lead to OutOfMemoryError if the memory is exhausted.

The heap area in modern JVM is further divided into 3 different memory spaces. These are –

  • Young generation: This area contains all the new objects that are created in Java. It also acts as a minor garbage collector. When this memory gets filled, garbage collection is performed.
  • Old generation: This area has objects that have a long lifespan and were not removed from many rounds of garbage collection. However, when this area gets filled as well, we perform garbage collection (major garbage collection.)
  • MetaSpace: Before Java 8, this area was the PermGen memory area, which is now replaced by MetaSpace. It auto-adjusts its size depending on your OS and is efficient in garbage collection and better memory management. 

Stack Area

The stack area in JVM is used to store the cell frames, local variables, and intermediate results. Here, every thread has its own stack. The stack area is automatically allocated and deallocated when the method execution starts and finishes.

Like Heap Area, here also, the StackOverflowError may occur if the stack is full and not managed properly. This also occurs if your recursive function is wrongly implemented, leading it to infinite recursion.

Program Counter (PC) Registers

The program counter registers are extremely helpful in the cases of multithreading. Each thread gets its program counter register to hold the address of the currently executing thread in the JVM.

The PC registers help the JVM to track the thread execution. They don’t usually cause any errors because of the small space provided to them.

Native Method Stack

The purpose of this Runtime Data Area is to handle the Non-Java (native) method execution. It supports the Java Native Interface or JNI. It is used when Java interacts with the system libraries or native APIs.

If you are using Native APIs in your code, you should be careful as buffer overflows in the code can lead to security risks. 

3. JVM Execution Engine

Next up, we have the third component of the Java Virtual Machine Internals. It is the JVM execution engine. This is responsible for the Java bytecode execution. The execution engine in Java also has some components.

Choosing the right data types is crucial for performance—our breakdown of float vs double explains how numerical precision impacts execution speed.

exection_engine

Each component of the JVM execution engine is extremely important for the language’s smooth functioning. Let’s see them one by one.

Interpreter

The interpreter executes the bytecode line by line, It is similar to the Python execution. But, in Java, the interpreter is used where you need the immediate execution of the code. It is slower as compared to the Java compiler.

The Java interpreter is suitable for short-lived applications. The slower execution is because executing each instruction leads to overhead.

JIT Compiler

JIT stands for Just-In-Time compilation. It enables faster execution of the Java code, so it is a good alternative to the interpreter.

JIT compiler identifies the hotspots in the code, which are essentially the frequently used code, and compiles it to native machine code. This code is stored in the cache and can be reached quickly if the program needs frequent execution.

However, the JIT compiler takes up memory, thus, we need to optimize our code for better performance. For this, we can use techniques like inlining and loop unrolling.

In Java, you get two kinds of JIT compilers. These are called the tiered compilers – c1 and c2. Let’s get to know about them.

  • C1 compiler – client compiler: The client compiler or C1 compiler is suitable for shorter-lived applications that need faster optimization and compilation time. The Java version 8+ automatically allocates this compiler to such apps.
  • C2 compiler – server compiler: While the C1 compiler facilitates faster compilation, the C2 compiler is more performance efficient as it analyses the code for a longer time. Thus, it is suitable for server applications. In Java 10, the Graal compiler is also present as an alternative to c2. It performs JIT and ahead-of-time compilation as well.

Garbage Collector (GC)

We have used the term Garbage Collector in the above sections of this article. Now, it is time to finally understand what it is. 

Garbage collection is the process of removing the unreferenced objects from the heap area. Java has a built-in garbage collector that makes it easier to manage memory. It has two phases. These are –

  • Mark: Identify unused or un-referenced objects in the memory.
  • Sweep: Remove the previously identified objects.

There are different kinds of Garbage collectors provided by the JVM. These are –

  • Serial Garbage Collector: Uses single thread and is ideal for small applications.
  • Parallel Garbage Collector: Uses multiple threads and is the default GC for JVM. It is also called the throughput collector.
  • CMS Garbage Collector: It stands for Concurrent Mark Sweep GC. It collects the unused objects concurrently and reduces the application pauses.
  • G1 Garbage Collector: It is called Garbage First collector and is ideal for large applications that use heap memory. It balances the performance and application pause time.

Your garbage collection logs can tell you a lot about the memory storage for your jar files. You simply have to run the following command in the terminal to get the analysis. 

Java also relies on garbage collection to manage memory. If you’re working with concurrent programming, you might find multithreading concepts useful, as they interact closely with garbage collection strategies.

				
					java -Xlog:gc* -jar javaGC.jar  
				
			

This will result in listing all the garbage collection logs the JVM has performed. Below is the terminal output that shows the format and represents how the log collection happened.

garbage_collection_logs

JVM Performance Optimization Strategies For Efficient Execution

It is very important to understand that Java applications can suffer from various bottlenecks due to inefficient management and other factors. Thus, we should keep in mind some JVM performance optimization strategies so that our applications turn out effective.

performance_strategies

  • Tuning JVM Parameters: Heap size tuning, thread management, and garbage collection, if done properly, can lead to performance enhancement of JVM. You can set a range for heap size and use a suitable garbage collector for your applications.
  • Reduce Memory Leaks: Always make sure that you are using proper techniques to reduce and prevent memory leaks. For this, you can use weak references and proper resource cleanup. Tools like VisualVM can also help in identifying memory leaks.
  • Minimize Object Creation: By doing this, we can reduce the GC overhead and speed up execution. You can also use object pooling for frequently used objects and prefer primitives instead of wrapper classes. 

Real-World Scenarios Where JVM Helps Large-Scale Apps

You must be wondering that there are so many emerging programming languages coming up, so why use Java for building applications? The answer is that Java is one of the most powerful programming languages that supports performance optimization. 

Let’s have a look at some practical scenarios where JVM tuning techniques were used. Read below!

  • Using G1GC for predictable pause times: Streaming platforms like Netflix use this garbage collector to perform concurrent garbage collection. This way, users can enjoy uninterrupted streaming. It divides the heap space into regions and prioritizes which region has the most garbage, and then collection happens.
  • Migration from HotSpot JVM to GraalVM: As we said above, the Graal compiler can perform both just-in-time and ahead-of-time compilation which significantly reduces the overhead and results in faster execution time. Thus, platforms like Twitter (X) make use of it to deal with the high volume of tweets and memory consumption.
  • In Spring Boot Applications: We use the Spring Boot framework to build cloud-native applications. These applications are time-critical and thus need a solution for faster execution and memory management. 

If you’re curious about how other platforms manage performance, check out these real-world Python projects.

Conclusion:

I hope that by now you have understood the internal workings of the JVM. Understanding the JVM internals is a good way to understand how the Java code works and helps developers grasp the knowledge of the language in a better way.

While the JVM takes care of its internal mechanism, we can also optimize our code to further enhance the performance of our Java applications. Using the resources wisely will also prevent any downtime and security risks.

Takeaways:

  • JVM is an essential component of the Java Runtime Environment.
  • The JIT compiler boosts the performance and execution of the code by overcoming the shortcomings of the Java interpreter.
  • Performing Garbage collection can help to manage the memory utilization properly.

Leave a Comment

Your email address will not be published. Required fields are marked *