Skip to main content

High-level language like Java requires the compiler to translate source code to highly optimized byte code, then byte code is interpreted by the JVM interpreter. Bytecode is generated by the javac compiler when source code is compiled, and produced as a .class file.

Did you have the question in your mind that is .class file contains nothing but hexadecimal code? but which code is used for which purpose? if then you are in the right place to go through!

let’s write code for subtract operation, and we will generate bytecode through compile.

public class AddToValue {
  public static void main(String[] args) {
    long maxLong = Long.MAX_VALUE;
    long secondMinimumPositiveLong = 1;
    long secondMaxLong = maxLong - secondMinimumPositiveLong;
    System.out.println(secondMaxLong);
  }
}

This high-level language hides the complexity of how JVM or theoperating system will read this code. Suppose we have declared the max long value, this is a simple instruction to us but how the value of maxLong variable is loaded to the CPU register is not our headache. we have to be relayed on the abstract layer.

although high-level code must be converted to instruction machine code-named as bytecode.

let’s compile this program.

javac AddToValue.java

we will get output like this

cafe babe 0000 0037 001f 0a00 0800 1107
0012 057f ffff ffff ffff ff09 0013 0014
0a00 1500 1607 0017 0700 1801 0006 3c69
6e69 743e 0100 0328 2956 0100 0443 6f64
6501 000f 4c69 6e65 4e75 6d62 6572 5461
626c 6501 0004 6d61 696e 0100 1628 5b4c
6a61 7661 2f6c 616e 672f 5374 7269 6e67
3b29 5601 000a 536f 7572 6365 4669 6c65
0100 0f41 6464 546f 5661 6c75 652e 6a61
7661 0c00 0900 0a01 000e 6a61 7661 2f6c
616e 672f 4c6f 6e67 0700 190c 001a 001b
0700 1c0c 001d 001e 0100 0a41 6464 546f
5661 6c75 6501 0010 6a61 7661 2f6c 616e
672f 4f62 6a65 6374 0100 106a 6176 612f
6c61 6e67 2f53 7973 7465 6d01 0003 6f75
7401 0015 4c6a 6176 612f 696f 2f50 7269
6e74 5374 7265 616d 3b01 0013 6a61 7661
2f69 6f2f 5072 696e 7453 7472 6561 6d01
0007 7072 696e 746c 6e01 0004 284a 2956
0021 0007 0008 0000 0000 0002 0001 0009
000a 0001 000b 0000 001d 0001 0001 0000
0005 2ab7 0001 b100 0000 0100 0c00 0000
0600 0100 0000 0100 0900 0d00 0e00 0100
0b00 0000 3c00 0400 0700 0000 1414 0003
400a 421f 2165 3705 b200 0516 05b6 0006
b100 0000 0100 0c00 0000 1600 0500 0000
0300 0400 0400 0600 0500 0b00 0600 1300
0700 0100 0f00 0000 0200 10

According to The Java Virtual Machine Instruction Set there are 256 possible bytecode instructions. generated bytecode contains instructions among them, we will find out which instruction is used for which bytecode down the line. this instructions set vary x86 to x64 another reason is that JVM tries to use a specific register for an instruction, if the processor does not have this register or is not compatible then used a stack for that operation.

let’s disassemble this class and we will print the instructions from the bytecode.

javap -c AddToValue.class

Compiled from "AddToValue.java"
public class AddToValue {
  public AddToValue();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static void main(java.lang.String[]);
    Code:
       0: ldc2_w        #3                  // long 9223372036854775807l
       3: lstore_1
       4: lconst_1
       5: lstore_3
       6: lload_1
       7: lload_3
       8: lsub
       9: lstore        5
      11: getstatic     #5                  // Field java/lang/System.out:Ljava/io/PrintStream;
      14: lload         5
      16: invokevirtual #6                  // Method java/io/PrintStream.println:(J)V
      19: return
}

We will map bytecode with these printed instructions, first, we have to understand some term 

opcode: opcode is operation code, also known as instruction machine code, that specifies the operation to be performed. ex:

operand stack: The operand stack is a 32-bit, used to store value and return once instructions are invoked.

mnemonic: Short description of the instruction ex: istore_1

theorem: the mnemonic form of opcode => mnemonic = opcode

now we will subtract operation instructions

long maxLong = Long.MAX_VALUE;
long secondMinimumPositiveLong = 1;
long secondMaxLong = maxLong - secondMinimumPositiveLong;

for these human-readable instructions, we got 8 opcode instructions

0: ldc2_w        #3                  // long 9223372036854775807l
3: lstore_1
4: lconst_1
5: lstore_3
6: lload_1
7: lload_3
8: lsub
9: lstore        5

opcode instructions corresponding hexadecimal value

Mnemonic Opcode
ldc2_w 14
lstore_1 40
lconst_1 0a
lstore_3 42
lload_1 1f
lload_3 21
lsub 65
lstore 37

this series of hexadecimal codes will exist within this .class file.

Bytecode
cafe babe 0000 0037 001f 0a00 0800 1107
0012 057f ffff ffff ffff ff09 0013 0014
0a00 1500 1607 0017 0700 1801 0006 3c69
6e69 743e 0100 0328 2956 0100 0443 6f64
6501 000f 4c69 6e65 4e75 6d62 6572 5461
626c 6501 0004 6d61 696e 0100 1628 5b4c
6a61 7661 2f6c 616e 672f 5374 7269 6e67
3b29 5601 000a 536f 7572 6365 4669 6c65
0100 0f41 6464 546f 5661 6c75 652e 6a61
7661 0c00 0900 0a01 000e 6a61 7661 2f6c
616e 672f 4c6f 6e67 0700 190c 001a 001b
0700 1c0c 001d 001e 0100 0a41 6464 546f
5661 6c75 6501 0010 6a61 7661 2f6c 616e
672f 4f62 6a65 6374 0100 106a 6176 612f
6c61 6e67 2f53 7973 7465 6d01 0003 6f75
7401 0015 4c6a 6176 612f 696f 2f50 7269
6e74 5374 7265 616d 3b01 0013 6a61 7661
2f69 6f2f 5072 696e 7453 7472 6561 6d01
0007 7072 696e 746c 6e01 0004 284a 2956
0021 0007 0008 0000 0000 0002 0001 0009
000a 0001 000b 0000 001d 0001 0001 0000
0005 2ab7 0001 b100 0000 0100 0c00 0000
0600 0100 0000 0100 0900 0d00 0e00 0100
0b00 0000 3c00 0400 0700 0000 1414 0003
400a 421f 2165 3705 b200 0516 05b6 0006 (40 0a 42 1f 21 65 37 05 -> lstore_1 lconst_1 lstore_3 lload_1 lload_3 lsub lstore)
b100 0000 0100 0c00 0000 1600 0500 0000
0300 0400 0400 0600 0500 0b00 0600 1300
0700 0100 0f00 0000 0200 10

Operand Stack Preparation from these instructions:

Mnemonic Description
ldc2_w push 9223372036854775807l to stack from constant pool
lstore_1 store 9223372036854775807l in a local variable 1
lconst_1 push 1l onto the stack
lstore_3 store 1l in a local variable 3
lload_1 load 9223372036854775807l from a local variable 1
lload_3 load 1l from a local variable 3
lsub subtract 9223372036854775807l – 1l
lstore store 9223372036854775806l in a local variable #index

In summary, we have seen how byte code holds the instruction need to be executed by JVM also mapping is made between Mnemonic and opcode in hexadecimal form.

ref:

  1. List of Java bytecode instructions
  2. The Java Virtual Machine Instruction Set
  3. The Java Virtual Machine
Software Engineer at Bangladesh Japan Information Technology(BJIT)

Volunteer at Java User Group Bangladesh (JUGBD)
cmabdullah21@gmail.com @cmabdullah21

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.