Sunday, October 15, 2017

[JVM-1] Byte code file structure

Java byte code has a structure so that the virtual machine knows where to look for which information. Usually, an assembly program has the following structure.
Data segment:
  //goes constant
  //array initialization and declaration
  //global variables
Code segment:
  //several sub routines that act on top of the data segment.

Byte Code File Structure:
Byte code has similar structure. It's Data segment follows a very specific format.
  Class_File_Format {
     u4 magic_number;

     u2 minor_version; 
     u2 major_version;

     u2 constant_pool_count; 
   
     cp_info constant_pool[constant_pool_count - 1];

     u2 access_flags;

     u2 this_class;
     u2 super_class;

     u2 interfaces_count; 
   
     u2 interfaces[interfaces_count];

     u2 fields_count; 
     field_info fields[fields_count];

     u2 methods_count;
     method_info methods[methods_count];

     u2 attributes_count; 
     attribute_info attributes[attributes_count];
  }

constant pool is one of the main parts of the bytecode file. It hold's the constants, variable names, method names, interfaces this class implements etc.

One interesting this is the magic number at the beginning for any java byte code file is always OxCAFEBABE. JVM won't read this file if this magic number is not found at the begin.

Constructors:
In byte code files, constructors are treated differently. All the constructors are replaced by a bytecode method <init> (parameters).

// In source packet in file init/ex8/CoffeeCup.java
class CoffeeCup {
    public CoffeeCup() {
        //...
    }
    public CoffeeCup(int amount) {
        //...
    }
    // ...
}
the compiler would generate the following two instance initialization methods in the class file for class CoffeeCup, one for each constructor in the source file:

// In binary form in file init/ex8/CoffeeCup.class:
public void <init>(CoffeeCup this) {...}
public void <init>(CoffeeCup this, int amount) {...}

Reading byte code
java Test.java  ==> produces Test.class
xxd Test.class  ==> reads byte codes of the class
javap -verbose Test.class ==> view the constant pool area
javap -c Test.class  ==> only shows the instructions

Example:
  class Test{
    public static void main(String args[]){
      int i = 0;
      int j = 1;
      for(i = 0; i < 10; i++){
        j = j + 1;
      }
    }
  }

  javap -verbose Test.class output is following
    Classfile /Users/sa050870/Desktop/blog/javacompile/Test.class
    Last modified Oct 3, 2017; size 325 bytes
    MD5 checksum e48f946daadcae98c0ce5bcc07ac8094
    Compiled from "Test.java"
  class Test
    minor version: 0
    major version: 52
    flags: ACC_SUPER
  Constant pool:
     #1 = Methodref          #3.#13         // java/lang/Object."<init>":()V
     #2 = Class              #14            // Test
     #3 = Class              #15            // java/lang/Object
     #4 = Utf8               <init>
     #5 = Utf8               ()V
     #6 = Utf8               Code
     #7 = Utf8               LineNumberTable
     #8 = Utf8               main
     #9 = Utf8               ([Ljava/lang/String;)V
    #10 = Utf8               StackMapTable
    #11 = Utf8               SourceFile
    #12 = Utf8               Test.java
    #13 = NameAndType        #4:#5          // "<init>":()V
    #14 = Utf8               Test
    #15 = Utf8               java/lang/Object
  {
    Test();
      descriptor: ()V
      flags:
      Code:
        stack=1, locals=1, args_size=1
           0: aload_0
           1: invokespecial #1                  // Method java/lang/Object."<init>":()V
           4: return
        LineNumberTable:
          line 1: 0

    public static void main(java.lang.String[]);
      descriptor: ([Ljava/lang/String;)V
      flags: ACC_PUBLIC, ACC_STATIC
      Code:
        stack=2, locals=3, args_size=1
           0: iconst_0
           1: istore_1
           2: iconst_1
           3: istore_2
           4: iconst_0
           5: istore_1
           6: iload_1
           7: bipush        10
           9: if_icmpge     22
          12: iload_2
          13: iconst_1
          14: iadd
          15: istore_2
          16: iinc          1, 1
          19: goto          6
          22: return
        LineNumberTable:
          line 3: 0
          line 4: 2
          line 5: 4
          line 6: 12
          line 5: 16
          line 8: 22
        StackMapTable: number_of_entries = 2
          frame_type = 253 /* append */
            offset_delta = 6
            locals = [ int, int ]
          frame_type = 15 /* same */
  }
  SourceFile: "Test.java"
 

Reference
[0] object initialization in java. https://www.javaworld.com/article/2076614/core-java/object-initialization-in-java.html
[1] hacking java byte code. https://www.acloudtree.com/hacking-java-bytecode-for-programmers-part3-yes-disassemble-with-javap-all-over-the-place/
[2] Inside the Java 2 virtual machine. Bill Venners. http://www.artima.com/insidejvm/ed2/
[3] Java class file architecture, wikipedia. https://en.wikipedia.org/wiki/Java_class_file
[4] JVM specification. https://docs.oracle.com/javase/specs/jvms/se7/html/index.html

start from here https://www.javaworld.com/article/2076614/core-java/object-initialization-in-java.html?page=2

Thursday, October 12, 2017

[Philosophy] In search of the true family

Today while walking through dense fog of Missouri, Kansas City a thought suddenly struck my mind. Why I keep running? I recalled my past. God knows, I worked as hard as hell to stand where I am standing today. Even when I was a young kid, I would wake up when it is still one or two hours for sun rise. I would wash my face with icy-cold water, I would pray to Almighty to show me path, to take me where I should go, then I would start my work. I would study and I stop only when it was time to go to school. If I wanted to do something, I never looked for a shortcut. I would always learn every material diligently, tried to think about them, understand them, never looked for memorizing something. I always wanted to go deep. Years after, even today I woke up at 5:30 in the morning, started looking at a MIT course material name "Hacking a Google code interview". During my undergrad days, I lost myself. I had a wrong thought that talent is all about something God-gifted. At this point of life, after realizing earnest and mindful work always beats a talent, I started educating myself again. I choose the topics that I felt I should be good at. So everyday, I would wake up at 5 am, educate myself on the knowledge I want to know. I would go to work, work for 9 hours, I would come back home take 2 hour of rest and from 7:30 pm I would start educating myself again. I have been doing this for days after days, months after months. I asked myself Why am I doing this?

Suddenly today my heart spoke. I have been living in several places. As I live in a place, I have to leave some of my parts there. With the help of that place, I transform some part of myself and become a modified person. I have to shed incompatible part of me that would not go with that place and let new parts grow in myself. May be, all I am looking for is a family. A family where members live under the shed of joy and respect. Their discussions are not trivial, their work is synergistic and solid. They don't take life for only merry-making. They work, think, reflect, invent, create necessity, fullfill the necessities and thus earn honest living. I believe, the members of true family do not grow up under the same roof. One member of the true family may be from Bangladesh, another may be from Africa, another may be from China, USA or Brazil. I have met many many people at many different parts of the world. I keep talking with them, I keep asking them what makes them happy. May be I do this because, someday, somewhere, some person will say something that will change my world forever. May be all these self modifications are taking me closer and closer to my family, to the people I truly belong. 

Wednesday, October 4, 2017

[JVM-0] Architecture of JVM

Java and Java Platform

Java has 4 core parts
  0. Language: a programmer uses it to write programs
  1. Class file format: java compiler translates it to byte codes to be executed by jvm
  2. JVM: an application that executes bytecode
  3. API: to interact with host machine

Java platform has 2 parts
  0. API
  1. JVM

A Java program runs on a Java platform.

Java program execution

Here are the steps to run a java program on a Java platform.

0. programmer writes java code

  class Test{
    public static void main(String args[]){
      int i = 0;
      int j = 1;
      for(i = 0; i < 10; i++){
        j = j + 1;
      }
    }
  }

1. Compile and produce byte code
              javac Test.java
bytecode has instruction that is targeted for a virtual architecture, java virtual machine.

javap -c Test.class produces the following output

class Test {
  Test();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static void main(java.lang.String[]);
    Code:
       0: iconst_0
       1: istore_1
       2: iconst_1
       3: istore_2
       4: iconst_0
       5: istore_1
       6: iload_1
       7: bipush        10
       9: if_icmpge     22
      12: iload_2
      13: iconst_1
      14: iadd
      15: istore_2
      16: iinc          1, 1
      19: goto          6
      22: return
}

2. user starts jvm by the command,
                java Test
Note: a jvm only runs a single application. jvm takes a class file as an agrument. the class file must have the main method with proper signature.

3. jvm loads, interprets the byte code and runs on host system.


Parts of JVM

0. Class loader: locates and imports byte code to jvm's memory.
    a. checks correctness of a type.
    a. on method area, loads byte code, initializes class variables. on heap creates Class object.
    b. links bytecode of method area with Class on heap.

there are two types of class loader
    a. bootstrap class loader: loads java api from installation location. this is part of jvm and written in c++ probably.
    b. user defined class loader: java classes created by user. these are objects in heap.

1. Method area: class loader loads byte code instructions to this place. For each class following information are stored
    a. fully qualified name of the type
    b. super class
    c. is it class or interface?
    d. modifier (public, abstract, final)
    e. constant pool: constants used by this type
        - string, int, float,
        - other classes used by this class (initially it holds only a symbolic link(fully qualified name), later when those classes are loaded to method area and in heap, those symbolic links are replaced by reference to class)
    f. field information (field name, type, modifier)
    g. method information
      0. name
      1. return type
      2. argument info
      3. modifier
      4. bytecode
      5. number of local variables
      6. size of operand stack
      7. exception table
    h.  class variable: all class's get a copy of static-final variable. static non-final are stored method area.
    i. class loader reference: reference to the loader that loaded this class.
    j. method table: instruction memory address for each method's start.

2. Java stack: each thread has separate java stack. only push and pop operations are allowed. contains stack frames for methods. each stack frame holds following information of a method
  a. parameters
  b. local variables
  c. operand stack
  d. return val
  e. return address
  f. exception table


      if an exception occurs and not catch clause found for that instruction, jvm causes the method return abruptly and re-throws the exception to the callers context.
one thread cannot access another thread's java stack.

3. Program counter: each thread gets one entry in the program counter (pc). pc remembers the next instruction to be run for a thread.

4. Heap: objects gets created in this area.

5. Execution engine: Fetches instructions from method area, translates and executes them. instructions acts on the data on java stack and heap. each thread is an execution engine. Interpreted byte code is cached and accessed if necessary.
2 popular techniques
  a. just in time compile: One by one, takes byte code, translates to native code, executes.
  b. adaptive: acts just like jit except as soon as it finds a code that is being used a lot of time(hot spot) it forks a thread. the thread heavily optimizes the code in hotspot and jvm in later time executes those optimized instructions.


6. Native method stack: holds frames for native methods. native methods works on the frame data and data in heap of JVM.

The following picture shows architecture of JVM


The following picture depicts Program counter and Java stack



** Object representation on JVM


** Thread Synchronization
Thread needs object locking and wait-notify mechanism to work.
calling the following methods on an object
  a. lock: a thread can access lock to an object. another thread has to wait to acquire the lock until the first thread unlocks the object.
  b. wait: a thread calls wait on an object. jvm puts the thread to the wait list of the object and makes it sleep. the thread sleeps until another thread calls notify or notify all
  c. notify and notify all: a thread calls this method on an object to notify the threads waiting on the object's wait list.

** Type of java threads
0. non-daemon thread: used by jvm. inital thread that starts a program, garbage collector.
1. daemon thread: created by running program.
As long as a non-daemon thread keeps running, the jvm would not stop unless exit method has not been called.

Atomic operations such as int, char operations makes sure a variable gets a value assigned either by one of the racing threads. if thread_1 tries to assign 0100 and thread_0 tries to assign 1011 to a variable x, it is guaranteed that x will have either 0100 or 1011 not any other values.

** Data types of jvm
a. reference type: holds object reference
b. primitive type: holds int, float. boolean false is stored as 0, any non-zero is stored as true. primitive types has same size and properties in all jvms. they don't depend on the host architecture.

As the jvm starts working, it's class loader loads byte codes to jvm's method area.