您的位置:首页 > 编程语言 > Java开发

Java Code To Byte Code

2015-11-09 11:46 507 查看
原文地址:http://blog.jamesdbloom.com/JavaCodeToByteCode_PartOne.html#variables

This article explains how Java code is compiled into byte code and executed on the JVM. To understand the internal architecture in the JVM and different memory areas used during byte code execution see my previous article on JVM Internals.

这篇文章解释了,Java代码怎样被编译成字节码,并且在JVM中运行的。为了理解JVM的内部结构和执行字节码时使用到的不同内存区域,查阅之间的文章JVM Internals。(地址:http://blog.jamesdbloom.com/JVMInternals.html)

This article is split into three parts, with each part being subdivided into sections. It is possible to read each section in isolation however the concepts will generally build up so it is easiest to read the sections. Each section will cover different Java code structures and explain how these are compiled and executed as byte code, as follows:

这篇文章分为三章,每一章又分为不同的小节。因为每一小节是可以独立理解而不用管整个概念是怎样被加深的,所以理解这些小节很容易。每一小节会涵盖不同的java代码结构,并且解释它们是怎么被当做字节码编译和执行的,文章结构如下:

Part
1 - Basic Programming Concepts

variables

local
variables

fields
(
class

variables)

constant
fields (
class

constants)


static

variables

conditionals


if

-
else



switch


loops


while

-loop


for

-loop


do

-
while

-loop

Part 2 - Object Orientation And Safety (next article)

try

-catch-finally

synchronized

method invocation

new (objects
and arrays)

Part 3 - Metaprogramming (future article)

generics

annotations

reflection

此段不翻译

This article includes many code example and shows the corresponding typical byte code that is generated. The numbers that precede each instruction (or opcode) in the byte code indicates the byte position. For example an instruction such a 1:iconst_1 is only one byte in length, as there is no operand, so the following byte code would be at 2. An instruction such as 1: bipush 5 would take two bytes, one byte for the opcode bipush and one byte for the operand5. In this case the following byte code would be at 3 as the operand occupied the byte at position 2.

这篇文章包括很多代码实例,也展示了由此产生的相应典型的字节码。在字节码每个结构之前的数字也表明了字节的执行位置。举例来说,1:iconst_1:只有一个字节长,因为没有操作数,所以接下来的字节码位置会是2.像1: bipush 5的这样一个结构,就占据了两个字节。操作码bipush一个字节,操作数5一个字节。在这个例子中,接下来的字节码会在3的位置,因为操作数在2的位置占了一个字节。

Variables

Local Variables

The Java Virtual Machine (JVM) has a stack based architecture. When each method is executed including the initial main method a frame is created on the stack which has a set of local variables. The array of local variables contains all the variables used during the execution of the method, including a reference tothis, all method parameters, and other locally defined variables. For classmethods (i.e. static methods) the method parameters start from zero, however, for instance methods the zero slot is reserved for this.

变量

本地变量

java虚拟机是一个基于栈的结构。当每一个方法执行时(包括初始的主方法),相应的栈帧就在栈上创建,并同时有一套局部变量。本地变量的数组包含执行该方法时所需要的所有变量值,包括对this 的引用,所有方法的参数和其他本地定义的变量。对于类方法(例如静态方法),方法参数从0开始,但是,实例方法的0处的位置为this保留。

A local variable can be:

· boolean

· byte

· char

· long

· short

· int

· float

· double

· reference

· returnAddress

这段不翻译

All types take a single slot in the local variable array except long and doublewhich both take two consecutive slots because these types are double width (64-bit instead of 32-bit).

所有类型在局部变量数组中占据一个单独的位置,除了long和double,它们占据连续的两个位置,因为这两个类型是8个字节长。

When a new variable is created the operand stack is used to store the value of the new variable. The value of the new variable is then stored into the local variables array in the correct slot. If the variable is not a primitive value then the local variable slot only stores a reference. The reference points to an the object stored in the heap.

当创建一个新变量,操作数栈就会保存新变量的值。新变量的值之后会被存进局部变量数组的正确位置。假如该变量不是一个原始值(原始值可被该数组的子类继承),那么本地变量的位置上会保存一个引用。该引用指向存在堆中的一个对象。(也就是说,栈中的数据结构指向封装在堆中的一个对象)

For example:

int i = 5;

Is compile to:

0: bipush 52: istore_0

bipush

Is used to add a byte as an integer to the operand stack, in this case 5 as added to the operand stack.

Bipush此时是作为一个整数加入到操作数栈中,在这个例子中,5同样被加入到操作数栈中

istore_0

Is one of a group of opcodes with the formatistore_<n> they all store an integer into local variables. The <n> refers to the location in the local variable array that is being stored and can only be 0, 1, 2 or 3. Another opcode is used for values higher then 3 called istore, which takes an operand for the location in the local variable array.

Istore_0是一组istore<n>结构的操作码,它们将一个整数存进局部变量。n代表着局部变量数组存储该元素的位置,只能是0,1,2,3.另一个叫做istore的操作码用来保存3以上的元素,它采用局部变量数组中的操作数表示位置。

In memory when this is executed the following happens:



The class file also contains a local variable table for each method, if this code was included in a method you would get the following entry in the local variable table for that method in the class file.

LocalVariableTable:

Start Length Slot Name Signature

0 1 1 i I

Class文件也为每个方法保存着一个局部变量表,假如代码被包含在一个方法。你可以在class文件中从如下的局部变量表中进入该方法。

Fields (Class Variables)

A field (or class variable) is stored on the heap as part of a class instance (or object). Information about the field is added into the field_info array in the class file as shown below.

域(类变量)

一个域是作为类实例的部分被存在堆中的。域的信息被加入到class文件中field_info的数组,如下所示:

ClassFile {

u4 magic;

u2 minor_version;

u2 major_version;

u2 constant_pool_count;

cp_info contant_pool[constant_pool_count – 1];

u2 access_flags;

u2 this_class;

u2 super_class;

u2 interfaces_count;

u2 interfaces[interfaces_count];

u2 fields_count;

field_info fields[fields_count];

u2 methods_count;

method_info methods[methods_count];

u2 attributes_count;

attribute_info attributes[attributes_count];

}

In addition if the variable is initialized the byte code to do the initialization is added into the constructor.

另外,加入变量已经被初始化。字节码的初始化是被加载进容器中。

When the following java code is compiled:

当下面的java代码被编译时

public class SimpleClass {

public int simpleField = 100;

}

An extra section appears when you run javap demonstrating the field added to the field_info array:

当你运行javap声明域加载进field_info数组中,另外的部分出现了

public int simpleField;

Signature: I

flags: ACC_PUBLIC

The byte code for the initialization is added into the constructor (shown in bold), as follows:

public SimpleClass();

Signature: ()V

flags: ACC_PUBLIC

Code:

stack=2, locals=1, args_size=1

0: aload_0

1: invokespecial #1 // Method java/lang/Object."<init>":()V

4: aload_0

5: bipush 100

7: putfield #2 // Field simpleField:I

10: return

aload_0

Loads the an object reference from the local variable array slot onto the top of the operand stack. Although the code shown has no constructor the initialization code for classvariables (field) actually executed in the default constructor created by the compiler. As a result the first local variable actually points to this, therefore the aload_0 opcode loads the thisreference onto the operand stack. aload_0 is one of a group of opcodes with the formataload_<n> they all load an object reference into the operand stack. The <n> refers to the location in the local variable array that is being accessed but can only be 0, 1, 2 or 3. There are other similar opcodes for loading values that are not an object reference iload_<n>, lload_<n>,fload_<n> and dload_<n> where i is for int, l is for long, f is for float and d is for double. Local variables with an index higher than 3 can be loaded using iload, lload, fload, dload andaload these opcodes all take a single operand that specifies the index of local variable to load.

从局部变量位置上加载一个对象引用到操作数栈上。虽然显示的代码没有构造器,对于calss变量的初始代码实际上在编译器的默认构造器中产生。因此,首个局部变量实际上指向的是this,因此aload_0操作码加载this引用。aload_0是aload<n>其中的一组操作码,它们都是加载对象引用进操作数栈。<n>代表在局部变量数组中的位置,只能取值0,1,2,3。有相类似的操作码加载值,而不是一个n个对象引用。iload<n>,lload<n>前的i表示int,l表示long,其余还有f和d类似。索引高于3的本地变量只能被加载进上述的load中,这些操作码都采用了单独的操作数来指明需要加载的而局部变量引用。

invokespecial

The invokespecial instruction is used to invoke instance initialization methods as well as private methods and methods of a superclass of the current class. It is part of a group of opcodes that invoke methods in different ways that includeinvokedynamic, invokeinterface,invokespecial, invokestatic,invokevirtual. The invokespecial instruction is this code is invoking the superclass constructor i.e. the constructor of java.lang.Object.

特殊调用

特殊调用指令用来调用实例的初始化方法,不仅是私有方法,也包括现有类的父类方法。是 一组操作码中的一部分,用来从不同方式来调用方法,包括invokedynamic, invokeinterface,invokespecial, invokestatic,invokevirtual。invokespecial指令是用来调用父类的构造器,比如 java.lang.Object.的构造器

bipush

Is used to add a byte as an integer to the operand stack, in this case 5 as added to the operand stack.

bipush

用来在操作数栈中增加一个int类型,在这个例子里5被加入到了操作数栈。

putfield

Takes a single operand that references a field in the run time constant pool, in this case the field called simpleField. The value to set the field to and the object that contains the field are both popped off the operand stack. The aload_0previously added the object that contains the field and the bipush previously added the 100 to the operand stack. The putfield then removes (pops) both of these values from the operand stack. The final result is that the field simpleField on the this object is updated with the value 100.

putfield

占了一个单独的操作数,指向运行时常量池中的一个域。在这个例子里叫simpleField。设置域的值和包含域的对象都被弹出操作数栈。aload_0之前加载了对象,包含了域和bipush在操作数栈上加了100。putfield在操作数栈上弹出这些值。最终结果是,这个对象的域simplefield被100这个值更新。

In memory when this is executed the following happens:



The putfield opcode
has a single operand that referencing the second position in the constant pool. The JVM maintains a per-type constant pool, a run time data structure that is similar to a symbol table although it contains more data. Byte codes in Java require data, often this
data is too large to store directly in the byte codes, instead it is stored in the constant pool and the byte code contains a reference to the constant pool. When a class file
is created it has a section for the constant pool as follows:

putfield操作码有一个单独的操作数,在常量池中指向第二位。JVM维护着一个per-type的常量池,一个运行时的数据结构与符号表相类似,虽然它包含更多的数据。java中的字节码需要数据,通常该数据因为太大而不能直接存在字节码中,而是存在常量池中,字节码包含了一个指向常量池的引用。当一个class文件创建时,它有一个常量池的部分,如下:
Constant pool:
#1 = Methodref          #4.#16         //  java/lang/Object."<init>":()V
#2 = Fieldref           #3.#17         //  SimpleClass.simpleField:I
#3 = Class              #13            //  SimpleClass
#4 = Class              #19            //  java/lang/Object
#5 = Utf8               simpleField
#6 = Utf8               I
#7 = Utf8               <init>
#8 = Utf8               ()V
#9 = Utf8               Code
#10 = Utf8               LineNumberTable
#11 = Utf8               LocalVariableTable
#12 = Utf8               this
#13 = Utf8               SimpleClass
#14 = Utf8               SourceFile
#15 = Utf8               SimpleClass.java
#16 = NameAndType        #7:#8          //  "<init>":()V
#17 = NameAndType        #5:#6          //  simpleField:I
#18 = Utf8               LSimpleClass;
#19 = Utf8               java/lang/Object


Constants Fields (Class Constants)

A constant field with the final modifier
is flagged as ACC_FINAL in
the classfile.

在calss文件中,一个带有final修饰符的常量域被标记为ACC_FINAL

For example:
public class SimpleClass {

public final int simpleField = 100;

}


The field description is augmented with ACC_FINAL:

域描述用ACC_FINAL扩展了
public static final int simpleField = 100;
Signature: I
flags: ACC_PUBLIC, ACC_FINAL
ConstantValue: int 100


The initialization in the constructor is however unaffected:

构造器的初始化并没有受到影响
4: aload_0
5: bipush        100
7: putfield      #2                  // Field simpleField:I


Static Variables

A static class
variable with the static modifier
is flagged as ACC_STATIC in
the class file
as follows:

带有static修饰符的静态class变量在class文件中被标记为ACC_STATIC,如下:
public static int simpleField;
Signature: I
flags: ACC_PUBLIC, ACC_STATIC


The byte code for initialization of static variables
is not found in the instance constructor <init>.
Instead static fields
are initialized as part of the classconstructor <cinit> using
the putstatic operand instead of putfield operand.

静态变量的初始化字节码不能再实例构造器<init>中找到。并且,静态域是作为类构造器<cinit>的部分,使用putstatic操作数而不是putfield操作数来初始化的
static {};
Signature: ()V
flags: ACC_STATIC
Code:
stack=1, locals=0, args_size=0
0: bipush         100
2: putstatic      #2                  // Field simpleField:I
5: return


Conditionals

Conditional flow control, such as, if-else statements
and switch statements
work by using an instruction that compares two values and branches to another byte code.

条件式流控制,例如if-else声明和switch声明生效是通过比较两种值和分值的指令,从而生效的。

Loops including for-loops
and while-loops
work in a similar way except that they typically also include a goto instructions
that causes the byte code to loop.do-while-loops
do not require any goto instruction
because their conditional branch is at the end of the byte code. For more detail on loops see the loops
section.

循环包括for循环和while循环,它们的运行方式相类似,除了它们都典型的包含了一条goto指令,让字节码循环。do-while循环不需要任何goto指令,因为它们的条件分支是在字节码的末端。想了解更多循环的细节,原文链接http://blog.jamesdbloom.com/JavaCodeToByteCode_PartOne.html#loops,也就是下文的循环小节。

Some opcodes can compare two integers or two references and then preform a branch in a single instruction. Comparisons between other types such as doubles, longs or floats is a two-step process. First the comparison is performed and 1, 0 or -1 is pushed onto
the operand stack. Next a branch is performed based on whether the value on the operand stack is greater, less-than or equal to zero.

一些操作码能比较两个整数或者两个引用,然后再单独的一条指令里生成一个分支。在不同类型间的比较,例如doubles,longs或者floats是分两步走的。首先,进行比较,1,0,-1被压入操作数栈。下一个分支基于操作数中的值是大于,小于还是等于0.

First the if-else statement
will be explained as an example and then the different types of instructions used for branching will be covered
in more detail.

首先,if-else声明被解释成一个示例,然后是其他不同指令的类型的分支。


if-else

The following code example shows a simple if-else comparing
two integer parameters.
public int greaterThen(int intOne, int intTwo) {
if (intOne > intTwo) {
return 0;
} else {
return 1;
}
}


This method results in the following byte code:
0: iload_1
1: iload_2
2: if_icmple     7
5: iconst_0
6: ireturn
7: iconst_1
8: ireturn


First the two parameters are loaded onto the operand stack using iload_1 andiload_2. if_icmple then
compares the top two values on the operand stack. This operand branches to byte code 7 if
intOne is less then or equal to intTwo. Notice this is the exact opposite of the test in the if condition
in the Java code because if the byte code test is successful execution branches to the else-block
where as in the Java code if the test is successful the execution enters the if-block.
In other words if_icmple is
testing if the if condition
is not true and jumping over the if-block.
The body of the if-block
is byte code 5 and 6,
the body of the else-block
is byte code 7 and 8.

首先,两个参数被加载进操作数栈中,使用了iloader_1和iloader_2.if_icmple然后比较操作数栈上的两个值。加入intONE小于或者等于intTWO,这个操作数就进入分支7



The following code example shows a slightly more complex example which requires a two-step comparison.

接下来的代码实例展示了稍复杂的两步比较的例子。
public int greaterThen(float floatOne, float floatTwo) {
int result;
if (floatOne > floatTwo) {
result = 1;
} else {
result = 2;
}
return result;
}


This method results in the following byte code:
0: fload_1
1: fload_2
2: fcmpl
3: ifle          11
6: iconst_1
7: istore_3
8: goto          13
11: iconst_2
12: istore_3
13: iload_3
14: ireturn


In this example first the two parameters values are pushed onto the operand stack using fload_1 and fload_2.
This example is different from the previous example because of the two-step comparison. fcmpl is
first used to compare floatOne and floatTwo and push the result onto the operand stack as follows:

floatOne > floatTwo –> 1

floatOne = floatTwo –> 0

floatOne < floatTwo –> -1

floatOne or floatTwo = NaN –> 1

Next ifle is
used to branch to byte code 11 if
the result from fcmpl is <= 0.

在这个例子里,头两个参数值通过fload_1和fload_2被压入操作数栈。这个例子与之前的例子不同在两步比较处。fcmpl是首先比较了floatONE和floatTWO,然后把结果压入到操作数栈,如下所示:

floatOne > floatTwo –> 1

floatOne = floatTwo –> 0

floatOne < floatTwo –> -1

floatOne or floatTwo = NaN –> 1

下一个ifle用来分支到字节码11,假如fcmpl的结构小于0

This example is also different from the previous example in that there is only a single return statement
at the end of the method as a result a goto is
required at the end of the if-block
to prevent the else-block
from also being executed. The goto branches
to byte code 13 where iload_3 is
then used to push the result stored in the third local variable slot to the top of the operand stack so that it can be returned by the return instruction.

这个例子与之前不同之处在于,只在方法结尾处有一个return语句,作为if语句块后防止执行else语句块。字节码13行goto分支,iload_3被用作在第三个局部变量位置,在操作数栈压入结果,因此会被return指令返回。



As well as comparing numeric values there are comparison opcodes for reference equality i.e. ==,
for comparison to null i.e. ==null and !=null and
for testing an object's type i.e. instanceof.

对于数值比较,也有比较操作码 引用等式 比较==null和!=null和测试是否是对象类instanceof

if_icmp<cond>
eq ne lt le gt ge

This group of opcodes are used to compare the top two integers on the operand stack and branch to a new byte code. The <cond> can
be:

eq -
equals

ne -
not equals

lt -
less then

le -
less then or equal

gt -
greater then

ge -
greater then or equal

if_acmp<cond>
eq ne

These two opcodes are used to test if two references are eq equal
or ne non
equal and branch to a new byte code location as specified by the operand.

ifnonnullifnull

These two opcodes are used to test if two references are null or
not null and
branch to a new byte code location as specified by the operand.

lcmp

This opcode is used to compare the top two integers on the operand stack and push a value onto the operand stack as follows:

if value1 > value2 –> push
1

if value1 = value2 –> push
0

if value1 < value2 –> push
-1

fcmp<cond>
l gdcmp<cond> l g

This group of opcodes is used to compare twofloat or double values
and push a value onto the operand stack as follows:

if value1 > value2 –> push
1

if value1 = value2 –> push
0

if value1 < value2 –> push
-1

The difference between the two types of operand ending in l or g is
how they handle NaN. Thefcmpg and dcmpg instructions
push the int value
1 onto the operand stack whereas fcmpl anddcmpl push
-1 onto the operand stack. This ensures that when testing two values if either of them are Not A Number (NaN) then the test will not be successful. For example if testing if x > y (where x and y are doubles) then fcmpl is
used so that if either value is NaN the into value -1 is pushed onto the operand stack. The next opcode will always be a ifle instruction
which branches if the value is less then 0. As a result if either x or y was a NaN the ifle would
branch over the if-block
preventing the code in the if-block
from being executed.

instanceof

This opcode pushes an int result
of 1 onto the operand stack if the object at the top of the operand stack is an instance of the classspecified.
The operand for this opcode is used to specify the class by providing an index into the constant pool. If the object is null or
not an instance of the specified class then
the intresult
0 is added to the operand stack.

if<cond>
eq ne lt le gt ge

All these operands compare the top value in the operand stack with zero and branch, to the byte code specified as an operand, if the comparison succeeds. These instructions are always used for conditional logic that is more complex and can not be done in a
single instruction for example when testing the result from a method call.


switch

The type of a Java switch expression
must be char, byte, short, int,
Character, Byte, Short, Integer, String or an enum type.
To support switchstatements
the JVM uses two special instructions called tableswitch andlookupswitch which
both only work with integer values. The use of only integers values is not a problem because char, byte, short and enum types
can all be internally promoted to int.
Support for String was also added in Java 7 which will be covered below. tableswitch is
typically a faster opcode however it also typically takes more memory. tableswitch works
by listing all potentialcase values
between the minimum and maximum case values.
The minimum and maximum values are also provided so that the JVM can immediately jump to the default-block
if the switch variable is not in the range of listed case values.
Values for case statement
that are not provided in the Java code are also listed, but point to the default-block,
to ensure all values between the minimum and maximum are provided. For example take the following switch statement:

Java中的switch表达式一定要是char,byte,short,int,Character, Byte, Short, Integer, String or an enum type。为了在JVM中支持switch语句,JVM使用两条特殊指令tableswitch和lookupswitch,它们都运行在整数。只有整数类型不是一个问题,因为char,short,byte和enum能内部转化为int。在java7中也会在之后加入对string的支持。tableswitch是典型的更快的操作码,但同时也明显会占据更多的内存。tableswitch通过列出所有介于最小和最大的潜在的case值来工作。最小和最大的值提供出来,所以JVM会立即跳入default块,假如switch变量不在列举的case值范围之内的话。在java代码中不被提供的case语句的值也会被列举出来,但是指向默认的块,为了保证提供在最大和最小值之间的值。例如,下面的switch语句
public int simpleSwitch(int intOne) {
switch (intOne) {
case 0:
return 3;
case 1:
return 2;
case 4:
return 1;
default:
return -1;
}
}


This produces the following byte code:
0: iload_1
1: tableswitch   {
default: 42
min: 0
max: 4
0: 36
1: 38
2: 42
3: 42
4: 40
}
36: iconst_3
37: ireturn
38: iconst_2
39: ireturn
40: iconst_1
41: ireturn
42: iconst_m1
43: ireturn


The tableswitch instruction
has values for 0, 1 and 4 to match the casestatement
provided in the code which each point to the byte code for their prospective code block. The tableswitch instruction
also has values for 2 and 3, as these are not provided as case statements
in the Java code they both point to the default code
block. When the instruction is executed the value at the top of the operand stack is checked to see if it is between the minimum and maximum. If the value is not between the minimum and maximum execution jumps to thedefault branch,
which is byte code 42 in
the above example. To ensure thedefault branch
value can be found in the tableswitch instruction
it is always the first byte (after any required padding for alignment). If the value is between the minimum and maximum it is used to index into the tableswitch and
find the correct byte code to branch to, for example for value 1 above the execution would branch to byte code 38.
The following diagram shows how this byte code would be executed:

tableswitch指令有0,1和4去匹配case语句,每一条case语句指向特定的代码块的字节码。tableswitch指令也有2和3的值,虽然在java代码中不提供,它们都指向default代码块。当指令执行到操作数栈顶的值时,会检查值是否在最大最小值之间。假如值不在其之间,执行会跳入default分支,也就是上例中的code42.为了保证默认分支能在tableswitch指令中被找到,总是第一个字节。假如在最大最小值之间的值用来在tableswitch中检索,找到正确的字节码,例如value1在执行之前会分支到字节码38.接下来的图标展示了字节码是如何被执行的



If the values in the case statement
were too far apart (i.e. too sparse) this approach would not be sensible, as it would take too much memory. Instead when the cases of the switch are
sparse a lookupswitch instruction
is used. Alookupswitch instruction
lists the byte code to branch to for each casestatement
but it does not list all possible values. When executing thelookupswitch the
value at the top of the operand stack is compared against each value in the lookupswitch to
determine the correct branch address. With alookupswitch the
JVM therefore searches (looks up) the correct match in a list of matches this is a slower operation then for the tableswitch where
the JVM just indexes the correct value immediately. When a select statement is compiled the compiler must trade off memory efficiency with performance to decide which opcode to use for the select statement. For the following code the compiler produces a lookupswitch:

假如case语句中的值太分散 例如太少,这种方法就不明智,因为它会占太多的内存。相反当switch的cases是稀少的,lookupswitch指令就派上用场了。lookupswitch指令为case语句列举了所有字节码,但并不列举所有可能的值。当执行lookupswitch,操作数栈顶的值是和lookupswitch中的所有值进行比较,来确定正确的分支地址。利用lookuposwitch,JVM可以在一系列匹配中找到合适的值来决定正确的分支地址,对于tableswitch是一个很慢的操作,但JVM却能立即索引到正确的值。当选择声明在编译器编译时,必须付出内存有效性的代价,以决定为select声明使用哪一个操作码。对下面的代码而言,编译器产生了一个lookupswitch:
public int simpleSwitch(int intOne) {
switch (intOne) {
case 10:
return 1;
case 20:
return 2;
case 30:
return 3;
default:
return -1;
}
}


This produces the following byte code:
0: iload_1
1: lookupswitch  {
default: 42
count: 3
10: 36
20: 38
30: 40
}
36: iconst_1
37: ireturn
38: iconst_2
39: ireturn
40: iconst_3
41: ireturn
42: iconst_m1
43: ireturn


To ensure efficient search algorithms (more efficient then linear search) the number of matches is provided and the matches are sorted. The following diagram shows how this would be executed:

为了保证有效的搜索算法,比线性搜索更有效,匹配项的数目被提供和排序号。接下来的图标展示了这将被如何执行:




String switch

In Java 7 the switch statement
added support for the String type. Although the existing opcodes for switches only support int no
new opcodes where added. Instead a switch for
the String type is done in two stages. First there the hashcode is compared between the top of the operand stack and the value for each case statement.
This is done using either a lookupswitch ortableswitch (depending
on the sparcity of the hashcode values). This causes a branch to byte code that calls String.equals() to perform an exact match. Atableswitch instruction
is then used on the result of the String.equals() to branch to the code for the correct case statement.

在java7中,switch语句为string类型加了支持。虽然现有的switch操作码仅支持int,没有新的操作码被加入。相反,string type的switch在两步之内被完成。首先比较了操作数栈顶和每一个case语句的值。这使用了lookupswitch和tableswitch (取决于哈希码值的稀少性)。这导致了字节码的分支,调用了string.equals()去实现一个准确的匹配。tableswitch指令被用在之后string.quals()的结果,分支进行到正确的case语句代码。
public int simpleSwitch(String stringOne) {
switch (stringOne) {
case "a":
return 0;
case "b":
return 2;
case "c":
return 3;
default:
return 4;
}
}


This String switch statement
will produce the following byte code:
0: aload_1
1: astore_2
2: iconst_m1
3: istore_3
4: aload_2
5: invokevirtual #2                  // Method java/lang/String.hashCode:()I
8: tableswitch   {
default: 75
min: 97
max: 99
97: 36
98: 50
99: 64
}
36: aload_2
37: ldc           #3                  // String a
39: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
42: ifeq          75
45: iconst_0
46: istore_3
47: goto          75
50: aload_2
51: ldc           #5                  // String b
53: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
56: ifeq          75
59: iconst_1
60: istore_3
61: goto          75
64: aload_2
65: ldc           #6                  // String c
67: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
70: ifeq          75
73: iconst_2
74: istore_3
75: iload_3
76: tableswitch   {
default: 110
min: 0
max: 2
0: 104
1: 106
2: 108
}
104: iconst_0
105: ireturn
106: iconst_2
107: ireturn
108: iconst_3
109: ireturn
110: iconst_4
111: ireturn


The class containing
this byte code also contains the following constant pool values references by this byte code. See the section on run
time constant pool in the JVM Internals article
for more detail about constant pools.

class包含这个字节码也包含下面的常量池值被字节码引用。
Constant pool:
#2 = Methodref          #25.#26        //  java/lang/String.hashCode:()I
#3 = String             #27            //  a
#4 = Methodref          #25.#28        //  java/lang/String.equals:(Ljava/lang/Object;)Z
#5 = String             #29            //  b
#6 = String             #30            //  c

#25 = Class              #33            //  java/lang/String
#26 = NameAndType        #34:#35        //  hashCode:()I
#27 = Utf8               a
#28 = NameAndType        #36:#37        //  equals:(Ljava/lang/Object;)Z
#29 = Utf8               b
#30 = Utf8               c

#33 = Utf8               java/lang/String
#34 = Utf8               hashCode
#35 = Utf8               ()I
#36 = Utf8               equals
#37 = Utf8               (Ljava/lang/Object;)Z


Notice the amount of byte code required to perform this switch including
twotableswitch instructions
and several invokevirtualinstructions
used to call String.equal(). See the section on method invocation in the next article for more detail on invokevirtual.
The following diagram shows how this would be executed for the input “b”.

注意执行switch所需要的字节码的数量,包括两条tableswitch指令和一些invokevirtual指令用来调用 String.equal().







If the hashcode values for the different cases matched, such as for the strings“FB” and “Ea” which
both have a hashcode of 28. This is handled by slightly altering the flow of equals methods as below. Notice how byte code 34: ifeg42 goes
to another invocation of String.equals() instead of the lookupswitchopcode
as in the previous example which had no colliding hashcode values.
public int simpleSwitch(String stringOne) {
switch (stringOne) {
case "FB":
return 0;
case "Ea":
return 2;
default:
return 4;
}
}


This generates the following byte code:
0: aload_1
1: astore_2
2: iconst_m1
3: istore_3
4: aload_2
5: invokevirtual #2                  // Method java/lang/String.hashCode:()I
8: lookupswitch  {
default: 53
count: 1
2236: 28
}
28: aload_2
29: ldc           #3                  // String Ea
31: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
34: ifeq          42
37: iconst_1
38: istore_3
39: goto          53
42: aload_2
43: ldc           #5                  // String FB
45: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
48: ifeq          53
51: iconst_0
52: istore_3
53: iload_3
54: lookupswitch  {
default: 84
count: 2
0: 80
1: 82
}
80: iconst_0
81: ireturn
82: iconst_2
83: ireturn
84: iconst_4
85: ireturn


Loops

Conditional flow control, such as, if-else statements
and switch statements
work by using an instruction that compares two values and branches to another byte code. For more detail on conditionals see the conditionals
section.

Loops including for-loops
and while-loops
work in a similar way except that they typically also include a goto instructions
that causes the byte code to loop.do-while-loops
do not require any goto instruction
because their conditional branch is at the end of the byte code.

Some opcodes can compare two integers or two references and then preform a branch in a single instruction. Comparisons between other types such as doubles, longs or floats is a two-step process. First the comparison is performed and 1, 0 or -1 is pushed onto
the operand stack. Next a branch is performed based on whether the value on the operand stack is greater, less-than or equal to zero. For more detail on the different types of instructions used for branchingsee
above.


while-loop

while-loops
consist of a conditional branch instructions such as if_icmpge orif_icmplt (as
described above) and a goto statement.
The conditional instruction branches the execution to the instruction immediately after the loop and therefore terminates the loop if the condition is not met. The finalinstruction
in the loop is a goto that
branches the byte code back to the beginning of the loop ensuring the byte code keeps looping until the conditional branch is met, as follows:
public void whileLoop() {
int i = 0;
while (i < 2) {
i++;
}
}


Is compiled to:
0: iconst_0
1: istore_1
2: iload_1
3: iconst_2
4: if_icmpge     13
7: iinc          1, 1
10: goto          2
13: return


The if_icmpge instruction
tests if the local variable in position 1 (i.e. i) is equal or greater then 10 if it is then the instruction jumps to byte code 14 finishing
the loop. The goto instruction
keeps the byte code looping until the if_icmpgecondition
is met at which point the execution branches to the returninstruction
immediately after the end of the loop. The iinc instruction
is one of the few instruction that updates a local variable directly without having to load or store values in the operand stack. In this example the iinc instruction
increases the first local variable (i.e. i) by 1.






for-loop

for-loops
and while-loops
use an identical pattern in byte code. This is not surprising because all while-loops
can be re-written easily as an identical for-loop.
The simple while-loop
above could for example be re-written as a for-loop
that produces the exactly identical byte-code as follows:
public void forLoop() {
for(int i = 0; i < 2; i++) {

}
}


do-while-loop

do-while-loops
are also very similar to for-loops
and while-loops
except that they do not require the goto instruction
as the conditional branch is the last instruction and is be used to loop back to the beginning.
public void doWhileLoop() {
int i = 0;
do {
i++;
} while (i < 2);
}


Results in the following byte code:
0: iconst_0
1: istore_1
2: iinc          1, 1
5: iload_1
6: iconst_2
7: if_icmplt     2
10: return








More Articles

The next two articles will cover the following topics:

Part 2 - Object Orientation And Safety (next article)

try

-catch-finally

synchronized

method calls (and parameters)

new (objects
and arrays)

Part 3 - Metaprogramming (future article)

generics

annotations

reflection

For more detail on the internal architecture in the JVM and different memory areas used during byte code execution see my previous article on JVM
Internals
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: