您的位置:首页 > 编程语言 > Java开发

Know Your Stack - Scala to Java to Bytecode to Assembly

2015-11-28 17:58 525 查看
Many times in my career so far I could have made better decisions, solved problems much faster or even prevented problems if I had a deeper understanding
of the technology stack I was using.

To begin getting a deeper understanding of the Scala stack, we will trace a simple program through it's lifetime of becoming Java, then JVM bytecode and
then the final step before it's fed to the CPU monsters - assembly.

Starting with a high-level
overview like this is a good foundation on which to explore each step in more detail. Because I'm nice I'll give you lots of links at the bottom of this post so you can do just that.

All code relevant to this post is up
on my github.


Scala

Our initial block of code is a scala object with a main method. When the application is started this main method will be run and, as you can see below, it's going to print something stupid to the console and finish executing.

package kns

object KnowYourStack {

def main(args: Array[String]) {
val message = "Fat stacks, yo!"
println(message)
}
}

[/code]


Compiled to Java

When Scala code is compiled with sbt it's already output as bytecode. However,
we can use one of the java decompilers to show us what the java code would have looked like.

First I'll run sbt to compile the Scala. This turns our KnowYourStack.scala into two .class files as shown below:





I'll now use jd-gui to disassemble them back into java. Here are the results:

package kns;

import scala.Predef.;

public final class KnowYourStack$
{
public static final  MODULE$;

static
{
new ();
}

public void main(String[] args)
{
String message = "Fat stacks, yo!";
Predef..MODULE$.println(message);
}

private KnowYourStack$()
{
MODULE$ = this;
}
}


package kns;

import scala.reflect.ScalaSignature;

@ScalaSignature(bytes="\006\001\025:Q!\001\002\t\002\025\tQb\0238pof{WO]*uC\016\\'\"A\002\002\007-t7o\001\001\021\005\0319Q\"\001\002\007\013!\021\001\022A\005\003\033-swn^-pkJ\034F/Y2l'\t9!\002\005\002\f\0355\tABC\001\016\003\025\0318-\0317b\023\tyAB\001\004B]f\024VM\032\005\006#\035!\tAE\001\007y%t\027\016\036 \025\003\025AQ\001F\004\005\002U\tA!\\1j]R\021a#\007\t\003\027]I!\001\007\007\003\tUs\027\016\036\005\0065M\001\raG\001\005CJ<7\017E\002\f9yI!!\b\007\003\013\005\023(/Y=\021\005}\021cBA\006!\023\t\tC\"\001\004Qe\026$WMZ\005\003G\021\022aa\025;sS:<'BA\021\r\001")
public final class KnowYourStack
{
public static void main(String[] paramArrayOfString)
{
KnowYourStack..MODULE$.main(paramArrayOfString);
}
}


It looks like Scala creates both an instance and static class as a way to enforce the singleton pattern, which objects are. This is also the entry point
to the application, which is maybe what the Scala signature is doing here?


Next Stop: Bytecode

A JVM takes bytecode and JITs it
to machine code at runtime. It will even compile hot code that is being JITd frequently. The bytecode of our two generated .class files is below using javap.

➜ javap -c -p KnowYourStack\$

Compiled from "KnowYourStack.scala"

public final class kns.KnowYourStack$ {
public static final kns.KnowYourStack$ MODULE$;

public static {};
Code:
0: new           #2                  // class kns/KnowYourStack$
3: invokespecial #12                 // Method "":()V
6: return

public void main(java.lang.String[]);
Code:
0: ldc           #16                 // String Fat stacks, yo!
2: astore_2
3: getstatic     #21                 // Field scala/Predef$.MODULE$:Lscala/Predef$;
6: aload_2
7: invokevirtual #25                 // Method scala/Predef$.println:(Ljava/lang/Object;)V
10: return

private kns.KnowYourStack$();
Code:
0: aload_0
1: invokespecial #31                 // Method java/lang/Object."":()V
4: aload_0
5: putstatic     #33                 // Field MODULE$:Lkns/KnowYourStack$;
8: return
}


➜ javap -c -p KnowYourStack

Compiled from "KnowYourStack.scala"

public final class kns.KnowYourStack {
public static void main(java.lang.String[]);
Code:
0: getstatic     #16                 // Field kns/KnowYourStack$.MODULE$:Lkns/KnowYourStack$;
3: aload_0
4: invokevirtual #18                 // Method kns/KnowYourStack$.main:([Ljava/lang/String;)V
7: return
}


Interesting.... where did our string go? We can see it being loaded as constant 16 (ldc #16) and accessed later (getstatic #16) but where is the declaration? A lot of these instructions make a lot of sense when you've read/watched the links I've posted at the
bottom of this post.


Final Destination: Assembly

I've got 2 ravenous intel CPUs inside in my i5 Ivy Bridge; both of whom have an insatiable appetite for x86_64 soup. Let's see what we can knock up from our base of bytecode special ingredients.

To get assembly out of the JVM you need to pass two flags. We do that like this in sbt:

name := "know your stack"

scalaVersion := "2.10.2"

fork := true

javaOptions += "-XX:+UnlockDiagnosticVMOptions"

javaOptions += "-XX:+PrintAssembly"


Note also the fork settings is set to true. This is because sbt needs to fork a JVM with the settings applied. You'll also need to listen to this
smart guy.

We're now ready to cook.

After running sbt run my terminal erupts with lines of text that are hard to make out. Luckily I pipe it into a file, which on later inspection happens to be full of intel instructions. Here's a couple of snippets:

Start of the application
Running kns.KnowYourStack
OpenJDK 64-Bit Server VM warning: PrintAssembly is enabled; turning on DebugNonSafepoints to gain additional output
Loaded disassembler from /usr/lib/jvm/java-7-openjdk/jre/lib/amd64/hsdis-amd64.so
Decoding compiled method 0x00007f3419060250:
Code:
[Disassembling for mach='i386:x86-64'] [0m
[Entry Point]
[Constants]
[# {method} 'readLine' '([BII)I' in 'java/util/jar/Manifest$FastInputStream'
# this:     rsi:rsi   = 'java/util/jar/Manifest$FastInputStream'
# parm0:    rdx:rdx   = '[B'
# parm1:    rcx       = int
# parm2:    r8        = int
#           [sp+0x60]  (sp of caller)
0x00007f34190603a0: mov    0x8(%rsi),%r10d
0x00007f34190603a4: shl    $0x3,%r10
0x00007f34190603a8: cmp    %r10,%rax
0x00007f34190603ab: jne    0x00007f3419037960  ;   {runtime_call}
0x00007f34190603b1: xchg   %ax,%ax
0x00007f34190603b4: nopl   0x0(%rax,%rax,1)
0x00007f34190603bc: xchg   %ax,%ax
[Verified Entry Point]
0x00007f34190603c0: mov    %eax,-0x14000(%rsp)
0x00007f34190603c7: push   %rbp
0x00007f34190603c8: sub    $0x50,%rsp         ;*synchronization entry


Method Decompilation with Annotations
[Decoding compiled method 0x00007f3419062f50:
[Code:
[Entry Point]
[Verified Entry Point]
[Constants]
# {method} 'indexOf' '([CII[CIII)I' in 'java/lang/String' [
# parm0:    rsi:rsi   = '[C' [
# parm1:    rdx       = int [
# parm2:    rcx       = int [
# parm3:    r8:r8     = '[C' [
# parm4:    r9        = int [
# parm5:    rdi       = int [
# parm6:    [sp+0x50]   = int  (sp of caller) [
0x00007f34190630a0: mov    %eax,-0x14000(%rsp) [
0x00007f34190630a7: push   %rbp [
0x00007f34190630a8: sub    $0x40,%rsp         ;*synchronization entry [
; - java.lang.String::indexOf@-1 (line 1718) [
0x00007f34190630ac: mov    %rsi,0x18(%rsp) [
0x00007f34190630b1: mov    %edx,0x10(%rsp) [
0x00007f34190630b5: mov    %r9d,(%rsp) [
0x00007f34190630b9: mov    %ecx,0x8(%rsp) [
0x00007f34190630bd: mov    0x50(%rsp),%ebp [
0x00007f34190630c1: cmp    %ecx,%ebp [
0x00007f34190630c3: jge    0x00007f3419063481  ;*if_icmplt [


You can find the full output on
 my github. Just don't ask me to explain what all of those registers are used for.... yet.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  scala