Know Your Stack - Scala to Java to Bytecode to Assembly
2015-11-28 17:58
525 查看
Many times in my career so far I could have made better decisions, solved problems much faster or even prevented problems if I had a deeper understanding
of the technology stack I was using.
To begin getting a deeper understanding of the Scala stack, we will trace a simple program through it's lifetime of becoming Java, then JVM bytecode and
then the final step before it's fed to the CPU monsters - assembly.
Starting with a high-level
overview like this is a good foundation on which to explore each step in more detail. Because I'm nice I'll give you lots of links at the bottom of this post so you can do just that.
All code relevant to this post is up
on my github.
Our initial block of code is a scala object with a main method. When the application is started this main method will be run and, as you can see below, it's going to print something stupid to the console and finish executing.
[/code]
When Scala code is compiled with sbt it's already output as bytecode. However,
we can use one of the java decompilers to show us what the java code would have looked like.
First I'll run sbt to compile the Scala. This turns our KnowYourStack.scala into two .class files as shown below:
I'll now use jd-gui to disassemble them back into java. Here are the results:
It looks like Scala creates both an instance and static class as a way to enforce the singleton pattern, which objects are. This is also the entry point
to the application, which is maybe what the Scala signature is doing here?
A JVM takes bytecode and JITs it
to machine code at runtime. It will even compile hot code that is being JITd frequently. The bytecode of our two generated .class files is below using javap.
Interesting.... where did our string go? We can see it being loaded as constant 16 (ldc #16) and accessed later (getstatic #16) but where is the declaration? A lot of these instructions make a lot of sense when you've read/watched the links I've posted at the
bottom of this post.
I've got 2 ravenous intel CPUs inside in my i5 Ivy Bridge; both of whom have an insatiable appetite for x86_64 soup. Let's see what we can knock up from our base of bytecode special ingredients.
To get assembly out of the JVM you need to pass two flags. We do that like this in sbt:
Note also the fork settings is set to true. This is because sbt needs to fork a JVM with the settings applied. You'll also need to listen to this
smart guy.
We're now ready to cook.
After running sbt run my terminal erupts with lines of text that are hard to make out. Luckily I pipe it into a file, which on later inspection happens to be full of intel instructions. Here's a couple of snippets:
Start of the application
Method Decompilation with Annotations
You can find the full output on
my github. Just don't ask me to explain what all of those registers are used for.... yet.
of the technology stack I was using.
To begin getting a deeper understanding of the Scala stack, we will trace a simple program through it's lifetime of becoming Java, then JVM bytecode and
then the final step before it's fed to the CPU monsters - assembly.
Starting with a high-level
overview like this is a good foundation on which to explore each step in more detail. Because I'm nice I'll give you lots of links at the bottom of this post so you can do just that.
All code relevant to this post is up
on my github.
Scala
Our initial block of code is a scala object with a main method. When the application is started this main method will be run and, as you can see below, it's going to print something stupid to the console and finish executing.package kns object KnowYourStack { def main(args: Array[String]) { val message = "Fat stacks, yo!" println(message) } }
[/code]
Compiled to Java
When Scala code is compiled with sbt it's already output as bytecode. However,we can use one of the java decompilers to show us what the java code would have looked like.
First I'll run sbt to compile the Scala. This turns our KnowYourStack.scala into two .class files as shown below:
I'll now use jd-gui to disassemble them back into java. Here are the results:
package kns; import scala.Predef.; public final class KnowYourStack$ { public static final MODULE$; static { new (); } public void main(String[] args) { String message = "Fat stacks, yo!"; Predef..MODULE$.println(message); } private KnowYourStack$() { MODULE$ = this; } }
package kns; import scala.reflect.ScalaSignature; @ScalaSignature(bytes="\006\001\025:Q!\001\002\t\002\025\tQb\0238pof{WO]*uC\016\\'\"A\002\002\007-t7o\001\001\021\005\0319Q\"\001\002\007\013!\021\001\022A\005\003\033-swn^-pkJ\034F/Y2l'\t9!\002\005\002\f\0355\tABC\001\016\003\025\0318-\0317b\023\tyAB\001\004B]f\024VM\032\005\006#\035!\tAE\001\007y%t\027\016\036 \025\003\025AQ\001F\004\005\002U\tA!\\1j]R\021a#\007\t\003\027]I!\001\007\007\003\tUs\027\016\036\005\0065M\001\raG\001\005CJ<7\017E\002\f9yI!!\b\007\003\013\005\023(/Y=\021\005}\021cBA\006!\023\t\tC\"\001\004Qe\026$WMZ\005\003G\021\022aa\025;sS:<'BA\021\r\001") public final class KnowYourStack { public static void main(String[] paramArrayOfString) { KnowYourStack..MODULE$.main(paramArrayOfString); } }
It looks like Scala creates both an instance and static class as a way to enforce the singleton pattern, which objects are. This is also the entry point
to the application, which is maybe what the Scala signature is doing here?
Next Stop: Bytecode
A JVM takes bytecode and JITs itto machine code at runtime. It will even compile hot code that is being JITd frequently. The bytecode of our two generated .class files is below using javap.
➜ javap -c -p KnowYourStack\$ Compiled from "KnowYourStack.scala" public final class kns.KnowYourStack$ { public static final kns.KnowYourStack$ MODULE$; public static {}; Code: 0: new #2 // class kns/KnowYourStack$ 3: invokespecial #12 // Method "":()V 6: return public void main(java.lang.String[]); Code: 0: ldc #16 // String Fat stacks, yo! 2: astore_2 3: getstatic #21 // Field scala/Predef$.MODULE$:Lscala/Predef$; 6: aload_2 7: invokevirtual #25 // Method scala/Predef$.println:(Ljava/lang/Object;)V 10: return private kns.KnowYourStack$(); Code: 0: aload_0 1: invokespecial #31 // Method java/lang/Object."":()V 4: aload_0 5: putstatic #33 // Field MODULE$:Lkns/KnowYourStack$; 8: return }
➜ javap -c -p KnowYourStack Compiled from "KnowYourStack.scala" public final class kns.KnowYourStack { public static void main(java.lang.String[]); Code: 0: getstatic #16 // Field kns/KnowYourStack$.MODULE$:Lkns/KnowYourStack$; 3: aload_0 4: invokevirtual #18 // Method kns/KnowYourStack$.main:([Ljava/lang/String;)V 7: return }
Interesting.... where did our string go? We can see it being loaded as constant 16 (ldc #16) and accessed later (getstatic #16) but where is the declaration? A lot of these instructions make a lot of sense when you've read/watched the links I've posted at the
bottom of this post.
Final Destination: Assembly
I've got 2 ravenous intel CPUs inside in my i5 Ivy Bridge; both of whom have an insatiable appetite for x86_64 soup. Let's see what we can knock up from our base of bytecode special ingredients.To get assembly out of the JVM you need to pass two flags. We do that like this in sbt:
name := "know your stack" scalaVersion := "2.10.2" fork := true javaOptions += "-XX:+UnlockDiagnosticVMOptions" javaOptions += "-XX:+PrintAssembly"
Note also the fork settings is set to true. This is because sbt needs to fork a JVM with the settings applied. You'll also need to listen to this
smart guy.
We're now ready to cook.
After running sbt run my terminal erupts with lines of text that are hard to make out. Luckily I pipe it into a file, which on later inspection happens to be full of intel instructions. Here's a couple of snippets:
Start of the application
Running kns.KnowYourStack OpenJDK 64-Bit Server VM warning: PrintAssembly is enabled; turning on DebugNonSafepoints to gain additional output Loaded disassembler from /usr/lib/jvm/java-7-openjdk/jre/lib/amd64/hsdis-amd64.so Decoding compiled method 0x00007f3419060250: Code: [Disassembling for mach='i386:x86-64'] [0m [Entry Point] [Constants] [# {method} 'readLine' '([BII)I' in 'java/util/jar/Manifest$FastInputStream' # this: rsi:rsi = 'java/util/jar/Manifest$FastInputStream' # parm0: rdx:rdx = '[B' # parm1: rcx = int # parm2: r8 = int # [sp+0x60] (sp of caller) 0x00007f34190603a0: mov 0x8(%rsi),%r10d 0x00007f34190603a4: shl $0x3,%r10 0x00007f34190603a8: cmp %r10,%rax 0x00007f34190603ab: jne 0x00007f3419037960 ; {runtime_call} 0x00007f34190603b1: xchg %ax,%ax 0x00007f34190603b4: nopl 0x0(%rax,%rax,1) 0x00007f34190603bc: xchg %ax,%ax [Verified Entry Point] 0x00007f34190603c0: mov %eax,-0x14000(%rsp) 0x00007f34190603c7: push %rbp 0x00007f34190603c8: sub $0x50,%rsp ;*synchronization entry
Method Decompilation with Annotations
[Decoding compiled method 0x00007f3419062f50: [Code: [Entry Point] [Verified Entry Point] [Constants] # {method} 'indexOf' '([CII[CIII)I' in 'java/lang/String' [ # parm0: rsi:rsi = '[C' [ # parm1: rdx = int [ # parm2: rcx = int [ # parm3: r8:r8 = '[C' [ # parm4: r9 = int [ # parm5: rdi = int [ # parm6: [sp+0x50] = int (sp of caller) [ 0x00007f34190630a0: mov %eax,-0x14000(%rsp) [ 0x00007f34190630a7: push %rbp [ 0x00007f34190630a8: sub $0x40,%rsp ;*synchronization entry [ ; - java.lang.String::indexOf@-1 (line 1718) [ 0x00007f34190630ac: mov %rsi,0x18(%rsp) [ 0x00007f34190630b1: mov %edx,0x10(%rsp) [ 0x00007f34190630b5: mov %r9d,(%rsp) [ 0x00007f34190630b9: mov %ecx,0x8(%rsp) [ 0x00007f34190630bd: mov 0x50(%rsp),%ebp [ 0x00007f34190630c1: cmp %ecx,%ebp [ 0x00007f34190630c3: jge 0x00007f3419063481 ;*if_icmplt [
You can find the full output on
my github. Just don't ask me to explain what all of those registers are used for.... yet.
相关文章推荐
- Windows下Scala环境搭建
- Windows7下安装Scala 2.9.2教程
- XML 文件解析--含Unicode字符的XML文件
- 分分钟掌握快速排序(Java / Scala 实现)
- Scala极速入门
- Spark初探
- Scala实现REST操作
- Scala method call syntax
- 关于Scala多重继承的菱形问题
- Scala 高阶函数(high-order function)剖析
- Scala Monad Design Pattern
- Spray.io搭建Rest服务
- Spray.io搭建Rest — 支持Twirl模板并部署
- 搭建hadoop/spark集群环境
- Akka (actors) remote example
- scala工具库
- scala-协变、逆变、上界、下界
- scala-常用函数介绍
- zeppelin入门使用
- ScalaMP ---- 模仿 OpenMp 的一个简单并行计算框架