您的位置:首页 > 产品设计 > UI/UE

Java 8 VM GC Tunning Guide Charter 6

2016-10-01 00:16 323 查看
第六章 并行GC

The Parallel Collector

The parallel collector (also referred to here as the throughput collector) is a generational collector similar to the serial collector; the primary difference is that multiple threads are used to speed up garbage collection. The parallel collector is enabled with the command-line option -XX:+UseParallelGC. By default, with this option, both minor and major collections are executed in parallel to further reduce garbage collection overhead.

并行gc和顺序gc的主要区别在于使用多线程进行垃圾回收加速。使用命令行参数-XX:+UseParallelGC。默认情况下,minor gc和major gc在垃圾回收时都使用多线程的方式。

On a machine with N hardware threads where N is greater than 8, the parallel collector uses a fixed fraction of N as the number of garbage collector threads. The fraction is approximately 5/8 for large values of N. At values of N below 8, the number used is N. On selected platforms, the fraction drops to 5/16. The specific number of garbage collector threads can be adjusted with a command-line option (which is described later). On a host with one processor, the parallel collector will likely not perform as well as the serial collector because of the overhead required for parallel execution (for example, synchronization). However, when running applications with medium-sized to large-sized heaps, it generally outperforms the serial collector by a modest amount on machines with two processors, and usually performs significantly better than the serial collector when more than two processors are available.

在支持硬件线程的机器上,当线程数N比8大的时候,并行gc使用固定值N作为分母的一个分数来确定gc的线程数。这个值一般是5/8向上取整。当N比8小的时候,取N。在某些特定平台,这个数值有可能下降到5/16。可以通过命令行参数来设定并行gc并发的线程数(后边有介绍)。在单核机器上,并行gc可能还不如顺序gc表现的好。但是现在大多数机器都是多核的,因此使用并行gc会比顺序gc的表现通常来说要好一些。

The number of garbage collector threads can be controlled with the command-line option -XX:ParallelGCThreads=<N>. If explicit tuning of the heap is being done with command-line options, then the size of the heap needed for good performance with the parallel collector is the same as needed with the serial collector. However, enabling the parallel collector should make the collection pauses shorter. Because multiple garbage collector threads are participating in a minor collection, some fragmentation is possible due to promotions from the young generation to the tenured generation during the collection. Each garbage collection thread involved in a minor collection reserves a part of the tenured generation for promotions and the division of the available space into these "promotion buffers" can cause a fragmentation effect. Reducing the number of garbage collector threads and increasing the size of the tenured generation will reduce this fragmentation effect.

使用命令行参数-XX:ParallelGCThreads=<N>来控制并发的线程数。如果命令行参数中指定了关于堆的优化参数,那么并行gc和顺序gc要求的堆的大小是一样的。启用并行gc可以让程序暂停的时间更短。在minor gc时,多个线程同时参与,对象从新生代到老生代的迁移过程中,就不可避免的出现了分段现象。因为每一个gc线程都会保留一段老生代的内存地址作为promotion buffer。减少并行gc的线程数目,增加老生代的内存大小,都可以有效减少内存地址分段的影响。

并行gc的内存分代

在第三章我们看到了内存分代的图,如下图:



但是这个图,当时特别标注了

Figure 3-2 Default Arrangement of Generations, Except for Parallel Collector and G1

这个内存分代的默认情况是不适用于并行gc和G1 gc的

并行gc的内存分代,如下图所示:



这个看上去和默认的分代情况是完全不同的

其中新生代只有一个survivor区域,然后还有一个space区域。Space区域就是上文中提到的promote buffer。

并行gc的Ergonomics

The parallel collector is selected by default on server-class machines. In addition, the parallel collector uses a method of automatic tuning that allows you to specify specific behaviors instead of generation sizes and other low-level tuning details. You can specify maximum garbage collection pause time, throughput, and footprint (heap size).

并行gc在服务器级别的机器上默认是被选中使用的。除此之外,并行gc可以通过你规定一些高级参数(而不是设定代的大小和其他底层参数)来实现自动调整。你可以规定最大垃圾回收暂停时间,吞吐量和堆大小。

Maximum Garbage Collection Pause Time: The maximum pause time goal is specified with the command-line option -XX:MaxGCPauseMillis=<N>. This is interpreted as a hint that pause times of <N> milliseconds or less are desired; by default, there is no maximum pause time goal. If a pause time goal is specified, the heap size and other parameters related to garbage collection are adjusted in an attempt to keep garbage collection pauses shorter than the specified value. These adjustments may cause the garbage collector to reduce the overall throughput of the application, and the desired pause time goal cannot always be met.

最大gc暂停时间:使用命令行参数-XX:MaxGCPauseMillis=<N>来设定。参数N规定暂停的时间应该小于等于N的值(毫秒数);默认是不设定最大gc暂停时间的上限的。一旦设定最大暂停时间,那么堆内存大小和其他gc相关的参数必须根据这个做相应的调整。这些调整可能会减少整体的程序的吞吐量,并且,这个最大暂停时间的参数不一定总能被满足。

Throughput: The throughput goal is measured in terms of the time spent doing garbage collection versus the time spent outside of garbage collection (referred to as application time). The goal is specified by the command-line option -XX:GCTimeRatio=<N>, which sets the ratio of garbage collection time to application time to 1 / (1 + <N>).

吞吐量:吞吐量指标使用gc的时间和总时间的比值来衡量。使用命令行参数-XX:GCTimeRatio=<N>来规定,这个参数设定的是比值是1 / (1 + <N>),比如-XX:GCTimeRatio=4,意味着gc的时间不能超过程序运行时间的20%。

Footprint: Maximum heap footprint is specified using the option -Xmx<N>. In addition, the collector has an implicit goal of minimizing the size of the heap as long as the other goals are being met.

占用空间大小:使用-Xmx和-Xms配置。

并行gc目标优先级顺序

最大暂停时间

吞吐量

最小空间占用

内存分代大小调整

Generation Size Adjustments

The statistics such as average pause time kept by the collector are updated at the end of each collection. The tests to determine if the goals have been met are then made and any needed adjustments to the size of a generation is made. The exception is that explicit garbage collections (for example, calls to System.gc()) are ignored in terms of keeping statistics and making adjustments to the sizes of generations.

一些统计数字,比如平均暂停时间,在每一次gc之后就会被更新。然后VM就会进行对比,来判定是否设定的目标(最大暂停时间,吞吐量,占用空间)被满足,如果不满足,就作出一些分代内存大小的调整。一个例外情况是,如果正赶上统计或者正赶上正在调整分代内存大小,那么外部的gc指令(比如system.gc())就会被忽略。

Growing and shrinking the size of a generation is done by increments that are a fixed percentage of the size of the generation so that a generation steps up or down toward its desired size. Growing and shrinking are done at different rates. By default a generation grows in increments of 20% and shrinks in increments of 5%. The percentage for growing is controlled by the command-line option -XX:YoungGenerationSizeIncrement=<Y> for the young generation and -XX:TenuredGenerationSizeIncrement=<T> for the tenured generation. The percentage by which a generation shrinks is adjusted by the command-line flag -XX:AdaptiveSizeDecrementScaleFactor=<D>. If the growth increment is X percent, then the decrement for shrinking is X/D percent.

伸缩一个代的内存大小是按照比例的。伸和缩采用的比例不同。默认伸的比例是20%,而缩的比例是5%。伸的比例由参数-XX:YoungGenerationSizeIncrement=<Y>(控制新生代)和-XX:TenuredGenerationSizeIncrement=<T>(控制老生代)来设定。缩的比例由一个统一参数-XX:AdaptiveSizeDecrementScaleFactor=<D>来设定。如果伸的参数是X,那么缩的比例就是X/D。

If the collector decides to grow a generation at startup, then there is a supplemental percentage is added to the increment. This supplement decays with the number of collections and has no long-term effect. The intent of the supplement is to increase startup performance. There is no supplement to the percentage for shrinking.

如果gc决定在程序启动的时候就扩大某个代的内存大小,那么在原有比例的基础上,VM会做一个增补。这个增补随着若干次gc之后逐渐消退,没有长期效果。增补的目的就是为了获得在启动的时候更好的性能。在收缩代内存时,没有增补行为。

If the maximum pause time goal is not being met, then the size of only one generation is shrunk at a time. If the pause times of both generations are above the goal, then the size of the generation with the larger pause time is shrunk first.

如果最大暂停时间没有被满足,那么VM会依次缩减代的内存大小,每次缩减一个代。如果新生代和老生代的暂停时间都没有满足,那么优先缩减暂停时间最大的那个代的内存。

If the throughput goal is not being met, the sizes of both generations are increased. Each is increased in proportion to its respective contribution to the total garbage collection time. For example, if the garbage collection time of the young generation is 25% of the total collection time and if a full increment of the young generation would be by 20%, then the young generation would be increased by 5%.

如果吞吐量目标没有被满足,两个代都会扩张内存。扩张的比例和各自导致total gc时间的长短有关。比如,如果新生代的gc时间是总gc时间的25%,并且新生代设定的扩张比例是20%,那么新生代扩张的比例就是5%(这是什么奇怪的算法)

默认的堆大小

默认堆的大小不是命令行指定的,而是经过计算算出来的

客户端JVM堆初始大小和最大值

The default maximum heap size is half of the physical memory up to a physical memory size of 192 megabytes (MB) and otherwise one fourth of the physical memory up to a physical memory size of 1 gigabyte (GB).

默认最大堆内存是物理内存的一半,上限是192MB;如果内存大于1G,那么是内存的1/4。

For example, if your computer has 128 MB of physical memory, then the maximum heap size is 64 MB, and greater than or equal to 1 GB of physical memory results in a maximum heap size of 256 MB.

比如,你的电脑是128MB的物理内存,那么最大内存就是64MB;如果你的电脑是1GB内存,那么最大堆内存就是256MB。

The maximum heap size is not actually used by the JVM unless your program creates enough objects to require it. A much smaller amount, called the initial heap size, is allocated during JVM initialization. This amount is at least 8 MB and otherwise 1/64th of physical memory up to a physical memory size of 1 GB.

最小堆内存至少为8MB,如果物理内存为1GB,那么就为物理内存的1/64。

The maximum amount of space allocated to the young generation is one third of the total heap size.

最大新生代内存分配,是整个堆内存的三分之一。

服务端JVM堆初始大小和最大值

The default initial and maximum heap sizes work similarly on the server JVM as it does on the client JVM, except that the default values can go higher. On 32-bit JVMs, the default maximum heap size can be up to 1 GB if there is 4 GB or more of physical memory. On 64-bit JVMs, the default maximum heap size can be up to 32 GB if there is 128 GB or more of physical memory. You can always set a higher or lower initial and maximum heap by specifying those values directly; see the next section.

服务端的内存更大些。32位JVM,默认最大堆内存为1GB(机器为4GB或更多)。64位JVM,默认最大堆内存可以到32GB或者128GB(根据不同系统)。这些值可以配置。

指定初始化和最大堆内存

You can specify the initial and maximum heap sizes using the flags -Xms (initial heap size) and -Xmx (maximum heap size). If you know how much heap your application needs to work well, you can set -Xms and -Xmx to the same value. If not, the JVM will start by using the initial heap size and will then grow the Java heap until it finds a balance between heap usage and performance.

使用-Xms和-Xmx来配置对内存的大小。

Other parameters and options can affect these defaults. To verify your default values, use the -XX:+PrintFlagsFinal option and look for MaxHeapSize in the output. For example, on Linux or Solaris, you can run the following:

为了验证默认值,可以使用命令行参数-XX:+PrintFlagsFinal来打印最大堆内存的设定(还包含其他设定,需要使用过滤管道)。比如如下指令:

java -XX:+PrintFlagsFinal <GC options> -version | grep MaxHeapSize





OutOfMemoryError

The parallel collector throws an OutOfMemoryError if too much time is being spent in garbage collection (GC): If more than 98% of the total time is spent in garbage collection and less than 2% of the heap is recovered, then an OutOfMemoryError is thrown. This feature is designed to prevent applications from running for an extended period of time while making little or no progress because the heap is too small. If necessary, this feature can be disabled by adding the option -XX:-UseGCOverheadLimit to the command line.

并行gc在gc时间过长的情况下会抛出OutOfMemoryError错误。如果超过98%的时间都花在了gc上,那么此错误就被抛出。可以使用参数-XX:-UseGCOverheadLimit来阻止抛出这个错误。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: