您的位置：首页 > 编程语言 > Java开发

jdk自带的ThreadLocal和netty扩展的FastThreadLocal比较总结

2017-02-05 16:20 821 查看

最近在分析一潜在内存泄露问题的时候，jmap出来中有很多的FastThreadLocalThread实例，看了下javadoc，如下：

A special variant of

ThreadLocal

that yields higher access performance when accessed from a

FastThreadLocalThread

.

Internally, a

FastThreadLocal

uses a constant index in an array, instead of using hash code and hash table, to look for a variable. Although seemingly very subtle, it yields slight performance advantage over using a hash table, and it is useful when accessed frequently.

To take advantage of this thread-local variable, your thread must be a

FastThreadLocalThread

or its subtype. By default, all threads created by

DefaultThreadFactory

are

FastThreadLocalThread

due to this reason.

Note that the fast path is only possible on threads that extend

FastThreadLocalThread

, because it requires a special field to store the necessary state. An access by any other kind of thread falls back to a regular

ThreadLocal

.

简单地说，就是在

FastThreadLocalThread线程内访问性能会更快的ThreadLocal的一种实现。其使用常量索引而非hash值作为索引进行变量查找。


对于使用默认线程池的情况，netty会使用DefaultTrheadFactory创建FastThreadLocalThread线程，而非原生的Thread，其源码位置如下：



根据之前对比java测试c++各种map、unordered_map的记忆，一般来说map中值越多、各种实现的差距越大（因为潜在的冲突增加以及底层的实现为b*或者链表或者线性等）。

为了大概了解下差距会有多少，搜了下，有个帖子（https://my.oschina.net/andylucc/blog/614359）进行了测试，例子中结果如下：

1000个ThreadLocal对应一个线程对象的100w次的计时读操作：

ThreadLocal：3767ms | 3636ms | 3595ms | 3610ms | 3719ms

FastThreadLocal: 15ms | 14ms | 13ms | 14ms | 14ms

1000个ThreadLocal对应一个线程对象的10w次的计时读操作：

ThreadLocal：384ms | 378ms | 366ms | 647ms | 372ms

FastThreadLocal:14ms | 13ms | 13ms | 17ms | 13ms

1000个ThreadLocal对应一个线程对象的1w次的计时读操作：

ThreadLocal：43ms | 42ms | 42ms | 56ms | 45ms

FastThreadLocal:15ms | 13ms | 11ms | 15ms | 11ms

100个ThreadLocal对应一个线程对象的1w次的计时读操作：

ThreadLocal：16ms | 21ms | 18ms | 16ms | 18ms

FastThreadLocal:15ms | 15ms | 15ms | 17ms | 18ms

上面的实验数据可以看出，当ThreadLocal数量和读写ThreadLocal的频率较高的时候，传统的ThreadLocal的性能下降速度比较快，而Netty实现的FastThreadLocal性能比较稳定。上面实验模拟的场景不够具体，但是已经在一定程度上我们可以认为，FastThreadLocal相比传统的的ThreadLocal在高并发高负载环境下表现的比较优秀。

总结来说，根据经验，个人认为99%的应用中不会使用超过成千上万个线程本地变量，所以除非极为特殊的应用，出于后续维护成本的考虑，使用传统的ThreadLocal就可以了，没必要使用FastThreadLocal。

PS：关于threadlocal的场景，就不重复阐述了，可参考下列两个帖子：
https://my.oschina.net/clopopo/blog/149368 http://blog.csdn.net/lufeng20/article/details/24314381 http://lavasoft.blog.51cto.com/62575/51926/

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航