您的位置:首页 > 其它

慎用ToLower和ToUpper,小心把你的系统给拖垮了

2020-05-04 17:24 441 查看
不知道何时开始,很多程序员喜欢用ToLower,ToUpper去实现忽略大小写模式的字符串相等性比较,有可能这个习惯是从别的语言引进的,大胆猜测下是JS,为了不引起争论,我指的JS是技师的意思~ # 一:背景 ## 1. 讲故事 在我们一个订单聚合系统中,每一笔订单都会标注来源,比如JD,Taobao,Etao,Shopex 等等一些渠道,UI上也提供高级配置输入自定义的订单来源,后来客户反馈输入xxx查询不出订单,这里就拿shopex为例,用户用小写的shopex查询,但系统中标注的是首字母大写的Shopex,所以自然无法匹配,为了解决这个问题开发小哥就统一转成大写做比对,用代码表示如下: ``` C# var orderfrom = "shopex".ToUpper(); customerIDList = MemoryOrders.Where(i =>i.OrderFrom.ToUpper()==orderFrom) .Select(i => i.CustomerId).ToList(); ``` 改完后就是这么牛的上线了,乍一看也没啥问题,结果一查询明显感觉比之前速度慢了好几秒,干脆多点几下,好咯。。。在监控中发现CPU和memory突高突低,异常波动,这位小哥又在写bug了,查了下代码问他为什么这么写,小哥说在js中就是这么比较的~~~ ## 2. string.Compare 改造 其实在C#中面对忽略大小写形式的比较是有专门的方法,性能高而且还不费内存,它就是 `string.Compare`,所以把上面代码改成如下就可以了。 ``` C# var orderfrom = "shopex"; customerIDList = MemoryOrders.Where(string.Compare(i.TradeFrom, tradefrom, StringComparison.OrdinalIgnoreCase) == 0) .Select(i => i.CustomerId).ToList(); ``` 这其中的 `StringComparison.OrdinalIgnoreCase`枚举就是用来忽略大小写的,上线之后除了CPU还是有点波动,其他都没有问题了。 # 二:为什么ToLower,ToUpper会有如此大的影响 为了方便演示,我找了一篇英文小短文,然后通过查询某一个单词来演示ToUpper为啥对cpu和memory以及查询性能都有如此大的影响,代码如下: ``` C# public static void Main(string[] args) { var strList = "Hooray! It's snowing! It's time to make a snowman.James runs out. He makes a big pile of snow. He puts a big snowball on top. He adds a scarf and a hat. He adds an orange for the nose. He adds coal for the eyes and buttons.In the evening, James opens the door. What does he see? The snowman is moving! James invites him in. The snowman has never been inside a house. He says hello to the cat. He plays with paper towels.A moment later, the snowman takes James's hand and goes out.They go up, up, up into the air! They are flying! What a wonderful night!The next morning, James jumps out of bed. He runs to the door.He wants to thank the snowman. But he's gone.".Split(' '); var query = "snowman".ToUpper(); for (int i = 0; i < strList.Length; i++) { var str = strList[i].ToUpper(); if (str == query) Console.WriteLine(str); } Console.ReadLine(); } ``` ## 1. 内存波动探究 既然内存有波动,说明内存里进了脏东西,学C#基础知识的时候应该知道string是不可变的,一旦有修改就会生成新的string,那就是说ToUpper之后会出现新的string,为了用数据佐证,用windbg演示一下。 ``` C# 0:000> !dumpheap -type System.String -stat Statistics: MT Count TotalSize Class Name 00007ff8e7a9a120 1 24 System.Collections.Generic.GenericEqualityComparer`1[[System.String, mscorlib]] 00007ff8e7a99e98 1 80 System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[System.Globalization.CultureData, mscorlib]] 00007ff8e7a9a378 1 96 System.Collections.Generic.Dictionary`2+Entry[[System.String, mscorlib],[System.Globalization.CultureData, mscorlib]][] 00007ff8e7a93200 19 2264 System.String[] 00007ff8e7a959c0 429 17894 System.String Total 451 object ``` 可以看到托管堆上有`Count=429`个string对象,那这个429怎么来的? 组成:短文128个,ToUpper后128个,系统默认165个,query字符串2个,不明字符串6个,最后就是`128 +128 + 165 + 2 + 6=429`,然后随便抽几个看看。 > !dumpheap -mt 00007ff8e7a959c0 > !DumpObj 000002244282a1f8 ``` C# 0:000> !DumpObj /d 0000017800008010 Name: System.String MethodTable: 00007ff8e7a959c0 EEClass: 00007ff8e7a72ec0 Size: 38(0x26) bytes File: C:\WINDOWS\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll String: HOUSE. Fields: MT Field Offset Type VT Attr Value Name 00007ff8e7a985a0 4000281 8 System.Int32 1 instance 6 m_stringLength 00007ff8e7a96838 4000282 c System.Char 1 instance 48 m_firstChar 00007ff8e7a959c0 4000286 d8 System.String 0 shared static Empty >> Domain:Value 0000017878943bb0:NotInit !DumpObj /d 0000017800008248 Name: System.String MethodTable: 00007ff8e7a959c0 EEClass: 00007ff8e7a72ec0 Size: 40(0x28) bytes File: C:\WINDOWS\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll String: SNOWMAN Fields: MT Field Offset Type VT Attr Value Name 00007ff8e7a985a0 4000281 8 System.Int32 1 instance 7 m_stringLength 00007ff8e7a96838 4000282 c System.Char 1 instance 53 m_firstChar 00007ff8e7a959c0 4000286 d8 System.String 0 shared static Empty >> Domain:Value 0000017878943bb0:NotInit
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: