您的位置:首页 > 编程语言 > Ruby

GBString 一个可以识别GB18030编码的字符串ruby类

2007-07-08 12:50 357 查看
GBString 一个可以识别GB18030编码的ruby字符串类

GBString是一个可以识别GB18030编码的ruby字符串类,它改写了String类的一些方法,可以很方便地处理内码是GB18030/GBK/GB2312的字符串。

项目的homepage
http://rubyforge.org/projects/gbstring/

License
====================
GBString, a ruby class simliar to String class but with GB18030 encoding aware style.

Copyright (C) <2007> <Bob Yang>

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

Email: bob.yang.dev at gmail.com
====================

一、快速开始
====================
1、构造一个GBString对象
require "g_b_string"
gbstr = GBString.new("这是一个GB18030编码的字符串!") # 注意,文件的编码方式要是GB18030

或者使用下面的简单方式

gbstr = _c("这也是一个GB18303编码的字符串!")

2、按中文字符为基本单位计算字符串长度
gbstr = _c("中文串")
puts( gbstr.size ) # => 3 而不是 6

3、遍历中文字符串
_c("中文串abc").each do |char|
puts char
end
# => 中
# => 文
# => 串
# => a
# => b
# => c

二、更多用法
====================
可以参考 test/tc_g_b_string.rb

1、each, each_with_index
每个元素是一个中文字符,类型为String.

2、split

cstr = _c"第一 第二 第三"
tokens = cstr.split(" ")
tokens[0] # => 第一
tokens[1] # => 第二
tokens[3] # => 第三

3、下标操作符'[ ]', 注意得到的对象是GBString类型的

cstr = _c"甲A"
puts( cstr[0] ) # => 甲
puts( cstr[1] ) # => A
puts( cstr[0].class ) # => GBString
puts( cstr[1].class ) # => GBString

4、类型转换,to_a, range
# 转换到数组, to_a
gbstr = _c("类型map")
array = gbstr.to_a
puts array.size # => 5
puts array[0] # => 类
puts array[3] # => a

# 作为range使用, 得到的对象是GBString类的
gbstr = _c"This is a /"中文字符串/"!"
puts( gbstr[10..16] ) # => "中文字符串"
puts( gbstr[-8..-1] ) # => "中文字符串"!
puts( gbstr[10, 7] ) # => "中文字符串"

5、作为Hash的key使用, 与相同内容的String对象相等.
hash = {_c("中国")=>1, _c("贵州省")=>2, _c("贵阳市")=>3}
puts( hash[_c("贵阳市")] ) # => 3
puts( hash["贵阳市"] ) # => 3

三、运行单元测试
====================
进入 test 目录,运行 ruby tc_g_b_string.rb 即可。
如果一切正常,会提示:
7 tests, 56 assertions, 0 failures, 0 errors
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐