您的位置:首页 > 编程语言 > Python开发

python 通过thrift 简单操作hbase

2014-07-08 15:08 603 查看
thrift 是facebook开发并开源的一个二进制通讯中间件,通过thrift,我们可以充分利用各个语言的优势,编写高效的代码。

关于thrift的论文:http://pan.baidu.com/share/link?shareid=234128&uk=3238841275

安装thrift:http://thrift.apache.org/docs/install/ubuntu/

安装完成后到hbase的目录下,找到Hbase.thrift,该文件在

hbase-0.94.4/src/main/resources/org/apache/hadoop/hbase/thrift下可以找到

thrift --gen py hbase.thrift 会生成gen-py文件夹,将其修改成hbase

安装python的thrift库

sudo pip install thrift

启动hbase的thrift服务:bin/hbase-daemon.sh start thrift 默认端口是9090

创建hbase表:



1 from thrift import Thrift
2 from thrift.transport import TSocket
3 from thrift.transport import TTransport
4 from thrift.protocol import TBinaryProtocol
5
6 from hbase import Hbase
7 from hbase.ttypes import *
8
9 transport = TSocket.TSocket('localhost', 9090);
10
11 transport = TTransport.TBufferedTransport(transport)
12
13 protocol = TBinaryProtocol.TBinaryProtocol(transport);
14
15 client = Hbase.Client(protocol)
16 transport.open()
17
18
19 contents = ColumnDescriptor(name='cf:', maxVersions=1)
20 client.createTable('test', [contents])
21
22 print client.getTableNames()




执行代码,成功后,进入hbase的shell,用命令list可以看到刚刚的test表已经创建成功。



插入数据:



1 from thrift import Thrift
2 from thrift.transport import TSocket
3 from thrift.transport import TTransport
4 from thrift.protocol import TBinaryProtocol
5
6 from hbase import Hbase
7
8 from hbase.ttypes import *
9
10 transport = TSocket.TSocket('localhost', 9090)
11
12 transport = TTransport.TBufferedTransport(transport)
13
14 protocol = TBinaryProtocol.TBinaryProtocol(transport)
15
16 client = Hbase.Client(protocol)
17
18 transport.open()
19
20 row = 'row-key1'
21
22 mutations = [Mutation(column="cf:a", value="1")]
23 client.mutateRow('test', row, mutations, None)




插入成功,通过scan命令查看插入结果:



获取一行数据:



1 from thrift import Thrift
2 from thrift.transport import TSocket
3 from thrift.transport import TTransport
4 from thrift.protocol import TBinaryProtocol
5
6 from hbase import Hbase
7 from hbase.ttypes import *
8
9 transport = TSocket.TSocket('localhost', 9090)
10 transport = TTransport.TBufferedTransport(transport)
11
12 protocol = TBinaryProtocol.TBinaryProtocol(transport)
13
14 client = Hbase.Client(protocol)
15
16 transport.open()
17
18 tableName = 'test'
19 rowKey = 'row-key1'
20
21 result = client.getRow(tableName, rowKey, None)
22 print result
23 for r in result:
24     print 'the row is ' , r.row
25     print 'the values is ' , r.columns.get('cf:a').value




getRow返回的是TResult列表,结果如下:



返回多行则需要使用scan:



1 from thrift import Thrift
2 from thrift.transport import TSocket
3 from thrift.transport import TTransport
4 from thrift.protocol import TBinaryProtocol
5
6 from hbase import Hbase
7 from hbase.ttypes import *
8
9 transport = TSocket.TSocket('localhost', 9090)
10 transport = TTransport.TBufferedTransport(transport)
11
12 protocol = TBinaryProtocol.TBinaryProtocol(transport)
13
14 client = Hbase.Client(protocol)
15 transport.open()
16
17 scan = TScan()
18 tableName = 'test'
19 id = client.scannerOpenWithScan(tableName, scan, None)
20
21 result2 = client.scannerGetList(id, 10)
22
23 print result2




scannerGetList会取10条数据,然后输出结果



scannerGet则是每次只取一行数据:



1 from thrift import Thrift
2 from thrift.transport import TSocket
3 from thrift.transport import TTransport
4 from thrift.protocol import TBinaryProtocol
5
6 from hbase import Hbase
7 from hbase.ttypes import *
8
9 transport = TSocket.TSocket('localhost', 9090)
10 transport = TTransport.TBufferedTransport(transport)
11
12 protocol = TBinaryProtocol.TBinaryProtocol(transport)
13
14 client = Hbase.Client(protocol)
15 transport.open()
16
17 scan = TScan()
18 tableName = 'test'
19 id = client.scannerOpenWithScan(tableName, scan, None)
20 result = client.scannerGet(id)
21 while result:
22     print result
23     result = client.scannerGet(id)




输出结果:



转自:http://www.cnblogs.com/hitandrew/archive/2013/01/21/2870419.html
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: