Redis Mass Insertion(--pipe批量插入)
2015-12-06 11:12
603 查看
Sometimes Redis instances needs to be loaded with big amount of preexisting or user generated data in a short amount of time, so that millions of keys will be created as fast as possible.
This is called a mass insertion, and the goal of this document is to provide information about how to feed Redis with data as fast as possible.
Using a normal Redis client to perform mass insertion is not a good idea for a few reasons: the naive approach of sending one command after the other is slow because you have to pay for the round trip time for every command. It is possible to use pipelining,
but for mass insertion of many records you need to write new commands while you read replies at the same time to make sure you are inserting as fast as possible.
Only a small percentage of clients support non-blocking I/O, and not all the clients are able to parse the replies in an efficient way in order to maximize throughput. For all this reasons the preferred way to mass import data into Redis is to generate a text
file containing the Redis protocol, in raw format, in order to call the commands needed to insert the required data.
For instance if I need to generate a large data set where there are billions of keys in the form: `keyN -> ValueN' I will create a file containing the following commands in the Redis protocol format:
Once this file is created, the remaining action is to feed it to Redis as fast as possible. In the past the way to do this was to use the
However this is not a very reliable way to perform mass import because netcat does not really know when all the data was transferred and can't check for errors. In 2.6 or later versions of Redis the
supports a new mode called pipe mode that was designed in order to perform mass insertion.
Using the pipe mode the command to run looks like the following:
That will produce an output similar to this:
The redis-cli utility will also make sure to only redirect errors received from the Redis instance to the standard output.
The Redis protocol is extremely simple to generate and parse, and is Documented here. However in order to generate
protocol for the goal of mass insertion you don't need to understand every detail of the protocol, but just that every command is represented in the following way:
Where
For instance the command SET key value is represented by the following protocol:
Or represented as a quoted string:
The file you need to generate for mass insertion is just composed of commands represented in the above way, one after the other.
The following Ruby function generates valid protocol:
Using the above function it is possible to easily generate the key value pairs in the above example, with this program:
We can run the program directly in pipe to redis-cli in order to perform our first mass import session.
This is called a mass insertion, and the goal of this document is to provide information about how to feed Redis with data as fast as possible.
Use the protocol, Luke
Using a normal Redis client to perform mass insertion is not a good idea for a few reasons: the naive approach of sending one command after the other is slow because you have to pay for the round trip time for every command. It is possible to use pipelining,but for mass insertion of many records you need to write new commands while you read replies at the same time to make sure you are inserting as fast as possible.
Only a small percentage of clients support non-blocking I/O, and not all the clients are able to parse the replies in an efficient way in order to maximize throughput. For all this reasons the preferred way to mass import data into Redis is to generate a text
file containing the Redis protocol, in raw format, in order to call the commands needed to insert the required data.
For instance if I need to generate a large data set where there are billions of keys in the form: `keyN -> ValueN' I will create a file containing the following commands in the Redis protocol format:
SET Key0 Value0 SET Key1 Value1 ... SET KeyN ValueN
Once this file is created, the remaining action is to feed it to Redis as fast as possible. In the past the way to do this was to use the
netcatwith the following command:
(cat data.txt; sleep 10) | nc localhost 6379 > /dev/null
However this is not a very reliable way to perform mass import because netcat does not really know when all the data was transferred and can't check for errors. In 2.6 or later versions of Redis the
redis-cliutility
supports a new mode called pipe mode that was designed in order to perform mass insertion.
Using the pipe mode the command to run looks like the following:
cat data.txt | redis-cli --pipe
That will produce an output similar to this:
All data transferred. Waiting for the last reply... Last reply received from server. errors: 0, replies: 1000000
The redis-cli utility will also make sure to only redirect errors received from the Redis instance to the standard output.
Generating Redis Protocol
The Redis protocol is extremely simple to generate and parse, and is Documented here. However in order to generateprotocol for the goal of mass insertion you don't need to understand every detail of the protocol, but just that every command is represented in the following way:
*<args><cr><lf> $<len><cr><lf> <arg0><cr><lf> <arg1><cr><lf> ... <argN><cr><lf>
Where
<cr>means "\r" (or ASCII character 13) and
<lf>means "\n" (or ASCII character 10).
For instance the command SET key value is represented by the following protocol:
*3<cr><lf> $3<cr><lf> SET<cr><lf> $3<cr><lf> key<cr><lf> $5<cr><lf> value<cr><lf>
Or represented as a quoted string:
"*3\r\n$3\r\nSET\r\n$3\r\nkey\r\n$5\r\nvalue\r\n"
The file you need to generate for mass insertion is just composed of commands represented in the above way, one after the other.
The following Ruby function generates valid protocol:
def gen_redis_proto(*cmd) proto = "" proto << "*"+cmd.length.to_s+"\r\n" cmd.each{|arg| proto << "$"+arg.to_s.bytesize.to_s+"\r\n" proto << arg.to_s+"\r\n" } proto end puts gen_redis_proto("SET","mykey","Hello World!").inspect
Using the above function it is possible to easily generate the key value pairs in the above example, with this program:
(0...1000).each{|n| STDOUT.write(gen_redis_proto("SET","Key#{n}","Value#{n}")) }
We can run the program directly in pipe to redis-cli in order to perform our first mass import session.
$ ruby proto.rb | redis-cli --pipe All data transferred. Waiting for the last reply... Last reply received from server. errors: 0, replies: 1000
相关文章推荐
- Redis在Centos7下安装,与phpredis扩展安装
- Redis常用命令
- redis的简介和配置
- Linux 下Redis安装部署
- ubuntu安装redis
- php连接redis的操作库predis操作大全
- Centos 安装 redis
- Redis常用命令
- Redis高级进阶(一)
- yii2 redis add password 密码验证
- redis sentinel 主从切换(failover)解决方案,详细配置
- 基于Python,scrapy,redis的分布式爬虫实现框架
- Redis: Jedis 源代码剖析2- 发布者/订阅者模式剖析
- php-redis扩展编译
- Nginx+Tomcat+Redis实现应用服务器集群负载均衡和Session共享
- Linux 安装redis 详解
- redis
- Redis数据库笔记
- redis的两种备份方式
- redis.conf 配置详解