您的位置：首页 > 运维架构

How to read video frames in hadoop?如何在Hadoop中读取视频帧？

2013-08-21 17:43 387 查看

To process specialized file formats (such as video) in Hadoop, you'd have to write a custom InputFormat and RecordReader that understands how to turn a video file into splits (the InputFormat) and then read splits into values (the RecordReader).在Hadoop要处理的专用文件格式（如视频），你就必须写一个自定义的InputFormat和RecordReader，了解如何将一个视频文件分割（InputFormat），然后读值（RecordReader）分裂成。 This is a non-trivial task and requires some intermediate knowledge of how Hadoop handles the splitting of data.这是一个不平凡的任务和Hadoop的处理分割的数据需要一些中间的知识。 I highly recommend Tom White's Hadoop the Definitive Guide book by O'Reilly as well as the videos on http://www.cloudera.com .我强烈建议汤姆白色的Hadoop权威指南书由O'Reilly和视频http://www.cloudera.com 。 (Full disclosure: I work for Cloudera.) （披露：我工作的Cloudera的。）

Keep in mind that video formats are generally compressed which gets even more complicated because InputSplits (created by an InputFormat) are simple byte offsets into the file (normally).请记住，一般都是压缩的视频格式变得更加复杂，因为InputSplits创建一个InputFormat是简单的字节偏移量到文件中（通常情况下）。 Start withhttp://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/InputFormat.html从与http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/InputFormat.html开始

To summarize: InputFormat knows how to generate a list of InputSplit objects that are (usually) between 64MB and 128MB and do NOT respect the notion of frames.总结：InputFormat知道如何生成的列表InputSplit对象（通常情况下），64MB和128MB之间，不尊重的概念框架。 The RecordReader then is used to read frames out of a InputSplit to create value objects that the map reduce job can process. “的RecordReader然后是用于读取帧一的InputSplit，以创建的Map Reduce作业可以处理的值对象。 If you want to generate video output you'll also need to write a custom OutputFormat.如果你想生成视频输出，您还需要编写一个自定义的OutputFormat。

Hope this helps.希望这会有所帮助。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航