您的位置：首页 > 理论基础 > 计算机网络

[置顶] 基于iOS的网络音视频实时传输系统（三）- VideoToolbox编码音视频数据为H264、AAC

2017-10-12 15:33 1221 查看

server端 -- 编码音视频数据为H264、AAC

这部分花了好多时间，本身就不具备这方面的相关知识，查阅了不少资料，不过关于VideoToolbox和AudioToolbox方面的编码资料寥寥无几，虽然网上搜索结果看似特别多，其实一看内容也大同小异，建议还是看看官方的文档。

下载

GitHub：

client 端：https://github.com/AmoAmoAmo/Smart_Device_Client

server端：https://github.com/AmoAmoAmo/Smart_Device_Server

另还写了一份macOS版的server，但是目前还有一些问题，有兴趣的去看看吧：https://github.com/AmoAmoAmo/Server_Mac

VideoToolbox编码视频数据为H264

初始化--创建session

// ----- 1. 创建session -----
int width = 640, height = 480;
OSStatus status = VTCompressionSessionCreate(NULL, width, height,
kCMVideoCodecType_H264, NULL, NULL, NULL,
didCompressH264, (__bridge void *)(self),  &EncodingSession);
NSLog(@"H264: VTCompressionSessionCreate %d", (int)status);
if (status != 0)
{
NSLog(@"H264: session 创建失败");
return ;
}

// ----- 2. 设置session属性 -----
// 设置实时编码输出（避免延迟）
VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue);
VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_ProfileLevel, kVTProfileLevel_H264_Baseline_AutoLevel);

// 设置关键帧（GOPsize)间隔
int frameInterval = 10;
f2c2

CFNumberRef  frameIntervalRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &frameInterval);
VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, frameIntervalRef);

// 设置期望帧率
int fps = 10;
CFNumberRef  fpsRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fps);
VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_ExpectedFrameRate, fpsRef);

//设置码率，上限，单位是bps
int bitRate = width * height * 3 * 4 * 8;
CFNumberRef bitRateRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRate);
VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_AverageBitRate, bitRateRef);

//设置码率，均值，单位是byte
int bitRateLimit = width * height * 3 * 4;
CFNumberRef bitRateLimitRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRateLimit);
VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_DataRateLimits, bitRateLimitRef);

// Tell the encoder to start encoding
VTCompressionSessionPrepareToEncodeFrames(EncodingSession);

编码完成回调

将来通过这个回调获取H264数据

void didCompressH264(void *outputCallbackRefCon,
void *sourceFrameRefCon,
OSStatus status,
VTEncodeInfoFlags infoFlags,
CMSampleBufferRef sampleBuffer)
{
//    NSLog(@"didCompressH264 called with status %d infoFlags %d", (int)status, (int)infoFlags); // 0 1
if (status != 0) {
return;
}

if (!CMSampleBufferDataIsReady(sampleBuffer)) {
NSLog(@"didCompressH264 data is not ready ");
return;
}
//    ViewController* encoder = (__bridge ViewController*)outputCallbackRefCon;

HJH264Encoder *encoder = (__bridge HJH264Encoder*)(outputCallbackRefCon);

// ----- 关键帧获取SPS和PPS ------
bool keyframe = !CFDictionaryContainsKey( (CFArrayGetValueAtIndex(CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true), 0)), kCMSampleAttachmentKey_NotSync);
// 判断当前帧是否为关键帧
// 获取sps & pps数据
if (keyframe)
{
CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
size_t sparameterSetSize, sparameterSetCount;
const uint8_t *sparameterSet;
OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sparameterSet, &sparameterSetSize, &sparameterSetCount, 0 );
if (statusCode == noErr)
{
// Found sps and now check for pps
size_t pparameterSetSize, pparameterSetCount;
const uint8_t *pparameterSet;
OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pparameterSet, &pparameterSetSize, &pparameterSetCount, 0 );
if (statusCode == noErr)
{
// Found pps
NSData *sps = [NSData dataWithBytes:sparameterSet length:sparameterSetSize];
NSData *pps = [NSData dataWithBytes:pparameterSet length:pparameterSetSize];
if (encoder)
{
[encoder gotSpsPps:sps pps:pps];  // 获取sps & pps数据
}
}
}
}

// --------- 写入数据 ----------
CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
size_t length, totalLength;
char *dataPointer;
OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer);
if (statusCodeRet == noErr) {
size_t bufferOffset = 0;
static const int AVCCHeaderLength = 4; // 返回的nalu数据前四个字节不是0001的startcode，而是大端模式的帧长度length

// 循环获取nalu数据
while (bufferOffset < totalLength - AVCCHeaderLength) {
uint32_t NALUnitLength = 0;
// Read the NAL unit length
memcpy(&NALUnitLength, dataPointer + bufferOffset, AVCCHeaderLength);

// 从大端转系统端
NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);

NSData* data = [[NSData alloc] initWithBytes:(dataPointer + bufferOffset + AVCCHeaderLength) length:NALUnitLength];
[encoder gotEncodedData:data isKeyFrame:keyframe];

// Move to the next NAL unit in the block buffer
bufferOffset += AVCCHeaderLength + NALUnitLength;
}
}
}

传入需要编码的帧

- (void) encode:(CMSampleBufferRef )sampleBuffer
{
CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer);
// 帧时间，如果不设置会导致时间轴过长。
CMTime presentationTimeStamp = CMTimeMake(frameID++, 1000); // CMTimeMake(分子，分母)；分子/分母 = 时间(秒)
VTEncodeInfoFlags flags;
OSStatus statusCode = VTCompressionSessionEncodeFrame(EncodingSession,
imageBuffer,
presentationTimeStamp,
kCMTimeInvalid,
NULL, NULL, &flags);
if (statusCode != noErr) {
NSLog(@"H264: VTCompressionSessionEncodeFrame failed with %d", (int)statusCode);

VTCompressionSessionInvalidate(EncodingSession);
CFRelease(EncodingSession);
EncodingSession = NULL;
return;
}
}

然后就可以在上面的回调里取得编码后的数据，再把数据通过socket发给客户端即可。

在每个阶段都要记得测试、打印日志，不然以后找bug会很辛苦的。

这里可以把编码后的数据写入本地文件，然后用VLC工具打开，检测编码是否有问题。

最后不要忘记关闭编码器

- (void)EndVideoToolBox
{
VTCompressionSessionCompleteFrames(EncodingSession, kCMTimeInvalid);
VTCompressionSessionInvalidate(EncodingSession);
CFRelease(EncodingSession);
EncodingSession = NULL;
}

另：在macOS环境下使用VideoToolbox编码的过程在这个博客里：

[置顶] VideoToolbox视频编码——在macOS上对获取到的视频进行编码的问题记录及YUV422转YUV420

AudioToolbox编码音频数据为AAC

设置编码参数

- (void) setupEncoderFromSampleBuffer:(CMSampleBufferRef)sampleBuffer {
AudioStreamBasicDescription inAudioStreamBasicDescription = *CMAudioFormatDescriptionGetStreamBasicDescription((CMAudioFormatDescriptionRef)CMSampleBufferGetFormatDescription(sampleBuffer));

AudioStreamBasicDescription outAudioStreamBasicDescription = {0}; // 初始化输出流的结构体描述为0. 很重要。
outAudioStreamBasicDescription.mSampleRate = inAudioStreamBasicDescription.mSampleRate; // 音频流，在正常播放情况下的帧率。如果是压缩的格式，这个属性表示解压缩后的帧率。帧率不能为0。
outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC; // 设置编码格式
outAudioStreamBasicDescription.mFormatFlags = kMPEG4Object_AAC_LC; // 无损编码 ，0表示没有
outAudioStreamBasicDescription.mBytesPerPacket = 0; // 每一个packet的音频数据大小。如果的动态大小，设置为0。动态大小的格式，需要用AudioStreamPacketDescription 来确定每个packet的大小。
outAudioStreamBasicDescription.mFramesPerPacket = 1024; // 每个packet的帧数。如果是未压缩的音频数据，值是1。动态码率格式，这个值是一个较大的固定数字，比如说AAC的1024。如果是动态大小帧数（比如Ogg格式）设置为0。
outAudioStreamBasicDescription.mBytesPerFrame = 0; //  每帧的大小。每一帧的起始点到下一帧的起始点。如果是压缩格式，设置为0 。
outAudioStreamBasicDescription.mChannelsPerFrame = 1; // 声道数
outAudioStreamBasicDescription.mBitsPerChannel = 0; // 压缩格式设置为0
outAudioStreamBasicDescription.mReserved = 0; // 8字节对齐，填0.
AudioClassDescription *description = [self
getAudioClassDescriptionWithType:kAudioFormatMPEG4AAC
fromManufacturer:kAppleSoftwareAudioCodecManufacturer]; //软编

OSStatus status = AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, description, &_audioConverter); // 创建转换器
if (status != 0) {
NSLog(@"setup converter: %d", (int)status);
}
}

获取编解码器

- (AudioClassDescription *)getAudioClassDescriptionWithType:(UInt32)type
fromManufacturer:(UInt32)manufacturer
{
static AudioClassDescription desc;

UInt32 encoderSpecifier = type;
OSStatus st;

UInt32 size;
st = AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size);
if (st) {
NSLog(@"error getting audio format propery info: %d", (int)(st));
return nil;
}

unsigned int count = size / sizeof(AudioClassDescription);
AudioClassDescription descriptions[count];
st = AudioFormatGetProperty(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size,
descriptions);
if (st) {
NSLog(@"error getting audio format propery: %d", (int)(st));
return nil;
}

for (unsigned int i = 0; i < count; i++) {
if ((type == descriptions[i].mSubType) &&
(manufacturer == descriptions[i].mManufacturer)) {
memcpy(&desc, &(descriptions[i]), sizeof(desc));
return &desc;
}
}

return nil;
}

将设备捕获到的音频数据传给编码器

- (void) encodeSampleBuffer:(CMSampleBufferRef)sampleBuffer completionBlock:(void (^)(NSData * encodedData, NSError* error))completionBlock {
CFRetain(sampleBuffer);
dispatch_async(_encoderQueue, ^{
if (!_audioConverter) {
[self setupEncoderFromSampleBuffer:sampleBuffer];
}
CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
CFRetain(blockBuffer);
// --------- 通过CMBlockBufferGetDataPointer获取到_pcmBufferSize和_pcmBuffer --------
OSStatus status = CMBlockBufferGetDataPointer(blockBuffer, 0, NULL, &_pcmBufferSize, &_pcmBuffer);
NSError *error = nil;
if (status != kCMBlockBufferNoErr) {
error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
}
memset(_aacBuffer, 0, _aacBufferSize);

AudioBufferList outAudioBufferList = {0};
outAudioBufferList.mNumberBuffers = 1;
outAudioBufferList.mBuffers[0].mNumberChannels = 1;
outAudioBufferList.mBuffers[0].mDataByteSize = (int)_aacBufferSize;
outAudioBufferList.mBuffers[0].mData = _aacBuffer;
AudioStreamPacketDescription *outPacketDescription = NULL;
UInt32 ioOutputDataPacketSize = 1;
// Converts data supplied by an input callback function, supporting non-interleaved and packetized formats.
// Produces a buffer list of output data from an AudioConverter. The supplied input callback function is called whenever necessary.
status = AudioConverterFillComplexBuffer(_audioConverter, inInputDataProc, (__bridge void *)(self), &ioOutputDataPacketSize, &outAudioBufferList, outPacketDescription);
NSData *data = nil;
if (status == 0) {
NSData *rawAAC = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize];
NSData *adtsHeader = [self adtsDataForPacketLength:rawAAC.length];
NSMutableData *fullData = [NSMutableData dataWithData:adtsHeader];
[fullData appendData:rawAAC];
data = fullData;
} else {
error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
}

if (completionBlock) {
dispatch_async(_callbackQueue, ^{
//                printf("----- audio data len = %d ----\n",(int)[data length]);
completionBlock(data, error);
});
}
CFRelease(sampleBuffer);
CFRelease(blockBuffer);
});
}

回调函数

OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData)
{
AACEncoder *encoder = (__bridge AACEncoder *)(inUserData);
UInt32 requestedPackets = *ioNumberDataPackets;

size_t copiedSamples = [encoder copyPCMSamplesIntoBuffer:ioData];
if (copiedSamples < requestedPackets) {
//PCM 缓冲区还没满
*ioNumberDataPackets = 0;
return -1;
}
*ioNumberDataPackets = 1;

return noErr;
}

/**
*  填充PCM到缓冲区
*/
- (size_t) copyPCMSamplesIntoBuffer:(AudioBufferList*)ioData {
size_t originalBufferSize = _pcmBufferSize;
if (!originalBufferSize) {
return 0;
}
ioData->mBuffers[0].mData = _pcmBuffer;
ioData->mBuffers[0].mDataByteSize = (int)_pcmBufferSize;
_pcmBuffer = NULL;
_pcmBufferSize = 0;
return originalBufferSize;
}

最后在需要的地方释放编码器

- (void) dealloc {
AudioConverterDispose(_audioConverter);
free(_aacBuffer);
}

参考文章

1.  http://www.jianshu.com/p/9febe519732a#comment-13802063

2.  http://www.jianshu.com/p/a671f5b17fc1

3.  http://blog.csdn.net/hard_man/article/details/53511026

4.  https://developer.apple.com/documentation/videotoolbox

基于iOS的网络音视频实时传输系统（一）- 前言

基于iOS的网络音视频实时传输系统（二）- 捕获音视频数据

基于iOS的网络音视频实时传输系统（三）- VideoToolbox编码音视频数据为H264、AAC

基于iOS的网络音视频实时传输系统（四）- 自定义socket协议(TCP、UDP)

基于iOS的网络音视频实时传输系统（五）- 使用VideoToolbox硬解码H264

基于iOS的网络音视频实时传输系统（六）- AudioQueue播放音频，OpenGL渲染显示图像

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航