PCM混音
2016-01-13 12:17
232 查看
混音
pcm混音的原理是把两组数据相加,相加后的数据范围不能超过pcm位宽的表示范围,MixFrames写死是int16_t类型(具体查看AudioFrame),所以可以看出webrtc内混音处理是不支持16bit之外的pcm音频。
PCM操作,包括单声道转立体声、立体声转单声道、哑音、音量调整。
音频术语
webrtc中的混音函数在
webrtc/modules/audio_conference_mixer/source/audio_conference_mixer_impl.cc,也就是下面这个函数。
// Mix |frame| into |mixed_frame|, with saturation protection and upmixing. // These effects are applied to |frame| itself prior to mixing. Assumes that // |mixed_frame| always has at least as many channels as |frame|. Supports // stereo at most. // // TODO(andrew): consider not modifying |frame| here. void MixFrames(AudioFrame* mixed_frame, AudioFrame* frame, bool use_limiter) { assert(mixed_frame->num_channels_ >= frame->num_channels_); if (use_limiter) { // Divide by two to avoid saturation in the mixing. // This is only meaningful if the limiter will be used. *frame >>= 1; } if (mixed_frame->num_channels_ > frame->num_channels_) { // We only support mono-to-stereo. assert(mixed_frame->num_channels_ == 2 && frame->num_channels_ == 1); AudioFrameOperations::MonoToStereo(frame); } *mixed_frame += *frame; }
最后一句代码才是混合的关键所在,它调用了AudioFrame的重载函数
+=,也就是进行了下面的操作。也就是把相加后的数据控制在int16_t范围。
文件路径是:
webrtc/modules/interface/module_common_types.h
inline AudioFrame& AudioFrame::operator+=(const AudioFrame& rhs) { ... if (speech_type_ != rhs.speech_type_) speech_type_ = kUndefined; if (noPrevData) { memcpy(data_, rhs.data_, sizeof(int16_t) * rhs.samples_per_channel_ * num_channels_); } else { // IMPROVEMENT this can be done very fast in assembly for (int i = 0; i < samples_per_channel_ * num_channels_; i++) { int32_t wrapGuard = static_cast<int32_t>(data_[i]) + static_cast<int32_t>(rhs.data_[i]); if (wrapGuard < -32768) { data_[i] = -32768; } else if (wrapGuard > 32767) { data_[i] = 32767; } else { data_[i] = (int16_t)wrapGuard; } } } energy_ = 0xffffffff; return *this; }
最后的判断可以用宏来写
#define MIXER_MAX(x,y) ((x)>(y)? (x):(y)) #define MIXER_MIN(x,y) ((x)<(y)? (x):(y)) #define MIXER_CLIP3(a,b,x) (MIXER_MAX(a,MIXER_MIN(x,b))) /* clip x between a and b */ #define MIXER_CLIP(x) MIXER_CLIP3(-32768,32767,x) for (int i = 0; i < samples_per_channel_ * num_channels_; i++) { int32_t wrapGuard = static_cast<int32_t>(data_[i]) + static_cast<int32_t>(rhs.data_[i]); data_[i] = (int16_t)MIXER_CLIP(wrapGuard); }
Android源码里面是这样写的,用位移的效率要高一些,我仅仅是根据理论知识推测效率要比判断要高,没有进行过对比。
static inline int16_t clamp16(int32_t sample) { if ((sample>>15) ^ (sample>>31)) sample = 0x7FFF ^ (sample>>31); return sample; }
相关文章推荐
- 怎么做好互联网产品运营?
- 关于sharepoint2013 使用 OWA一些答疑
- PHP系统的安全配置初级
- 实现QT与HTML页面通信
- 关于ubuntu下qt编译显示Cannot connect creator comm socket /tmp/qt_temp.xxx/stub-socket的解决的方法
- Linux 分区和目录
- 阿里云RDS导入mysql数据库
- java.security.cert.CertificateException: Certificates does not conform to algorithm constraints
- HTTP返回码中301与302的区别
- C#源代码—使用哈希表保存学生信息
- 服务器返回数据为nil,null问题处理
- POJ1088滑雪
- fragment生命周期 fragment与activity通信
- 怎样快速学习一个系统
- 大数据IMF传奇行动 scala IDE 本地local开发wordcount 无法加载主类问题解决
- 盘点机器学习领域的五大流派
- 【杭电oj】1201 - 18岁生日(水)
- 学习mongo系列(十一)关系
- Linux分区
- 2016中国员工工资涨幅全球最高,菜鸟在线想知道你怎么看?