您的位置：首页 > 其它

PCM混音

2016-01-13 12:17 232 查看

混音

pcm混音的原理是把两组数据相加，相加后的数据范围不能超过pcm位宽的表示范围，

MixFrames

写死是int16_t类型（具体查看AudioFrame），所以可以看出webrtc内混音处理是不支持16bit之外的pcm音频。

PCM操作，包括单声道转立体声、立体声转单声道、哑音、音量调整。

音频术语

webrtc中的混音函数在

webrtc/modules/audio_conference_mixer/source/audio_conference_mixer_impl.cc

，也就是下面这个函数。

// Mix |frame| into |mixed_frame|, with saturation protection and upmixing.
// These effects are applied to |frame| itself prior to mixing. Assumes that
// |mixed_frame| always has at least as many channels as |frame|. Supports
// stereo at most.
//
// TODO(andrew): consider not modifying |frame| here.
void MixFrames(AudioFrame* mixed_frame, AudioFrame* frame, bool use_limiter) {
assert(mixed_frame->num_channels_ >= frame->num_channels_);
if (use_limiter) {
// Divide by two to avoid saturation in the mixing.
// This is only meaningful if the limiter will be used.
*frame >>= 1;
}
if (mixed_frame->num_channels_ > frame->num_channels_) {
// We only support mono-to-stereo.
assert(mixed_frame->num_channels_ == 2 &&
frame->num_channels_ == 1);
AudioFrameOperations::MonoToStereo(frame);
}

*mixed_frame += *frame;
}

最后一句代码才是混合的关键所在，它调用了AudioFrame的重载函数

+=

，也就是进行了下面的操作。也就是把相加后的数据控制在int16_t范围。

文件路径是：

webrtc/modules/interface/module_common_types.h

inline AudioFrame& AudioFrame::operator+=(const AudioFrame& rhs) {
...

if (speech_type_ != rhs.speech_type_) speech_type_ = kUndefined;

if (noPrevData) {
memcpy(data_, rhs.data_,
sizeof(int16_t) * rhs.samples_per_channel_ * num_channels_);
} else {
// IMPROVEMENT this can be done very fast in assembly
for (int i = 0; i < samples_per_channel_ * num_channels_; i++) {
int32_t wrapGuard =
static_cast<int32_t>(data_[i]) + static_cast<int32_t>(rhs.data_[i]);
if (wrapGuard < -32768) {
data_[i] = -32768;
} else if (wrapGuard > 32767) {
data_[i] = 32767;
} else {
data_[i] = (int16_t)wrapGuard;
}
}
}
energy_ = 0xffffffff;
return *this;
}

最后的判断可以用宏来写

#define MIXER_MAX(x,y) ((x)>(y)? (x):(y))
#define MIXER_MIN(x,y) ((x)<(y)? (x):(y))
#define MIXER_CLIP3(a,b,x) (MIXER_MAX(a,MIXER_MIN(x,b)))  /* clip x between a and b */
#define MIXER_CLIP(x)  MIXER_CLIP3(-32768,32767,x)

for (int i = 0; i < samples_per_channel_ * num_channels_; i++) {
int32_t wrapGuard =
static_cast<int32_t>(data_[i]) + static_cast<int32_t>(rhs.data_[i]);
data_[i] = (int16_t)MIXER_CLIP(wrapGuard);
}

Android源码里面是这样写的，用位移的效率要高一些，我仅仅是根据理论知识推测效率要比判断要高，没有进行过对比。

static inline int16_t clamp16(int32_t sample)
{
if ((sample>>15) ^ (sample>>31))
sample = 0x7FFF ^ (sample>>31);
return sample;
}

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航