您的位置:首页 > 运维架构 > Linux

centos7下使用swig扩展python接口来调用c++ 实现声纹识别

2017-07-28 10:40 1051 查看
参考链接:

http://blog.csdn.net/freewebsys/article/details/47259413

https://www.zhihu.com/question/23003213

http://www.swig.org/papers/PyTutorial98/PyTutorial98.pdf 更详细的swig资料

https://stackoverflow.com/questions/27149849/python-importerror-undefined-symbol-g-utf8-skip

https://stackoverflow.com/questions/9098980/swig-error-undefined-symbol

http://blog.csdn.net/stpeace/article/details/51416297



1、动机

在做声纹识别服务过程中,因为脚本是每次识别都需要加载一次分类器,加载过程十分耗时,识别一次需要7s...觉得有两个方案:1、守护进程来实现预加载,其他地方还是用脚本来实现,毕竟主要是加载耗时,识别时由web接口来调用守护进程服务来识别,然后返回结果,好处就是貌似需要改的地方少点,但由于有很多中间文件生成会显得很乱,不是太适合生产环境使用;2、将识别过程重新组装,然后封装为python可以调用的接口,然后利用web现成的服务来完成多客户的同时访问,这样的问题就是改写识别过程工作量比较大,很耗时,而且细节较多,但这样应该可以一劳永逸,所以暂时确定用这个方案,下面是记录将c++接口通过swig暴露给python的测试,主要有两点:怎么用,加载分类器后是否能持久化,不用每次都重新加载。

...用了将近两周时间,一周重新组装识别的代码,一周扩展为python的接口,可以运行,但还是有蛮多可以优化的地方,感觉不知道时好难,会了后又觉得好简单...

2、环境

linux centos 7

3、安装swig

官方网址:http://www.swig.org/download.html

目前最新版:3.0.12

yum install swig

安装版本swig-2.0.10-5.el7.x86_64.rpm

如果想安装最新版,删掉yum remove swig

下载安装包后:tar -zxvf swig-3.0.12.tar.gz

make -j 8

make install

这样就装好了

4、编写测试的c源码

4.1 直接写,不使用swig



#include <Python.h>

int great_function(int a) {
return a + 1;
}

static PyObject * _great_function(PyObject *self, PyObject *args)
{
int _a;
int res;

if (!PyArg_ParseTuple(args, "i", &_a))
return NULL;
res = great_function(_a);
return PyLong_FromLong(res);
}

static PyMethodDef GreateModuleMethods[] = {
{
"great_function",
_great_function,
METH_VARARGS,
""
},
{NULL, NULL, 0, NULL}
};

PyMODINIT_FUNC initgreat_module(void) {
(void) Py_InitModule("great_module", GreateModuleMethods);
}


包裹函数_great_function。它负责将Python的参数转化为C的参数(PyArg_ParseTuple),调用实际的great_function,并处理great_function的返回值,最终返回给Python环境。
导出表GreateModuleMethods。它负责告诉Python这个模块里有哪些函数可以被Python调用。导出表的名字可以随便起,每一项有4个参数:第一个参数是提供给Python环境的函数名称,第二个参数是_great_function,即包裹函数。第三个参数的含义是参数变长,第四个参数是一个说明性的字符串。导出表总是以{NULL, NULL, 0, NULL}结束。
导出函数initgreat_module。这个的名字不是任取的,是你的module名称添加前缀init。导出函数中将模块名称与导出表进行连接

直接编译:

gcc -fPIC -shared great_module.c -o great_module.so -I/root/anaconda2/include/python2.7/ -l/root/anaconda2/lib/python2.7

4.2 使用swig

参考:http://www.swig.org/translations/chinese/tutorial.html

swig_example.c

include <time.h>
double My_variable = 3.0;

int fact(int n) {
if (n <= 1) return 1;
else return n*fact(n-1);
}

int my_mod(int x, int y) {
return (x%y);
}

char *get_time()
{
time_t ltime;
time(<ime);
return ctime(<ime);
}
swig_example.i(接口文件)
/* example.i */
%module example    ----(模块名称)-----
%{
/* Put header files here or function declarations like below */
extern double My_variable;
extern int fact(int n);
extern int my_mod(int x, int y);
extern char *get_time();
%}

extern double My_variable;  -----(申明)------
extern int fact(int n);
extern int my_mod(int x, int y);
extern char *get_time();[code]
[/code]
生成python文件---形成动态库

swig -python swig_example.i

gcc -c -fPIC swig_example.c swig_example_wrap.c -I/root/anaconda2/include/python2.7/

ld -shared swig_example.o swig_example_wrap.o -o _example.so

测试:到生成的文件路径下,在终端打开

[root@hadoop-0 cpython]# python

Python 2.7.5 (default, Nov 20 2015, 02:00:19)

[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>> import example

>>> example.fact(5)

120

>>> example.my_mod(10,30)

10

>>> example.get_time()

'Fri Jul 28 11:43:30 2017\n'

>>>example.cvar.My_variable

3.0


全局变量需要这样访问,cvar是缺省的全局变量对象

如果想改变cvar的命名,可以将指令swig -python swig_example.i 改为swig -python-globals
myvar
swig_example.i

目前看来,将声纹识别过程扩展成python的接口应该是没有问题的。

5、编写测试的c++源码

//---example.h
#include <iostream>
using namespace std;
class Example{
public:
void say_hello();
};
//---example.cpp
#include "example.h"

void Example::say_hello(){
cout<<"hello"<<endl;
}
//---example.i
%module example
%{
#include "example.h"
%}
%include "example.h"
#---setup.py
#!/usr/bin/env python

"""
setup.py file for SWIG C\+\+/Python example
"""
from distutils.core import setup, Extension
example_module = Extension('_example',
sources=['example.cpp', 'example_wrap.cxx',],
)
setup (name = 'example',
version = '0.1',
author = "www.99fang.com",
description = """Simple swig C\+\+/Python example""",
ext_modules = [example_module],
py_modules = ["example"],
)


执行:

swig -c\+\+ -python example.i

python setup.py build_ext --inplace

这个地方卡了下,我安装了默认的python2.7,anaconda2,anaconda3,而python2.7的头文件和动态库路径设置有问题,直接用会导致找不到Python.h文件,直接将anaconda2,anaconda3设置环境变量:PATH=$PATH:/root/anaconda2/bin:/root/anaconda3/bin,然后执行python3 setup.py build_ext
--inplace,然后链接的就是python3的头文件和库文件了.

6、开始扩展声纹识别的接口

首先先脚本改写为c++的文件,然后测试c++文件运行ok后,就可以开始扩展了,因为这主要是记录将c++接口扩展为python的过程,改写的过程就不详述了

linux gcc 编译相关指令可以看:http://blog.csdn.net/wuxianfeng1987/article/details/76528254

指令意思不明白,出问题基本就没解决思路了

文件结构:

/* recognition.i */

%module recognition

%{
#define SWIG_FILE_WITH_INIT
#include "recognition.h"
%}

int init();
int recognation();


/* include "recognition.h */

#include "base/kaldi-common.h"
#include "util/common-utils.h"
#include "util/kaldi-thread.h"
#include "feat/feature-mfcc.h"
#include "feat/wave-reader.h"
#include "matrix/kaldi-matrix.h"
#include "ivector/voice-activity-detection.h"
#include "ivector/ivector-extractor.h"
#include "ivector/plda.h"
#include "gmm/full-gmm.h"
#include "gmm/diag-gmm.h"
#include "gmm/mle-full-gmm.h"
#include "gmm/am-diag-gmm.h"
#include "hmm/transition-model.h"
#include "hmm/posterior.h"
int init();
int recognation();

/* recognition.cc */

#include <iostream>
using namespace std;
using namespace kaldi;

// global var
IvectorExtractor extractor;
FullGmm fgmm;
DiagGmm gmm;
Plda plda;   ...


下面是从一个简单函数慢慢加入函数到成功运行的整个解决过程:

distutils 编译:

swig -c++ -python recognition.i

python3.6 setup.py build_ext --inplace

distutils没有仔细看过,修改指令可能会有问题,手动编译测试。

手动编译:

swig -c++ -python recognition.i

gcc -O2 -fPIC -c recognition.cc

gcc -O2 -fPIC -c recognition_wrap.cxx -I..-I/root/anaconda3/include/python3.6m/

gcc -shared recognition.o recognition_wrap.o -o _recognition.so

error:---------------------------------------------------------------------------------

import recognition

Traceback (most recent call last):

File"/usr/wxf/kaldi/src/featbin/recognition.py", line 14,inswig_import_helper

return importlib.import_module(mname)

File"/root/anaconda3/lib/python3.6/importlib/__init__.py",line 126, inimport_module

return _bootstrap._gcd_import(name[level:], package, level)

File"<frozen importlib._bootstrap>", line 978, in_gcd_import

File"<frozen importlib._bootstrap>", line 961, in_find_and_load

File"<frozen importlib._bootstrap>", line 950,in_find_and_load_unlocked

File"<frozen importlib._bootstrap>", line 648, in_load_unlocked

File"<frozen importlib._bootstrap>", line 560, inmodule_from_spec

File"<frozen importlib._bootstrap_external>", line 922,increate_module

File"<frozen importlib._bootstrap>", line 205,in_call_with_frames_removed

ImportError: /usr/wxf/kaldi/src/featbin/_recognition.so: undefinedsymbol:__gxx_personality_v0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File"<stdin>", line 1, in <module>

File"/usr/wxf/kaldi/src/featbin/recognition.py", line 17,in<module>

_recognition = swig_import_helper()

File"/usr/wxf/kaldi/src/featbin/recognition.py", line 16,inswig_import_helper

return importlib.import_module('_recognition')

File"/root/anaconda3/lib/python3.6/importlib/__init__.py",line 126, inimport_module

return _bootstrap._gcd_import(name[level:], package, level)

ImportError: /usr/wxf/kaldi/src/featbin/_recognition.so: undefinedsymbol:__gxx_personality_v0

分析-------------------------------------------------------------------------

因为是cc文件,要用g++编译,修改指令为:

swig -c++ -python recognition.i

g++ -O2 -fPIC -c recognition.cc

g++ -O2 -fPIC -c recognition_wrap.cxx -I..-I/root/anaconda3/include/python3.6m/

g++ -shared recognition.o recognition_wrap.o -o _recognition.so

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

添加 头文件测试: #include "base/kaldi-common.h"

error:-------------------------------------------------------------------

[root@hadoop-0 featbin]# g++ -O2 -fPIC -c recognition.cc

In file included from recognition.cc:2:0:

../base/kaldi-common.h:34:30: 致命错误:base/kaldi-utils.h:没有那个文件或目录

#include "base/kaldi-utils.h"

分析---------------------------------------------------------------------

包含路径:上级目录-I..

g++ -O2 -fPIC -c recognition.cc -I..

error:------------------------------------------------------------------

In file included from ../base/kaldi-error.h:32:0,

from ../base/kaldi-common.h:35,

from recognition.cc:2:

../base/kaldi-types.h:44:23: 致命错误:fst/types.h:没有那个文件或目录

#include <fst/types.h>

g++ -O2 -fPIC -c recognition.cc -I.. -I/usr/wxf/kaldi/tools/openfst/include

错误:#error This filerequires compiler and librarysupport for the ISO C++ 2011 standard.

This support is currently experimental, and must be enabled with the-std=c++11or -std=gnu++11 compiler options.

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

swig -c++ -python recognition.i

g++ -std=c++11 -O2 -fPIC -c recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include

g++ -O2 -fPIC -c recognition_wrap.cxx -I/root/anaconda3/include/python3.6m/

g++ -shared recognition.o recognition_wrap.o -o _recognition.so

添加:KALDI_LOG 测试 暂时正常执行

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

添加头文件:#include "util/common-utils.h" 测试

error:------------------------------------------------------------

#error "You need to define (using the preprocessor) either HAVE_CLAPACKorHAVE_ATLAS or HAVE_MKL (but not more than one)"

#error "You need to define (using thepreprocessor) eitherHAVE_CLAPACK or HAVE_ATLAS or HAVE_MKL (but not more thanone)"

分析--------------------------------------------------------------

g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC -c recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include

error:----------------------------------------------------------

cblas.h没有那个文件或目录#include <cblas.h>

分析--------------------------------------------------------------

g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC -c recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include -I/usr/wxf/kaldi/tools/ATLAS/include

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

添加头文件:

#include "base/kaldi-common.h"

#include "util/common-utils.h"

#include "util/kaldi-thread.h"

#include "feat/feature-mfcc.h"

#include "feat/wave-reader.h"

#include "matrix/kaldi-matrix.h"

#include "ivector/voice-activity-detection.h"

#include "ivector/ivector-extractor.h"

#include "ivector/plda.h"

#include "gmm/full-gmm.h"

#include "gmm/diag-gmm.h"

#include "gmm/mle-full-gmm.h"

#include "gmm/am-diag-gmm.h"

#include "hmm/transition-model.h"

#include "hmm/posterior.h"

#include <iostream>

using namespace std;

using namespace kaldi;

正常

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

加入函数

int LoadIvector(std::string ivector_extractor_rxfilename,IvectorExtractor&extractor){

try{ReadKaldiObject(ivector_extractor_rxfilename,&extractor);}

catch (const std::exception &e) {

std::cerr << e.what();

return -1;

}

return 0;

}

测试:

error:----------------------------------------------------------

python 导入模块 import recognition 时错误

ImportError: /usr/wxf/kaldi/src/featbin/_recognition.so: undefinedsymbol:_ZN5kaldi5InputC1ERKSsPb

ImportError: /usr/wxf/kaldi/src/featbin/_recognition.so: undefinedsymbol:_ZN5kaldi7DiagGmm15CopyFromFullGmmERKNS_7FullGmmE

分析--------------------------------------------------------------

编译没问题,导入时出现未定义符号,库没加载到

·>如果有使用到动态库,加入:-rdynamic -ldl

·>添加静态库:

../hmm/kaldi-hmm.a

../ivector/kaldi-ivector.a

../feat/kaldi-feat.a

../transform/kaldi-transform.a

../gmm/kaldi-gmm.a

../tree/kaldi-tree.a

../util/kaldi-util.a

../matrix/kaldi-matrix.a

../base/kaldi-base.a

编译指令修改为:只编译不链接 -c

g++ -std=c++11 -rdynamic -DHAVE_ATLAS -O2 -fPIC -c recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include../hmm/kaldi-hmm.a../ivector/kaldi-ivector.a ../feat/kaldi-feat.a../transform/kaldi-transform.a../gmm/kaldi-gmm.a
../tree/kaldi-tree.a../util/kaldi-util.a../matrix/kaldi-matrix.a ../base/kaldi-base.a -ldl

还是存在问题

感觉不应该是编译过程中没链接库,不过还是试试 测试-链接生成 -o

g++ -std=c++11 -rdynamic -DHAVE_ATLAS -O2 -fPIC -o recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include../hmm/kaldi-hmm.a../ivector/kaldi-ivector.a ../feat/kaldi-feat.a../transform/kaldi-transform.a../gmm/kaldi-gmm.a
../tree/kaldi-tree.a../util/kaldi-util.a../matrix/kaldi-matrix.a ../base/kaldi-base.a -ldl

g++:警告:../hmm/kaldi-hmm.a:未使用链接器输入文件,因为链接尚未完成,说明不是生成obj文件过程中链接的

考虑是最后步链接静态库,测试-最后一步在生成动态库时链接静态库 比较几个版本,发现2.7才可以

python3.6m-------------

g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC-c recognition_wrap.cxx -I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include-I/root/anaconda3/include/python3.6m/

g++ -std=c++11 -Wall -shared-DKALDI_DOUBLEPRECISION=0-DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -O2recognition.orecognition_wrap.o -o _recognition.so -fPIC -L/root/anaconda3/lib-lpython3.6m-I.. -rdynamic ../hmm/kaldi-hmm.a../ivector/kaldi-ivector.a../feat/kaldi-feat.a
../transform/kaldi-transform.a../gmm/kaldi-gmm.a../tree/kaldi-tree.a ../util/kaldi-util.a ../matrix/kaldi-matrix.a../base/kaldi-base.a-ldl

python3---------------

g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC-c recognition_wrap.cxx -I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include-I/root/anaconda3/include/python3.6m/

g++ -std=c++11 -Wall -shared-DKALDI_DOUBLEPRECISION=0-DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -O2recognition.orecognition_wrap.o -o _recognition.so -fPIC -L/root/anaconda3/lib-lpython3-I.. -rdynamic ../hmm/kaldi-hmm.a ../ivector/kaldi-ivector.a../feat/kaldi-feat.a../transform/kaldi-transform.a
../gmm/kaldi-gmm.a../tree/kaldi-tree.a../util/kaldi-util.a ../matrix/kaldi-matrix.a../base/kaldi-base.a -ldl

python2.7-------------------

g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC-c recognition_wrap.cxx -I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include-I/root/anaconda2/include/python2.7/

g++ -std=c++11 -Wall -shared-DKALDI_DOUBLEPRECISION=0-DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -O2recognition.orecognition_wrap.o -o _recognition.so -fPIC -L/root/anaconda2/lib-lpython2.7-I.. -rdynamic ../hmm/kaldi-hmm.a../ivector/kaldi-ivector.a../feat/kaldi-feat.a
../transform/kaldi-transform.a../gmm/kaldi-gmm.a../tree/kaldi-tree.a ../util/kaldi-util.a../matrix/kaldi-matrix.a../base/kaldi-base.a -ldl

同样存在问题

后面查了下资料,动态库中链接静态库,必须要加-fPIC参数,也就是说我在安装kaldi的过程中生成的静态库是有问题的,需要配置-fPIC参数才行,后面重新编译kaldi后就ok了

7、修改接口API的输入参数

之前为了减少调试的变量,参数直接写死的,下面将添加相关参数。

先看看swig对c++一些类型的支持情况:



我主要是传入模型的路径和模型参数配置,用到的是std:string,int,double等

当我直接将接口改为:int init(std::string ivector_extractor_rxfilename,

std::string fgmm_rxfilename,

std::string plda_rxfilename);

编译,在python中调用时,出现错误提示:TypeError: in method 'init', argument 1 of type 'std::string',解决办法就是在*.i文件中加入%include "std_string.i"
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: