python 参议院文本预处理的一维数组的间隔空间
2015-10-23 12:55
525 查看
#!/usr/bin/python
import re
def pre_process_msg ( msgIn ):
if msgIn=="":
return "msgIn_Input_Error,should'nt Null, it is Strings"
else:
#1 trim
msg = msgIn
msg = msg.strip()
#2 process msg internal special char replace with “ ”
dst_replace_pattern1 = re.compile('\n')
msg = dst_replace_pattern1.sub(" ",msg)
dst_replace_pattern1 = re.compile('\r')
msg = dst_replace_pattern1.sub(" ",msg)
dst_replace_pattern1 = re.compile('\t')
msg = dst_replace_pattern1.sub(" ",msg)
#3 one or more space replaced with one space,to form srings with " " internal
result=""
result=re.sub(" {1,}", " ", msg)
msg=result.strip()
print "'"+msg+"'"
return msg
import re
def pre_process_msg ( msgIn ):
if msgIn=="":
return "msgIn_Input_Error,should'nt Null, it is Strings"
else:
#1 trim
msg = msgIn
msg = msg.strip()
#2 process msg internal special char replace with “ ”
dst_replace_pattern1 = re.compile('\n')
msg = dst_replace_pattern1.sub(" ",msg)
dst_replace_pattern1 = re.compile('\r')
msg = dst_replace_pattern1.sub(" ",msg)
dst_replace_pattern1 = re.compile('\t')
msg = dst_replace_pattern1.sub(" ",msg)
#3 one or more space replaced with one space,to form srings with " " internal
result=""
result=re.sub(" {1,}", " ", msg)
msg=result.strip()
print "'"+msg+"'"
return msg
相关文章推荐
- Sublime text 2or3 插件安装办法(2015更新)
- python字符编码
- Python中*args与**args的区别
- python logging 毫秒级别的时间打印
- python笔记
- 【Python】一个python实例:给重要的文件创建备份.摘自crossin-python简明教程
- python下编译py成pyc和pyo
- Python 基础
- python 日志打印
- python中编码问题——unicode, gbk, utf8
- 堆排序python实现
- Python中logging模块的使用
- vim - Run python code in vim editor
- python读写文件,和设置文件的字符编码比如utf-8
- Python函数_参数的多类型传值
- python多进程断点续传分片下载器
- Python中文编码问题
- python unittest
- python基础
- mac python eclipse