您的位置:首页 > 编程语言 > Python开发

Strip HTML tags using Python

2013-01-15 16:29 274 查看

Strip HTML tags using Python

We often need to strip HTML tags from string (or HTML source). I usually do it using a simple regular expression in Python. Here is my function to strip HTML tags:

def remove_html_tags(data):
p = re.compile(r'<.*?>')
return p.sub('', data)
Here is another function to remove more than one consecutive white spaces:

def remove_extra_spaces(data):
p = re.compile(r'\s+')
return p.sub(' ', data)
Note that re module needs to be imported in order to use regular expression.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: