您的位置:首页 > Web前端 > HTML

匹配获取HTML标签属性的正则 表达式

2015-06-07 21:56 585 查看
目的:

1、希望删除除class,src,href外的其他HTML

例如

1)

<a href="http://51js.com" title="这是标题" class="a">标题</a>

删除属性后:

<a href="http://51js.com" class="a">标题</a>

2)

<td style="color:red" class="b" rospan="3" colspan="5"> </td>

删除属性后:

<td class="b" rospan="3" colspan="5"> </td>

想找到以个匹配这样的正则表达式,谢谢。

 

LZ,刚看到,你一是“希望删除除class、src、href外的其他”,二是希望删除除class、 rospan、 colspan外的其它。综合1、2,你的意思就是删除title、style等。由此琢磨你的实际用途也就是说,大多数的标签属性需要保留,只有少数删除掉。那我就想啦,标签属性如此之多,匹配保留的关键词串势必特别雍长,你干嘛不穷举少数呢? SUCH AS:(?=title|style)\b[^\s]+=["']?[^"']*["']?(?=\s|>)。下面逐步建立和完善这个正则。

var str = '\

<a href="http://51js.com" title=这是标题 class="a">标题</a>\

<td style=color:red class="b" rospan="3" colspan="5"></td>\

<td style="border-right: #d4d0c8; padding-right: 0.75pt; border-top: #d4d0c8;\

padding-left: 0.75pt; padding-bottom: 0cm; border-left: windowtext 0.5pt solid;\

width: 62pt; padding-top: 0.75pt; border-bottom: black 0.5pt solid; height: 18.75pt;\

background-color: transparent" width=83 rowspan="4">\

<a href="fdsafd" class="ddd" rowspan="fdsd">\

';

str = str.replace(/(?=title|style)\b[^\s]+=["']?[^"']*["']?(?=\s|>)/gi, '');

alert(str)

 

<script>

var str = '\

<a href="http://51js.com" title=这是标题 class="a">标题</a>\

<td style=color:red class="b" rospan="3" colspan="5"></td>\

<td style="border-right: #d4d0c8; padding-right: 0.75pt; border-top: #d4d0c8;\

padding-left: 0.75pt; padding-bottom: 0cm; border-left: windowtext 0.5pt solid;\

width: 62pt; padding-top: 0.75pt; border-bottom: black 0.5pt solid; height: 18.75pt;\

background-color: transparent" width=83 rowspan="4">\

<a href="fdsafd" class="ddd" rowspan="fdsd">\

';

str = str.replace(/(?=title|style)\b[^\s]+=["']?[^"']*["']?(?=\s|>)/gi, '');

alert(str)

</script>

复制代码运行代码另存代码

截取parent源码片段,实际测试HTML标签属性过滤。为了说明问题,在上面基础上再过滤掉属性“class”和“alt”。

<textarea id="txt" style="width:500px;height:500px">

<div class="maintable"><br><div class="subtable nav" style="width:100%">

<span id="forumlist" onmouseover="showMenu(this.id)"><a href="index.php">无忧脚本</a></span>

» <a href="forumdisplay.php?fid=1">JavaScript & VBScript & DHTML 脚本技术讨论版</a> » 求以匹配获取HTML标签属性的正则 表达式</div><br></div>

<div class="maintable">

<table width="100%" cellspacing="0" cellpadding="0" align="center" style="clear: both;">

<tr><td valign="bottom">

<div style="margin-bottom: 4px">

<a href="redirect.php?fid=1&tid=88672&goto=nextoldset" style="font-weight: normal"> ‹‹ 上一主题</a> | <a href="redirect.php?fid=1&tid=88672&goto=nextnewset" style="font-weight: normal">下一主题 ››</a><br>

</div>

</td><td width="40%" align="right" valign="bottom">

<div class="right"> <a href="post.php?action=reply&fid=1&tid=88672&extra="><img src="images/default/reply.gif" border="0" alt="" /></a></div>

<div id="newspecialheader" class="right" onmouseover="showMenu(this.id)"><a 

href="post.php?action=newthread&fid=1&extra="

><img src="images/default/newtopic.gif" border="0" alt="" /></a><a href="###"><img src="images/default/newspecial.gif" border="0" alt="" /></a></div>

<div class="popupmenu_popup newspecialmenu" id="newspecialheader_menu" style="display: none">

<table cellpadding="4" cellspacing="0" border="0" width="100%">

<tr><td class="popupmenu_option"><div class="newspecial"><a href="post.php?action=newthread&fid=1&extra=&poll=yes">投票</a></div></td></tr>

<div class="maintable">

</textarea>

<script>

var str = document.getElementById("txt").value;

str = str.replace(/(?=title|style|class|alt)\b[^\s]+=["']?[^"']*["']?(?=\s|>)/gi, '');

alert(str)

</script>

<textarea id="txt" style="width:500px;height:500px">

<div class="maintable"><br><div class="subtable nav" style="width:100%">

<span id="forumlist" onmouseover="showMenu(this.id)"><a href="index.php">无忧脚本</a></span>

» <a href="forumdisplay.php?fid=1">JavaScript & VBScript & DHTML 脚本技术讨论版</a> » 求以匹配获取HTML标签属性的正则 表达式</div><br></div>

<div class="maintable">

<table width="100%" cellspacing="0" cellpadding="0" align="center" style="clear: both;">

<tr><td valign="bottom">

<div style="margin-bottom: 4px">

<a href="redirect.php?fid=1&tid=88672&goto=nextoldset" style="font-weight: normal"> ‹‹ 上一主题</a> | <a href="redirect.php?fid=1&tid=88672&goto=nextnewset" style="font-weight: normal">下一主题 ››</a><br>

</div>

</td><td width="40%" align="right" valign="bottom">

<div class="right"> <a href="post.php?action=reply&fid=1&tid=88672&extra="><img src="images/default/reply.gif" border="0" alt="" /></a></div>

<div id="newspecialheader" class="right" onmouseover="showMenu(this.id)"><a 

href="post.php?action=newthread&fid=1&extra="

><img src="images/default/newtopic.gif" border="0" alt="" /></a><a href="###"><img src="images/default/newspecial.gif" border="0" alt="" /></a></div>

<div class="popupmenu_popup newspecialmenu" id="newspecialheader_menu" style="display: none">

<table cellpadding="4" cellspacing="0" border="0" width="100%">

<tr><td class="popupmenu_option"><div class="newspecial"><a href="post.php?action=newthread&fid=1&extra=&poll=yes">投票</a></div></td></tr>

<div class="maintable">

</textarea>

<script>

var str = document.getElementById("txt").value;

str = str.replace(/(?=title|style|class|alt)\b[^\s]+=["']?[^"']*["']?(?=\s|>)/gi, '');

alert(str)

</script>

复制代码运行代码另存代码

对上面表达式进行完善,防止出现<...>title="..."</...>时候删除title="...",即仅仅过滤HTML标签的属性。

具体添加逻辑判断(?![^>]*(?=<)):字符处理范围不包括标签对之间的innerText ==> /(?![^>]*(?=<))(?=title|style)\b[^\s]+=["']?[^"']*["']?(?=\s|>)/gi

       

<script>

var str = '<a href="http://51js.com" title=这是标题 class="a">标题 title=这是标题 style=color:red</a><td style=color:red class="b" rospan="3" colspan="5">style=color:red title=这是标题</td>';

str = str.replace(/(?![^>]*(?=<))(?=title|style)\b[^\s]+=["']?[^"']*["']?(?=\s|>)/gi, '');

alert(str)

</script>

复制代码运行代码另存代码
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: