您的位置：首页 > 其它

提取网页中链接和标题的正则表达式

2005-08-19 08:57 344 查看

StreamReader sr = new StreamReader("c://sina.htm",System.Text.Encoding.Default);
string strHtml = sr.ReadToEnd();string p *)""|'(?<url>[^']*)'|(?<url>[^/>^/s]+)).*/>(?<title>[^/<^/>]*)/</.*/</a/]=@"/<a.*href/s*=/s*(?:""(?<url>[^""]*)""|'(?<url>[^']*)'|(?<url>[^/>^/s]+)).*/>(?<title>[^/<^/>]*)/</.*/</a/>";Regex reg = new Regex(p, RegexOptions.IgnoreCase | RegexOptions.Compiled);
MatchCollection ms = reg.Matches(strHtml);

foreach(Match m in ms)
{
Console.WriteLine("{0}/n{1}/n/n", m.Groups["title"].Value, m.Groups["url"].Value);
}

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航