Objective-C里字符串NSString过滤HTML标签的方法
2014-07-23 17:40
429 查看
// 第一种,用NSScanner扫描,来自下面这个著名的链接,不过现在打不开鸟~ // Source: http://rudis.net/content/2009/01/21/flatten-html-content-ie-strip-tags-cocoaobjective-c
- (NSString *)removeHTML:(NSString *)html {
NSScanner *theScanner;
NSString *text = nil;
theScanner = [NSScanner scannerWithString:html];
while ([theScanner isAtEnd] == NO) {
// find start of tag
[theScanner scanUpToString:@"<" intoString:NULL] ;
// find end of tag
[theScanner scanUpToString:@">" intoString:&text] ;
// replace the found tag with a space
//(you can filter multi-spaces out later if you wish)
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:@"%@>", text] withString:@" "];
}
return html;
}
// 第二种,用NSString自带的Seprated自截断方法
- (NSString *)removeHTML2:(NSString *)html{
NSArray *components = [html componentsSeparatedByCharactersInSet:[NSCharacterSet characterSetWithCharactersInString:@"<>"]];
NSMutableArray *componentsToKeep = [NSMutableArray array];
for (int i = 0; i < [components count]; i = i + 2) {
[componentsToKeep addObject:[components objectAtIndex:i]];
}
NSString *plainText = [componentsToKeep componentsJoinedByString:@""];
return plainText;
}
转载地址:http://bbs.9ria.com/thread-244433-1-1.html
- (NSString *)flattenHTML:(NSString *)html trimWhiteSpace:(BOOL)trim
{
NSScanner *theScanner = [NSScanner scannerWithString:html];
NSString *text = nil;
while ([theScanner isAtEnd] == NO) {
// find start of tag
[theScanner scanUpToString:@"<" intoString:NULL] ;
// find end of tag
[theScanner scanUpToString:@">" intoString:&text] ;
// replace the found tag with a space
//(you can filter multi-spaces out later if you wish)
html = [html stringByReplacingOccurrencesOfString:
[ NSString stringWithFormat:@"%@>", text]
withString:@""];
}
return trim ? [html stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] : html;
}
第三种方法:一个第三方的库可以很容易的解决此问题:https://github.com/mwaterfall/MWFeedParser
Link
Summary
Link
Author name
Date (the date the item was published)
Updated date (the date the item was updated, if available)
Summary (brief description of item)
Content (detailed item content, if available)
Enclosures (i.e. podcasts, mp3, pdf, etc)
Identifier (an item's guid/id)
If you use MWFeedParser on your iPhone/iPad app then please do let me know, I'd love to check it out :)
Important: This free software is provided under the MIT licence (X11 license) with the addition of the following condition:
This Software cannot be used to archive or collect data such as (but notlimited to) that of events, news, experiences and activities, for the purpose of any concept relating to diary/journal keeping.
The full licence can be found at the end of this document.
Set delegate:
Set the parsing type. Options are
Set whether the parser should connect and download the feed data synchronously or asynchronously. Note, this only affects the download of the feed data, not the parsing operation itself.
Initiate parsing:
The parser will then download and parse the feed. If at any time you wish to stop the parsing, you can call:
The
Important: There are some occasions where feeds do not contain some information, such as titles, links or summaries. Before using any data, you should check to see if that data exists:
The method
For a usage example, please see
or you could use the provided
An example of this would be:
of
add
If you are just interested in using the HTML and/or InternetDateTime categories in your app, you can just specify those in your podfile with
Method 2: Including Source
Open
Drag the
Import
Provide functionality to list available feeds when given the URL to a webpage with one or more web feeds associated with it.
Support for the Media RSS extension (from Flickr, etc.)
Support for the GeoRSS extension.
Look into web feed icons.
Look into supporting/detecting images in feed items.
Feel free to get in touch and suggest/vote for other features.
- (NSString *)removeHTML:(NSString *)html {
NSScanner *theScanner;
NSString *text = nil;
theScanner = [NSScanner scannerWithString:html];
while ([theScanner isAtEnd] == NO) {
// find start of tag
[theScanner scanUpToString:@"<" intoString:NULL] ;
// find end of tag
[theScanner scanUpToString:@">" intoString:&text] ;
// replace the found tag with a space
//(you can filter multi-spaces out later if you wish)
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:@"%@>", text] withString:@" "];
}
return html;
}
// 第二种,用NSString自带的Seprated自截断方法
- (NSString *)removeHTML2:(NSString *)html{
NSArray *components = [html componentsSeparatedByCharactersInSet:[NSCharacterSet characterSetWithCharactersInString:@"<>"]];
NSMutableArray *componentsToKeep = [NSMutableArray array];
for (int i = 0; i < [components count]; i = i + 2) {
[componentsToKeep addObject:[components objectAtIndex:i]];
}
NSString *plainText = [componentsToKeep componentsJoinedByString:@""];
return plainText;
}
转载地址:http://bbs.9ria.com/thread-244433-1-1.html
- (NSString *)flattenHTML:(NSString *)html trimWhiteSpace:(BOOL)trim
{
NSScanner *theScanner = [NSScanner scannerWithString:html];
NSString *text = nil;
while ([theScanner isAtEnd] == NO) {
// find start of tag
[theScanner scanUpToString:@"<" intoString:NULL] ;
// find end of tag
[theScanner scanUpToString:@">" intoString:&text] ;
// replace the found tag with a space
//(you can filter multi-spaces out later if you wish)
html = [html stringByReplacingOccurrencesOfString:
[ NSString stringWithFormat:@"%@>", text]
withString:@""];
}
return trim ? [html stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] : html;
}
第三种方法:一个第三方的库可以很容易的解决此问题:https://github.com/mwaterfall/MWFeedParser
MWFeedParser — An RSS and Atom web feed parser for iOS
MWFeedParser is an Objective-C framework for downloading and parsing RSS (1.* and 2.*) and Atom web feeds. It is a very simple and clean implementation that reads the following information from a web feed:Feed Information
TitleLink
Summary
Feed Items
TitleLink
Author name
Date (the date the item was published)
Updated date (the date the item was updated, if available)
Summary (brief description of item)
Content (detailed item content, if available)
Enclosures (i.e. podcasts, mp3, pdf, etc)
Identifier (an item's guid/id)
If you use MWFeedParser on your iPhone/iPad app then please do let me know, I'd love to check it out :)
Important: This free software is provided under the MIT licence (X11 license) with the addition of the following condition:
This Software cannot be used to archive or collect data such as (but notlimited to) that of events, news, experiences and activities, for the purpose of any concept relating to diary/journal keeping.
The full licence can be found at the end of this document.
Demo / Example App
There is an example iPhone application within the project which demonstrates how to use the parser to display the title of a feed, list all of the feed items, and display an item in more detail when tapped.Setting up the parser
Create parser:// Create feed parser and pass the URL of the feed NSURL *feedURL = [NSURL URLWithString:@"http://images.apple.com/main/rss/hotnews/hotnews.rss"]; feedParser = [[MWFeedParser alloc] initWithFeedURL:feedURL];
Set delegate:
// Delegate must conform to `MWFeedParserDelegate` feedParser.delegate = self;
Set the parsing type. Options are
ParseTypeFull,
ParseTypeInfoOnly,
ParseTypeItemsOnly. Info refers to the information about the feed, such as it's title and description. Items are the individual items or stories.
// Parse the feeds info (title, link) and all feed items feedParser.feedParseType = ParseTypeFull;
Set whether the parser should connect and download the feed data synchronously or asynchronously. Note, this only affects the download of the feed data, not the parsing operation itself.
// Connection type feedParser.connectionType = ConnectionTypeSynchronously;
Initiate parsing:
// Begin parsing [feedParser parse];
The parser will then download and parse the feed. If at any time you wish to stop the parsing, you can call:
// Stop feed download / parsing [feedParser stopParsing];
The
stopParsingmethod will stop the downloading and parsing of the feed immediately.
Reading the feed data
Once parsing has been initiated, the delegate will receive the feed data as it is parsed.- (void)feedParserDidStart:(MWFeedParser *)parser; // Called when data has downloaded and parsing has begun - (void)feedParser:(MWFeedParser *)parser didParseFeedInfo:(MWFeedInfo *)info; // Provides info about the feed - (void)feedParser:(MWFeedParser *)parser didParseFeedItem:(MWFeedItem *)item; // Provides info about a feed item - (void)feedParserDidFinish:(MWFeedParser *)parser; // Parsing complete or stopped at any time by `stopParsing` - (void)feedParser:(MWFeedParser *)parser didFailWithError:(NSError *)error; // Parsing failed
MWFeedInfoand
MWFeedItemcontains properties (title, link, summary, etc.) that will hold the parsed data. View
MWFeedInfo.hand
MWFeedItem.hfor more information.
Important: There are some occasions where feeds do not contain some information, such as titles, links or summaries. Before using any data, you should check to see if that data exists:
NSString *title = item.title ? item.title : @"[No Title]"; NSString *link = item.link ? item.link : @"[No Link]"; NSString *summary = item.summary ? item.summary : @"[No Summary]";
The method
feedParserDidFinish:will only be called when the feed has successfully parsed, or has been stopped by a call to
stopParsing. To determine whether the parsing completed successfully, or was stopped, you can call
isStopped.
For a usage example, please see
RootViewController.min the demo project.
Available data
Here is a list of the available properties for feed info and item objects:MWFeedInfo
info.title(
NSString)
info.link(
NSString)
info.summary(
NSString)
MWFeedItem
item.title(
NSString)
item.link(
NSString)
item.author(
NSString)
item.date(
NSDate)
item.updated(
NSDate)
item.summary(
NSString)
item.content(
NSString)
item.enclosures(
NSArrayof
NSDictionarywith keys
url,
typeand
length)
item.identifier(
NSString)
Using the data
All properties ofMWFeedInfoand
MWFeedItemreturn the raw data as provided by the feed. This content may or may not include HTML and encoded entities. If the content does include HTML, you could display the data within a UIWebView,
or you could use the provided
NSStringcategory (
NSString+HTML) which will allow you to manipulate this HTML content. The methods available for your convenience are:
// Convert HTML to Plain Text // - Strips HTML tags & comments, removes extra whitespace and decodes HTML character entities. - (NSString *)stringByConvertingHTMLToPlainText; // Decode all HTML entities using GTM. - (NSString *)stringByDecodingHTMLEntities; // Encode all HTML entities using GTM. - (NSString *)stringByEncodingHTMLEntities; // Minimal unicode encoding will only cover characters from table // A.2.2 of http://www.w3.org/TR/xhtml1/dtds.html#a_dtd_Special_characters // which is what you want for a unicode encoded webpage. - (NSString *)stringByEncodingHTMLEntities:(BOOL)isUnicode; // Replace newlines with <br /> tags. - (NSString *)stringWithNewLinesAsBRs; // Remove newlines and white space from string. - (NSString *)stringByRemovingNewLinesAndWhitespace; // Wrap plain URLs in <a href="..." class="linkified">...</a> // - Ignores URLs inside tags (any URL beginning with =") // - HTTP & HTTPS schemes only // - Only works in iOS 4+ as we use NSRegularExpression (returns self if not supported so be careful with NSMutableStrings) // - Expression: (?<!=")\b((http|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?) // - Adapted from http://regexlib.com/REDetails.aspx?regexp_id=96 - (NSString *)stringByLinkifyingURLs;
An example of this would be:
// Display item summary which contains HTML as plain text NSString *plainSummary = [item.summary stringByConvertingHTMLToPlainText];
Debugging problems
If for some reason the parser doesn't seem to be working, try enabling Debug Logging inMWFeedParser.h. This will log error messages to the console and help you diagnose the problem. Error codes and their descriptions can be found at the top
of
MWFeedParser.h.
Other information
MWFeedParser is not currently thread-safe.Adding to your project
Method 1: Use CocoaPods
CocoaPods is great. If you are using CocoaPods (and here's how to get started), simplyadd
pod 'MWFeedParser'to your podfile and run
pod install. You're good to go! Here's an example podfile:
platform :ios, '7' pod 'MWFeedParser'
If you are just interested in using the HTML and/or InternetDateTime categories in your app, you can just specify those in your podfile with
pod 'MWFeedParser/NSString+HTML'or
pod 'MWFeedParser/NSDate+InternetDateTime'.
Method 2: Including Source
Directly Into Your Project
Open MWFeedParser.xcodeproj.
Drag the
MWFeedParser&
Categoriesgroups into your project, ensuring you checkCopy items into destination group's folder.
Import
MWFeedParser.hinto your source as required.
Outstanding and suggested features
Demonstrate the previewing of formatted item summary/content (HTML with images, paragraphs, etc) within aUIWebViewin demo app.
Provide functionality to list available feeds when given the URL to a webpage with one or more web feeds associated with it.
Support for the Media RSS extension (from Flickr, etc.)
Support for the GeoRSS extension.
Look into web feed icons.
Look into supporting/detecting images in feed items.
Feel free to get in touch and suggest/vote for other features.
相关文章推荐
- Objective c里字符串NSString 过滤HTML标签的两种方法
- Objective c里字符串NSString 过滤HTML标签的两种方法
- iOS字符串NSString 过滤HTML标签的两种方法
- asp.net 截取带有html标签的字符串(先过滤html,再截取)的方法
- 过滤字符串html标签方法
- 过滤HTML标签的方法.C#和JS分别
- 过滤字符串中的HTML标签
- DEDE中如何过滤掉Html标签,并且截取字符串长度
- KindEditor设置为过滤模式,但在代码模式下提交表单时不过虑HTML标签的解决方法
- 浅析php过滤html字符串,防止SQL注入的方法
- ASP.NET过滤html标签的几种常用方法
- ASP.NET中过滤HTML字符串的两个方法
- ASP.NET中过滤HTML字符串的两个方法
- 关于使用strip_tag过滤字符串中的html标签
- ios去掉字符串中的html标签的方法
- C# html 标签过滤方法
- HTML标签中通过字符串插入HTML-- insertAdjacentHTML方法示例
- 黄聪:Wordpress写文章自动过滤HTML标签解决方法
- java 过滤html标签方法
- objective-c过滤HTML标签