您的位置:首页 > 编程语言 > C语言/C++

C++ text file process summary

2016-07-27 15:20 330 查看
Summary

Read line by line to Vector

Function to remove the white space

Regular Expressions to remove comments

Header and namespace usage

Process Result

Reference

Summary

Read out a text file line by line in C++ into string Vector, and process it accordingly, by using string process function, and regular expressions, after the processing, removed the white space, tab space, empty line, comments etc.

Read line by line to Vector

Open and push each line into the string Vector is easy, refer to this Link, use function
std::getline
to read out each line and use
store.push_back(textLine)
to push each line into string vector, use for loop to iterate each line and process, as below code shows,

vector<string> store;
std::string textLine;
//sPath contains the file name and path.
std::ifstream inputFile (sPath);
//make sure file is properly opened.
if(!inputFile)
{
szMsg.Format(_T("Failed to open the script file!"));
throw szMsg;
}
//read out the file one line by one line and push into the Vector "store"
while (std::getline(inputFile, textLine))
{
store.push_back(textLine);
}

//process the string Vector one line by one line
for (std::vector<string>::iterator it = store.begin() ; it != store.end(); ++it)
{
std::stringstream apdu_str; // String stream to convert int to string
apdu_str << *it; // Convert to stringstream
//remove all the white space or tabe space.
std::string apdu_str_tmp = reduce(apdu_str.str(), "");
//convert to upper case
transform(apdu_str_tmp.begin(), apdu_str_tmp.end(), apdu_str_tmp.begin(), toupper);
//check if the first character is hex or not, if it is not hex, and it's not "RESET", will continue the loop to process the next line
if(!isxdigit(apdu_str_tmp[0]) && apdu_str_tmp.compare("RESET"))
{
//cout<<"it not hex and not RESET, illegal!!!!";
continue;
}
//remove the comments, comments start with "//"
std::tr1::regex regex_apdu ("(.*)(\\/\\/.*)");
//comments includes "//" and after "//" will be removed.
apdu_str_tmp = tr1::regex_replace (apdu_str_tmp,regex_apdu,std::string("$1"));

LPWSTR new_apdu_str;
//convert to LPWSTR and display in the GUI.
new_apdu_str = ConvertString(apdu_str_tmp);
szMsg.Format(_T("%s"), new_apdu_str);
LOG_INFO_APPEND(szMsg);
//delete the new_apdu_str to avoid memory leakage.
delete[] new_apdu_str;
}


Function to remove the white space

Refer to this Link, to to remove or replace the white space or tab space in the string, splits into two steps, first step use
trim
function to remove the leading white space and ending white space, 2nd step remove or replace the white space in the middle of the string. Below is the two functions,

std::string trim(const std::string& str,
const std::string& whitespace = " \t")
{
std::size_t strBegin = str.find_first_not_of(whitespace);
if (strBegin == std::string::npos)
return ""; // no content

std::size_t strEnd = str.find_last_not_of(whitespace);
std::size_t strRange = strEnd - strBegin + 1;

return str.substr(strBegin, strRange);
}

std::string reduce(const std::string& str,
const std::string& fill = " ",
const std::string& whitespace = " \t")
{
// trim first
std::string result = trim(str, whitespace);

// replace sub ranges
std::size_t beginSpace = result.find_first_of(whitespace);
while (beginSpace != std::string::npos)
{
std::size_t endSpace = result.find_first_not_of(whitespace, beginSpace);
std::size_t range = endSpace - beginSpace;

result.replace(beginSpace, range, fill);

std::size_t newStart = beginSpace + fill.length();
beginSpace = result.find_first_of(whitespace, newStart);
}

return result;
}


Regular Expressions to remove comments

Use the regular expression to find out the comments and remove it, for all the string start with “//” will be processed as comments and be removed. The regular expression is as:

std::tr1::regex regex_apdu ("(.*)(\\/\\/.*)");


To remove the comments, when matched, there will be three results be stored into variables, “$0” will store the whole matching, “$1” will store the string before the comments, “$2” will store the string after the comments (include the “//”), so if we output “$1”, it will be the result of removed comments, as below code shows,

apdu_str_tmp = tr1::regex_replace (apdu_str_tmp,regex_apdu,std::string("$1"));


As of above function, the 3rd parameter must be a string class, not a string literal, added string literal cast to a string
std::string
.

Refer to Link1, Link2 and Link3 for more examples.

Header and namespace usage

Used below header and namespace:

#include <vector>
#include <iterator>
#include <fstream>
#include <string>
#include <sstream>
#include <algorithm>
#include <regex>
using namespace tr1;
using namespace std;


Process Result

Before the process the read out text is as below, there is some space in the line 6:

Reset
Res
00A4040007A0000000041010
//This is comments
80500000081122334455667788//this is comment1

not a hex
8482000010404142434445464748494A4B4C4D4E4F404142434445464748494A4B4C4D4E4F404142434445464748494A4B4C4D4E4F  //this is a comments2


After process the result is as below,

RESET
00A4040007A0000000041010
80500000081122334455667788
8482000010404142434445464748494A4B4C4D4E4F404142434445464748494A4B4C4D4E4F404142434445464748494A4B4C4D4E4F


Reference

1,John D. Cook’s blog entry: C++ TR1 regular expressions

2,Reading line from text file and putting the strings into a vector?

3,Is there a C++ iterator that can iterate over a file line by line?

4,Removing leading and trailing spaces from a string

5,cppreference.com std::regex_match

6,http://www.cplusplus.com, std::regex_replace
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息