C++ || Snippet – Custom “strtok” Function To Split A String Into Tokens Using Multiple Delimiters
The following is a custom “strtok” function which splits a string into individual tokens according to delimiters. So for example, if you had an string full of punctuation characters and you wanted to remove them, the following function can do that for you.
This function works by accepting an std::string containing the text to be broken up into smaller std::strings (tokens), aswell as another std::string that contains the delimiter characters. After the parse is complete, it returns a vector object containing all the found tokens in the string.
The code demonstrated on this page is different from the cstring strtok function in that this implementation works for C++ only.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
// ============================================================================ // Author: Kenneth Perkins // Date: Jun 13, 2014 // Taken From: http://programmingnotes.org/ // File: Strtok.cpp // Description: Demonstrates the use of a custom "strtok" function which // splits a string into individual tokens according to delimiters. // ============================================================================ #include <iostream> #include <string> #include <vector> using namespace std; // function prototype vector<string> str_tok(const string& str, const string& delimiters); int main() { // declare variables string str = "- This, is. a sample string? 3!30$' 19: 68/, LF+, 1, 1"; string delimeters = " ,.-':;?()+*/\%$#!\"@^&"; vector<string> tokens; cout << "The original string: " << str << endl; // split the string into tokens according to the delimeter tokens = str_tok(str, delimeters); cout << "\nThe tokens: \n"; for(unsigned x = 0; x < tokens.size(); ++x) { cout << tokens[x] << endl; } return 0; }// end of main /** * FUNCTION: str_tok * USE: Splits a string into individual tokens and saves them into a vector. * @param str: A std::string to be broken up into smaller std::strings (tokens). * @param delimiter: A std::string containing the delimiter characters. * @return: A vector containing all the found tokens in the string. */ vector<string> str_tok(const string& str, const string& delimiters) { std::size_t prev = 0; std::size_t currentPos = 0; vector<string> tokens; // loop thru string until we reach the end while((currentPos = str.find_first_of(delimiters, prev)) != string::npos) { if(currentPos > prev) { tokens.push_back(str.substr(prev, currentPos - prev)); } prev = currentPos + 1; } // if characters are remaining, save to vector if(prev < str.length()) { tokens.push_back(str.substr(prev, string::npos)); } return tokens; }// http://programmingnotes.org/ |
QUICK NOTES:
The highlighted lines are sections of interest to look out for.
The code is heavily commented, so no further insight is necessary. If you have any questions, feel free to leave a comment below.
Once compiled, you should get this as your output
The original string: - This, is. a sample string? 3!30$' 19: 68/, LF+, 1, 1
The tokens:
This
is
a
sample
string
3
30
19
68
LF
1
1
Leave a Reply