Content
C-Style Strings Processing C-style Strings C-Style Strings vs. C++ Strings I/O Streams Command-Line Arguments <- Go BackUniversity of Michigan at Ann Arbor
Last Edit Date: 02/05/2023
Disclaimer and Term of Use:
We do not guarantee the accuracy and completeness of the summary content. Some of the course material may not be included, and some of the content in the summary may not be correct. You should use this file properly and legally. We are not responsible for any results from using this file
This personal note is adapted from Professor Amir Kamil, Andrew DeOrio, James Juett, Sofia Saleem, and Saquib Razak. Please contact us to delete this file if you think your rights have been violated.
This work is licensed under a Creative Commons Attribution 4.0 International License.
In the original C language, strings are represented as just an array of characters, which have the type char
. The following initializes a string representing the characters in the word hello
:
char str[6] = { 'h', 'e', 'l', 'l', 'o', '\0' };
We can visualize it as following:
A C-style string has a sentinel value at its end, the special null character, denoted by '\0'
. This is not the same as a null pointer, which is denoted by nullptr
, nor the character '0'
, which denotes the digit 0.
The null character signals the end of the string, and algorithms on C-style strings rely on its presence to determine where the string ends.
A character array can also be initialized with a string literal:
char str2[6] = "hello"; char str3[] = "hello";
If the size of the array is specified, it must have sufficient space for the null terminator. In the second case above, the size of the array is inferred as 6 from the string literal that is used to initialize it. A string literal implicitly contains the null terminator at its end, so both str2
and str3
are initialized to end with a null terminator.
We can also declare a C-style string as a pointer, because it is actually an array:
const char *walcomeMsg = "Welcome to EECS 280!";
Point at a string literal. We can use it, but we don't plan to modify the contents (and the compiler enforces this with the const).
For almost any operation we would like to perform on a cstring, the basic idea is that we set up a traversal by pointer loop that iterates until it happens upon the null character.
As the pointer walks through the string, we perform whatever data processing or modifications we need by dereferencing the pointer to work with individual characters.
It's generally a good idea to wrap up this kind of work in a function that can be reused wherever we need it. Let's take a look at how this plays out in code with the following example.
Try it out
The example above copy word1
to word2
and print the results out.
Because cstrings are just built on fundamental types like arrays, char
, and pointers, you don't need to include any libraries to use them. However, many common operations for cstrings are available as functions in the <cstring>
Library, which you can #include
at the top of your files if you need them. You can find documentation for these in a number of places, but online resources like http://www.cplusplus.com/reference/cstring/ are generally a good place to start.
In general, you should prefer to use C++ string
where you can. It's an easier datatype to work with than a cstring and supports intuitive string operators like ==
, <
, +
, =
, etc.
Basically it works well and doesn't have some of the unpredictable quirks. (Contrast this to the fact that by its nature as an array of characters, cstring variables won't work with any of the operators just mentioned.)
Streams are the fundamental mechanism for text-based I/O (input/output) in C++, whether it's printing messages and taking input from the user via the terminal, reading and writing to files, or a number of other applications.
The following is an example of I/O stream:
1 #include <iostream> 2 #include <string> 3 #include <fstream> 4 5 using namespace std; 6 7 int main() { 8 9 string inName = "in.txt"; 10 string outName = "out.txt"; 11 12 cout << "Copying from " << inName << " to " << outName << endl; 13 14 string wordToRemove; 15 cout << "What word would you like to remove? "; 16 cin >> wordToRemove; 17 18 ifstream fin(inName); 19 ofstream fout(outName); 20 if ( !fin.is_open() ) { 21 cout << "Unable to open " << inName << endl; 22 return 1; 23 } 24 25 if ( !fout.is_open() ) { 26 cout << "Unable to open " << outName << endl; 27 return 1; 28 } 29 30 string word; 31 while (fin >> word) { 32 if (word != wordToRemove) { fout << word << " "; } 33 else { fout << "*****" << " "; } 34 } 35 36 fin.close(); 37 fout.close(); 38 }
Here's another example. Note that the stoi()
function, converts from a string
to the int
value that it represents.
In this case, we want to read a sequence of numbers from the user via cin
and add them together. The user may enter as many numbers as they like and then types "done"
to indicate they are finished. Because we need to accommodate both numbers and a string, we use the most general type - string
and then convert to an int
where appropriate using stoi
.
1 #include <iostream> 2 #include <string> 3 4 using namespace std; 5 6 int main() { 7 int sum = 0; 8 string word; 9 while (cin >> word && word != "done") { 10 sum += stoi(word); 11 } 12 cout << "sum is " << sum << endl; 13 }
Command-line arguments are arguments that are passed to a program when it is invoked from a shell or terminal. As an example, consider the following command:
$ g++ -Wall -O1 -std=c++11 -pedantic test.cpp –o test
Here, g++
is the program we are invoking, and the arguments tell g++
what to do. For instance, the -Wall
argument tells the g++
compiler to warn about any potential issues in the code, -O1
tells the compiler to use optimization level 1, and so on.
Command-line arguments are passed to the program through arguments to main()
. The main()
function may have zero parameters, in which case the command-line arguments are discarded. It can also have two parameters, so the signature has the following form:
int main(int argc, char *argv[]);
argc
: The number of command-line arguments passed to the program.
argv
: Contains each command-line argument as a C-style string.
As we saw last time, an array parameter is actually a pointer parameter, so the following signature is equivalent:
int main(int argc, char **argv);
Thus, the second parameter is a pointer to the first element of an array, each element of which is a pointer to the start of a C-style string, as shown below:
The command-line arguments also include the name of the program as the first argument – this is often used in printing out error messages from the program.
As an example, the following program takes an arbitrary number of integral arguments and computes their sum:
1 #include <iostream> 2 #include <cstdlib> // for atoi() function 3 4 using namespace std; 5 6 int main(int argc, char *argv[]) { 7 int sum = 0; 8 for (int i = 1; i < argc; ++i) { 9 sum += atoi(argv[i]); 10 } 11 cout < "sum is " < sum < endl; 12 }
Note that atoi()
converts the string argument str
to an integer (type int
).
The first argument is skipped, since it is the program name. Each remaining argument is converted to an int
by the atoi()
function, which takes a C-style string as the argument and returns the integer that it represents. For example, atoi("123")
returns the number 123 as an int
.
The following is an example of running the program:
$ ./sum.exe 2 4 6 8 10 sum is 30