PEG:C++ Lesson 5 for Pascal Users

From PEGWiki
Jump to: navigation, search
← Previous Lesson | Next Lesson →

I/O is the total of all data transfer between a program's memory and the "outside world", including the screen/keyboard, disk drives, and other programs. In C++, just as in Pascal, I/O is abstracted to devices known as files. A file represents any source or destination for data. Hence, each file on your hard disk is an I/O file; your program usually starts with a file for the keyboard and two files for the screen; and special files called pipes enable one program to communicate with another.

In C++, files are further abstracted to streams. A stream, loosely speaking, is some object that allows data to be read and/or written. Most streams allow additional operations, but read and write are fundamental.

To clarify this abstraction: suppose your program prints output... the data you wish to output is sent to the stream, which is part of your program. cout is a stream. Regardless of whether cout points to the screen, or to a file, or to another program, it is still cout. The stream is associated with a file, which is part of the operating system's memory; the stream just remembers some number, such as 1, and uses this to make a request to the operating system that the data be written to the file. A file might keep track of data such as /home/brian/programs/SPOJ/106miles.out. The operating system forwards the data to an associated device. Hence, in this example, the OS will make requests to the hardware to seek to the file /home/brian/programs/SPOJ/106miles.out, and write the data.

How convenient, then, that we do not have to deal with all the nitty-gritty details when we perform I/O! The abstraction of streams makes things very easy indeed.

C++ has inherited all I/O operations from the older C language and also added its own. We will discuss the C++ system of I/O here and the C system in a later lesson. (The latter is almost always faster and sometimes more powerful, but usually more arcane and less convenient.)

All C++ programs have three standard streams. They are known as cin, cout, and cerr. The first, the standard input file, only allows reading and, by default, represents the keyboard (or at least it seems to be); it is equivalent to input or stdin in Pascal. The second, the standard output file, only allows writing and represents, by default, the screen; equivalent to output or stdoutin Pascal. The third, the standard error file, allows only writing as well, and represents, by default, the screen; its Pascal equivalent depends on the compiler but is usually stderr. Standard error is useful because the standard output can be redirected so that the output is printed to a file instead of the terminal, but if an error occurs, or if we want to print out data for debugging purposes, we still want data to end up on the screen, hence we can write it to standard error.

To access the standard streams, you must include the header <iostream> and write using namespace std;. Every readable stream supports the extraction operator -- the operator that allows data to be read from it. In C++, this is (confusingly) the same as the shift-right operator, >>. We can chain together >>s. Similarly, every writable stream supports the insertion operator, <<, which can be chained as well. The special object endl can be inserted into an output stream; it represents a newline. That is, sending endl to cout is like the statement writeln; in Pascal. Examine the example below (and note that entities enclosed in double quotes are strings in C++).

#include <iostream>
using namespace std;
int main()
{
     int x,y;
     cout << "Enter two integers separated by a space." << endl;
     cin >> x >> y;
     cout << "The first number you entered was " << x << endl;
     cout << "The second number you entered was " << y << endl;
     cout << "Their sum is " << x+y << endl;
     return 0;
}

At this point is is convenient to discuss some finer points regarding cin:

  1. cin remembers its position in the input stream. Normally, when you read, characters are retrieved from the current position onward, and then the new position is set just after the last character read.

  2. If you attempt to read a character, all whitespace will be ignored, so that cin >> c will never leave a space, tab, or newline in c, merely skipping them to find the first non-whitespace character.
  3. The statement cin >> x, where x is an integral type, might do the following:
    1. Skip whitespace characters.
    2. Read a character. If it is + or -, record the integer's sign and read the next character. If it is a digit, proceed to step c. If it is neither, proceed to step d.
    3. If the character is anything other than a digit, stop reading. (Notice that if the character is neither digit nor whitespace in Pascal, a runtime error would occur, but C++ is much more lenient.) Otherwise, read the next character and repeat this step.
    4. If the integer obtained by reading all these characters is too large to fit in the requested type, or if no characters were read at all (this may happen if the user enters some string which does not represent a valid integer), the instruction cin >> x leaves x unchanged. It does not generate an error. Otherwise, the integer read is stored in x.
    5. The last character read (that is, the first one that was not a digit) is "stuffed back" into the input stream and the position is moved back one character, so that it is available for a future read operation and not merely "skipped". This means that if the user enters 1234a and your program reads an int followed by a character, 1234 will be stored in the int and 'a' in the char.
  4. Real types are dealt with analogously. The gist of it is that reading stops when some character is encountered that cannot be part of the number, such as a second sign character, a second decimal point, a decimal point following E, and so on, that whatever can be read is stored in the variable if possible (i.e. something valid is read and overflow/underflow does not occur), and the last character is put back in the stream.
  5. The end-of-file condition arises when the position in the input stream is just past the stream's last character. As cin, cout, and cerr can be redirected to refer to files on disk (this is what happens on an online judge), it may happen when reading data with cin that the end of the file is reached. In fact, some poorly designed SPOJ problems tell you that the input file ends with the end-of-file. Thus, it might not tell you the total number of test cases, or some sentinel input value that, when read, tells your program that there is no more test data; it might be that your program must determine where the file ends. It is important, therefore, to understand how C++ deals with the end-of-file condition.

Essentially, the end-of-file behaves like whitespace in the sense that it stops the reading of an integer or real. If you attempt to read a character and the end-of-file is reached, the character variable is left unchanged. The function cin.eof() returns true if the position of cin is at the end of the file, false otherwise.

There is a subtlety here to be addressed, however. Suppose the problem tells you to add together numbers, two to a line, until the end of file is reached. So if the input file were:

1 1
12 34
1337 5291

then the output would be

2
46
6628

Supposing for a moment that you know how to use while loops, you might use the condition while (!cin.eof()) to check if there is more data to be read. This works fine if the final "1" is actually the last character in the input file, but what if it is followed by whitespace? Then, the end-of-file condition does not actually exist yet, so the loop will run one more time, and you will print out one more line of output than you should.

The solution is to attempt to read first and then check if you're at the end of the file. Because even if the "1" were followed by whitespace, the next attempt to read an integer would skip all the whitespace and then reach the end of the file.