C Output and Input • Jonathan Cook

This page presents plain C I/O, not C++ I/O

Our book presents I/O in C++ that is very specific to C++. Since we are also learning C in this course, we are going to spend time learning “old-fashioned” C I/O. We will learn parts of it but certainly not all of it.

Unlike C++ I/O, which uses overloaded operators (don’t worry about what that means for now), C I/O uses plain old functions. The most commonly used ones are printf and scanf, although each has several variants.

C (and C++) has three predefined and preopened I/O channels:

stdin is the input channel, and typically reads input from the keyboard, from a file, or from another program;
stdout is the regular output channel, and typically prints to the screen, into a file, or to another program;
stderr is the error output channel, and typically prints to the screen, into a file, or to another program.

Note that where the input and output is coming from or going to is dependent on how the program was run, not on how the program was written. The program doesn’t know the difference between the different sources.

Output using `printf`

The printf function prototype looks like:

`int printf( char* format_string, ... )`

What that signature means is that it returns an integer value, the first parameter is a string, and the rest of the parameters are undefined!

How can that be?

printf is known as a function that can take a variable number of arguments. It has to take at least one, the format string, but it might then take 0 more, or 100 more.

How does printf know how many arguments were actually given to it in a call?

The short answer is: it doesn’t. (That should scare you!) The long answer is, the format string tells it.

The way printf works is that it takes the format string and starts printing it to stdout. But whenever it finds a special code in the format string, it processes the code, and that code might tell it to access the next argument (and the next, and the next,…).

Backslash formats

For example, here’s an example call to printf:

printf("Hello world! Please say\t\"Hello\" back!\n");

This is a call that uses only the format string and no more arguments. But it does contain some special codes, each of which begins with a ‘' character (a backslash). \t tells printf to print a tab character, " tells it to print a double quote, and \n tells it to print a newline (e.g., end the current line of output and start a new one). In your program text, a tab character in a string would be hard to see, C does not allow line breaks in strings, and double quotes are used to begin and end strings, so each of these needs a different kind of representation. So the backslash character is used to tell printf (and C) to treat the next character as special.

What if you want to print a backslash? Good question – you need to use two backslashes in your string! “\” is a string with one backslash in it.

Note that, strictly speaking, the backslash operator is a feature of the C language (and C++ as well), not the printf function. By the time the string is sent to printf, the backslash operator is gone and the command is replaced by the proper character.

% formats

The backslash helps us print wierd characters, but it doesn’t do anything to access more arguments. The % (percent) character does that.

For example, the call to printf of:

printf("My first number is %d and my second %f\n", 42, 99.003);

will print out “My first number is 42 and my second 99.003”, followed by a newline character. The “%d” means “take the next available argument as an integer and print it, and “%f” means “take the next available argument as a float and print it”. Every % operator uses up an argument, and the next % operator access the next argument. There is no other choice – you cannot reorder them. They must be processed left to right.

What if we had accidentally typed

printf("My first number is %d and my second %f\n", 99.003, 42);

What would happen? You might be tempted to think that the %d will find the next integer argument, but it doesn’t! On my computer, this prints out “My first number is 652835029 and my second 0.000000”. This is the reason that C++ invented a new I/O mechanism! In C, printf and scanf are unsafe and rely on the programmer to get everything exactly right!

On the bright side, most modern C compilers can warn you of errors in your printf format, and GCC on Linux (if I use -Wall) does indeed tell me

print.c:5: warning: int format, double arg (arg 2)
print.c:5: warning: double format, different type arg (arg 3)

Some of the more popular formats are:

Format	Meaning
d	Signed decimal integer
u	Unsigned decimal integer
x	Hexadecimal integer
f	Floating point value in decimal format
e	Float value in exponential format
g	Float value in either format
c	A single character
s	A C string
p	A pointer value

Length and position modifiers

Each format (except ‘c’ I think) can take various flags and modifiers. Here we will only talk about the ‘-’ flag and the length and precision modifiers. See the man page (“man 3 printf”) for much more detail.

If a flag such as ‘-’ is used, it must appear immediately after the % symbol (or another flag). Then if the length is present, it must appear next. If precision is used, it must appear after a decimal point, and after the flags and length if they are present. Finally, the original format symbol is specified. Thus, examples of valid signed decimal integer formats are: “%d”, “%-d”, “%4d”, “%4.3d”, “%.3d”, and “%-3.5d”.

The ‘-’ flag always means left justification. It is most useful with string formats. The length always means the minimum number of characters printed, with spaces being used as padding (on the left normally, or on the right with the ‘-’ flag). The precision has different meanings for different formats. For strings, it means the maximum number of characters to print, truncating the string if necessary. For integer formats, it means the number of digits to print, using leading 0’s if needed. For floating point formats, it means the number of digits after the decimal point.

These modifiers let you control how your output is formatted. You can make nice reports with columns by using proper modifiers:

For a string field, like a name, it is common to want left justification and you should always specify an equal length and precision so that your column never overruns another. For example, printf("%-20.20s\n",MyStringVar) will print exactly 20 characters every time.
For integer fields, it is rare that you want leading zeros, and numbers are usually right justified, so a length modifier is all you need. However, you must make sure that it is longer or equal to the largest integer your data will have, otherwise you will get a misaligned column. E.g., printf("%6d",MyInt) will work fine if the variable will never have a value of a million or more.
For float/double fields, usually you want to control the precision as well as the length. For example, printf("%10.2",MyFloat) will print two fractional digits, which might be nice from printing out dollar amounts. (note: In a real financial application, you should not be using real values to represent money, anyways!)

The `scanf()` input function

Input using scanf looks very similar to printf, but instead of printing values out, it is reading values in from stdin. The format string tells scanf what to look for in the input, and what values to assign to variables.

The big difference that causes the most problems for new C programmers is that when printf sees a “%d” in its format string, it expects an integer argument that it can process. But when scanf sees a “%d” in its string, it expects an argument that is a pointer to an integer variable, where it can store the incoming integer data that it finds. All scanf arguments must be pointers to variables!

For example, scanf("val: %d",&MyIntVar) would expect that the input would contain the characters “val: “, and then some digit characters that it could convert into an integer value, which it then assigns to MyIntVar. The & operator is a C/C++ operator that provides a pointer to the variable. We read it as “address of”.

The scanf can take most all of the same formats that printf does. See “man 3 scanf” for details.

File I/O

C has a library of functions, called Standard I/O, or stdio, that offer a large collection of ways to deal with input and output. printf and scanf are just two out of many.

For writing programs that need to explicitly read or write to files, here we’ll explain the basics of the most common mechanism. Here’s a simple example that we’ll use:

#include <stdio.h>;

int main()
{
   FILE *f;
   int i;
   f = fopen("datafile.txt","r");
   if (f == NULL) 
   {
      printf("Error: unable to open file\n");
      return 1;
   }
   while (!feof(f))
   {
      if (fscanf(f,"%d",&i))
      {
         printf("data = %d\n",i);
      }
   }
   fclose(f);
   return 0;
}

Firstly, you need to define a variable of type FILE*. This is your handle that you use to refer to the file while it is open. You initialize it with an fopen call, which takes two string parameters. The first is just the filename. The second is “r” for read mode, “w” for write mode, and “a” for append mode. Warning: the “w” mode will erase any current file contents, if there are any! The append mode allows you to add to a current file (or create a new one if it doesn’t yet exist). These three modes are enough for this class, but there are others that allow you to do more, such as read and write from the same file.

You should always check the file handle for NULL-ness after you open a file! If the open failed and your program doesn’t check, it will crash. In this course, you will lose points if you do not check.

Once a file is open, you can use it for reading or writing, whatever is appropriate for how you opened it. In the program above, we need to start reading it. The loop uses another new function, feof which is short for “file-end-of-file”, and it tests to see if the file has anything left for reading. As usual in C, functions often return a 0 for success, or in this case, to mean the end-of-file has not yet been reached. You should read while (!feof(f)) as “while not end of file on f”.

In this program, we use a variant of scanf to read in the data. This version, fscanf, is exactly like the first but it reads from the designated file handle, which is its first argument. There’s even another variant that reads data from a string! The function prototypes for the three look like:

 scanf (char *format, ...);
fscanf (FILE *infile, char *format, ...);
sscanf (char *indata, char *format, ...);

We’re using the second, and are asking for it to read in a single integer and assign the value to our variable i. All of the scanf functions return the number of data values that they read in and assigned.

When we run out of integers in our input file, fscanf won’t be able to read anymore, it will return 0 or something negative, our loop test will finally fail, and we’ll break out of the loop. At that point we need to close the file, and we use fclose for that. You should never use a file handle after closing it! Your program will probably crash.

Neat Stuff, Sorta

It turns out that C and its standard libraries generally has extreme capability to greatly compact your program. The sample program above is nice and clear, but in reality we are not using all the power of C. We could do:

#include <stdio.h>;

int main()
{
   FILE *f;
   int i;
   if ((f = fopen("datafile.txt","r")) == NULL)
   {
      printf("Error: unable to open file\n");
      return 1;
   }
   while (fscanf(f,"%d",&i) > 0)
      printf("data = %d\n",i);
   fclose(f);
   return 0;
}

All I did was take advantage of the C language fact that an assignment statement returns the value it is assigning (and thus you can both assign the value to f and compare it to NULL, with the right set of parentheses), and the fact that fscanf will check for EOF itself (I read the man page to learn this).

While some compacting of a program is good, overdoing it is usually a bad thing. Indeed, for 15 years most C programming books taught programmers to open up files just like the compact form above, but now it is considered bad form to place an assignment inside a conditional expression. (And I will take off points for programs that use this form!)

Safe I/O Programming

When processing text files that contain line-based formatted data, using scanf and fscanf generally works, but I prefer to use a safer method: reading the line into a string and then scanning the data out of the string rather than directly out of the file.

Why? Firstly, since the scan functions do not recognize or care about newlines, if one line is badly formatted (say, with four integers rather than the expected three), the next scan will be reading across line boundaries, and all subsequent one’s will too. By reading each line into a string and scanning the string, if one line is messed up it won’t affect all subsequent lines.

Secondly, early implementations of the C library I/O functions sometimes seemed to not work well – they worked, but sometimes they might hang thinking they need data even though the data is available. This is especially true when the “file” really isn’t a file but is data piped from another program.

So, below is an example of reading line-based input that is expected to have three integers per line, in my “safe” programming style:

#include <stdio.h>;

int main()
{
   FILE *fin;
   char line[256];
   int data1, data2, data3, nread;
   fin = fopen("input.dat", "r");
   if (!fin)
   {
      fprintf(stderr, "Can't open input.dat\n");
      return 1;
   }
   while (fgets(line, sizeof(line), fin))
   {
      nread = sscanf(line, "%d %d %d", &data1, &data2, &data3);
      if (nread == 3)
         printf("line read: %d %d %d\n", data1, data2, data3);
      else
         printf("line read failed\n");
   }
   fclose(fin);
   return 0;
}

Summary of useful I/O functions (use the “man” command to learn more)

Standard I/O Function	Purpose	Return Value
`int printf( char *format, ...)`	prints to stdout according to format and args	# of characters printed
`int scanf( char *format, ...)`	read in data from stdin (use pointers!)	# of data values assigned (or eof)
`FILE* fopen(char filename, char mode)`	open a file (r/w/a mode)	valid file handle or NULL
`int fclose(FILE *file)`	close a file	0 on success, EOF on error
`int fprintf( FILE outfile, char format, ...)`	print output to a file	# of chars printed
`int fscanf( FILE infile char format, ...)`	read in data from file (use pointers!)	# of data values assigned (or eof)
`int feof( FILE *f)`	test file for End-Of-File	0 if more data still available
`int fflush(FILE *file)>`	make sure data is written!	0 on success, EOF otherwise
`int fread(char data, int size, int nmemb, FILE infile)`	read size*nmemb data from a file	number of items (not # of chars!)
`int fwrite(char data, int size, int nmemb, FILE outfile)`	write size*nmemb data to a file	number of items (not # of chars!)
`char fgets(char str, int size, FILE *infile)`	read a line of data (up to size bytes) from a file	str on success, 0 on error/EOF
`int fputs(char str, FILE outfile)`	write string out to a file	0+ on success, negative on error
`int sprintf( char outdata, char format, ...)`	print output to a char string	# of chars printed
`int sscanf( char indata, char format, ...)`	read in data from a string (use pointers!)	# of data values assigned (or eof)

Other resources

The CPPReference site has a good section on C; just scroll down to the bottom of the main page. The C section includes a page on file I/O.