C Output and Input
This page presents plain C I/O, not C++ I/O
Our book presents I/O in C++ that is very specific to C++. Since we are also learning C in this course, we are going to spend time learning “old-fashioned” C I/O. We will learn parts of it but certainly not all of it.
Unlike C++ I/O, which uses overloaded operators (don’t worry about
what that means for now), C I/O uses plain old functions. The most
commonly used ones are printf
and scanf
,
although each has several variants.
C (and C++) has three predefined and preopened I/O channels:
stdin
is the input channel, and typically reads input from the keyboard, from a file, or from another program;stdout
is the regular output channel, and typically prints to the screen, into a file, or to another program;stderr
is the error output channel, and typically prints to the screen, into a file, or to another program.
Note that where the input and output is coming from or going to is dependent on how the program was run, not on how the program was written. The program doesn’t know the difference between the different sources.
Output using printf
The printf
function prototype looks like:
`int printf( char* format_string, ... )`
What that signature means is that it returns an integer value, the first parameter is a string, and the rest of the parameters are undefined!
How can that be?
printf
is known as a function that can take a variable
number of arguments. It has to take at least one, the format
string, but it might then take 0 more, or 100 more.
How does printf
know how many arguments were actually
given to it in a call?
The short answer is: it doesn’t. (That should scare you!) The long answer is, the format string tells it.
The way printf
works is that it takes the format string
and starts printing it to stdout
. But whenever it finds
a special code in the format string, it processes the code, and
that code might tell it to access the next argument (and the next,
and the next,…).
Backslash formats
For example, here’s an example call to printf
:
printf("Hello world! Please say\t\"Hello\" back!\n");
This is a call that uses only the format string and no more
arguments. But it does contain some special codes, each of which
begins with a ‘' character (a backslash). \t tells printf
to print a tab character, " tells it to print a double quote,
and \n tells it to print a newline (e.g., end the current line of
output and start a new one). In your program text, a tab character
in a string would be hard to see, C does not allow line breaks in
strings, and double quotes are used to begin and end strings, so
each of these needs a different kind of representation. So the
backslash character is used to tell printf (and C) to treat the
next character as special.
What if you want to print a backslash? Good question – you need to use two backslashes in your string! “\” is a string with one backslash in it.
Note that, strictly speaking, the backslash operator is a feature
of the C language (and C++ as well), not the printf
function.
By the time the string is sent to printf
, the backslash
operator is gone and the command is replaced by the proper character.
% formats
The backslash helps us print wierd characters, but it doesn’t do anything to access more arguments. The % (percent) character does that.
For example, the call to printf of:
printf("My first number is %d and my second %f\n", 42, 99.003);
will print out “My first number is 42 and my second 99.003”, followed by a newline character. The “%d” means “take the next available argument as an integer and print it, and “%f” means “take the next available argument as a float and print it”. Every % operator uses up an argument, and the next % operator access the next argument. There is no other choice – you cannot reorder them. They must be processed left to right.
What if we had accidentally typed
printf("My first number is %d and my second %f\n", 99.003, 42);
What would happen? You might be tempted to think that the %d will
find the next integer argument, but it doesn’t! On my computer, this
prints out “My first number is 652835029 and my second 0.000000”.
This is the reason that C++ invented a new I/O mechanism! In C,
printf
and scanf
are unsafe and rely on
the programmer to get everything exactly right!
On the bright side, most modern C compilers can warn you of errors in your printf format, and GCC on Linux (if I use -Wall) does indeed tell me
print.c:5: warning: int format, double arg (arg 2)
print.c:5: warning: double format, different type arg (arg 3)
Some of the more popular formats are:
Format | Meaning |
---|---|
d | Signed decimal integer |
u | Unsigned decimal integer |
x | Hexadecimal integer |
f | Floating point value in decimal format |
e | Float value in exponential format |
g | Float value in either format |
c | A single character |
s | A C string |
p | A pointer value |
Length and position modifiers
Each format (except ‘c’ I think) can take various flags and modifiers. Here we will only talk about the ‘-’ flag and the length and precision modifiers. See the man page (“man 3 printf”) for much more detail.
If a flag such as ‘-’ is used, it must appear immediately after the % symbol (or another flag). Then if the length is present, it must appear next. If precision is used, it must appear after a decimal point, and after the flags and length if they are present. Finally, the original format symbol is specified. Thus, examples of valid signed decimal integer formats are: “%d”, “%-d”, “%4d”, “%4.3d”, “%.3d”, and “%-3.5d”.
The ‘-’ flag always means left justification. It is most useful with string formats. The length always means the minimum number of characters printed, with spaces being used as padding (on the left normally, or on the right with the ‘-’ flag). The precision has different meanings for different formats. For strings, it means the maximum number of characters to print, truncating the string if necessary. For integer formats, it means the number of digits to print, using leading 0’s if needed. For floating point formats, it means the number of digits after the decimal point.
These modifiers let you control how your output is formatted. You can make nice reports with columns by using proper modifiers:
- For a string field, like a name, it is common to want left
justification and you should always specify an equal length and
precision so that your column never overruns another. For example,
printf("%-20.20s\n",MyStringVar)
will print exactly 20 characters every time. - For integer fields, it is rare that you want leading zeros,
and numbers are usually right justified, so a length modifier is
all you need. However, you must make sure that it is longer or
equal to the largest integer your data will have, otherwise you
will get a misaligned column. E.g.,
printf("%6d",MyInt)
will work fine if the variable will never have a value of a million or more. - For float/double fields, usually you want to control the
precision as well as the length. For example,
printf("%10.2",MyFloat)
will print two fractional digits, which might be nice from printing out dollar amounts. (note: In a real financial application, you should not be using real values to represent money, anyways!)
The scanf()
input function
Input using scanf
looks very similar to printf
,
but instead of printing values out, it is reading values in
from stdin
. The format string tells scanf
what to look for in the input, and what values to assign to
variables.
The big difference that causes the most problems for new C
programmers is that when printf
sees a “%d” in its
format string, it expects an integer argument that it can
process. But when scanf
sees a “%d” in its string,
it expects an argument that is a pointer to an integer
variable, where it can store the incoming integer data
that it finds. All scanf
arguments must be
pointers to variables!
For example, scanf("val: %d",&MyIntVar)
would
expect that the input would contain the characters “val: “,
and then some digit characters that it could convert into
an integer value, which it then assigns to MyIntVar
.
The &
operator is a C/C++ operator that provides
a pointer to the variable. We read it as “address of”.
The scanf
can take most all of the same formats that
printf
does. See “man 3 scanf
” for details.
File I/O
C has a library of functions, called Standard I/O, or stdio
,
that offer a large collection of ways to deal with input
and output. printf
and scanf
are just two
out of many.
For writing programs that need to explicitly read or write to files, here we’ll explain the basics of the most common mechanism. Here’s a simple example that we’ll use:
#include <stdio.h>;
int main()
{
FILE *f;
int i;
f = fopen("datafile.txt","r");
if (f == NULL)
{
printf("Error: unable to open file\n");
return 1;
}
while (!feof(f))
{
if (fscanf(f,"%d",&i))
{
printf("data = %d\n",i);
}
}
fclose(f);
return 0;
}
Firstly, you need to define a variable of type FILE*
.
This is your handle that you use to refer to the file while it
is open. You initialize it with an fopen
call, which
takes two string parameters. The first is just the filename.
The second is “r” for read mode, “w” for write mode, and “a”
for append mode. Warning: the “w” mode will
erase any current file contents, if there are any! The append
mode allows you to add to a current file (or create a new one
if it doesn’t yet exist). These three modes are enough for
this class, but there are others that allow you to do more,
such as read and write from the same file.
You should always check the file handle for NULL-ness after you open a file! If the open failed and your program doesn’t check, it will crash. In this course, you will lose points if you do not check.
Once a file is open, you can use it for reading or writing,
whatever is appropriate for how you opened it. In the program
above, we need to start reading it. The loop uses another
new function, feof
which is short for “file-end-of-file”,
and it tests to see if the file has anything left for reading.
As usual in C, functions often return a 0 for success, or
in this case, to mean the end-of-file has not yet been reached.
You should read while (!feof(f))
as “while not end of
file on f”.
In this program, we use a variant of scanf
to read
in the data. This version, fscanf
, is exactly like
the first but it reads from the designated file handle, which
is its first argument. There’s even another variant that reads
data from a string! The function prototypes for the three
look like:
scanf (char *format, ...);
fscanf (FILE *infile, char *format, ...);
sscanf (char *indata, char *format, ...);
We’re using the second, and are asking for it to read in a single
integer and assign the value to our variable i
. All of
the scanf
functions return the number of data values
that they read in and assigned.
When we run out of integers in our input file, fscanf won’t be
able to read anymore, it will return 0 or something negative,
our loop test will finally fail, and we’ll break out of the
loop. At that point we need to close the file, and we use
fclose
for that. You should never use a file
handle after closing it! Your program will probably crash.
Neat Stuff, Sorta
It turns out that C and its standard libraries generally has extreme capability to greatly compact your program. The sample program above is nice and clear, but in reality we are not using all the power of C. We could do:
#include <stdio.h>;
int main()
{
FILE *f;
int i;
if ((f = fopen("datafile.txt","r")) == NULL)
{
printf("Error: unable to open file\n");
return 1;
}
while (fscanf(f,"%d",&i) > 0)
printf("data = %d\n",i);
fclose(f);
return 0;
}
All I did was take advantage of the C language fact that an
assignment statement returns the value it is assigning (and
thus you can both assign the value to f and compare
it to NULL, with the right set of parentheses), and the
fact that fscanf
will check for EOF itself (I read
the man page to learn this).
While some compacting of a program is good, overdoing it is usually a bad thing. Indeed, for 15 years most C programming books taught programmers to open up files just like the compact form above, but now it is considered bad form to place an assignment inside a conditional expression. (And I will take off points for programs that use this form!)
Safe I/O Programming
When processing text files that contain line-based formatted data,
using scanf
and fscanf
generally works, but I
prefer to use a safer method: reading the line into a string
and then scanning the data out of the string rather than directly
out of the file.
Why? Firstly, since the scan functions do not recognize or care about newlines, if one line is badly formatted (say, with four integers rather than the expected three), the next scan will be reading across line boundaries, and all subsequent one’s will too. By reading each line into a string and scanning the string, if one line is messed up it won’t affect all subsequent lines.
Secondly, early implementations of the C library I/O functions sometimes seemed to not work well – they worked, but sometimes they might hang thinking they need data even though the data is available. This is especially true when the “file” really isn’t a file but is data piped from another program.
So, below is an example of reading line-based input that is expected to have three integers per line, in my “safe” programming style:
#include <stdio.h>;
int main()
{
FILE *fin;
char line[256];
int data1, data2, data3, nread;
fin = fopen("input.dat", "r");
if (!fin)
{
fprintf(stderr, "Can't open input.dat\n");
return 1;
}
while (fgets(line, sizeof(line), fin))
{
nread = sscanf(line, "%d %d %d", &data1, &data2, &data3);
if (nread == 3)
printf("line read: %d %d %d\n", data1, data2, data3);
else
printf("line read failed\n");
}
fclose(fin);
return 0;
}
Summary of useful I/O functions (use the “man” command to learn more)
Standard I/O Function | Purpose | Return Value |
---|---|---|
int printf( char *format, ...) |
prints to stdout according to format and args | # of characters printed |
int scanf( char *format, ...) |
read in data from stdin (use pointers!) | # of data values assigned (or eof) |
FILE* fopen(char *filename, char *mode) |
open a file (r/w/a mode) | valid file handle or NULL |
int fclose(FILE *file) |
close a file | 0 on success, EOF on error |
int fprintf( FILE *outfile, char *format, ...) |
print output to a file | # of chars printed |
int fscanf( FILE *infile char *format, ...) |
read in data from file (use pointers!) | # of data values assigned (or eof) |
int feof( FILE *f) |
test file for End-Of-File | 0 if more data still available |
int fflush(FILE *file)> |
make sure data is written! | 0 on success, EOF otherwise |
int fread(char *data, int size, int nmemb, FILE *infile) |
read size*nmemb data from a file | number of items (not # of chars!) |
int fwrite(char *data, int size, int nmemb, FILE *outfile) |
write size*nmemb data to a file | number of items (not # of chars!) |
char *fgets(char *str, int size, FILE *infile) |
read a line of data (up to size bytes) from a file | str on success, 0 on error/EOF |
int fputs(char *str, FILE *outfile) |
write string out to a file | 0+ on success, negative on error |
int sprintf( char *outdata, char *format, ...) |
print output to a char string | # of chars printed |
int sscanf( char *indata, char *format, ...) |
read in data from a string (use pointers!) | # of data values assigned (or eof) |
Other resources
The CPPReference site has a good section on C; just scroll down to the bottom of the main page. The C section includes a page on file I/O.