C
|
Environment,Science, Technology, News, Research, Invention,mythology,practical,health,knowledge,magazine,fusion,physics,chemistry, mathematics, biology, disaster, diseases,How does rainbow forms,logic circuits,Inspirational Nature Quotes,Environmental chemistry, important topics on electrical and electronics,disaster planning, ...
Monday, March 4, 2013
Origination of C-Programming Language
Tuesday, March 27, 2012
Chapter 14: What's Next?
Chapter 14: What's Next?
This last handout contains a brief list of the significant topics in C which we have not covered, and which you'll want to investigate further if you want to know all of C.
Types and Declarations
We have not talked about the void, short int, and long double types. void is a type with no values, used as a placeholder to indicate functions that do not return values or that accept no arguments, and in the ``generic'' pointer type void * that can point to anything. short int is an integer type that might use less space than a plain int; long double is a floating-point type that might have even more range or precision than plain double.
The char type and the various sizes of int also have ``unsigned'' versions, which are declared using the keyword unsigned. Unsigned types cannot hold negative values but have guaranteed properties on overflow. (Whether a plain char is signed or unsigned is implementation-defined; you can use the keyword signed to force a character type to contain signed characters.) Unsigned types are also useful when manipulating individual bits and bytes, when ``sign extension'' might otherwise be a problem.
Two additional type qualifiers const and volatile allow you to declare variables (or pointers to data) which you promise not to change, or which might change in unexpected ways behind the program's back.
There are user-defined structure and union types. A structure or struct is a ``record'' consisting of one or more values of one or more types concreted together into one entity which can be manipulated as a whole. A union is a type which, at any one time, can hold a value from one of a specified set of types.
There are user-defined enumeration types (``enum'') which are like integers but which always contain values from some fixed, predefined set, and for which the values are referred to by name instead of by number.
Pointers can point to functions as well as to data types.
Types can be arbitrarily complicated, when you start using multiple levels of pointers, arrays, functions, structures, and/or unions. Eventually, it's important to understand the concept of a declarator: in the declaration
int i, *ip, *fpi();
we have the base type int and three declarators i, *ip, and *fpi(). The declarator gives the name of a variable (or function) and also indicates whether it is a simple variable or a pointer, array, function, or some more elaborate combination (array of pointers, function returning pointer, etc.). In the example, i is declared to be a plain int, ip is declared to be a pointer to int, and fpi is declared to be a function returning pointer to int. (Complicated declarators may also contain parentheses for grouping, since there's a precedence hierarchy in declarators as well as expressions: [] for arrays and () for functions have higher precedence than * for pointers.)
We have not said much about pointers to pointers, or arrays of arrays (i.e. multidimensional arrays), or the ramifications of array/pointer equivalence on multidimensional arrays. (In particular, a reference to an array of arrays does not generate a pointer to a pointer; it generates a pointer to an array. You cannot pass a multidimensional array to a function which accepts pointers to pointers.)
Variables can be declared with a hint that they be placed in high-speed CPU registers, for efficiency. (These hints are rarely needed or used today, because modern compilers do a good job of register allocation by themselves, without hints.)
A mechanism called typedef allows you to define user-defined aliases (i.e. new and perhaps more-convenient names) for other types.
Operators
The bitwise operators &, |, ^, and ~ operate on integers thought of as binary numbers or strings of bits. The & operator is bitwise AND, the | operator is bitwise OR, the ^ operator is bitwise exclusive-OR (XOR), and the ~ operator is a bitwise negation or complement. (&, |, and ^ are ``binary'' in that they take two operands; ~ is unary.) These operators let you work with the individual bits of a variable; one common use is to treat an integer as a set of single-bit flags. You might define the 3rd (2**2) bit as the ``verbose'' flag bit by defining
#define VERBOSE 4
Then you can ``turn the verbose bit on'' in an integer variable flags by executing
flags = flags | VERBOSE;
or
flags |= VERBOSE;
and turn it off with
flags = flags & ~VERBOSE;
or
flags &= ~VERBOSE;
and test whether it's set with
if(flags & VERBOSE)
The left-shift and right-shift operators << and >> let you shift an integer left or right by some number of bit positions; for example, value << 2 shifts value left by two bits.
The ?: or conditional operator (also called the ``ternary operator'') essentially lets you embed an if/then statement in an expression. The assignment
a = expr ? b : c;
is roughly equivalent to
if(expr)
a = b;
else a = c;
Since you can use ?: anywhere in an expression, it can do things that if/then can't, or that would be cumbersome with if/then. For example, the function call
f(a, b, c ? d : e);
is roughly equivalent to
if(c)
f(a, b, d);
else f(a, b, e);
(Exercise: what would the call
g(a, b, c ? d : e, h ? i : j, k);
be equivalent to?)
The comma operator lets you put two separate expressions where one is required; the expressions are executed one after the other. The most common use for comma operators is when you want multiple variables controlling a for loop, for example:
for(i = 0, j = 10; i < j; i++, j--)
A cast operator allows you to explicitly force conversion of a value from one type to another. A cast consists of a type name in parentheses. For example, you could convert an int to a double by typing
int i = 10;
double d;
d = (double)i;
(In this case, though, the cast is redundant, since this is a conversion that C would have performed for you automatically, i.e. if you'd just said d = i .) You use explicit casts in those circumstances where C does not do a needed conversion automatically. One example is division: if you're dividing two integers and you want a floating-point result, you must explicitly force at least one of the operands to floating-point, otherwise C will perform an integer division and will discard the remainder. The code
int i = 1, j = 2;
double d = i / j;
will set d to 0, but
d = (double)i / j;
will set d to 0.5. You can also ``cast to void'' to explicitly indicate that you're ignoring a function's return value, as in
(void)fclose(fp);
or
(void)printf("Hello, world!\n");
(Usually, it's a bad idea to ignore return values, but in some cases it's essentially inevitable, and the (void) cast keeps some compilers from issuing warnings every time you ignore a value.)
There's a precise, mildly elaborate set of rules which C uses for converting values automatically, in the absence of explicit casts.
The . and -> operators let you access the members (components) of structures and unions.
Statements
The switch statement allows you to jump to one of a number of numeric case labels depending on the value of an expression; it's more convenient than a long if/else chain. (However, you can use switch only when the expression is integral and all of the case labels are compile-time constants.)
The do/while loop is a loop that tests its controlling expression at the bottom of the loop, so that the body of the loop always executes once even if the condition is initially false. (C's do/while loop is therefore like Pascal's repeat/until loop, while C's while loop is like Pascal's while/do loop.)
Finally, when you really need to write ``spaghetti code,'' C does have the all-purpose goto statement, and labels to go to.
Functions
Functions can't return arrays, and it's tricky to write a function as if it returns an array (perhaps by simulating the array with a pointer) because you have to be careful about allocating the memory that the returned pointer points to.
The functions we've written have all accepted a well-defined, fixed number of arguments. printf accepts a variable number of arguments (depending on how many % signs there are in the format string) but we haven't seen how to declare and write functions that do this.
C Preprocessor
If you're careful, it's possible (and can be useful) to use #include within a header file, so that you end up with ``nested header files.''
It's possible to use #define to define ``function-like'' macros that accept arguments; the expansion of the macro can therefore depend on the arguments it's ``invoked'' with.
Two special preprocessing operators # and ## let you control the expansion of macro arguments in fancier ways.
The preprocessor directive #if lets you conditionally include (or, with #else, conditionally not include) a section of code depending on some arbitrary compile-time expression. (#if can also do the same macro-definedness tests as #ifdef and #ifndef, because the expression can use a defined() operator.)
Other preprocessing directives are #elif, #error, #line, and #pragma.
There are a few predefined preprocessor macros, some required by the C standard, others perhaps defined by particular compilation environments. These are useful for conditional compilation (#ifdef, #ifndef).
Standard Library Functions
C's standard library contains many features and functions which we haven't seen.
We've seen many of printf's formatting capabilities, but not all. Besides format specifier characters for a few types we haven't seen, you can also control the width, precision, justification (left or right) and a few other attributes of printf's format conversions. (In their full complexity, printf formats are about as elaborate and powerful as FORTRAN format statements.)
A scanf function lets you do ``formatted input'' analogous to printf's formatted output. scanf reads from the standard input; a variant fscanf reads from a specified file pointer.
The sprintf and sscanf functions let you ``print'' and ``read'' to and from in-memory strings instead of files. We've seen that atoi lets you convert a numeric string into an integer; the inverse operation can be performed with sprintf:
int i = 10;
char str[10];
sprintf(str, "%d", i);
We've used printf and fprintf to write formatted output, and getchar, getc, putchar, and putc to read and write characters. There are also functions gets, fgets, puts, and fputs for reading and writing lines (though we rarely need these, especially if we're using our own getline and maybe fgetline), and also fread and fwrite for reading or writing arbitrary numbers of characters.
It's possible to ``un-read'' a character, that is, to push it back on an input stream, with ungetc. (This is useful if you accidentally read one character too far, and would prefer that some other part of your program read that character instead.)
You can use the ftell, fseek, and rewind functions to jump around in files, performing random access (as opposed to sequential) I/O.
The feof and ferror functions will tell you whether you got EOF due to an actual end-of-file condition or due to a read error of some sort. You can clear errors and end-of-file conditions with clearerr.
You can open files in ``binary'' mode, or for simultaneous reading and writing. (These options involve extra characters appended to fopen's mode string: b for binary, + for read/write.)
There are several more string functions in
The header file
A host of mathematical functions are defined in the header file
There's a random-number generator, rand, and a way to ``seed'' it, srand. rand returns integers from 0 up to RAND_MAX (where RAND_MAX is a constant #defined in
(int)(rand() / (RAND_MAX + 1.0) * n) + 1
Another way is
rand() / (RAND_MAX / n + 1) + 1
It seems like it would be simpler to just say
rand() % n + 1
but this method is imperfect (or rather, it's imperfect if n is a power of two and your system's implementation of rand() is imperfect, as all too many of them are).
Several functions let you interact with the operating system under which your program is running. The exit function returns control to the operating system immediately, terminating your program and returning an ``exit status.'' The getenv function allows you to read your operating system's or process's ``environment variables'' (if any). The system function allows you to invoke an operating-system command (i.e. another program) from within your program.
The qsort function allows you to sort an array (of any type); you supply a comparison function (via a function pointer) which knows how to compare two array elements, and qsort does the rest. The bsearch function allows you to search for elements in sorted arrays; it, too, operates in terms of a caller-supplied comparison function.
Several functions--time, asctime, gmtime, localtime, asctime, mktime, difftime, and strftime--allow you to determine the current date and time, print dates and times, and perform other date/time manipulations. For example, to print today's date in a program, you can write
#include
time_t now;
now = time((time_t *)NULL);
printf("It's %.24s", ctime(&now));
The header file
There are facilities for dealing with multibyte and ``wide'' characters and strings, for use with multinational character sets.
Chapter 13: Reading the Command Line
Chapter 13: Reading the Command Line
We've mentioned several times that a program is rarely useful if it does exactly the same thing every time you run it. Another way of giving a program some variable input to work on is by invoking it with command line arguments.
(We should probably admit that command line user interfaces are a bit old-fashioned, and currently somewhat out of favor. If you've used Unix or MS-DOS, you know what a command line is, but if your experience is confined to the Macintosh or Microsoft Windows or some other Graphical User Interface, you may never have seen a command line. In fact, if you're learning C on a Mac or under Windows, it can be tricky to give your program a command line at all. Think C for the Macintosh provides a way; I'm not sure about other compilers. If your compilation environment doesn't provide an easy way of simulating an old-fashioned command line, you may skip this chapter.)
C's model of the command line is that it consists of a sequence of words, typically separated by whitespace. Your main program can receive these words as an array of strings, one word per string. In fact, the C run-time startup code is always willing to pass you this array, and all you have to do to receive it is to declare main as accepting two parameters, like this:
int main(int argc, char *argv[])
{
...
}
When main is called, argc will be a count of the number of command-line arguments, and argv will be an array (``vector'') of the arguments themselves. Since each word is a string which is represented as a pointer-to-char, argv is an array-of-pointers-to-char. Since we are not defining the argv array, but merely declaring a parameter which references an array somewhere else (namely, in main's caller, the run-time startup code), we do not have to supply an array dimension for argv. (Actually, since functions never receive arrays as parameters in C, argv can also be thought of as a pointer-to-pointer-to-char, or char **. But multidimensional arrays and pointers to pointers can be confusing, and we haven't covered them, so we'll talk about argv as if it were an array.) (Also, there's nothing magic about the names argc and argv. You can give main's two parameters any names you like, as long as they have the appropriate types. The names argc and argv are traditional.)
The first program to write when playing with argc and argv is one which simply prints its arguments:
#include
main(int argc, char *argv[])
{
int i;
for(i = 0; i < argc; i++)
printf("arg %d: %s\n", i, argv[i]);
return 0;
}
(This program is essentially the Unix or MS-DOS echo command.)
If you run this program, you'll discover that the set of ``words'' making up the command line includes the command you typed to invoke your program (that is, the name of your program). In other words, argv[0] typically points to the name of your program, and argv[1] is the first argument.
There are no hard-and-fast rules for how a program should interpret its command line. There is one set of conventions for Unix, another for MS-DOS, another for VMS. Typically you'll loop over the arguments, perhaps treating some as option flags and others as actual arguments (input files, etc.), interpreting or acting on each one. Since each argument is a string, you'll have to use strcmp or the like to match arguments against any patterns you might be looking for. Remember that argc contains the number of words on the command line, and that argv[0] is the command name, so if argc is 1, there are no arguments to inspect. (You'll never want to look at argv[i], for i >= argc, because it will be a null or invalid pointer.)
As another example, also illustrating fopen and the file I/O techniques of the previous chapter, here is a program which copies one or more input files to its standard output. Since ``standard output'' is usually the screen by default, this is therefore a useful program for displaying files. (It's analogous to the obscurely-named Unix cat command, and to the MS-DOS type command.) You might also want to compare this program to the character-copying program of section 6.2.
#include
main(int argc, char *argv[])
{
int i;
FILE *fp;
int c;
for(i = 1; i < argc; i++)
{
fp = fopen(argv[i], "r");
if(fp == NULL)
{
fprintf(stderr, "cat: can't open %s\n", argv[i]);
continue;
}
while((c = getc(fp)) != EOF)
putchar(c);
fclose(fp);
}
return 0;
}
As a historical note, the Unix cat program is so named because it can be used to concatenate two files together, like this:
cat a b > c
This illustrates why it's a good idea to print error messages to stderr, so that they don't get redirected. The ``can't open file'' message in this example also includes the name of the program as well as the name of the file.
Yet another piece of information which it's usually appropriate to include in error messages is the reason why the operation failed, if known. For operating system problems, such as inability to open a file, a code indicating the error is often stored in the global variable errno. The standard library function strerror will convert an errno value to a human-readable error message string. Therefore, an even more informative error message printout would be
fp = fopen(argv[i], "r");
if(fp == NULL)
fprintf(stderr, "cat: can't open %s: %s\n",
argv[i], strerror(errno));
If you use code like this, you can #include
Chapter 12: Input and Output
Chapter 12: Input and Output
So far, we've been calling printf to print formatted output to the ``standard output'' (wherever that is). We've also been calling getchar to read single characters from the ``standard input,'' and putchar to write single characters to the standard output. ``Standard input'' and ``standard output'' are two predefined I/O streams which are implicitly available to us. In this chapter we'll learn how to take control of input and output by opening our own streams, perhaps connected to data files, which we can read from and write to.
12.1 File Pointers and fopen
How will we specify that we want to access a particular data file? It would theoretically be possible to mention the name of a file each time it was desired to read from or write to it. But such an approach would have a number of drawbacks. Instead, the usual approach (and the one taken in C's stdio library) is that you mention the name of the file once, at the time you open it. Thereafter, you use some little token--in this case, the file pointer--which keeps track (both for your sake and the library's) of which file you're talking about. Whenever you want to read from or write to one of the files you're working with, you identify that file by using its file pointer (that is, the file pointer you obtained when you opened the file). As we'll see, you store file pointers in variables just as you store any other data you manipulate, so it is possible to have several files open, as long as you use distinct variables to store the file pointers.
You declare a variable to store a file pointer like this:
FILE *fp;
The type FILE is predefined for you by
FILE *fp1, *fp2;
If you were reading from one file and writing to another you might declare and input file pointer and an output file pointer:
FILE *ifp, *ofp;
Like any pointer variable, a file pointer isn't any good until it's initialized to point to something. (Actually, no variable of any type is much good until you've initialized it.) To actually open a file, and receive the ``token'' which you'll store in your file pointer variable, you call fopen. fopen accepts a file name (as a string) and a mode value indicating among other things whether you intend to read or write this file. (The mode variable is also a string.) To open the file input.dat for reading you might call
ifp = fopen("input.dat", "r");
The mode string "r" indicates reading. Mode "w" indicates writing, so we could open output.dat for output like this:
ofp = fopen("output.dat", "w");
The other values for the mode string are less frequently used. The third major mode is "a" for append. (If you use "w" to write to a file which already exists, its old contents will be discarded.) You may also add a + character to the mode string to indicate that you want to both read and write, or a b character to indicate that you want to do ``binary'' (as opposed to text) I/O.
One thing to beware of when opening files is that it's an operation which may fail. The requested file might not exist, or it might be protected against reading or writing. (These possibilities ought to be obvious, but it's easy to forget them.) fopen returns a null pointer if it can't open the requested file, and it's important to check for this case before going off and using fopen's return value as a file pointer. Every call to fopen will typically be followed with a test, like this:
ifp = fopen("input.dat", "r");
if(ifp == NULL)
{
printf("can't open file\n");
exit or return
}
If fopen returns a null pointer, and you store it in your file pointer variable and go off and try to do I/O with it, your program will typically crash.
It's common to collapse the call to fopen and the assignment in with the test:
if((ifp = fopen("input.dat", "r")) == NULL)
{
printf("can't open file\n");
exit or return
}
You don't have to write these ``collapsed'' tests if you're not comfortable with them, but you'll see them in other people's code, so you should be able to read them.
12.2 I/O with File Pointers
For each of the I/O library functions we've been using so far, there's a companion function which accepts an additional file pointer argument telling it where to read from or write to. The companion function to printf is fprintf, and the file pointer argument comes first. To print a string to the output.dat file we opened in the previous section, we might call
fprintf(ofp, "Hello, world!\n");
The companion function to getchar is getc, and the file pointer is its only argument. To read a character from the input.dat file we opened in the previous section, we might call
int c;
c = getc(ifp);
The companion function to putchar is putc, and the file pointer argument comes last. To write a character to output.dat, we could call
putc(c, ofp);
Our own getline function calls getchar and so always reads the standard input. We could write a companion fgetline function which reads from an arbitrary file pointer:
#include
/* Read one line from fp, */
/* copying it to line array (but no more than max chars). */
/* Does not place terminating \n in line array. */
/* Returns line length, or 0 for empty line, or EOF for end-of-file. */
int fgetline(FILE *fp, char line[], int max)
{
int nch = 0;
int c;
max = max - 1; /* leave room for '\0' */
while((c = getc(fp)) != EOF)
{
if(c == '\n')
break;
if(nch < max)
{
line[nch] = c;
nch = nch + 1;
}
}
if(c == EOF && nch == 0)
return EOF;
line[nch] = '\0';
return nch;
}
Now we could read one line from ifp by calling
char line[MAXLINE];
...
fgetline(ifp, line, MAXLINE);
12.3 Predefined Streams
Besides the file pointers which we explicitly open by calling fopen, there are also three predefined streams. stdin is a constant file pointer corresponding to standard input, and stdout is a constant file pointer corresponding to standard output. Both of these can be used anywhere a file pointer is called for; for example, getchar() is the same as getc(stdin) and putchar(c) is the same as putc(c, stdout). The third predefined stream is stderr. Like stdout, stderr is typically connected to the screen by default. The difference is that stderr is not redirected when the standard output is redirected. For example, under Unix or MS-DOS, when you invoke
program > filename
anything printed to stdout is redirected to the file filename, but anything printed to stderr still goes to the screen. The intent behind stderr is that it is the ``standard error output''; error messages printed to it will not disappear into an output file. For example, a more realistic way to print an error message when a file can't be opened would be
if((ifp = fopen(filename, "r")) == NULL)
{
fprintf(stderr, "can't open file %s\n", filename);
exit or return
}
where filename is a string variable indicating the file name to be opened. Not only is the error message printed to stderr, but it is also more informative in that it mentions the name of the file that couldn't be opened. (We'll see another example in the next chapter.)
12.4 Closing Files
Although you can open multiple files, there's a limit to how many you can have open at once. If your program will open many files in succession, you'll want to close each one as you're done with it; otherwise the standard I/O library could run out of the resources it uses to keep track of open files. Closing a file simply involves calling fclose with the file pointer as its argument:
fclose(fp);
Calling fclose arranges that (if the file was open for output) any last, buffered output is finally written to the file, and that those resources used by the operating system (and the C library) for this file are released. If you forget to close a file, it will be closed automatically when the program exits.
12.5 Example: Reading a Data File
Suppose you had a data file consisting of rows and columns of numbers:
1 2 34
5 6 78
9 10 112
Suppose you wanted to read these numbers into an array. (Actually, the array will be an array of arrays, or a ``multidimensional'' array; see section 4.1.2.) We can write code to do this by putting together several pieces: the fgetline function we just showed, and the getwords function from chapter 10. Assuming that the data file is named input.dat, the code would look like this:
#define MAXLINE 100
#define MAXROWS 10
#define MAXCOLS 10
int array[MAXROWS][MAXCOLS];
char *filename = "input.dat";
FILE *ifp;
char line[MAXLINE];
char *words[MAXCOLS];
int nrows = 0;
int n;
int i;
ifp = fopen(filename, "r");
if(ifp == NULL)
{
fprintf(stderr, "can't open %s\n", filename);
exit(EXIT_FAILURE);
}
while(fgetline(ifp, line, MAXLINE) != EOF)
{
if(nrows >= MAXROWS)
{
fprintf(stderr, "too many rows\n");
exit(EXIT_FAILURE);
}
n = getwords(line, words, MAXCOLS);
for(i = 0; i < n; i++)
array[nrows][i] = atoi(words[i]);
nrows++;
}
Each trip through the loop reads one line from the file, using fgetline. Each line is broken up into ``words'' using getwords; each ``word'' is actually one number. The numbers are however still represented as strings, so each one is converted to an int by calling atoi before being stored in the array. The code checks for two different error conditions (failure to open the input file, and too many lines in the input file) and if one of these conditions occurs, it prints an error message, and exits. The exit function is a Standard library function which terminates your program. It is declared in