Assignment #5
Using Unix performance tools like time, gprof and gct for Performance Analysis
CS371 Spring 2000
Date Issued 02/15/'00
Date Due 02/22/'00








Introduction
 

At first check your .cshrc file to see if it includes the standard library and the c-tools, else add the following two lines:

        source /local/config/cshrc.standard
   source /local/config/cshrc.c-tools

Performance analysis tools provide a variety of analysis levels. Analysis levels vary from simple timing of a command to a
statement-by-statement analysis of a program. This laboratory covers three of the performance analysis tools. They are the time command, the gprof command and the gct command.
The time command  is used to display the execution time of a program.
The gprof command generates a time profile for the modules in a program, and a call graph showing what modules call which other modules, and which modules are called by other modules.
The gct command generates a detailed statement by statement analysis of a C program execution.

Use maxline.c from the lab3 as the sample program for running the performance analysis tools.
 

The time command

The time  commandis useful to measure the running time of a program.

It is recommended that you add to the LINUX part of your .cshrc file the following command :

 set time = (100, '%E elapsed  %U user  %S  system')

Then you can measure the time to execute your  program  using the command :

time  my_program

The program my_file just lists out all the files in the current directory that has been edited/created in the month of september. The output after using the time command is:

singapore:[64] time my_file
    intro.ps
    my_file
    test
    my_forloop

0:01.26 elapsed  0.400 user  0.520  system
singapore:[65]

The elapsed time is the total time(including CPU time, i/o time) taken to execute your program in seconds, user time is the time taken by the CPU to execute your program measured in seconds and system time is the time taken by the system to run your program also measured in seconds.

 To measure the runtime of the maxline program you can do the following:

 gcc -o maxline maxline.c
 time maxline < maxline.c

Here, maxline takes maxline.c as an input file and prints out the longest line in this file.  The typical output should be as shown below, though the time may vary depending on the terminal you are working on.

bear:[11]  gcc -o maxline maxline.c
bear:[12]  time maxline < maxline.c
Hello! This is an output of the program maxline.c
    for (i = 0; i < lim - 1 && (c = getchar ()) != EOF && c != '\n'; ++i)
0:00.02 elapsed  0.010 user  0.020  system
bear:[13]
 

gprof

gprof   enables you to determine which parts of a program are taking most of the execution time. Profiling allows you to learn where your program spent its time and which functions called which other functions while it was executing. This information can show you which pieces of your program are slower than you expected, and might be candidates for rewriting to make your program execute faster. It can also tell you which functions are being called more or less often than you expected. This may help you spot bugs that had otherwise been unnoticed.
Since the profiler uses information collected during the actual execution of your program, it can be used on programs that are too large or too complex to analyze by reading the source. However, how your program is running will affect the information that shows up in the profile data. If you don't use some feature of your program while it is being profiled, no profile information will be generated for that feature.

Profiling has several steps:

     You must compile and link your program with profiling enabled.
     You must execute your program to generate a profile data file.
     You must run gprof to analyze the profile data.
 

The result of the analysis is a file containing two tables, the flat profile and the call graph (plus blurbs which briefly explain the contents of these tables).

The flat profile shows how much time your program spent in each function, and how many times that function was called. The call graph shows, for each function, which functions called it, which other functions it called, and how many times. There is also an estimate of how much time was spent in the subroutines of each function. This can suggest places where you might try to eliminate function calls that use a lot of time.

First, compile the program with -pg option to make the program compiled for profiling. After running this program, there will be a file named gmon.out created, then we use the gprof command to view the result of the profile.The output of gprof command is voluminous. The two major items of output are the  `flat`  profile and the full call-graph profile. To use gprof, use the following commands  in succession:

gcc -pg -o maxline maxline.c
maxline < maxline.c
gprof maxline

The output would be as following:

    bear:[13] gcc -pg -o maxline maxline.c
    bear:[14] maxline < maxline.c
    Hello! This is an output of the program maxline.c
        for (i = 0; i < lim - 1 && (c = getchar ()) != EOF && c != '\n'; ++i)
    bear:[15]
    bear:[15] gprof maxline

    Flat profile:

    Each sample counts as 0.01 seconds.
     no time accumulated

  %   cumulative         self                self       total
    time      seconds   seconds    calls    Ts/call     Ts/call     name
    0.00      0.00       0.00        63       0.00        0.00      getline
    0.00      0.00       0.00        5        0.00        0.00      copy
    0.00      0.00       0.00        1        0.00        0.00      main

  %             the percentage of the total running time of the
  time          program used by this function.

  cumulative     a running sum of the number of seconds accounted
  seconds          for by this function and those listed above it.

  self           the number of seconds accounted for by this
  seconds        function alone.  This is the major sort for this
                 listing.

  calls          the number of times this function was invoked, if
                 this function is profiled, else blank.

  self           the average number of milliseconds spent in this
  ms/call        function per call, if this function is profiled,
                 else blank.

  total          the average number of milliseconds spent in this
  ms/call        function and its descendents per call, if this
                 function is profiled, else blank.

  name          the name of the function.  This is the minor sort
                for this listing. The index shows the location of
                the function in the gprof listing. If the index is
                in parenthesis it shows where it would appear in
                the gprof listing if it were to be printed.
 

         Call graph (explanation follows)
 

    granularity: each sample hit covers 4 byte(s) no time propagated

    index % time    self     children    called       name
                    0.00     0.00        63/63        main [3]
    [1]     0.0     0.00     0.00        63           getline [1]
-----------------------------------------------
                    0.00     0.00        5/5          main [3]
    [2]     0.0     0.00     0.00        5            copy [2]
-----------------------------------------------
                    0.00     0.00        1/1          _start [42]
    [3]     0.0     0.00     0.00        1            main [3]
                    0.00     0.00        63/63        getline [1]
                    0.00     0.00        5/5          copy [2]
-----------------------------------------------

     This table describes the call tree of the program, and was sorted by
     the total amount of time spent in each function and its children.

     Each entry in this table consists of several lines.  The line with the
     index number at the left hand margin lists the current function.
     The lines above it list the functions that called this function,
     and the lines below it list the functions this one called.
     This line lists:

     index              A unique number given to each element of the table.
                        Index numbers are sorted numerically.
                        The index number is printed next to every function name so
                        it is easier to look up where the function in the table.

     % time             This is the percentage of the `total' time that was spent
                        in this function and its children.  Note that due to
                        different viewpoints, functions excluded by options, etc,
                        these numbers will NOT add up to 100%.

     self               This is the total amount of time spent in this function.

     children           This is the total amount of time propagated into this
                        function by its children.

     called             This is the number of times the function was called.
                        If the function called itself recursively, the number
                        only includes non-recursive calls, and is followed by
                        a `+' and the number of recursive calls.

     name              The name of the current function.  The index number is
                       printed after it.  If the function is a member of a
                       cycle, the cycle number is printed between the
                       function's name and the index number.
 

     For the function's parents, the fields have the following meanings:

     self              This is the amount of time that was propagated directly
                       from the function into this parent.

     children          This is the amount of time that was propagated from
                       the function's children into this parent.

     called            This is the number of times this parent called the
                       function `/' the total number of times the function
                       was called.  Recursive calls to the function are not
                       included in the number after the `/'.

     name              This is the name of the parent.  The parent's index
                       number is printed after it.  If the parent is a
                       member of a cycle, the cycle number is printed between
                       the name and the index number.

     If the parents of the function cannot be determined, the word
     `<spontaneous>' is printed in the `name' field, and all the other
     fields are blank.

     For the function's children, the fields have the following meanings:

     self              This is the amount of time that was propagated directly
                       from the child into the function.

     children          This is the amount of time that was propagated from the
                       child's children to the function.

     called            This is the number of times the function called
                       this child `/' the total number of times the child
                       was called.  Recursive calls by the child are not
                       listed in the number after the `/'.

     name              This is the name of the child.  The child's index
                       number is printed after it.  If the child is a
                       member of a cycle, the cycle number is printed
                       between the name and the index number.

 If there are any cycles (circles) in the call graph, there is an
 entry for the cycle-as-a-whole.  This entry shows who called the
 cycle (as parents) and the members of the cycle (as children.)
 The `+' recursive calls entry shows the number of function calls that
 were internal to the cycle, and the calls entry for each member shows,
 for that member, how many times it was called from other members of
 the cycle.
 

Index by function name

 [2] copy                    [1] getline                 [3] main

bear:[16]
 

 gct

 The Generic Coverage Tool (GCT) is a freeware coverage tool that measures how thoroughly tests exercise C programs. It instruments C code to  implement branch coverage, multi-condition coverage, loop coverage, weak mutation coverage etc. By instrumenting a file we imply adding code to the program to increment the condition counts. The condition counter tells how often a program has reached the given coverage condition. You can also look up on the  online guide for gct .

GCT can measure the following:

- how often branches have been taken in either direction. (branch coverage)
- how often cases in switches have been taken.
- how often nested logical conditions have evaluated to true and false (multiple condition coverage)
- how often loops have iterated zero, one, and more than one times (loop coverage)
- how often the boundaries of relational operators have been probed (relational coverage)
- how often routines have been entered
- how often call sites have been exercised
- how often routines have had multiple threads of execution in them
- weak mutation coverage (a research technique).

If you use branch coverage, you will find the combination of branch, multiple condition, loop, and relational operator coverage to be more effective at little extra cost.  Routine and call coverage are useful for system testing.  Race coverage is useful for system testing of multiprocessor systems. The tool accumulates statistics about all the program runs.

This tool has been widely used in production environments, including testing the UNIX kernel.
 

MECHANISM

The tool takes C source code, adds instrumentation, then uses the system C compiler to produce an object file.  Alternately, the instrumented source can be captured and moved to another machine for compilation.  For efficiency's sake, coverage information is stored in an internal log.  The log is written on exit (or, in the case of an unending program like the UNIX kernel, extracted from the running program).

Instrumented programs run slower than the original programs.  How much slower depends on the program, how much of it is instrumented, and what types of coverage are measured.  An instrumented GCT runs 44% slower than normal. Reporting tools produce detailed or summary reports on missing coverage. They can also report on particular routines or files.

  USAGE

1.  First select a Master Directory where all your code  should be located. If you have previously compiled the code(without gct), delete all your object files (*.o).

2. Create a control file called gct-ctrl in the master directory. The file contains the following four elements, called top level directives, arranged in any order:

   option directives
 coverage directives
 file directives
logfile directives

NOTE: The content of the gct-ctrl file should be the sample file  mentioned below.

  Here is a sample gct-ctrl file.

3. In the master directory, use the following command:

    gct-init

   This will create the files gct-map, gct-defs.h, gct-ps-defs.h, gct-ps-defs.candgct-write.c in the master directory

4. Compile your code using gct (as a compiler).

    gct -o maxline maxline.c

   If you are using make file, you can use

    make CC=gct

   This will set the macro $(CC) to gct, so you must have used CC to set your compiler in the makefile.

5.  Run the program

     maxline < maxline.c

6.  See the report by running greport. You need to supply the name of the logfile GCTLOG as an argument. This file is generated automatically when you do a gct-init in the master directory. You have to provide the name of the file where you want the output of greport to be written, for eg. I have named it as report.

    greport GCTLOG > report

   The following command will display the values for all coverage conditions.

   greport -all GCTLOG > report

   A sample of the report is given below:

"maxline.c", line 21: routine main is never entered.  [1]
"maxline.c", line 21: call of printf (in main) never made.  [1]
"maxline.c", line 23: while was taken TRUE 62, FALSE 1 times.
"maxline.c", line 23: loop zero times: 0, one time: 0, many times: 1.
"maxline.c", line 23: call of getline (in main) never made.  [63]
"maxline.c", line 23: operator > might be <. (L!=R)  [62]
"maxline.c", line 23: operator > might be >=. (L==R)  [1]
"maxline.c", line 23: operator > needs boundary L==R+1.  [11]
"maxline.c", line 24: if was taken TRUE 5, FALSE 57 times.
"maxline.c", line 24: operator > might be <. (L!=R)  [62]
"maxline.c", line 24: operator > might be >=. (L==R)  [0]
"maxline.c", line 24: operator > needs boundary L==R+1.  [1]
"maxline.c", line 27: call of copy (in main) never made.  [5]
"maxline.c", line 29: if was taken TRUE 1, FALSE 0 times.
"maxline.c", line 29: operator > might be <. (L!=R)  [1]
"maxline.c", line 29: operator > might be >=. (L==R)  [0]
"maxline.c", line 29: operator > needs boundary L==R+1.  [0]
"maxline.c", line 30: call of printf (in main) never made.  [1]
"maxline.c", line 40: routine getline is never entered.  [63]
"maxline.c", line 40: for was taken TRUE 884, FALSE 63 times.
"maxline.c", line 40: loop zero times: 12, one time: 6, many times: 45.
"maxline.c", line 40: operator < might be >. (L!=R)  [947]
"maxline.c", line 40: operator < might be <=. (L==R)  [0]
"maxline.c", line 40: operator < needs boundary L==R-1.  [0]
"maxline.c", line 40: condition 1 (i, 2) was taken TRUE 947, FALSE 0 times.
"maxline.c", line 40: condition 2 (c, 2) was taken TRUE 946, FALSE 1 times.
"maxline.c", line 40: condition 3 (i, 1) was taken TRUE 946, FALSE 1 times.
"maxline.c", line 40: condition 4 (c, 1) was taken TRUE 884, FALSE 62 times.
"maxline.c", line 42: if was taken TRUE 62, FALSE 1 times.
"maxline.c", line 59: routine copy is never entered.  [5]
"maxline.c", line 60: while was taken TRUE 248, FALSE 5 times.
"maxline.c", line 60: loop zero times: 0, one time: 0, many times: 5.

 Assignment  (10 points)

Use the performance analysis tools (time,  gprof, gct) to analyze the C program (McCabe Complexity) you had written in lab3.

1.          Call the time command several times and get the average time. You may use a UNIX command, repeat, to execute the timing command several times easily.

            repeat 50 time  mccabe < inputfile.

          You can use a simple shell script or a simple awk program to run the time command 50 times on your program and to calculate the average. Else, you can compute the average manually.

2.        Use the  gprof tool to determine what parts of your program are executed most often. Provide the output.

3.        Run the C program on different input data such that the each branch of the program is executed at least once (use the gct tool in order to check whether all statements  of the program are executed at least once). The accumulated protocol should indicate that all the branches of the program are covered at least ones.
 

Submit the printouts of the results of running all these three tools and also the shell script or awk program of the first part of the assignment, via e-mail before midnight of the due date.
 
 

CS371 Home| HW List|