Skip to Content

PL: Subprograms, Functions, Procedures

From old CS 471 notes…

Subprograms

Fundamentals

Subprogram is the generic term for a subroutine, procedure, function, method, etc.

A subprogram has a single entry point. Not necessarily a single exit point.

Control is transferred to a subprogram by a call.

The caller suspends execution until the subprogram finishes, and control is always transferred back to the caller when the subprogram finishes. (anyone know of any exceptions?)

Typical definitions of procedure and function are: a function returns a value, a procedure does not (it simply executes the body).

C/C++ procedures must be declared as void functions.

Methods are procedures or functions that operate on an object, in an OOP language. (Except static methods, which are simply procedures or functions encapsulated within a class.)

The declaration of a subprogram defines its type signature or function prototype: that is, the number, order, and types of its parameters, and the type of value it returns (if any). These along with the name of the function make up its signature.

Parameters

Subprograms can take parameters.

Formal parameters refer to the defined parameters in the definition of the subprogram.

Actual parameters refer to the arguments provided in a particular call to the subprogram.

Most programming languages use positional parameter passing; the actual parameters match up to the formal parameters based on their position in the argument list.

Python and others can or do support keyword parameter passing, where the formal parameter name is used to indicate the actual parameter that corresponds to it. Order does not matter.

Unix commands often use keyword, position, and a combination of both. How?

Parameter Passing Methods

Parameters can have one of three modes:

  • in mode: the parameter communicates data to the subprogram.
  • out mode: the parameter communicates data from the subprogram.
  • inout mode: the parameter communicates data both to and from the subprogram.

Most programming languages do not directly support out mode. Rather, the programmer simply knows that s/he should not use that parameter’s initial value.

The return value of a function can be thought of as an out mode parameter.

Pass-by-Value: Our typical call-by-value.

Pass-by-Result: Only out mode, not typically supported.

Pass-by-Value-Result: Also known as copy-in, copy-out; not typically supported in languages.

Pass-by-Reference: Our typical call-by-reference.

Pass-by-Name: In essence, textual substitution; this is used in C macros.

Parameter passing implementation

Parameters are part of the activation record of a procedure call, and so typically they are passed on the run-time call stack.

  • Pass-by-value: actual value of parameter is placed on stack; parameter can be treated as a local variable, because any value change on the stack does not affect external program; C/C++ uses this for lot’s of things, Java uses this for built-in data types.
  • Pass-by-reference: a reference to the parameter value is placed on stack; reference can be changed as a local variable, but what it’s referring to remains in external scope. C arrays are inherently passed this way, everything else must be done by the programmer using pass-by-value pointers; C++ has notation for call-by-reference; in Java all objects are pass-by-reference.
  • Pass-by-value-result: Local storage must be available as in pass-by-value, so that the variable can be modified internally, then code must copy value back out at end of procedure. This mode is very common in remote procedure call distrubuted environments.

Passing subprograms as parameters

In many functional programming languages, there is no strong distinction between code and data.

E.g., in LISP (short for LISt Processor), both data and functions are lists of things. One can pass a function to a function just as easily as passing data.

There is an important notion here that is not mentioned in the book, and that is the idea of a first class entity.

From Wikipedia:First class object:

This includes: naming it, storing it, passing it as a parameter, evaluating it, creating a new one on the fly, etc.

In C/C++/Java, functions/methods are not first class entities.

Lots of scripting languages have an “eval” command, or something similar, which let’s you create code on the fly and then evaluate (execute) it.

What is the environment that the procedure parameter runs in? From textbook:

  • Shallow binding: the environment of the call statement that enacts the passed subprogram;
  • Deep binding: the environment of the definition of the passed subprogram;
  • Ad hoc binging: the environment of the call statement that passed the subprogram as an actual parameter.

Overloaded subprograms

Allowing the same name to be used multiple times to define multiple different procedures/functions/methods.

Something must be unique about each definition to distinguish it from others.

A good choice would be the entire function prototype: that is, the types of the parameters and the return type should all be used to distinguish between overloaded subprograms. (textbook: this is used in Ada)

C++/Java/C# unfortunately do not consider the return type for purposes of overloading. The textbook says that because these languages “allow mixed-mode expressions, the return type is irrelevant”. I disagree – it is just a little harder to use, but it certainly could have been.

Generic subprograms

Also known as templates in C++.

Supports parametric polymorphism, where some type expressions are used as parameters to decide the concrete forms of the subprogram.

Read textbook for differences between, e.g., C++ and Java.

Coroutines

Coroutines are a unique form of subprogram that allows what the book calls “quasi-concurrency”.

With coroutines, control switches back and forth between the routines, rather than the strict call-then-run-to-finish model of stack-based nested procedure calls.

Each coroutine is named and other coroutines explicitly use it by name.

Coroutines transfer control between each other with a resume CR statement. This stops the current coroutine and then resumes coroutine “CR” wherever it stopped previously.

If any one coroutine ends then all coroutines end, and control is returned back to the original caller.

Stacks and activation records

All of our general imperative (including OO) languages use a runtime call stack of activation records.

This nicely supports recursion.

Activation record contains:

  • Actual parameters
  • Return address
  • Saved execution state of caller
  • Local variables

Text adds dynamic link which is a pointer to the top of the stack for the caller’s state. This allows the callee to completely restore the stack for the caller.

C/C++ typically split the duties of restoring the stack: callee is responsible for removing locals and restoring execution state; caller is responsible for removing actual parameters.

C/C++ still have a dynamic link: the saved “frame pointer”. On x86, this is the saved %ebp register, “bp” meaning “base pointer” (same as frame pointer).

Nested subprograms and variable access

The dynamic links of active procedures form a dynamic chain,

Nested subprograms in languages using dynamic scoping can use the dynamic link to resolve non-local variables.

  • deep access: store variables in activation records, search the dynamic chain for non-local variables;
  • shallow access: store variables somewhere else (e.g., in a global table), lookup every variable from table.

But still need to find the variable in the activation record.

Nested subprograms in languages that use static scoping must also provide a static link to the enclosing subprogram.

The static links form a static chain.

Still need to find variable in the activation record, but since it is static, the compiler can generate a fixed offset (assuming data types have known sizes).

Compiler can also determine the number of static links to traverse.

Passing nested subprograms

If a subprogram (or subprogram “reference”) can be passed out of the enclosing context, in what environment does it execute?

Typically, it’s “original” execution environment must be saved as long as there is potential for it to execute. This is the root idea of a closure.

It is also discussed on the Wiki funarg problem page.