PL: Functional Programming • Jonathan Cook

From old CS 471 notes…

Functional Programming Languages

Programming Paradigms

Several (many?) large-scale organizing philosophies for how to specify computations have been devised: we call these programming paradigms.

Depending on who you talk to, the list can be very different.

See Wikipedia:Programming_paradigms for an “exhaustive” list.

Imperative: Computations are specified as a series of commands or actions that operate over the state of the program.

Declarative: Computations are specified as declarations of relations that must hold; lots of sub-paradigms in declarative, but we will focus on functional and logic.

functional: Computations are specified similar to mathematical functions, that soley declare a computational relationship between the parameters to the function and the result of the function; also follows mathematical rules of function composition.

logic: Computations are specified as logical relations among sets of things; logical inference rules are used to “compute” what logical relations are true and what values in the sets must be chosen to make the relations true.

Some people add object-oriented as another basic programming paradigm, but it is in some sense orthogonal to the three paradigms above.

object-oriented: A computation is organized as a set of objects which send messages to each other; since objects are opaque to each other, internal computation of an object (i.e., what it does when it receives a message) could be specified in imperative, functional, or logic paradigms.

Languages are typically centered on one programming paradigm, but multi-paradigm languages exist, most often in the context of interpreted languages.

Functional Programming

As said above, computations are specified similar to mathematical functions, that soley declare a computational relationship between the parameters to the function and the result of the function; also follows mathematical rules of function composition.

A pure mathematical function is simply a mapping between two sets: the input set of values and the output set of values.

Example: I = {1,2}, O = {3,4}, f = (1->3, 2->4)

Or, I = {1,2}, f(i in I) = i+2

Of course, the sets might possibly be infinite, so some sort of abstract specification might be needed to declare them and the functional mapping.

I = {Positive Integers}, f(i in I) = i + 2

Pure mathematical functions always map the same input to the same output: this is called referential transparency.

Referential transparency allows all sorts of optimizations to be performed on how to compute the function.

Example, suppose we had f(i in I) = i OP 2, where OP is a very expensive computation; we could cache the result of each unique i that the function is computed for, and if that same input is seen again, simply look up the result from the cache.

function composition: in math, functions can be composed with each other.

f(x) = x + 2

g(x) = 3 * x + 4

h(x) = f . g, or f(g(x))

Computational optimization: might evaluate g(x) separately when needed for a concrete x, or might compose the function bodies and optimize them down to (3 * x + 6) before any computation.

Variables

Major point: variables in mathematical functions are simply names for constant values, they do not change value as a computation is performed.

In imperative programming, a variable is a name that represents a memory cell that holds various values over the lifetime of the computation.

In functional programming, a variable is simply a name referring to a value from some set; it does not change over time.

Another way to look at it is, in functional programming a variable is just a parameter to some computation; inside the computation the parameter is constant.

Because of this, there are no assignment statements in functional languages!

Some functional languages “look like” they have variables and assignments; e.g., you might say something like:

  function g(x,y) = 
     let 
        v = f(x) + 42 * y + 17  
     in
        sqrt(v+12) - x % 3
     end
  end

But the variable v is simply short-hand notation so that the programmer can split up a complex computation and not write it all in one expression; the expression that v is set to can be substituted in the “let” block, and v is constant within that block.

Control structures

In imperative programming, the main control features are sequencing, selection, and iteration (repetition). These are typically found in statement sequencing, if-else and switch blocks, and loops, respectively.

In functional programming, instead of sequencing of actions (as imperative languages do), there is only the notion of computational nesting, e.g., a parenthesized sub-expression must be ultimately evaluated before the full expression.

And, e.g., f(g(x)) ultimately means that g(x) must be evaluated “before” f(), even if any optimizations ultimately compose their expressions.

Selection control is more like the C conditional operator “?:” rather than the “if-else” statement: the operator actually returns a value, whereas “if-else” simply controls execution flow.

This is like disjoint functions in math; it might look like:

  function f(x) = 
     if (x < 10) then
        2 * x + 9
     else
        x + 10
  end

The equivalent C would be:

int f(int x)
{
   return 
       (x < 10) ? 
          (2 * x + 9) 
        : (x + 10)
   ;
}

In pure functional languages, the idea of loops makes no sense because there is no underlying program state that changes for each loop iteration!

In other words, you cannot have any variable that changes values so that the loop condition ultimately changes from true to false.

So, for repetition, functional languages use recursion.

E.g., to sum an array we might do:

   function array_sum(A,i) = 
      if (i >= A.length) then
         0
      else
         A[i] + array_sum(A, i+1)
   end

Note that in this example, the variable i never changes in any single invocation!

It simply represents a different constant in each invocation.

Also, we are assuming that the array is constant as well.

The recursion above is in a form known as tail recursion, where the recursive call is the last thing in the function.

Good functional language systems can make tail-recursive functions just as efficient as loops in imperative languages.

Even GCC can optimize tail recursion, but only if the recursive call is the only thing in a return statement. GCC cannot optimize this:

int fact(int n)
{
   if (n <= 1)
      return 1;
   else
      return n * fact(n-1);
}

but it can and does optimize this:

int fact(int n, int fval)
{
   if (n <= 1)
      return fval;
   else
      return fact(n-1, n * fval);
}

By carrying the factorial value as second parameter so that the recursive call is the only thing in the return statement, with “-O3” optimization, GCC will remove the recursive call and transform the body of the function into a loop.

Higher order functions and map/reduce

Not many notes here, but we talked and did examples (in ML) of functions that create functions (kinda like what Javascript can do but a bit more powerful), and the map/reduce paradigm of computation that was invented within functional programming but is now being used heavily in large scale data-intensive environments (e.g., Google heavily depends on it, as does Yahoo).