Next: Checking program points Up: If it doesn't Work Previous: If it doesn't Work Contents Index

Subsections

Understanding failure

We can distinguish five types of failure, each with its own set of causes and possible remedies.

Run-time errors

Run-time errors occur if we call a built-in predicate with a wrong argument pattern. This will usually either be a type mismatch, i.e. using a number where an atom is expected, or an instantiation problem, i.e. passing a variable where a ground value was expected or vice versa. In this case the ECLiPSe system prints out the offending call together with an error message indicating the problem. If we are lucky, we can identify the problem immediately, otherwise we may have to look up the documentation of the built-in to understand the problem.

In the following example bug0, we have called the predicate open/3 with an incorrect first argument. :-export(bug0/0). bug0:- open(1,read,S), % wrong read(S,X), writeln(X), close(S).

When we run the query bug0, the ECLiPSe system prints out the message:

type error in open(1, read, S)

In general run-time errors are quite simple to locate and to fix. But the system does not indicate which particular call of a built-in is to blame. We may need to use the tracer to find out which of dozens of calls to the predicate is/2 for example may have caused a particular problem. There are several things that may help to avoid the tedious tracing.

If the call to the predicate contains some variable name, we may be able to locate the problem by searching for that name in the source file(s).
Well placed logging messages may also be helpful, they indicate which program part is responsible.
If we have only applied a small change to a previously working program, then it is likely that the error is located close to the change we have made. Testing the program after each change will speed up development.

Environment limits

These errors occur when we hit some limit of the run-time system, typically exceeding available memory. An example is the run-away recursion in the program bug1 below:

:-export(bug1/0). bug1:- lp(X), % wrong writeln(X). lp([_H|T]):- lp(T). lp([]).

After some seconds, the system returns an error message of the form:

*** Overflow of the local/control stack!
You can use the "-l kBytes" (LOCALSIZE) option to have a larger stack.
Peak sizes were: local stack 13184 kbytes, control stack 117888 kbytes

Typical error sources are passing free variables to a recursion, where the terminating clause is not executed first or the use of an infinite data structure.

Failure

Probably the most common error symptom is a failure of the application. Instead of producing an answer, the system returns 'no'. This is caused by:

Calling a user-defined predicate with the wrong calling pattern. If none of the rules of a predicate matches the calling pattern, then the predicate will fail. Of course, quite often this is intentional (tests for some condition for example). It becomes a problem if the calling predicate does not handle the failure properly. We should know for each predicate that we define if it can fail and make sure that any calling predicate handles that situation.
Calling a built-in predicate with arguments so that it fails unexpectedly. This is much less likely, but some built-in predicates can fail if the wrong arguments are passed.
Wrong control structure. A common problem is to miss the alternative branch in an if-then-else construct. If the condition part fails, then the whole call fails. We must always add an else part, perhaps with an empty statement true.

The best way of finding failures is by code inspection, helped by logging messages which indicate the general area of the failure. If this turns out to be too complex, we may have to use the tracer.

Wrong answer

More rare is the situation where a ``wrong'' answer is returned. This means that the program computes a result, which we do not accept as an answer. The cause of the problem typically lies in a mismatch of the intended behaviour of the program (what we think it is doing) and the actual behaviour.

In a constraint problem we then have to identify which constraint(s) should eliminate this answer and why that didn't happen. There are two typical scenarios.

The more simple one is caused by missing a constraint alltogether, or misinterpreting the meaning of a constraint that we did include.
The more complex scenario is a situation where the constraint did not trigger and therefore was not activated somewhere in the program execution.

We can often distinguish these problems by re-stating the constraint a second time after the wrong answer has been found.

If the constraint still accepts the solution, then we can assume that it simply does not exclude this solution. We will have to reconsider the declarative definition of the constraint to reject this wrong answer.

If the program fails when the constraint is re-stated, then we have a problem with the dynamic execution of the constraints in the constraint solver. That normally is a much more difficult problem to solve, and may involve the constraint engine itself.

Missing answer

Probably the most complex problem is a missing answer, i.e. the program produces correct answers, but not all of them. This assumes that we know this missing answer, or we know how many answers there should be to a particular problem and we get a different count when we try to generate them all. In the first case we can try to instantiate our problem variables with the missing answer before stating the constraints and then check where this solution is rejected.

This type of problem often occurs when we develop our own constraints and have made a mistake in one of the propagation rules. It should not occur if we use a pre-defined constraint solver.

Next: Checking program points Up: If it doesn't Work Previous: If it doesn't Work Contents Index

Warwick Harvey
2004-08-07