Skip to Content

PL: Attribute Grammars

From old CS 471 notes…

Attribute Grammars

Basic context-free grammars give us nice “simple” methods for specifying the first-level syntax of a programming language.

But they cannot encode context-sensitive rules, such as “an integer expression must be used for an array index

Such rules are on the boundary between syntax and semantics. Syntax determines what a valid program is written as, and semantics determines what the program does.

The textbook calls such rules as the array index example above static semantics.

In programming languages and software engineering, static means something that can be determined without executing the program, while dynamic means something that happens during execution of the program.

Thus, static semantics are behavioral rules that can be determined by the compiler, statically, before the program is ever run.

Note that the more you can determine statically, the safer or more reliable your program will be.

Most (by far) static semantics involve type checking, that is, ensuring that the program uses the data types properly.

Attribute Grammars

Simply put, an attibuted grammar is a CFG that has attributes associated with terminal and non-terminal grammar symbols.

Grammar rules have attribute computation functions that determine how attribute values are computed and propagated through the parsing of a particular program.

Grammar rules also have predicate functions, which are boolean functions stating a particular constraint on attribute values that must hold (be true) for the program being parsed to be correct.

The predicate functions encode the static semantic rules of the language.

Attributes

Attributes are in one of two classes: synthesized attributes or inherited attributes.

Synthesized attributes pass information up a parse tree: an attribute’s value is computed from the attributes of the child parse nodes.

Inherited attributes pass information down and across a parse tree: an attribute’s value is computed from attributes of the parent node and/or children to the left (typically, to avoid circularity among children).

Intrinsic attributes are those attributes on the leaf nodes of the parse tree that are computed outside of the attribute grammar. For example, the type of a variable (identifier) is stored in the symbol table.

Example from textbook

Ensuring that the type of an expression is compatible with the type of the variable that is being assigned to in an assignment statement.

Two attributes: expected_type and actual_type

expected_type is an inherited attribute that is passed down to the expression node in the parse tree, which is the data type that is needed for the assignment.

actual_type is a synthesized attribute that is passed up from subexpressions, which is the actual data type being produced by the (sub-)expression.

Key information transfer: the expected_type of the expression is assigned to be the actual_type of the variable on the left hand side of the assignment.

This is OK because an inherited attribute can be propagated from attributes of siblings to the left.