Home AGI Book index Flemming Nielson - Principles of Program Analysis - 1999

Flemming Nielson - Principles of Program Analysis - 1999

History / Edit / PDF / EPUB / BIB /
Created: October 27, 2016 / Updated: August 30, 2025 / Status: finished / 4 min read (~767 words)
artificial-general-intelligence

One common theme behind all approaches to program analysis is that in order to remain computable (tractable?) one can only provide approximate answers
In general, we expect the program analysis to produce a possibly larger set of possibilities than what will ever happen during execution of the program
All program analyses should be semantics based: this means that the information obtained from the analysis can be proved to be safe (or correct) with respect to a semantics of the programming language
Program analysis should not be semantics direct: this would mean that the structure of the program analysis should reflect the structure of the semantics

An assignment (called a definition in the classical literature) of the form $[x := a]^\ell$ may reach a certain program point (typically the entry or exit of an elementary block) if there is an execution of the program where $x$ was last assigned a value at $\ell$ when the program point is reached

Classical data flow analyses:
- Available expressions
- Reaching definitions
- Very busy expressions
- Live variables

$init: \bf{Stmt \rightarrow Lab}$
- Returns the initial label of a statement
$final: \bf{Stmt \rightarrow \mathcal{P}(Lab)}$
- Returns the set of final labels in a statement

$blocks: \bf{Stmt \rightarrow \mathcal{P}(Blocks)}$
- Blocks is the set of statements, or elementary blocks
$labels: \bf{Stmt \rightarrow \mathcal{P}(Lab)}$
- The set of labels occurring in a program
$init(S) \in labels(S)$ and $final(S) \subseteq labels(S)$

$flow: \bf{Stmt \rightarrow \mathcal{P}(Lab \times Lab)}$
- The set of couples representing transitions between labels
$flow^R: \bf{Stmt \rightarrow \mathcal{P}(Lab \times Lab)}$
- The reverse flow
- $flow^R(S): \{(\ell,\ell') | (\ell',\ell) \in flow(S)\}$

$S_*$: the program that we are analysing
$\bf{Lab_*}$: the labels ($labels(S_*)$) appearing in $S_*$
$\bf{Var_*}$: the variables ($FV(S_*)$) appearing in $S_*$
$\bf{Blocks_*}$: the elementary blocks ($blocks(S_*)$) occurring in $S_*$
$\bf{AExp_*}$: the set of non-trivial arithmetic subexpressions in $S_*$. An expression is trivial if it is a single variable or constant

For each program point, which expressions must have already been computed, and not later modified, on all paths to the program point
An expression is killed in a block if any of the variables used in the expression are modified in the block
A generated expression is an expression that is evaluated in the block and where none of the variables used in the expression are later modified in the block
We are interested in the largest sets satisfying the equation for $AE_{entry}$

For each program point, which assigments may have been made and not overwritten, when program execution reaches this point along some path
An assignment is destroyed if the block assigns a new value to the variable
We are interested in the smallest sets satisfying the equation for $RD_{entry}$

An expression is very busy at the exit from a label if, no matter what path is taken from the label, the expression must always be used before any of the variables occurring in it are redefined
(The aim of the Very Busy Expressions Analysis is to determine) For each program point, which expressions must be very busy at the exit from the point
We are interested in the largest sets satisfying the equation for $VB_{exit}$

A variable is live at the exit from a label if there exists a path from the label to a use of the variable that does not re-define the variable
(The Live Variables Analysis will determine) For each program point, which variables may be live at the exit from the point
This analysis might be used as the basis for Dead Code Elimination
We are interested in the smallest sets satisfying the equation for $LV_{exit}$

Use-Definition chains or ud-chains: Links that, for each use of a variable, associate all assignments that reach that use
Definition-Use chains or du-chains: Links that, for each assignment, associate all uses
One application of ud- and du-chains is for Dead Code Elimination
Another application is in Code Motion (moving code around)

Shape Analysis will allow us to statically detect errors like deferencing a nil-pointer

Nielson, Flemming, Hanne R. Nielson, and Chris Hankin. Principles of program analysis. Springer, 1999.