Data Flow Analysis
All the optimization techniques we have learned earlier depend on data flow analysis. DFA is a technique used to know about how the data is flowing in any control-flow graph.
Example:
Forglobal common sub-expression elimination, we need to find the expression that computes the same value along with any available execution path of the program.
The Data-Flow Analysis Schema
In data flow analysis, a data flow value associates with every program point represents an abstraction for a set of all possible program states for that point.
The domain of this application is a set of possible data flow values.
IN[a]: The data flow value before the statement a.
OUT[a]: The data flow valueafter the statement a.
The main aim of the data flow problem is to find a set of constraints on the IN[a]’s and OUT[a]’s for statements a.
There are two sets of constraints: Transfer function and control-flow constraints.
Transfer Function
The semantic of the statement are the constraints for the data flow values before and after a statement.
For example, if variable x has the value y before executing any statement, say p = x. Then, after the statement’s execution, the value for both x and p will be y.
The transfer function is the relationship between the data flow values before and after the assignment statement.
Transfer function comes in two ways:
- Forward propagation along with execution path
- Backward propagation up the execution path
Forward Propagation:
The following points should be considered:
- In forward propagation, the transfer function for any statement s will be represented by Fs.
- This transfer function takes the data flow values before the statement and produces the output or new data flow value after the statement.
- So the new data flow values after the statement will be OUT[s] = Fs (IN[s]).
Backward propagation:
The following points should be considered:
- Backward propagation is the converse of forward propagation.
- This transfer function converts a data flow value after the statement to a new data flow value before the statement.
- So the new data flow values will be IN[s] = Fs (OUT[s]).
Control-Flow Constraints
The second set of constraints is derived from the flow of control. If block B consists of statements S1, S2, …….., Sn, then the control flow value of Si will be equal to the control flow values into Si + 1. Which is:
IN [ Si +1 ] = OUT [ Si ], for all i = 1 , 2, …..,n – 1.
Reaching Definition
The most common and useful data flow scheme is Reaching Definition. A definition D reaches the point P along with path following D to P such that D is not killed along the path.
Live-Variable Analysis
Here we know about the value of a variable at point p is used from point p to the end. It means that the variable is live at point p; otherwise variable is dead at point p.
This is used for register allocation for basic blocks.
Available Expressions
An expression a + b is said to be available at point p if all the path from entry node p evaluates a + b. There should not be any other assignment to a or b.
We can say that block kills expression a + b if it assigns a or b and does not computes a + b.
The main use of the available expression is to detect global common sub-expression.