# Predict Directive

## Introduction

Prediction is the process of forming a linear function of the vector of fixed and random effects in the linear model to obtain an estimated or predicted value for a quantity of interest. It is primarily used for predicting tables of adjusted means. If a table is based on a subset of the explanatory variables then the other variables need to be accounted for. It is usual to form a predicted value either at specified values of the remaining variables, or averaging over them in some way.

• Underlying principles
• Prediction process
• Prediction problems

## Predict syntax

The predict statement(s) may appear immediately after the model line (before or after any tabulate statements) or after the R and G structure lines. The syntax is
predict factors [ qualifiers ]
• predict must be the first element of the predict statement, commencing in column 1 in upper or lower case,
• factors is a list of the variables defining a multiway table to be predicted; each variable may be followed by a list of specific values to be predicted,
• the qualifiers, listed below, instruct ASReml to modify the predictions in some way,
• a predict statement may be continued on subsequent lines by terminating the current line with a comma,
• several predict statements may be specified.

The first step is to specify the classify set of explanatory variables after the predict directive.

## Predict qualifiers

The prediction qualifiers are defined with the following syntax:
f is an explanatory variable which is a factor,
t is a list of terms in the fitted model,
v is a list of explanatory variables.

#### Controlling formation of tables

!ASSOCIATE facilitates prediction when there is a hierarchal structure to the levels of some factors. More.
!AVERAGE f [ weights]
is used to formally include a variable in the averaging set and to explicitly set the weights for averaging. Variables that only appear in random model terms are not included in the averaging set unless specified with the !AVERAGE, !ASSOCIATE or !PRESENT qualifiers. The default for weights is equal weights.
weights can be expressed like {3*1 0 2*1}/5 to represent the sequence 0.2 0.2 0.2 0 0.2 0.2. The string inside the curly brace is expanded first and the expression n*v means n occurrences of v.

!ASAVERAGE v is used to modify averaging of ASSOCIATED factors either by supplying an explcit set of weights for the base associated factor, or listing the associated factors to be averaged in order. Otherwise, the base level associated factor levels have equal weight.

!AVERAGE f ' file'[, n] When there are a large number of weights, it may be convenient to construct them in a file and retrieve them. The file is read free format. If n is specified, the values only taken from field n of the file file.

!PARALLEL [ v]
without arguments means all classify variables are expanded in parallel. Otherwise list the variables from the classify set whose levels are to be taken in parallel.

!PRESENT v
is used when averaging is to be based only on cells with data. v is a list of variables and may include variables in the classify set. v may not include variables with an explicit !AVERAGE qualifier. ASReml works out what combinations are present from the design matrix. A second !PRESENT qualifier is allowed on a predict statement (but not with !PRWTS). This is needed when there are two nested factors such as sites within regions and genotype within family. The two lists must not overlap.

!PRWTS v
is used in conjunction with the first !PRESENT factors to specify the weights that ASReml will use for averaging that !PRESENT table. More details.

#### Controlling inclusion of model terms

!EXCEPT t
causes the prediction to include all fitted model terms not in t.

!IGNORE t
causes ASReml to set up a prediction model based on the default rules and then removes the terms in t. This might be used to omit the spline Lack of fit term ( !IGNORE fac(x)) from predictions as in
```  yield ~ mu x variety !r spl(x) fac(x)
predict x !IGNORE fac(x)
```
which would predict points on the spline curve averaging over variety.

!ONLYUSE t
causes the prediction to include only model terms in t. It can be used for example to form a table of slopes as in

```  HI ~ mu X variety X.variety
predict variety X 1 !onlyuse X X.variety
```
!USE t
causes ASReml to set up a prediction model based on the default rules and then adds the terms listed in t.

#### Printing

!DEC [ n]
gives the user control of the number of decimal places reported in the table of predicted values where n is 0...9. The default is 4. G15.9 format is used if n exceeds 9.

!PLOT [ x]
instructs ASReml to attempt a plot of the predicted values. This qualifier is only applicable in versions of ASReml linked with the Winteracter Graphics library. If there is no argument, ASReml produces a figure of the predicted values as best it can. The user can modify the appearance by typing ESC to expose a menu or with the plot arguments.

!PRINTALL
instructs ASReml to print the predicted value, even if it is not of an estimable function. By default, ASReml only prints predictions that are of estimable functions.

!SED
requests all standard errors of difference be printed. Normally only an average value is printed.

!TDIFF
requests t-statistics be printed for all combinations of predicted values.

!TURNINGPOINTS n
requests ASReml to scan the predicted values from a fitted line for possible turning points and if found, report them and save them internally in a vector which can be accessed by subsequent parts of the same job using \$TPn. This was added facilitate location of putative QTL.

!TWOSTAGEWEIGHTS
is intended for use with variety trials which will subsequently be combined in a meta analysis. It forms the variance matrix for the predictions, inverts it and writes the predicted variety means with the corresponding diagonal elements of this matrix to the .pvs file. These values are used in some variety testing programs in Australia for a subsequent second stage analysis across many trials. A data base is used to collect the results from the individual trials and write out the combined data set. The diagonal elements are used as weights in the combined analysis.

!VPV
requests that the variance matrix of predicted values be printed to the .pvs file.

## Examples

Examples are as follows:
``` yield ~ mu variety !r repl
predict variety
```
is used to predict variety means in the NIN field trial analysis. Random repl is ignored in the prediction.
``` yield ~ mu x variety !r repl
predict variety
```
predicts variety means at the average of x ignoring random repl.
``` yield ~ mu x variety  repl
predict variety x 2
```
forms the hyper-table based on variety and repl at the covariate value of 2 and then averages across repl to produce variety predictions.
``` GFW Fdiam  ~ Trait Trait.Year !r Trait.Team
predict Trait Team
```
forms the hyper-table for each trait based on Year and Team with each linear combination in each cell of the hyper-table for each trait using Team and Year effects. Team predictions are produced by averaging over years.
``` yield ~  variety  !r site.variety
predict variety
```
will ignore the site.variety term in forming the predictions while
``` predict variety !AVERAGE site
```
forms the hyper-table based on site and variety with each linear combination in each cell using variety and site.variety effects and then forms averages across sites to produce variety predictions.