Predict Directive
Context: Prediction
Prediction process
ASReml parses the predict statement before fitting the model.
If any syntax problems are encountered, these are reported
in the .pvs file after which the statement is ignored: the job is
completed as if the erroneous prediction statement did not exist.
The predictions are formed as an extra process in the final iteration
and are reported to the .pvs file.
Consequently, aborting a run by creating the
ABORTASR.NOW
file will cause any predict statements to be ignored
using
FINALASR.NOW
will allow any predict statements to be honoured.
By default, factors are predicted at each level, simple covariates are
predicted at their overall mean and covariates used as a basis for
splines or orthogonal polynomials are predicted at their design
points.
Model terms mv
and units are always ignored.
Prediction at particular values of a covariate or particular levels
of a factor is achieved by listing
the values after the variate/factor name. Where there is a sequence of values,
use the notation a b ... n to represent the sequence of values
from a
to n with step
size b-a. The default stepsize is 1 (in which case
b may be omitted).
A colon ( :) may replace the
ellipsis ( ...). An increasing
sequence is assumed. When giving particular values for factors,
the default is to use
the coded level (1: n) rather than the label (alphabetical or integer).
To use the label, precede it with a quote ( ").
p
The second step is to specify the averaging set. The default averaging
set is those explanatory variables involved in fixed effect model terms that are not
in the classifying set. By default variables that only define random
model terms are ignored. The
qualifier !AVERAGE allows these variables to be added to the
default averaging set.
The third step is to select the linear model terms to
use in prediction. The default is that all model terms based entirely on variables in the
classifying and averaging sets are used. Two qualifiers allow this default
to be modified by adding ( !USE) or
removing ( !IGNORE) model terms.
The qualifier !ONLYUSE explicitly specifies the model terms to use, ignoring all others.
The qualifier !EXCEPT explicitly specifies the model terms not to
use, including all others. These qualifiers may
implicitly modify the averaging set by including variables defining terms in the
predicted model not in the classify set. It is sometimes easier to specify
the classify set and the model terms to use and allow ASReml to
construct the averaging set.
The fourth step is to choose the weights to use when averaging over
dimensions in the hyper-table. The default is to simply average over the
specified levels but the qualifier !AVERAGE
factor weights allows other weights to be specified.
There are often situations in which the fixed effects
design matrix X is not of full column rank. These can be
classified according to the cause of aliasing.
1. linear dependencies among the model terms due to
over-parameterisation of the model,
2. no data present for some factor combinations so that the
corresponding effects cannot be estimated,
3. linear dependencies due to other, usually unexpected,
structure in the data.
The first type of aliasing is imposed by the parameterisation chosen
and can be determined from the model. The second type of aliasing can
be detected when setting up the design matrix for parameter estimation
(which may require revision of imposed constraints). The third type
can then be detected during the absorption of the mixed model
equations. Dependencies (aliasing) can be dealt with in several ways
and ASReml checks that predictions are of estimable functions in the sense defined by
Searle (1971, p160) and are invariant to the constraint method used.
ASReml doesn't print predictions of non-estimable
functions unless the !PRINTALL qualifier is specified.
However, using !PRINTALL is rarely a satisfactory solution.
Failure to report
predicted values normally means that the predict statement is
averaging over some cells of the hyper-table that
have no information and therefore cannot be averaged in a meaningful way.
Appropriate use of the !AVERAGE
and/or !PRESENT
qualifiers will usually resolve the problem.
The !PRESENT qualifier enables the construction of means
by averaging only the estimable cells of the hyper-table.
It is reguarly used for nested factors, for example locations nested in regions.
Return to start