# Variance Models

## Factor Analysis models

FAk, FACVk and XFAk are different parameterizations of the factor analytic model in which S is modelled as S= GG' + P where G\$ is a matrix of k loadings on the covariance scale and P is a diagonal vector of specific variances. See Smith et al. (2001) and Thompson et al. (2003) for examples of factor analytic models in multi-environment trials.

The general limitations are
• that P may not include zeros except in the XFAk formulation
• constraints are required in G for kgt 1 for identifiability. Typically, one zero is placed in the second column, two zeros in the third column, etc.
• The total number of parameters fitted (kw + w - k(k-1)/2) may not exceed w(w+1)/2.

#### Correlation form

FAk models the variance-covariance matrix \$S on the correlation scale as S= DCD, where
• D is diagonal such that DD = diag(S),
• C is a correlation matrix of the form FF' + E where F is a matrix of k loadings vectors on the correlation scale and E is diagonal and is defined by difference,
• the parameters are specified in the order: loadings for each factor (F) followed by the variances (diag(S); when k is greater than 1, constraints on the elements of F are required.

#### Covariance scale

FACVk models ( CV for covariance) are an alternative formulation of FA models in which S is modelled as S= GG' + P where G is a matrix of k loadings on the covariance scale and P is diagonal. The parameters in FACV
• are specified in the order: loadings (G) followed by specific variances P; when k is greater than 1, constraints on the elements of G are required,
• are related to those in FA by G= DF and P= DED,

#### Extended form

XFAk ( X for extended) is the third form of the factor analytic model and has the same parameterisation as for FACV, that is, S= GG' + P. However, XFA models
• have parameters specified in the order diag(P) and vec(G); when k is greater than 1, constraints on the elements of G are required,
• may not be used in R structures,
• are used in G structures in combination with the xfa(f,k) model term,
• return the factors as well as the effects.
• permit some elements of P to be fixed to zero,
• are computationally faster than the FACV formulation for large problems when k is much smaller than w,

Special consideration is required when using the XFAk model. The SSP must be expanded to have room to hold the k factors. This is achieved by using the xfa(f,k) model term in place of f in the model. For example,

```  y ~ site !r geno.xfa(site,2)
0 0 1
geno.xfa(site,2) 2
geno
xfa(site,2) 0 XFA2  !GP
10*0.1   # Psi (Specific variances, assuming 10 sites)
```

In ASReml 3 if no loadings are fixed (i.e. !GP), ASReml will rotate the loadings to orthogonality, and hold the leading loadings of lower factors fixed. They are however updated in the orthogonalization process which occurs at the beginning of each iteration (so the final returned values have not been formally rotated).

Finding the REML solutions for multifactor Factor Analytic models can be difficult. The first problem is specifying initial values. When using !CONTINUE and progressing XFA(k) to XFA(k+1), ASReml3 initialises the next factor at SQRT(P*0.4) and changing the sign of the (relatively) largest loading to negative.

``` !WORK 1 !NOGRAPH !continue
Title: ALBUS2tage.
#trial,year,region,variety,yield,rep,weight,ems
#KFA02BURU,2002,NSW,KIEV-MUTANT,0.873,3,2136.562,0.0010000
trial   !A
year    !I
region  !A
variety !A
yield
rep     *
weight  !*0.025
ems
!CYCLE 11 1 2 3 4
!DOPART \$I
ALBUS2tage.csv  !SKIP 1   !MAXIT 40 !AILOAD 20

!PART 11
!MAXIT 25
yield !wt=weight ~ mu trial !r  trial.variety
1 1 1
0 !S2==0.025
trial.variety 2
trial 0 CORUH .1
87*.1
variety

!PART 1 2 3 4
yield !wt=weight ~ mu trial !r xfa(trial,\$I).var
1 1 1
0 !S2==0.025
xfa(trial,\$I).var 2
xfa(trial 0 XFA\$I     !GP
87*.01
87*.07  87*.07   87*.07  87*.07
variety
```
A previous set of analyses using these five models gave LogL values for the models CORUH, XFA1, XFA2, XFA3 and XFA4 respectively of 2782, 2910, 3021, 3109 and 3200 using the strategies listed above in separate runs. Running this job using the integrated strategy produced LogL values of 2783, 2911, 3048, 3153 and 3206. However, for models XFA3 and XFA4, the LogL drifted away again.

The XFA display reported in the .res file has been revised. The current output from a small example with 9 environments and 2 factors is %Ontario
``` DISPLAY of variance partitioning for XFA structure in xfa(Env,2).Geno
1 |                                       1         |   0.3339  79.7 0.0679 0.5147 0.0335
2 |                                               1 2   0.1666 100.0 0.0000 0.4003 0.0797
3 |                            1    2               |   0.2475  67.8 0.0798 0.3805 0.1514
4 |                                            1    2   0.1475 100.0 0.0000 0.3625 0.1269
5 |                                        1        2   0.4496 100.0 0.0000 0.6104 -0.278
6 |                     1                           2   0.1210 100.0 0.0000 0.2287 0.2622
7 |                    1     2                      |   0.4106  54.4 0.1872 0.4152 -0.226
8 |    1                                            2   0.0901 100.0 0.0000 0.0922 0.2857
9 |                           1                     2   0.1422 100.0 0.0000 0.2819 0.2506
0 |----+----+----+----+----+----+----+----+-- Average   0.2343  89.1 0.0372 0.3651 0.0763
```
In the figure, 1 indicates the proportion of TotalVar explained by the first loading, 2 indicates the proportion explained by first and second (provided it plots right of 1. Consequently, the distance from 2 to the right margin represents PsiVar. %expl reports the percentage of TotalVar explained by all loadings. The last row contains column averages.