General info commands
The following help topics are reference information only (not
commands).
Basic usage steps
This is a quick reference list of the steps needed to start an
analysis with SIBS. Steps in *CAPS* are common to all data sets,
*lower case* steps are for specific types of data. To get more
specific usage or analysis information for particular steps type 'help
'.
1. *LOAD MARKER DATA* ('load markers ') The locus file
can be in either Linkage format or our own internal format.
2. *SET MAP* ('set map marker1 dist1 marker2 dist2 marker3...') This
step can be SKIPPED if the map was given in the loci file. If all
distances between markers are less than .5 distances will be assumed
to be recombination fractions, otherwise distances will be interpreted
as cM. Typing 'set map' with no arguments will list the current map.
3. *sex-linked status* ('sex linked ') If data is from
markers on the X chromosome, this setting must be turned on BEFORE
loading the pedigrees. All males must be homozygous for X markers.
You cannot load a mixture of X-linked and autosomal markers in the
same session. Affected males and females are both analyzed in
commands that run off affectation status (estimate, infomap, and
exclude) but in this version only phenotyped MALES are looked at in
the other commands (although females are used to set maternal phase).
4. *PREPARE PEDIGREES* ('prepare pedigrees ') This
command loads a Linkage-format pedigree file with genotypes and checks
for formatting and non-Mendelian inheritance errors. PHENOTYPE data
is also loaded as part of the 'prepare' command (for more information
on loading phenotype data and the FORMAT of the file see 'help prepare
pedigrees'). After loading data, specific pedigrees can be
removed/selected with the commands 'select pedigrees' and 'remove
pedigrees'. If you have sibships with more than two affected sibs you
can use the 'pairs used' command to select which pairs will be used
(if you decide to use multiple pairs from the same sibship, be sure to
read the help under the 'pairs used' command, as the correct weighting
for multiple sibpairs is not known).
5. *set scanning increments* ('increment distance ' or
'increment steps ') The default is to scan every 1cM, use
this command to change the scanning increments along the map.
6. *SCAN* ('scan') This command computes the probability of each
pair of selected sibs sharing 0, 1, or 2 alleles IBD (0 or 1 alleles
IBD in the case of X-linked data), which is needed for ALL the
analysis commands. 'Scan' must be run after any settings related to
the map change, or if the pedigrees/pairs selected change.
At this point you are ready to run any of the qualitative and
quantitative analysis commands ('estimate', 'exclude', 'infomap' and
'haseman elston', 'nonparametric', 'ml variance', and 'penetrance'
respectively). All commands other than loading markers and pedigrees
can be repeated at any time during the session. If you just type
'help' you will get a list of a variety of flags ('postscript on/off',
'single point on/off') that are available to tailor the session to
your particular needs.
Other helpful notes on running the program:
*command abbreviations (i.e., 'prep ' instead of 'prepare
pedigrees ') In addition to the abbreviations listed for
each command, commands can be shortened to just a few unique letters.
*default answers to input questions are provided in [] and
are accepted just by hitting or .
*commands can be safely broken out of by typing .
*the path for default filenames is the path from which the data was loaded
*the program provides a reverse-video message as to where it is in
the analysis, if this is not being properly flushed from your screen
with each update, try typing 'setenv TERM vt100' at your UNIX shell
prompt.
@journal reference
How to cite MAPMAKER/SIBS:
L. Kruglyak and E. Lander. (1995) "Complete Multipoint Sib Pair
Analysis of Qualitative and Quantitative
Traits". Am. J. Hum. Genet. 57:439-454.
@copyright
Copyright (c) 1995
Whitehead Institute for Biomedical Research. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions must reproduce the above copyright notice, this
list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution. Redistributions of
source code must also reproduce this information in the source code itself.
2. If the program is modified, redistributions must include a notice
(in the same places as above) indicating that the redistributed program is
not identical to the version distributed by Whitehead Institute.
3. All advertising materials mentioning features or use of this
software must display the following acknowledgment:
This product includes software developed by the
Whitehead Institute for Biomedical Research.
4. The name of the Whitehead Institute may not be used to endorse or
promote products derived from this software without specific prior written
permission.
We request that users of this software inform us by sending email to
software_registration@genome.wi.mit.edu (a form can be found by typing
'help registration' in the program).
We also request that use of this software be cited in publications as:
L. Kruglyak and E. Lander. (1995) "Complete Multipoint Sib Pair
Analysis of Qualitative and Quantitative
Traits". Am. J. Hum. Genet. 57:439-454.
THIS SOFTWARE IS PROVIDED BY THE WHITEHEAD INSTITUTE ``AS IS'' AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE WHITEHEAD INSTITUTE BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.
@registration
-------------------------------------------------------------------------
SOFTWARE USAGE E-MAIL NOTIFICATION FORM
We appreciate your interest in the software developed at the MIT
Genome Center, and we would be grateful if you could drop us a note
telling us that you are using it (this also ensures that you will
receive news of any updates). Could you please take a moment to fill
out this form and mail it to software_registration@genome.wi.mit.edu?
<---------------------- CUT HERE ------------------>
SOFTWARE PACKAGE:
YOUR NAME:
INSTITUTION:
MAIL ADDRESS:
PHONE NUMBER:
FAX NUMBER:
E-MAIL ADDRESS:
WHERE DID YOU HEAR ABOUT THIS SOFTWARE?:
WHERE DID YOU OBTAIN IT FROM?:
ADDITIONAL COMMENTS WELCOME:
This program was developed at the Whitehead Institute for Biomedical
Research, 1995
@bug reports
Please send bug reports (or any comments about new features that might
be needed) to :
mpreeve@genome.wi.mit.edu
Please send a sample of the input that causes the error to appear, and put the
'photo' option on to record the commands used so that we can duplicate
the error. Thanks!
@TOPIC loading data and pedigrees
Commands to set up your data for the analysis commands.
@load markers
This command reads in the marker-locus data (allele frequencies for
each genetic marker and the map if given in the file). The file may
be in one of two formats:
1. Linkage parameter file (output from PREPLINK)
See the file sample.loc as an example
2. Our internal marker information format (which contains the same
information and also includes map distance between markers).
See the file internal.loc as an example.
This should be the first command of any SIBS session as this
information is required by every subsequent step in the mapping
process.
If all the distances in the map are < .5, the map will be interpreted
as being in recombination fractions, otherwise, centimorgans will be
assumed. For our internal file format map distances are required as
part of correct input. Distances in the Linkage format are specified
after the "<< SEX DIFFERENCE, INTERFERENCE" line near the bottom of
the file. (The current map can be checked/updated at any time by
using the 'set map' command. See 'help set map' for more
information.)
If the parents are untyped marker allele frequencies are more
important and can cause false-positives if misspecified.
(Underestimating the frequency of an allele will lead to
overestimating the degree of IBD sharing at the locus.) See 'help
allele frequencies' for more information on testing how sensitive your
results are to changes in allele frequencies.
Note -- It is usually better to enter recombinationally inseparable
markers with a cM distance of .1, as if there is a recombination
between them in your data set it will cause a likelihood = 0 error.
@sex linked
*Setting up for X-linked data
sibpair:1> sex linked on
data will be analyzed in sex-linked mode
This command controls in what inheritance mode the data will be
analyzed. If your data is sex-linked you must set this BEFORE loading
pedigree data. All males should be homozygous for X markers and this
is enforced as the pedigree file is loaded. (If you forget to toggle
it on before loading pedigree data you may get INCONSISTENT Mendelian
errors, if this occurs just set sex-linked on and load the pedigree
file again). Once you have toggled on the sex-linked flag all the
sharing proportions in the program will be altered to indicate that
brother/brother pairs can share 0 or 1, brother/sister pairs can share
0 or 1, and sister/sister pairs can share 1 or 2.
*Setting sex-linked status automatically --
The first line of a Linkage loci file has (for example):
12 0 0 5 << NO. OF LOCI, RISK LOCUS, SEX-LINKED (IF 1), PROGRAM
If you set the third parameter to '1':
12 0 1 5 << NO. OF LOCI, RISK LOCUS, SEX-LINKED (IF 1), PROGRAM
sex-linked mode will automatically be turned on and the program will
correctly read past the extra liability/penetrance class lines for
X-linked data. (Reading past the extra lines can also be accomplished
by turning 'sex linked on' before loading marker-locus data.) There
is no way to automatically turn on X-linked status with our internal
marker file format.
*Important note -- True X-linked analysis is ONLY done for affectation
analysis commands ('estimate', 'infomap', and 'exclude'), for
phenotypic analysis commands, only phenotyped males are considered
(although female sib genotypes are used for determining maternal
phase/allele information and to clarify whether two male sibs share
IBD rather than just IBS).
*Reference for more on X-linked analysis:
"An Extension of the Maximum Lod Score method to X-linked loci",
Cordell, et al. Annals of Human Genetics (in press, expected
Oct. 95).
@prepare pedigrees
This command reads through a pedigree file and does some of the
initial mapping set-up. The pedigree file is the same as the
LINKAGE pedigree input format (before running MAKEPED or
doing any preprocessing!):
Each line of the file must have the structure:
3 12 8 9 1 2 1 1 2 8 3 0 0 4 6 1 3 ...
(a) (b) (c) (d) (e) (f) (g) (h ------------------------)
(a) pedigree name (can be a string)
(b) individual ID #
(c) father's ID #
(d) mother's ID #
(e) sex (1=MALE, 2=FEMALE, sex must be specified, 'unknown' is not valid)
(f) affectation status (1=UNAFFECTED, 2=AFFECTED, 0=UNKNOWN)
(g) liability class (optional) - classes specified in marker data file
(h) marker genotypes (these must be integers, 0 indicates missing data)
All ID numbers must be integers, use zeros if the parents are
missing. Both parents must be listed (if one parent is known and the
other not, just insert a dummy parent and fill the genotypes for each
marker with "0 0"). Please consult documentation for the Linkage
program if you need any further information regarding the Linkage file
formats.
SIBS requires that all pedigrees be simple nuclear pedigrees (no
half-sibs, cousins, etc -- these pedigrees are better suited for
GENEHUNTER) with two or more affected/phenotyped sibs. If you load
more than two sibs, all sibs will be used to determine parental phase
information if the parents alleles are missing. For the analysis
methods the default is to use just the first pair of
affected/phenotyped sibs. To use all independent pairs of sibs, or
all pairs use the 'pairs used' command. When using 'all pairs', each
pair is considered as an independent pedigree but a weight (2/num_affecteds)
is factored in to counteract inflation of significance due to the
statistical dependence among these pairs.
As the pedigree file is loaded SIBS checks for any non-Mendelian
inheritance errors and also checks that all the alleles present have a
frequency > 0.0. You may enter as many pedigrees as you wish in a
single file. After completing the "load markers" and "prepare
pedigrees" steps, you are ready to map. If you'd like to become
familiar with the program using the sample data then you can run the
following sequence:
sibpair:3> load sample.loc
Parsing Linkage marker data file...
#the program will determine from the first line of the file
#which file format is being used
Current map (11 markers):
loc1 10.0 loc2 10.0 loc3 10.0 loc4 10.0 loc5 10.0 loc6 10.0 loc7 10.0 loc8 10.0
loc9 10.0 loc10 10.0 loc11
sibpair:4> prepare pedigrees sample.ped
Load phenotype data? y/n [n]: n
file loaded successfully
sibpair:5> scan
...scan done, affection-based analysis commands can now be run.
You can now run 'estimate', 'infomap' and 'exclude'. The sample
data shows a completely penetrant recessive trait that lies midway
between markers loc7 and loc8.
Keep in mind when creating files that there must be a one-to-one
correspondence (IN ORDER AND NUMBER) between the markers described in
the marker data file and the markers that have genotypes listed for
them in the pedigree file. The marker file must be loaded first, see
'load markers'.
You may want to begin by loading your pedigree file through the
MAPMAKER/PEDMANAGER program which will report all the inheritance/format
errors at once. MAPMAKER/PEDMANAGER also provides drawing functionality
and allele frequency calculator.
====================================
Loading and analyzing phenotype data
====================================
Phenotype data is loaded from a separate file in the format:
....
....
An example phenotype file is:
4
ped1 3 6.0 6.0 5.0 5.67
ped1 4 6.0 6.0 5.0 5.67
ped2 3 5.0 5.0 6.0 5.33
ped2 4 6.0 6.0 - 6.00
Where the pedigree file could be:
ped1 1 0 0 1 1 2 3 4 5 2 4
ped1 2 0 0 2 1 3 3 5 5 3 3
ped1 3 1 2 2 2 2 3 5 5 2 3
ped1 4 1 2 1 2 3 3 5 5 2 3
ped2 1 0 0 1 1 2 3 4 6 2 4
ped2 2 0 0 2 1 3 3 6 6 3 3
ped2 3 1 2 2 2 2 3 6 6 2 3
ped2 4 1 2 1 2 3 3 6 6 2 3
Where the pedigree names are the same and given in the same order as
in the pedigree file. Phenotype information must be given for each
sib in a pedigree, but NOT the parents. Missing phenotype values are
indicated with a hyphen. (Negative data will be correctly
interpreted, and not mistaken for missing data.)
Note that after loading your data you must scan with regard to
phenotypes (this choice allows you to run affectation status and
phenotype data in the same session):
sibpair:6> prep sample.ped
Load phenotype data? y/n [n]: y
Enter the name of the phenotypes file: sample.pheno
file loaded successfully
sibpair:7> scan
scan pairs of phenotyped sibs or affected pairs? p/a [p]: p
...scan done, phenotype-based analysis commands can now be run.
======================================================================
Using affectation and phenotypic analysis commands in the same session
======================================================================
To use both phenotypes and affectation in the same session all you
need to do is change which scan type you have done. Scanning for
affectation status will allow you to run 'estimate', 'infomap', and
'exclude', whereas scanning pairs of phenotyped sibs will enable you
to run 'haseman-elston', 'nonparametric', 'no dom var', and 'ml
variance'.
@scan
This command must be run before any of the analysis commands. Also if
the map, pair setting, scanning increment, or pedigree selection has
changed, you must re-scan before running any of the analysis commands
(such as estimate, exclude, etc).
If you have loaded phenotype data you will be asked if you want to
scan the phenotyped pairs or the affected pairs:
sibpair:7> scan
scan pairs of phenotyped sibs or affected pairs? p/a [p]: p
...scan done, phenotype-based analysis commands can now be run.
Without phenotype data loaded, affected pairs are automatically
scanned.
'Scan' computes the full multipoint probability that two sibs share 0,
1, or 2 alleles IBD (0 or 1 for sex-linked data) with the given map
and allele frequencies. This information is then used by all the
quantitative and qualitative analysis commands. The IBD distribution
can be output as a text file using the command 'dump ibd'.
Note: 'scan' may give the error:
error: Number of positions to be calculated exceeds the max of 1000
If this occurs you need to select fewer scan points -- either set the
number of points between markers back or scan at larger distance
increments. (The ceiling can be changed by adjusting MAX_POSITIONS in
sibpair.h and recompiling.)
@TOPIC map control commands
If any of the settings listed below are changed you must re-scan
before they will take effect. For a list of the current map and
pedigree settings type 'session summary'.
@set map
The 'set map' command is used to select the current map that the 'scan'
command will operate on. It is called in the following manner:
set map ...
If marker names are specified (as they can be using our internal
marker file format), markers may be specified using those names. If
not, then they must be specified numerically.
Distances may be specified as either recombination-fractions or
centiMorgans, with the necessary caveat that if all distances are
below 0.5 then they are assumed to be a recombination-fractions,
otherwise centiMorgans are assumed.
When data is entered using our internal marker file format, SIBS will
automatically create a map using each marker and the distances
specified in the file. When data is entered using the Linkage format,
a map can be automatically created by specifying the recombination
distances near the bottom of the file (see the sample.loc file). Note
that it is usually better to enter recombinationally INSPEPARABLE
markers with a cM distance of .1 or so, as if there is a recombination
between them in your data set it will cause a likelihood = 0 error.
If you are working with unordered markers and want to analyze them in
single-point mode type 'single-point on' at the SIBS prompt.
@off end
This command controls how far before the first marker and after the
last marker in a map scores will be calculated. For example, if this
value is set to 10.0, then subsequent scan commands will begin
calculating scores 10 cM before the first marker and continue stepping
through until 10 cM after the last marker. The default value of 'off
end' is 0.0 cM. Only off-end distances >= 0 are valid. Calling 'off
end' with no arguments causes SIBS to report the current value.
Distances may be specified as either recombination-fractions or
centiMorgans, with the necessary caveat that any distance below 0.5 is
assumed to be a recombination-fraction and any greater than or equal
to 0.5 is assumed to be in centiMorgans.
@increment
This command controls the frequency with which the 'scan' command
calculates the IBD distribution along the map.
If 'increment distance 2.0' is entered, the 'scan' command will calculate
a map every 2.0 cM throughout the genetic map selected (regardless of the
position of markers in that map) as follows (in this example the off end
distance is set to 6.0 cM):
-6.0 (6 cM before the first marker), -4.0, -2.0, 0.0 (the position
of the first marker), 2.0, 4.0, ...etc...until 6.0 cM after the last locus.
If 'increment step 5' is selected, the scan command will calculate LOD
scores at 5 equally spaced positions in each map interval. For example,
if your map has three markers separated by distance of 10 and 15 cM and the
off end distance is set to 5 cM, maps will be calculated at the following
positions:
-5.0, -4.0, -3.0, -2.0, -1.0 (equally spaced in the 5cM before the first marker)
0.0, 2.0, 4.0, 6.0, 8.0 (equally spaced in the 10 cM interval)
10.0, 13.0, 16.0, 19.0, 22.0 (equally spaced in the 15 cM interval)
25.0, 26.0, 27.0, 28.0, 29.0, 30.0 (equally spaced in the 5cM after the map)
The default value of 'increment' is 1.0 cM. Calling 'increment' with no
arguments causes SIBS to report the current value.
Note that the first ('distance') method is not guaranteed to hit every
marker position and should be considered inferior to the second
('step') method, which will compute a map at every marker position.
@map function
This command controls which mapping function is used to convert
centiMorgans to recombination-fractions and back again both in the
input and output of the program and in the internal calculations.
Currently only Haldane and Kosambi map functions are available. The
default 'map function' is Kosambi. If the map function is changed you
must do a new 'set map ....' and re-scan before it will take affect.
@allele frequencies
The 'allele frequencies' command controls how the allele frequencies
supplied in the marker data file are utilized by the 'scan' command.
Argument :: Result
given The 'scan' command uses the allele-frequencies given in
the marker data file to compute probabilities.
thresholded The 'scan' command uses the given allele-frequencies
unless the are below the given value. Any frequency below
this given value is considered to have this given value.
The default case is that the 'allele frequencies' given are used. Typing
'allele frequency' with no argument causes SIBS to report the current value.
Marker allele frequencies are important when the parents are untyped
and in this case misspecification of allele frequencies can produce
false positive results. (Underestimating the frequency of an allele
will overestimate the degree of IBD sharing at the locus.) Ideally,
one should obtain good estimates of allele frequencies from the
appropriate population (MAPMAKER/PEDMANAGER will give a quick estimation
of allele frequencies in your set of pedigrees). In addition this
command is provided so that you test how sensitive your results are to
changes in the allele frequencies.
@single point
If you set single point to 'on' each marker loaded from the locus file
will be analyzed separately and the results for each marker will be
output in one file. Any previously set map order will remain intact
for use again if single point is turned back 'off'. There is no
postscript output in single point mode since there is no continuous
region to plot values along. Single point runs the analysis command
for each marker in the order it was loaded regardless of what markers
are included in the map that is entered.
@postscript
If you don't want to generate postscript output files type 'postscript
off' and SIBS will not query you for the names of output files. You
can turn it back on at any point during the session. (If the map only
contains one marker or if you are running in single-point mode,
postscript output is automatically turned off.)
@TOPIC pedigree selection commands
If any of the settings listed below are changed you must re-scan
before they will take effect. For a list of the current map and
pedigree settings type 'session summary'.
@pairs used
If you have loaded more than two sibs in any of your sibships this
command allows you to include the extra sibs in the analysis commands
(all sibs are automatically included for phase information if parents
are missing). When using 'all pairs', each pair is considered as an
independent pedigree but a weight (2/num_affecteds) is factored in to
counteract inflation of significance due to the statistical dependence
among these pairs.
Simply type 'pairs used' and indicate which pair setting you would
like to use:
sibpair:1> pairs used
the current pair setting is: *first affected/phenotyped sibpair only*
Possible pair options:
1. First pair of affected/phenotyped sibs
2. All independent pairs of affected/phenotyped sibs*
3. All pairs of affected/phenotyped sibs*
Enter the index of the analysis you want to use [1]: 2
*"independent" pairs of sibs are created by taking the first sib
paired with sibs 2...n (for a three-sib sibship this will mean the
sharing for pairs 1-2 & 1-3 will be computed). Therefore, the results
can be different if you rearrange the order of the sibship. "all"
pairs are created by taking the first sib paired with sibs 2...n, the
second sib paired with 3...n, etc. For a four-sib sibship this means
the sharing for pairs 1-2, 1-3, 1-4, 2-3, 2-4 and 3-4 will be
computed. The sibs are considered as part of a whole family when
inheritance vectors are determined and then each pair is treated as a
essentially a separate pedigree for the purposes of analysis.
You DO NOT need to re-scan for a change in the pair setting to take effect.
The default is to use the first pair of affected/phenotyped sibs.
@select pedigrees
The 'select pedigrees' command allows the user to select which
pedigrees will be analyzed in subsequent calls to 'scan'. If you wish
to analyze only a subset of the pedigrees in your data file (for
example, ped2 and ped6), you would enter the command:
sibpair:3> select pedigrees ped2 ped6
The following pedigrees are in use: ped2 ped6
If you wish to analyze all pedigrees in the data file (the default condition
when a new data file is prepared), you could enter:
sibpair:4> select pedigrees all
The following pedigrees are in use: ped1 ped2 ped3 ped4 ped5 ped6 ped7
Keep in mind that in the above example, the names 'ped1', 'ped2',
etc., represent the names that are listed in the original pedigree
file. Entering "select pedigrees" with no argument causes SIBS to
report the current list of pedigrees under analysis. If you wish to
remove pedigrees from the list of active pedigrees under analysis, see
the 'remove pedigrees' command.
After changing the list of active pedigrees you must run 'scan' to
recompute the IBD distribution before running any of the qualitative
or quantitative analysis commands.
@remove pedigrees
The 'remove pedigrees' command enables the user to remove specific pedigrees
from analysis by the 'scan' command. If, for example, you wished to exclude
ped5 from your analysis, the command:
sibpair:5> remove pedigrees ped5
The following pedigrees are in use: ped1 ped2 ped3 ped4 ped6 ped7
would effect the appropriate change. You can go back to using the
full set of pedigrees by saying 'select all'.
After changing the list of active pedigrees you must run 'scan' to
recompute the IBD distribution before running any of the qualitative
or quantitative analysis commands.
@TOPIC qualitative trait commands
Commands to map loci using affectation status.
@estimate
Usage -- To run the command just type 'estimate' no arguments are
needed. SIBS will first ask you if you want to analyze your data
under the assumption of no dominance variance, or under the assumption
of dominance variance where Holman's triangle is applied:
analyze under the assumption of no dominance variance? y/n [n]
The default is to perform the analysis with the assumption of
dominance variance.
SIBS will then query you for the filenames to store the text and
postscript output respectively. You will be alerted if either of the
chosen filenames already exist.
Output -- The text file consists of the columns:
(for X-linked data there is no column for z2, and
z's are listed first for each type of pair --
brother/brother, brother/sister, and sister/sister
pairs with the total listed afterward.)
where the z-values are the calculated maximum likelihood proportions.
At the end of the text output is a time-stamped summary of the session
settings when the analysis was run. The first postscript output file
is a graph of position vs. loglike and the second is a plot of how the
maximum likelihood sharing proportions change across the region. The
marker names are given along the x-axis and the distance examined in
the analysis is given at end of the x-axis (this may be larger than
the map distance if you have specified an off-end distance).
Background -- 'estimate' scans the selected map region and identifies
regions of significant excess allele sharing.
Note that the LOD score is never negative, because the maximum
likelihood solution for z0, z1, and z2 can never be worse than the
Mendelian segregation expectation.
@exclude
Usage -->
command line:
sibs:6> exclude
You are then given the option of inputting a set of z's or relative
risk values (the input queries will be different depending on whether
you want to analyze your data under the assumption of no dominance
variance or not). For X-linked data you will be asked to input
Zs/relative risk values for each pair type -- brother/brother,
sister/sister, and brother/sister.
Output --> The text file consists of tabbed columns in the format:
position z2-1 z2-2 z2-3 ... etc.
(You should be able to use this file as input to a plotting program if
you don't have access to a postscript printer.) At the end of the
text file is a time-stamped summary of the session settings. The
postscript file consists of multiple y-axis LOD score plots for each
relative risk value/set of z's and gives the distance examined in the
analysis at end of the x-axis (this may be larger than the map
distance if you have specified an off-end distance). A horizontal
dashed line is drawn at the traditional exclusion criterion of Z < -2.
Background --> Exclusion mapping is used to identify and exclude
regions unlikely to have a major effect on the trait you are mapping.
SIBS does this by comparing the likelihood of the observed sharing
proportion of 0, 1 and 2 alleles between affected sibs (z0,z1,z2), to
the likelihood under the Mendelian expectation of a0=1/4, a1=1/2, and
a2=1/4. When using SIBS under the asssumption of no dominance the
sharing proportions are given by:
z0 = a0/Ls
z1 = a1
z2 = a2((2Ls-1)/Ls)
where Ls = lambda-sub-S, the relative risk ratio for a sib, defined
as:
prevalence of the trait in siblings of affected individuals
---------------------------------------------------------
prevalence of the trait in the population at large
Note that Ls = 1 when there is no observed difference in prevalence of
sibs vs the population (z0=a0, z1=a1, z2=a2 and LOD = 0). If Ls < 1,
it would imply that there was some protective advantage in having an
affected sib. Since neither of these cases are interesting and/or
reasonable, only Ls values > 1 are allowed. (The no dominance
variance assumption allows us to simplify the sharing proportions
above to the one variable Ls. With dominance variance Ls = Lo where
Lo = relative risk ratio for an offspring, and Lm-1=2(Ls-1) where Lm
is the relative risk ratio for a monozygotic twin.)
The likelihood under Bayes theorem is:
L(pos) = (z0*p0+z1*p1+z2*p2) / (a0*p0+a1*p1+a2*p2)
and the LOD score is calculated by summing log10(L(pos)) across
pedigrees for each position.
The relations for z0, z1, and z2 above hold if multiple loci are
involved in the trait, provided that the loci interact
multiplicatively and the lambda values are defined as the component of
the relative risk attributable to the locus.
More details on the analytical method are present in the publication
(enter 'journal ref' for complete source information).
@infomap
Usage -- To run the command just type 'infomap' no arguments are
needed. SIBS will query you for the filenames to store the text and
postscript output respectively. You will be alerted if either of the
chosen filenames already exist. (With X-linked data all types of
pairs are calculated together.)
Output -- The text output file consists of two columns:
where information content is given as a number between 0 and 1. (See
the note below about the possibility of negative information content).
At the end of the text output file is a time-stamped summary of the
settings and files from which the analysis was generated. A
postscript plot of the two columns is generated and gives one a quick
view of where more markers are needed. The markers are shown along
the x-axis and the distance examined in the analysis is given at end
of the x-axis (this may be larger than the map distance if you have
specified an off-end distance).
Background -- Information content mapping gives you a way to calculate
how close you are to extracting the full IBD status across the region
with the current set of markers. The measure is based on the variance
of the IBD distribution.
@TOPIC quantitative trait commands
Commands to map loci using numerical phenotype scorings.
@haseman elston
Usage -- After typing the command you will be queried as to which
phenotype you want to analyze (if you have loaded more than one), and
then queried for files to store the text output for the traditional
haseman-elston and EM haseman-elston analyses, as well as the filename
for the postscript output.
Output -- The traditional and EM haseman-elston output files have the columns:
At the bottom of each of these text output file is a time-stamped
summary of the session variables when the command was run. This
summary will also list which phenotype was selected and in the case of
the EM algorithm, the convergence limit that was used. The postscript
output file has a plot of both the traditional and EM results.
Note1: The EM algorithm has been found to have very rare instabilities
in large intervals between markers; if there is a sudden peak in the
EM plot make sure a similarly shaped peak also appears in the
traditional haseman-elston results. (The nonparametric method does
not have these instabilities either and can also be used to verify
your results.)
Note2: In order to run this command you must have selected more than
two pedigrees/pairs -- which shouldn't be a problem since it won't be
very significant using any less!
@ml variance
Usage -- After typing the command you will be queried as to which
phenotype you want to analyze (if you have loaded more than one), and
then queried for a files to store the text and postscript output.
Output -- The text output file has the format:
(no sigsq2 for sex-linked data)
At the bottom of each of these text output file is a time-stamped
summary of the session variables when the command was run. This
summary will also list which phenotype was selected and the
convergence limit that was used. The postscript output file is a plot
of position vs. LOD.
Note: This EM-based algorithm has been found to have very rare
instabilities in large intervals between markers; if there is a sudden
peak in the plot you can verify it by checking it against the results
of the nonparametric method, which is not subject to the same
instabilities.
@no dom var
Usage -- After typing the command you will be queried as to which
phenotype you want to analyze (if you have loaded more than one), and
then queried for a files to store the text and postscript output.
(Note that since X-linked markers are hemizygous in males, this
command can not be run on sex-linked data where the no-dominance
assumption isn't meaningful.)
Output -- The text output file has the format:
(no sigsq2 for sex-linked data)
At the bottom of each of these text output file is a time-stamped
summary of the session variables when the command was run. This
summary will also list which phenotype was selected and the
convergence limit that was used. The postscript output file is a plot
of position vs. LOD.
Note: This EM-based algorithm has been found to have very rare
instabilities in large intervals between markers; if there is a sudden
peak in the plot you can verify it by checking it against the results
of the nonparametric method, which is not subject to the same
instabilities.
@nonparametric
Usage -- After typing the command you will be queried as to which
phenotype you want to analyze (if you have loaded more than one), and
then queried for a files to store the text and postscript output.
Output -- The text output file has the format:
At the bottom of each of these text output file is a time-stamped
summary of the session variables when the command was run. This
summary will also list which phenotype was selected and the
convergence limit that was used. The postscript output file is a plot
of position vs. Z-score.
@TOPIC basic commands
Auxiliary SIBS commands.
@session summary
Prints a summary of the current map and pedigree related settings, for
example:
sibpair:12> session summary
*Pedigree file: sample.ped
*Phenotype file: (not loaded)
The following 19 pedigrees are in use:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
19 pedigrees originally loaded
*Pairs: first pair from each sibship
*Analysis is: autosomal
*Loci file: sample.loci
*Current map (8 markers):
loc1 9.0 loc2 3.0 loc3 11.0 loc4 4.0 loc5 1.0 loc6 1.0 loc7 3.0 loc8
map function: kosambi
units: centimorgans
off end distance: 0.0
scans done at constant increments of 1.0 cM
(A similar summary is printed at the bottom of each text output file.)
@dump ibd
This command allows you to output the calculated likelihood of
sharing 0, 1 or 2 alleles (0 or 1 for sex-linked data) for each
pedigree, possibly for use in another program. (You will be queried
for the filename to store it in.)
The output format is:
...
@TOPIC shell commands
There are several basic features which SIBS provides to make the
program more friendly and useful. These include on-line help ('help')
and the ability to record session output ('photo') and to accept input
from a batch file ('run').
@help
'Help' displays on-line help information for SIBS commands and
features. Typing 'help' alone produces a list of available topics and
commands. For a general description of a numbered topic, type 'help
', where is the displayed number of the topic. For
help on a more specific command or feature, type 'help ', for
example:
sibpair:1> help prepare pedigrees
The on-line help is an exact duplicate of the reference manual.
@photo
The "photo" command is used to save a copy of the current SIBS session
(input and output) in a text file. If you type "photo ", for
example,
sibpair:1> photo sample.out
all input and output from that point on will be copied into the specified
file (here, the file named "sample.out"). Typing "photo off" or quitting
SIBS terminates this process and closes the photo file. The default
extension for a transcript file is ".out". The 'photo' command will append
program output to the specified file, so output from several sessions may be
collected in the same file if desired.
@run
The "run" command instructs SIBS to take a series of commands from any
text file. This file should contain lines of commands and other input just as
they would be typed into SIBS interactively.
For example, you might want to use a 'run' file to save setup
commands for loading your data:
load markers test.loci
increment distance .1
prep pedigree test.ped
p (To indicate that phenotype data will be loaded)
test.pheno
and could be run with the command
sibpair:1> run setup.in
where 'setup.in' is the name of the file containing the 5 lines of commands
above. (No blank lines can be used to accept defaults, you must provide
input in each line of the run file.)
@system
The 'system' command is used to temporarily interrupt SIBS and start
up a new command interpreter from the operating system. Commands which are
normally typed to the operating system may then be issued. You can return to
SIBS by typing 'exit' or control-D in most operating systems. If an
argument is supplied to 'system', the argument is interpreted just as a
normal command issued to the operating system. For example:
sibpair:4> system lp results.out
would execute the printing command on your operating system and then return
control immediately to SIBS.
@change directory
The 'cd' command works essentially the same way it does under Unix. By
default, all files are read or written from the current directory unless
specified otherwise.
@time
Display the current time from the system clock.
@quit
Assures that the program exits properly; removes any auxillary files,
then exits the session.
@debug
***********Not meant for general use***********
Displays to the screen debugging output throughout the analysis code.
@end