Manuals: Web & text versions |
Web Manual Table of Contents |
Tutorial Practice Datasets |
Formating data with "prepare" |
Mapping with "build" |
Bibliography & Other Links |
The option of choice for positioning one new or a couple of difficult old loci
is all; for
the loci in the inserted_loci parametre it tests all possible
positions and combinations of positions along the current map, and reports the
LOD scores. If one of the position combinations has a LOD score at least 2.000
better than any other, we can manually insert the loci at their positions and
augment the current map.
Which loci from the chromosome 2 dataset are good
candidates for use with
all? For "new" loci, there are the two p arm loci
that were excluded from the second
round of build runs because they showed no twopoint linkage to the
12 locus map: D2S61_2 (64) & CPSI (9). There are also two
q arm loci that show linkage to some of the recently added loci of
the current map: CRYG1-5 (0) & D2S35 (14) (data from a
twopoint run not shown.) For problematic old loci, there are those
that DID enter the second round maps of Try 218 & Try 219
but were omitted from the "best" map of Try 220
because they could not be uniquely placed: D2S39 (22), D2S36 (13), &
GYPC (58).
Trying to position these seven loci somewhere in the 27 loci of the
current map with one all run will NOT work. Most computers don't have
enough memory to evaluate and store the >27 x109 34 locus maps
this calculation calls for. On the other hand, trying only one locus at a time
can sacrifice some analytical power, if the insertion of loci is mutually
supporting. For example, we know from
build220.out
that loci D2S39 (22), D2S36 (13), & GYPC (58) don't have unique
positions along the current map with a LOD score stringency of 2.000 .
Trying to insert these three loci - one at a time - using all will
simply repeat what we know from build220.out . A better strategy
might be to attempt inserting these difficult loci in pairs, or in combination
with one of the "new" loci from above. Rather than try the 21 possible
two locus combinations possible with these seven loci, however (the all
option is one of the more computationally intensive and this would take far too
long), try each of the four new loci from the p & q arms.
If one or more of these fits uniquely, twopoint will reveal which of
the remaining loci - alone or in pairs - might next insert via all.
With a new locus added to the current map, the remaining steps are:
The flipsn shows that no changes in map order are
required! As before, though, there are a few locus pair
reversals that are within a LOD score of 3.000 of
this best map.
The final map of the p arm of chromosome 2 incorporates 28 loci
(36 when haplotyped loci are included), spans 254.0 cM, and has a LOD score
of -1141.012 .
77 10 28 70 0 55 68 1 49 20 69 33 59 26 40 31 66 56 54 50 44 65 42 5 21 61 24 17This conservative map has 20 backbone loci, plus eight others that probably go just before (if above) or just after (if below) their backbone neighbours.
Unsatisfied with reporting a conservative map? Another possibility is to go back to the dataset to see where its weaknesses lie, and to do another few experiments that will confirm or deny the current map. CRI-MAP's tool for viewing the data used for a particular map is the chrompic option.
The chrompic option takes from the dataset all the information used to produce the current map, and re-formats it for easy inspection. As with most CRI-MAP options, the chrompic output begins with a restatement of the parametres that were used. It then lists map-specific data for each family in the dataset. For example, the first family in the dataset, Family 1326, has no data relevant to the current map. Each individual from #3 to #9 is shown with two lines of 36 "dashes" beside it, and the number 0 beside each line.
Family 1326 phase likelihood = 1.000, 2d best = 0.000 3 ---------- ---------- ---------- ------ 0 ---------- ---------- ---------- ------ 0 4 ---------- ---------- ---------- ------ 0 ---------- ---------- ---------- ------ 0The lines represent the two copies of chromosome 2 in each person, the dashes represent the loci of the current map, and the number counts the cross-over events estimated to have occurred on that chromosome for the loci in their current order. There are 36 dashes because chrompic also reports on the loci in haplotyped systems.
The first family with relevant information is Family 1328.
Family 1328 phase likelihood = 0.185, 2d best = 0.087 3 o-----i--- ---------- ---i-iiii- ------ 1 1 CRYG1-5 ------i--- ---------- ---o-oi-i- --i--i 2 7 D2S44Looking at individual #3, both of her chromosomes have been scored for seven of the 36 loci of the current map. Her maternal chromosome (above) has data for map loci 1 7 24 26 27 28 & 29 (these are loci CRYG1-5 D2S44 D2S6 D2S46 D2S48 APOB_2 & APOB) and her paternal chromosome (below) has data for map loci 7 24 26 27 29 33 & 36 (D2S44 D2S6 D2S46 D2S48 APOB D2S47 & ACP1). Further, individual #3 has one cross-over on the maternal chromosome, between map loci 1 & 7, which places six paternal (i) alleles for D2S44 D2S6 D2S46 D2S48 APOB_2 & APOB on the otherwise maternal (o) chromosome. Two cross-overs on the paternal chromosome place maternal alleles for D2S6 and D2S46. Note the precision with which these cross-over events are placed; quite precisely between map loci 26 and 27 for the second cross-over on the paternal chromosome, and with very little precision for the first cross-over - somewhere between map loci 7 and 24. Finally, note that all data from this family is phase unknown, indicated by the use of lower case letters (i and o) to show loci as either paternally or maternally inherited.
After the family data comes the summary of informative intervals. An interval, the gap between any two loci on the current map, is informative if one or more cross-overs are recorded within it. Thus, when we look at the "1____7" interval, we see it holds six cross-overs, one being from 1328-3-M (Family 1328, individual #3, maternal chromosome).
Finally, the chrompic output ends with the identities of the haplotyped loci in the current map, and the details of the Sex-Averaged Map.
The section of chrpc230.out that summarizes the cross-over chromosomes is the most useful for finding weaknesses in the dataset and seeing ways to improve them. One set of intervals with little support in the current map is the set among loci D2S44, Prot_C/pcr, D2S54+D2S54_2, & LCO, or, among map loci 7 through 11. (Recall that the locus order D2S44 D2S54+D2S54_2 Prot_C/pcr LCO has only slightly less support than the current map, a LOD score decrease of 0.76) Counting the cross-overs in each interval of this set gives a rough feeling for why.
Locus IDs 68 77 1,3 49 Map Loci __7_______8______9,10____11__ cross-over # |---9---|---5---|---4---| per interval |------11-------| |-------8-------| |----------10-----------|There are almost as many cross-overs in interval "8____11" (8) as there are in total between "8__9,10" and "9,10__11" (4+5=9). Increasing the cross-over counts in interval "8__9,10" and "9,10__11" would strengthen the support for current map, as would decreasing the count in interval "8____11". A quick improvement could come from analysing the families with cross-overs in interval "8____11" for the one of the two map loci 9 & 10 (D2S54 & D2S54_2). Similarly, support for the current map would improve if there were fewer cross-overs in the "7____9,10" interval and more between loci 7 & 8 and between 8 & 9,10 . Which families should be re-analysed, and for what loci?
Below is an excerpt from chrpc230.out . There are 15 families having the 29 cross-overs in one of the three intervals "7____9,10", "7____11" & "8____11".
7 9 1416-10-M 1416-5-M 1359-5-M 1356-11-M 1356-9-M 1345-5-P 7 10 1463-5-P 1454-4-M 23-5-P 23-3-P 1354-6-P 7 11 1458-8-M 1458-6-M 1454-7-P 1359-15-P 1359-10-P 1359-6-P 1359-5-P 1359-3-P 1349-7-P 13292-3-M 8 11 1375-13-P 1375-6-P 1375-5-P 1375-4-P 1362-7-M 1340-8-M 1331-6-M 13292-3-PSeven of these families have only one individual with a cross-over in a relevant interval, and six families have only two individuals; scoring entire families for new loci to scrutinise one or two cross-over events is unlikely to be an efficient use of resources.
How does the partial map built in this tutorial compare? Here's our conservative map of the p arm again.
77 10 28 70 0 55 68 1 49 20 69 33 59 26 40 31 66 56 54 50 44 65 42 5 21 61 24 17And here's the 1992 CEPH Consortium map, reformatted for comparison.
12 74 47 51 48 9 0 14 30 11 27 18 61 68 1 49 20 69 10 24 59 26 35 40 31 66 56 70 54 50 42 52 43 16 75 15The maps are almost identical for the order of the loci they share. The one difference is the locus pair 56 70, already flagged in our map as having weak support. As for the loci the maps do not share, many of the extra loci in the consortium map are q arm loci we chose to ignore (e.g., 74 47 51 etc.). Other extra loci close to or on p arm of the consortium map (18 16 75 35 15) are compensated by extra backbone loci in our conservative map (55 33 44 65 5) and by the other extra loci with weaker support (21 77 28 17).
I wish you good fortune with your use of CRI-MAP!