We can now start to try to build maps. All the maps built go in a
specific storage structure called the ``heap'' that remembers the
best maps found during all the map search process. To build a first
map in order to get a first map in the heap, we will simply directly
ask CarthaGene to assess the quality of the default order specified in
the mrkselset command. This is done using the sem
command. This procedure compute true multipoint maximum likelihood of
the current order and prints the corresponding map (the markers order
used for printing can be reversed w.r.t. the marker selection. See
section 2.4.10).
CG> sem Map -1 : log10-likelihood = -169.89 -------: Set : Marker List ... 1 : L029 A079 A059 A036 M232 D022 M237 M030 M076 M034 T018 T035 L078 L00...To try to build less stupid maps, we can try to use heuristics building procedures. The two simple procedures nicemapl and nicemapd build reasonably good maps using respectively 2-points LOD and 2-points distances as guide (trying to put strong LOD/small distances close together). The two slightly more complex procedures mfmapl and mfmapd tend to provide better results. Both classes of heuristics are derived from usual travelling salesman problem heuristics. In all cases, the true multipoint maximum likelihood of the order is then computed and the map printed.
CG> nicemapl Map -1 : log10-likelihood = -72.96 -------: Set : Marker List ... 1 : L029 L010 L078 T035 D022 A059 L001 A079 M030 M232 T018 M237 M076 A03... CG> nicemapd Map -1 : log10-likelihood = -70.86 -------: Set : Marker List ... 1 : L029 L010 L078 T035 D022 L001 A059 A079 M030 M232 T018 M237 M076 A03...The loglikelihoods of the maps found using these two heuristics are -72.96 and -70.86 respectively. We can have a closer look to the maps build up to this point by asking for a detailed view of all the maps stored in ``the heap''. This is achieved using the heaprintd command:
CG> heaprintd Map 0 : log10-likelihood = -169.89, log-e-likelihood = -391.19 -------: Data Set Number 1 : Markers Distance Cumulative Distance Theta 2pt Pos Id name Haldane Haldane Kosambi (%%age) LOD 1 20 L029 29.9 cM 29.9 cM 24.3 cM 22.5 %% 4.4 2 284 A079 2.2 cM 32.2 cM 2.2 cM 2.2 %% 18.4 3 277 A059 21.6 cM 53.7 cM 18.3 cM 17.5 %% 6.2 4 255 A036 13.0 cM 66.8 cM 11.7 cM 11.5 %% 9.0 5 220 M232 10.1 cM 76.9 cM 9.2 cM 9.1 %% 4.8 6 239 D022 13.9 cM 90.8 cM 12.4 cM 12.2 %% 3.2 7 186 M237 8.7 cM 99.4 cM 8.0 cM 7.9 %% 11.0 8 132 M030 8.7 cM 108.1 cM 8.0 cM 7.9 %% 11.0 9 99 M076 5.9 cM 114.0 cM 5.6 cM 5.6 %% 13.0 10 94 M034 12.2 cM 126.3 cM 11.0 cM 10.9 %% 8.6 11 75 T018 19.3 cM 145.6 cM 16.6 cM 16.0 %% 6.5 12 85 T035 0.0 cM 145.6 cM 0.0 cM 0.0 %% 21.4 13 62 L078 14.5 cM 160.0 cM 12.8 cM 12.6 %% 9.0 14 42 L001 19.4 cM 179.4 cM 16.6 cM 16.1 %% 6.0 15 38 L010 ---------- ---------- 179.4 cM 156.8 cM 15 markers, log10-likelihood = -169.89 log-e-likelihood = -391.19 Map 1 : log10-likelihood = -72.96, log-e-likelihood = -167.99 -------: Data Set Number 1 : Markers Distance Cumulative Distance Theta 2pt Pos Id name Haldane Haldane Kosambi (%%age) LOD 1 20 L029 0.0 cM 0.0 cM 0.0 cM 0.0 %% 18.1 2 38 L010 5.9 cM 5.9 cM 5.6 cM 5.6 %% 13.1 3 62 L078 0.0 cM 5.9 cM 0.0 cM 0.0 %% 21.4 4 85 T035 2.8 cM 8.7 cM 2.7 cM 2.7 %% 9.6 5 239 D022 12.7 cM 21.4 cM 11.4 cM 11.2 %% 6.4 6 277 A059 1.1 cM 22.5 cM 1.1 cM 1.1 %% 19.9 7 42 L001 3.4 cM 25.9 cM 3.3 cM 3.3 %% 16.8 8 284 A079 0.0 cM 25.9 cM 0.0 cM 0.0 %% 21.7 9 132 M030 3.4 cM 29.3 cM 3.3 cM 3.3 %% 16.0 10 220 M232 1.1 cM 30.4 cM 1.1 cM 1.1 %% 17.8 11 75 T018 4.7 cM 35.1 cM 4.5 cM 4.5 %% 12.8 12 186 M237 0.0 cM 35.1 cM 0.0 cM 0.0 %% 19.9 13 99 M076 5.9 cM 41.0 cM 5.6 cM 5.6 %% 13.0 14 255 A036 0.0 cM 41.0 cM 0.0 cM 0.0 %% 21.4 15 94 M034 ---------- ---------- 41.0 cM 38.6 cM 15 markers, log10-likelihood = -72.96 log-e-likelihood = -167.99 Map 2 : log10-likelihood = -70.86, log-e-likelihood = -163.17 -------: Data Set Number 1 : Markers Distance Cumulative Distance Theta 2pt Pos Id name Haldane Haldane Kosambi (%%age) LOD 1 20 L029 0.0 cM 0.0 cM 0.0 cM 0.0 %% 18.1 2 38 L010 5.9 cM 5.9 cM 5.6 cM 5.6 %% 13.1 3 62 L078 0.0 cM 5.9 cM 0.0 cM 0.0 %% 21.4 4 85 T035 2.5 cM 8.5 cM 2.5 cM 2.5 %% 9.6 5 239 D022 11.5 cM 19.9 cM 10.4 cM 10.2 %% 6.4 6 42 L001 1.1 cM 21.0 cM 1.1 cM 1.1 %% 19.9 7 277 A059 2.2 cM 23.3 cM 2.2 cM 2.2 %% 18.4 8 284 A079 0.0 cM 23.3 cM 0.0 cM 0.0 %% 21.7 9 132 M030 3.4 cM 26.7 cM 3.3 cM 3.3 %% 16.0 10 220 M232 1.1 cM 27.8 cM 1.1 cM 1.1 %% 17.8 11 75 T018 4.7 cM 32.5 cM 4.5 cM 4.5 %% 12.8 12 186 M237 0.0 cM 32.5 cM 0.0 cM 0.0 %% 19.9 13 99 M076 5.9 cM 38.4 cM 5.6 cM 5.6 %% 13.0 14 255 A036 0.0 cM 38.4 cM 0.0 cM 0.0 %% 21.4 15 94 M034 ---------- ---------- 38.4 cM 36.2 cM 15 markers, log10-likelihood = -70.86 log-e-likelihood = -163.17 EM calls: Set 1 : 36 (33,0) CPU Time (secs): 0.17 Maps within -3.0: 2So far there are only our 3 maps in the heap. We can try to build other maps using a smarter heuristics procedure called build. This command incrementally includes markers, always choosing the best loglikelihood and the best insertion point. Because it is too ``greedy'' in its choices, this procedure can be performed in parallel on several maps, always keeping the best maps. So, the command takes one parameter to specify the number of map built at the same time.
CG> build 10 Build(10) : ||||||||||||||| Map 5 : log10-likelihood = -70.86 -------: Set : Marker List ... 1 : L029 L010 L078 T035 D022 L001 A059 A079 M030 M232 T018 M237 M076 A03...No better map was found. We may now shift to so-called ``improving'' methods. These methods cannot start from scratch and are dedicated at improving available maps.
Thomas Schiex 2009-10-27