prompt> programname argument1 argument2 -switch1 -switch2This is the short description of the program that is running.
It is usually two lines long and fairly terse.
PROGRAMNAME what sequence(s) ? ge:someseq
Begin (* 1 *) ?
End (* 516 *) ?
Reverse (* No *) ?
Select one of:
A) First option
B) Second option
Please choose one (* A *): B (don't accept defaults without
knowing what you are accepting)
What should I call the output file (* someseq.pgmnm *) ?
Note that the arguments can occur before or after any switches; an argument is actually the answer to the programmes default switch "-INfile=". If the arguments are not present on the command line, then the programme will prompt for them. If switches are not present on the command line, the programme will use default values and will NOT prompt for them.
To see what switches are available and optionally to set them, run the programme with the switch "-CHEck". You may abbreviate a switch by entering only the uppercase part of the switchname; the rest is optional.
prompt> programname -che
This is the short description of the program that is running.
It is usually two lines long and fairly terse.
Press <rtn> for more:
Syntax: % programname [-INfile=]GenEMBL:Humhb*
Required Parameters: None
Local Data Files: None
Optional Parameters:
-OUTfile=FileName copy file(s)-sequence(s) into one file
-DOCLines=6 copies only the first 6 lines of documentation.
-NOMONitor suppresses the screen monitor
-PROtein input sequence is protein
Add what to the command line ? -pro
PROGRAMNAME what sequence(s) ?
etc.
One point to note about arguments for E/GCG programmes: arguments that are database entries [actually from E/GCG data libraries] may be given in upper- &/or lower-case because E/GCG itself is "case-insensitive". E/GCG programmes are run under the UNIX environment, though, and UNIX is a "case-sensitive" operating system. Therefore, if an argument is a UNIX file with one or more upper-case letters, it must be typed with its upper-case letter(s).
In addition to the files or data library entries you specify, map accesses a file describing a vast number of commercially available restriction enzymes to determine what sites it can seek. This extra input file is normally read in from a central, hidden part of the system. We will fetch this file, too, and modify it to reflect our enzyme freezer stock, budget, and available vector sites.
prompt> map Map displays both strands of a DNA sequence with restriction sites shown above the sequence and possible protein translations shown below. (Linear) MAP of what sequence ? hsfau.ge_pr Begin (* 1 *) ? End (* 518 *) ? Select the enzymes: Type nothing or "*" to get all enzymes. Type "?" for help on which enzymes are available and how to select them. Enzyme(* * *): What protein translations do you want: a) frame 1 b) frame 2 c) frame 3 d) frame 4 e) frame 5 f) frame 6 t)hree forward frames s)ix frames o)pen frames only n)o protein translation q)uit Please select (capitalize for 3-letter) (* t *): What should I call the output file (* hsfau.map *) ? prompt> more hsfau.map (Linear) MAP of: hsfau check: 2981 from: 1 to: 518 LOCUS HSFAU 518 bp RNA PRI 23-SEP-1993 DEFINITION H.sapiens fau mRNA. ACCESSION X65923 KEYWORDS fau gene. SOURCE human. ORGANISM Homo sapiens . . . With 209 enzymes: * October 26, 1995 15:21 .. S MH B C AN a B CB P TbiM B AcT Av vlMAu s vs l aonn c ceh li aawc9 m io e qIfl c ifa uJ IIoi6 F RF I IIII I III II IVIII I II / / / / TTCCTCTTTCTCGACTCCATCTTCGCGGTAGCTGGGACCGCCGTTCAGTCGCCAATATGC 1 ---------+---------+---------+---------+---------+---------+ 60 AAGGAGAAAGAGCTGAGGTAGAAGCGCCATCGACCCTGGCGGCAAGTCAGCGGTTATACG a F L F L D S I F A V A G T A V Q S P I C - b S S F S T P S S R * L G P P F S R Q Y A - c P L S R L H L R G S W D R R S V A N M Q - [several pages deleted] Enzymes that do cut: AceIII AciI AflII AluI ApaI AscI AvaII BanII BbsI BbvI BccI BcefI BmgI BpmI Bpu1102I BsaJI BsaXI BscGI BsiEI BsiHKAI BslI BsmFI BsoFI Bsp1286I BsrI BsrDI BsrFI BssHII BstEII Bsu36I Cac8I CviJI CviRI DdeI DpnI DrdII EaeI EciI EcoO109I EcoRII FauI FokI GdiII HaeI HaeII HaeIII HhaI Hin4I HincII HinfI HphI MaeII MaeIII MboII MnlI MscI MseI MspI MwoI NciI NlaIII NlaIV NspI PleI Psp1406I RsaI Sau96I Sau3AI ScrFI SfaNI SphI TaqI TauI ThaI TseI Tsp45I Tsp509I TspRI Tth111II UbaCI Enzymes that do not cut: AatII AccI AflIII AhdI AlwI AlwNI ApaBI ApaLI ApoI AvaI AvrII BaeI BamHI BanI Bce83I BcgI BcgI BclI BfaI BfiI BglI BglII BplI Bpu10I BsaI BsaAI BsaBI BsaHI BsaWI BsbI BseRI BsgI BsmI BsmAI BsmBI Bsp24I Bsp24I BspEI BspGI BspLU11I BspMI BsrBI BsrGI BssSI Bst1107I BstXI BstYI CjeI CjeI CjePI CjePI ClaI DraI DraIII DrdI DsaI EagI EarI Eco47III Eco57I EcoNI EcoRI EcoRV FseI FspI HgaI HgiEII HindIII HpaI KpnI MluI MmeI MslI MspA1I MunI NarI NcoI NdeI NgoAIV NheI NotI NruI NsiI NspV PacI Pfl1108I PflMI PinAI PmeI PmlI PshAI Psp5II PstI PvuI PvuII RcaI RleAI RsrII SacI SacII SalI SanDI SapI ScaI SexAI SfcI SfiI SgfI SgrAI SmaI SnaBI SpeI SrfI Sse8387I Sse8647I SspI StuI StyI SunI SwaI TaqII TaqII TfiI Tth111I VspI XbaI XcmI XhoI XmnI prompt>
prompt> fetch data:enzyme.dat
prompt> map hsfau.ge_pr -dat=enzyme.dat -out=hsfau2.map
prompt> more hsfau2.map
prompt> map -che
prompt> map hsfau.ge_pr -dat=enzyme.dat -out=hsfau3.map
-minc=2 -maxc=3
prompt> more hsfau3.map
prompt> setplot +---------------------> displaying all of 10 option(s) <---------------------+ |psf postscript - sent to file: homedir:graf.ps | |epsf eps postscript - sent to file: homedir:graf.eps | |hpg hp laser with hpgl - sent to file: homedir:graf.hp | |xcol x windows colour graphics - for x-windows terminal | |xmon x windows monochr. graphics - for x-windows terminal | |vt340 vt340 graphics - for a vt340 terminal | |vt241 vt241 graphics - for a vt241 terminal | |tek versaterm tektronix 4105 graphics on your terminal | |dec declaser 5100 postscript/pcl/hpgl printer at biobase | |qms qms colorscript210 ps printer at biobase (14 kr./pg) | | | | | +------------------------------------------------------------------------------+ enter a command. choices are: <up-arrow> and <down-arrow> scroll the list <return> makes GCG use the selected device Q quits without doing anything C creates and edits a new device (you can't delete from the site file) V views the selection (use C to edit a copy)
prompt> mapplot hsfau.ge_pr -dat=enzyme.dat -minc=2 -maxc=3
This final output might show possibilities for sub-cloning most of hsfau with only one enzyme. Can you sub-clone a fragment that is only coding sequence? Which open reading frame(s) is (are) used? Where is this information shown in the orginal sequence file? (Hint!) Are "hser2.ge_pr" or "hsht.ge_pr" better or worse prospects for sub-cloning with your reduced enzyme list?
prompt> eextractpeptide hsfau3.map -out=hsfau3.pep
prompt> more hsfau3.pep
Given that we know the coding regions for these three example sequences, let's translate them properly into proteins. For quick reference, the coding regions of these three sequences follow:
data library entry | filename | coding sequence |
---|---|---|
ge:hsef2 | hsef2.ge_pr | 1 .. 2577 |
ge:hsfau | hsfau.ge_pr | 57 .. 458 |
ge:hsht | hsht.ge_pr | 128 .. 1420 |
prompt> translate hsef2.ge_pr TRANSLATE translates nucleotide sequences into peptide sequences. Begin (* 1 *) ? End (* 3075 *) ? 2577 Reverse (* No *) ? Range begins ATGGT and ends TGTAG. Is this correct (* Yes *) ? That is done, now would you like to: A) Add another exon from this sequence B) Add another exon from a new sequence C) Translate and then add more genes from this sequence D) Translate and then add more genes from a new sequence W) Translate assembly and write everything into a file Please choose one (* W *): What should I call the output file (* hsef2.pep *) ?