From zhuiastate.edu Tue Jan 17 18:47:13 2012
From: "Hu, Zhiliang [AN S]" <zhuiastate.edu>
To: Multiple Recipients of <crimap-usersanimalgenome.org>
Subject: Re: Update: Bash shell script for parsing chrompic files to
show only phase lines with highest recombinations
Date: Tue, 17 Jan 2012 18:47:13 -0600
Hi Jill,
Sorry it's not your copy and paste problem but I have a line wrapper script
to "watch" for long lines for breaks. I have manually corrected your post
on the archive and they should look fine:
http://www.animalgenome.org/tools/share/crimap/forum/
A small trick to keep raw text format on this list is to precede each line
with a space/spaces or "bullet" symbols (eg. "o ", "- ", "# ", etc).
I have some old perl scripts to "clean" the crimap output for easier visual
evaluations or direct copy to a table. Perhaps I should contribute them as
well :-)
Zhiliang
/on PAG @ san diego
On Jan 17, 2012, at 04:09 PM, Jill Maddox wrote:
>
> Hi again
>
> Sorry, I cut and pasted the script into my mail program and had Unix rather
> than DOS carriage returns so that the lines in the script got wrapped and
> truncated in the wrong places. For those of you now familiar with scripts
> the \ should be at the end of a line and should be preceded by a space.
>
> Here is the corrected script
>
> ========================================================================
> #!/bin/bash
> # chrompic_recomb_ord.sh
> # sort lines containing more than recombinants and order from lowest
> # to highest
> # Input: chrompic filename, minimum number of recombinations
> # Output: chrompic filename with .recomb suffix
> if [ $# -ne 2 ]; then
> echo 1>&2 Usage: $0 chrompic_filename min_num_recomb
> exit 127
> fi
> if ! [ -a ./${1} ]; then
> echo "file ${1} not found"
> exit 127
> fi
> if ! [[ "$2" =~ ^[0-9]+$ ]]; then
> echo "$2 is not a suitable number"
> exit 127
> fi
>
> egrep " [-0o1i]{10} " $1 | sed 's/ \([0-9]*\)$/\t\1/g' | \ gawk -v var=$2
> 'BEGIN{FS=OFS="\t"; line1=0; minrec=var;} \ {if (line1 == 0) {line1 =
> 1;num_in_line=split($1, lineinfo, " "); \
> ind = lineinfo[1]; \
> if ($2 >= minrec){for (i=2; i <= num_in_line; i++) \
> printf "%s", lineinfo[i];printf "\t%d\t%d\n", $2, ind;} \
> else next;} \
> else {line1 = 0; if ($2 >= minrec) \
> {num_in_line=split($1, lineinfo, " "); \
> for (i=2; i <= num_in_line; i++) printf "%s", lineinfo[i]; \
> printf "\t%d\t%d\n", $2, ind;}}}' | \
> sort -n -k2 -n -k3 > ${1}.recomb
> exit
>
=============================================================================
=====
>
>
> Regards
>
>
> Jill
>
>
> ***************************************************************
>
> Jill Maddox 16 Park Square Port Melbourne, 3207 Australia phone: 03 9646
> 0428 E-mail: jillian.maddoxalumni.unimelb.edu.au
>
> ***************************************************************
>
>
>>>crimap-users-requestanimalgenome.org
>
|