transcript_annotation

After Matt ran on the analyses, he sent me list of transcripts to annotate.

I double checked the scaffold positons for transcripts in multiple raw fastq files from Matt and then did the following to create a file with cuffIDs, scaffold and their start stop position:

Folder: /uufs/chpc.utah.edu/common/home/gompert-group1/data/lycaeides/matt_transcriptome/cufflinks/genomeref/transcript_annot

cut -f1 ../KS001.gtf > scaffold_ks001

cut -f9 ../KS001.gtf | cut -d';' -f2 | cut -d' ' -f3 > cuffids_ks001

##in R

trlist<-read.table("transcript_list.txt", header=F)

trids<-unique(trlist[,1])

#get the files with all the scaffolds and transcript ids

sp<-read.table("scaffold_ks001", header=F)

cid<-read.table("cuffids_ks001", header=F)

pos<-read.table("startstop_ks001", header=F)

final<-cbind(cid,sp, pos)

final<-as.data.frame(final)

#loop to crosscheck and write out the file

for (i in trids){

scafids<-unique(final[which(final[,1] == i),])

print(scafids)

write.table(scafids, "scafs_transcriptids.txt",append=T, row.names=F, col.names=F, quote=F)

}

I used the following scripts to run the annotation: create_snp_annotations.py. This is the same script I used to annotate SNPs for the dubois data. I ran this script on both start and stop position of transcripts that Matt sent me.

I used only the scaffold and start or scaffold and stop position from the scafs_transcriptids.txt file to run this annotation.

python create_snp_annotations.py --map transcripts_start --ann genome_annotation.txt --out out_transcripts_start

python create_snp_annotations.py --map transcripts_stop --ann genome_annotation.txt --out out_transcripts_stop

On July 10th 2020: Su'ad asked me send back annotations for her result tables. I wrote a script (transcript_annot.py) for this and created the tables (*annot.csv) and sent it back to her. In the output table, I manually edited the column names and edited a few columns which were empty by adding 0. The files are saved in the transcript_annot folder above in the folder suad_results_annot.

Page updated

Google Sites

Report abuse