Purpose 目的
Find restriction enzyme site(s) from an arbitrary amino acid sequence.
任意のアミノ酸配列からDNA配列を設計する際に目的の制限酵素切断サイトが挿入可能かを判定し、挿入可能であればそのDNA配列の候補を出力させる
How to use 使い方
"restrictionsite_maker.py" ソースコード
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# -*- coding: utf-8 -*-
#/usr/bin/python
"""
Created on Sat Sep 12 12:47:04 2015
@author: Gembu Maryu
"""
import
itertools, random
amino =
{'A': ['GCC', 'GCT', 'GCG', 'GCA'], 'C': ['TGC', 'TGT'], 'E': ['GAA', 'GAG'], 'D': ['GAC', 'GAT'], 'G': ['GGC', 'GGG', 'GGT', 'GGA'], 'F': ['TTC', 'TTT'], 'I': ['ATA', 'ATC', 'ATT'], 'H': ['CAT', 'CAC'], 'K': ['AAA', 'AAG'], 'M': ['ATG'], 'L': ['CTT', 'TTG', 'CTG', 'TTA', 'CTC','CTA'], 'N': ['AAC', 'AAT'], 'Q': ['CAA', 'CAG'], 'P': ['CCT', 'CCC', 'CCG', 'CCA'], 'S': ['TCT', 'TCC', 'TCG', 'TCA', 'AGC', 'AGT'], 'R': ['AGG', 'CGG', 'CGC', 'CGA', 'AGA', 'CGT'], 'T': ['ACC', 'ACG', 'ACT', 'ACA'], 'W': ['TGG'], 'V': ['GTC','GTT','GTA','GTG'], 'Y': ['TAT','TAC']}
usage =
{'A': {1.0: 'GCC', 0.6: 'GCT', 0.1: 'GCG', 0.33: 'GCA'}, 'C': {1.0: 'TGC', 0.46: 'TGT'}, 'E': {0.42: 'GAA', 1.0: 'GAG'}, 'D': {1.0: 'GAC', 0.46: 'GAT'}, 'G': {1.0: 'GGC', 0.66: 'GGG', 0.16: 'GGT', 0.41: 'GGA'}, 'F': {1.0: 'TTC', 0.46: 'TTT'}, 'I': {0.17: 'ATA', 1.0: 'ATC', 0.53: 'ATT'}, 'H': {0.42: 'CAT', 1.0: 'CAC'}, 'K': {0.43: 'AAA', 1.0: 'AAG'}, 'M': {1.0: 'ATG'}, 'L': {0.27: 'CTT', 0.4: 'TTG', 1.0: 'CTG', 0.14: 'TTA', 0.6: 'CTC', 0.06: 'CTA'}, 'N': {1.0: 'AAC', 0.47: 'AAT'}, 'Q': {0.27: 'CAA', 1.0: 'CAG'}, 'P': {0.68: 'CCT', 1.0: 'CCC', 0.11: 'CCG', 0.39: 'CCA'}, 'S': {0.54: 'TCT', 0.76: 'TCC', 0.05: 'TCG', 0.2: 'TCA', 1.0: 'AGC', 0.35: 'AGT'}, 'R': {0.79: 'AGG', 0.58: 'CGG', 0.38: 'CGC', 0.2: 'CGA', 1.0: 'AGA', 0.08: 'CGT'}, 'T': {1.0: 'ACC', 0.11: 'ACG', 0.36: 'ACT', 0.64: 'ACA'}, 'W': {1.0: 'TGG'}, 'V': {0.54: 'GTC', 0.3: 'GTT', 0.12: 'GTA', 1.0: 'GTG'}, 'Y': {1.0: 'TAT', 0.56: 'TAC'}}
res_enz =
{'AatII':'gacgtc', 'Asp718':'ggtacc', 'BamHI':'ggatcc', 'BglII':'agatct','ClaI':'atcgat', 'EcoRV':'gatatc','HindIII':'aggcct', 'MluI':'acgcgt', 'PinAI':'ctgcag','PspOMI':'gggccc','PstI':'ctgcag','PviI':'cgatcg','PvuII':'cagctg','SpeI':'actagt','XmaI':'cccggg'}
res_inv =
{v:k for
k, v in
res_enz.items()}
aa_seq =
raw_input('Input the amino acid sequence >> ') # wait user input
#aa_seq = 'HSFQKQQREKTRWDNSGRGDE' # sample amino acid sequece
aa_li =
[aa_seq[f:f+3] for
f in
range(len(aa_seq)-2)] # make a list every 3 amino acid
def
amino2dna(seq):
genome =
''
seq_list =
list(seq)
rand_list =
[]
for
char in
seq_list:
#generate random number(0<rand<1)
rand =
random.random()
rand_list.append(rand)
# print rand
cumlo =
sorted(usage[char].keys())
comp =
0
#number in cumlo
while
comp < len(cumlo):
if
rand < cumlo[0]: # compare values
# print amino[char][cumlo[0]]
genome =
genome +
usage[char][cumlo[0]]
break
elif
rand > cumlo[comp] and
rand < cumlo[comp+1]:
# print amino[char][cumlo[comp+1]]
genome =
genome +
usage[char][cumlo[comp+1]]
break
comp =
comp +
1
return
genome
# make all dna sequence patern from amino acid sequence
for
aa_tri in
aa_li: # aa_tri: amino acid trio
#aa_tri = aa_li[1]
dna_seq_list =
[' ']
for
aa in
aa_tri: # aa: single amino acid
codon_list =
amino[aa]
#print codon_list
conn_codon =
list(itertools.product(dna_seq_list, codon_list)) # make all dna sequence patern from amino acid sequence
dna_seq_list =
[] # reset dna_seq_lit
for
tmp in
conn_codon:
dna_seq_list.append(tmp[0]+tmp[1]) # assign a connected dna sequence as valible
#print dna_seq_list
# searching presence of restriction enzyme recognition site
for
r in
res_inv:
for
d in
dna_seq_list:
if
r in
d.lower():
print
res_inv[r] +
': '
+
aa_tri +
'('+
d +' )'
aa_split =
aa_seq.split(aa_tri)
print
aa_split
print
amino2dna(aa_split[0]).strip() +
d.strip() +
amino2dna(aa_split[1]).strip()
出力結果
出力結果は以下の3つの要素から構成される。候補が複数表示された場合は目的に合わせて利用してください。
1行目: 含まれる制限酵素の名前、アミノ酸配列 (変換後のDNA配列)
2行目: 上記の3アミノ酸配列除いた前後の配列(上に示した制限酵素切断部位候補になるアミノ酸配列が複数存在していた場合、リストの長さが2以上になる)
3行目: 入力したアミノ酸配列のDNA配列への変換結果(制限酵素認識部位以外はhuman codon usageにもとづいて生成)
例:
Input the amino acid sequence >> HSFQKQQREKTRWDNSGRGDE
MluI: KTR( AAAACGCGT )
['HSFQKQQRE', 'WDNSGRGDE']
CACAGCTTTCAGAAACAACAGAGAGAGAAAACGCGTTGGGACAACTCAGGAAGGGGTGACGAG
MluI: KTR( AAGACGCGT )
['HSFQKQQRE','WDNSGRGDE']
CACTCCTTTCAGAAGCAACAGAGAGAAAAGACGCGTTGGGATAACTCCGGACGAGGAGACGAA
MluI: TRW( ACGCGTTGG )
['HSFQKQQREK', 'DNSGRGDE']
CATAGCTTTCAAAAGCAGCAGAGAGAGAAAACGCGTTGGGACAATTCTGGAAGGGGCGACGA
備考
チェックする制限酵素の追加/削除はソースコードを直接編集することで可能。