ex801 - nibb-gitc/gitc2025mar-rnaseq GitHub Wiki
ex801 Functional annotation using similarity-based methods
æŒç¿801: ãã¢ããžãŒæ€çŽ¢ãçšããæ©èœã¢ãããŒã·ã§ã³
ããã§ã¯ãè¿çžçš®ã®ã¢ãã«çç©ã§ãè¯è³ªã®ã¢ãããŒã·ã§ã³ãã€ããã²ãã ããŒã¿ãå©çšå¯èœã§ããã±ãŒã¹ãæ³å®ããŠããã¢ããžãŒæ€çŽ¢çãçšããŠã¢ãããŒã·ã§ã³ãè¡ã£ãŠã¿ãã䜿çšããããŒã¿ã¯ãé µæ¯ Saccharomyces eubayanus ã®ãã©ã³ã¹ã¯ãªãããŒã ããŒã¿( GEO accesion: GSE133146 )ã§ããããã®é µæ¯ã¯ãã©ã¬ãŒããŒã«ã®çç£ã«äœ¿ãããé µæ¯ S. pastorianus ã®ç¥å çš®ã®äžã€ã§ãS. pastorianus ã¯ãã®é µæ¯ãšS. cerevisiaeã®ãã€ããªããã«ãã£ãŠæç«ãããšãããŠããããã®èµ·æºã«è¿«ãããšãå ã®ç ç©¶ã®ç®çãšãªã£ãŠããããã®ç ç©¶ã§ã¯ãã€ããªããããç¥å çš®ã«è¿ããšèããããS. eubayanusã®ããã©ã€æ ªã察象ãšããŠããã®ã²ãã ã·ãŒã±ã³ã¹ãè¡ã£ãŠããããããã§ã¯å ¬éãããŠãã奿 ªã®ã²ãã ãçšããŠãããã³ã°ãã䞻㫠S. cerevisiae ãšã®æ¯èŒã«åºã¥ããŠã¢ãããŒã·ã§ã³ãè¡ããšããæ³å®ã§è§£æãè¡ã£ãŠã¿ããããªããããã§ã¯ã¢ãããŒã·ã§ã³ãŸã§ãè¡ããããã«åºã¥ããšã³ãªããã¡ã³ãè§£æã¯æŒç¿803ã§è¡ãã
ããŒã¿
ãµãŒãäžã§äœæ¥ããã ãã£ã¬ã¯ããª~/gitc/data/IU/yeastã«ã以äžã®ãã¡ã€ã«ãããã
file | contents | remarks |
---|---|---|
seub_genome.fa | S. eubayanus ã²ãã é å | å ¥åããŒã¿ |
stringtie_merged.gtf | stringtieã«ããã¢ã»ã³ãã«çµæ | å ¥åããŒã¿ |
topTags.transcript.txt | EdgeRã®çµæïŒãã©ã³ã¹ã¯ãªããã¬ãã«ïŒ | å ¥åããŒã¿ïŒæŒç¿803ã§çšãã |
topTags.gene.txt | EdgeRã®çµæïŒéºäŒåã¬ãã«ïŒ | å ¥åããŒã¿ïŒæŒç¿803ã§çšãã |
seub_genes.pep | S. eubayanus éºäŒå翻蚳é å | äžéçµæ |
seub_genes6.pep | S. eubayanus éºäŒå翻蚳é åçž®å°ç | äžéçµæïŒäžéšïŒ |
scer_prot.fa | S. cerevisiae éºäŒå翻蚳é å | å ¬çããŒã¿ããååŸ |
blastout.tab | S. eub x S. cer BLASTçµæ | äžéçµæ |
seub_genes6.iprscan.tsv | seub_genes6ã«å¯Ÿãã InterProScançµæ | åºåçµæïŒäžéšïŒ |
seub_genes.emapper.annotations | seub_genesã«å¯ŸããEggNOG-mapperçµæ | åºåçµæïŒæŒç¿803ã§çšãã |
ç¶æ³ãšããŠã¯ãã²ãã é åã«å¯ŸããŠRNA-seqãªãŒãããããã³ã°ããŠã²ãã ããŒã¹ã§ã¢ã»ã³ãã«ãããã©ã³ã¹ã¯ãªããããšã«é »åºŠãã«ãŠã³ãããŠãEdgeRã§ææãªçºçŸå€åã瀺ããã©ã³ã¹ã¯ãªããïŒéºäŒåãæœåºãããšãããŸã§ãçµãã£ãŠããããšãæ³å®ããŠãããããªãã¡ãåæç¶æ ãšããŠæåã®ïŒã€ã®ãã¡ã€ã«ãååšããŠãããããããã¢ãããŒã·ã§ã³ããããªã£ãŠãããéäžã®éçšãã¹ãããã§ããããã«ãããã€ãäžéçµæã®ãã¡ã€ã«ã眮ããŠãããããããäžæžãããªãããã«ãå¥ã®ãã£ã¬ã¯ããªãäœæããŠãããã«ãã¡ã€ã«ãã³ããŒããŠäœæ¥ããããšãå§ããã
% cd ~/gitc/data/IU/
% mkdir ex801
% cp yeast/* ex801
% cd ex801
Step 1: ãã©ã³ã¹ã¯ãªããé åã®äœæã(è¬çŸ©ã§ã¯ã¹ãããããäºå®)
stringtieã§ã¯ããã©ã³ã¹ã¯ãªããã®åº§æšãèšé²ããGTFãã¡ã€ã«ãäœæããããé åãã¡ã€ã«ã¯äœæããªãã®ã§ããŸããã®GTFãã¡ã€ã«ãšã²ãã é åãããã©ã³ã¹ã¯ãªããé åãäœæããå¿ èŠããããããã¯gffreadã³ãã³ãã§è¡ãã
% gffread stringtie_merged.gtf -g seub_genome.fa -w seub_transcripts.fa
Step 2: ã³ãŒãé å(CDS)ã®æšå®ãšç¿»èš³é åã®äœæã(è¬çŸ©ã§ã¯ã¹ãããããäºå®)
äœæãããã©ã³ã¹ã¯ãªããé åããCDSãæšå®ãããããã¯ãTransDecoderãçšããŠè¡ãããé·ãORFãæœåºããæ®µéãšãããããCDSãæšå®ããæ®µéã®ïŒæ®µéã§è¡ãã
% TransDecoder.LongOrfs -t seub_transcripts.fa
% TransDecoder.Predict -t seub_transcripts.fa --single_best_only --cpu 8
ããã©ã«ãã§ã¯ãäžã€ã®ãã©ã³ã¹ã¯ãªããäžã«éè€ããªãè€æ°ã®CDSãåå®ãããéã¯ãããããè€æ°ã®CDSãšããŠåºåããããåŸåŠçãããé¢åã«ãªãã®ã§ãããã§ã¯ãããã£ãå Žåã«ã¯äžã€ã®CDSã®ã¿ãåºåãããªãã·ã§ã³ãæå®ããŠããã
åºåçµæãšããŠãseub_transcripts.fa.trandecoder.pep ïŒã¢ããé žé åïŒã®ã»ãããã¡ã€ã«åã®ãµãã£ãã¯ã¹ã .cds ïŒå¡©åºé åïŒããã³ã.gff3ïŒãã©ã³ã¹ã¯ãªããäžã®CDSã®åº§æšïŒã®ïŒã€ã®ãã¡ã€ã«ãäœæããããã¢ããé žé åãã¡ã€ã«ããã®åŸã®è§£æã«çšããããé ååããå ã®GTFãã¡ã€ã«äžã®transcript_idããå°ãå€ãã£ãŠããïŒé ååã®åŸã«.p1ãã€ããŠããã1ã¯ãã®ãã©ã³ã¹ã¯ãªããé åãããšããCDSã®çªå·ã§ã2以äžã«ãªãããšãããïŒã
>DI49_1142.p1 GENE.DI49_1142~~DI49_1142.p1 ORF type:complete len:236 (-),score=50.10 DI49_1142:234-941(-)
MLPLIASRNRRPISLTIRKLFRTMSIVKGKPEEAKIVEARHVKDTSDCKWIGLQKIIYKD
PNGNEREWDSAVRTTRNSGGVDGIGILTILKYKDGKPDEILLQKQFRPPVEGVCIEMPAG
LIDAGEDVDTAALRELKEETGYKGKIISKSPTVFNDPGFTNTNLCLVTVEVDMSLPENQK
PVTQLEDNEFIECFSVELHKFPDEMVKLDQQGYKLDARVQNVAQGILMAKQYNIQ*
é ååãå€ãã£ãŠããŸããšåŸã åé¡ã«ãªãã®ã§ãå ã®transcript_idã«ååãæ»ããŠããŸããïŒå ã«--single_best_onlyãæå®ããŠäžå¯Ÿäžå¯Ÿå¿ãã€ãããã«ããŠããã®ã§ããããå¯èœã«ãªã£ãŠããïŒãããã¯ãseqkit ã®æåå眮æã³ãã³ã(replace)ãçšããŠä»¥äžã®ããã«ããŠè¡ããã
seqkit replace -p '\.p[0-9]+ ' -r ' ' seub_transcripts.fa.transdecoder.pep | sed 's/\*//' > seub_genes.pep
-p ã§åãé€ãéšåã®æ£èŠè¡šçŸãæå®ããããã -r ã§æå®ãã空çœã«çœ®ãæããŠããããªããå ã®é åãã¡ã€ã«ã¯ã¢ããé žé åã®æ«å°Ÿã«ã¹ãããã³ãã³ã®ååšã瀺ã*ãå ¥ã£ãŠãããããœãããŠã§ã¢ã«ãã£ãŠã¯ãããåé¡ã«ãªãããšãããã®ã§ã䜵ã㊠sed ã³ãã³ããçšããŠãããé€å»ããŠãããå å·¥åŸã¯ã以äžã®ããã«ãªãã
>DI49_1142 GENE.DI49_1142~~DI49_1142 ORF type:complete len:236 (-),score=50.10 DI49_1142:234-941(-)
MLPLIASRNRRPISLTIRKLFRTMSIVKGKPEEAKIVEARHVKDTSDCKWIGLQKIIYKD
PNGNEREWDSAVRTTRNSGGVDGIGILTILKYKDGKPDEILLQKQFRPPVEGVCIEMPAG
LIDAGEDVDTAALRELKEETGYKGKIISKSPTVFNDPGFTNTNLCLVTVEVDMSLPENQK
PVTQLEDNEFIECFSVELHKFPDEMVKLDQQGYKLDARVQNVAQGILMAKQYNIQ
Step 3: BLASTãçšãããã¢ããžãŒæ€çŽ¢ã(è¬çŸ©ã§ã¯ã¹ãããããäºå®)
åã¹ãããã§äœæããã¢ããé žé å seub_genes.pep ãçšããŠãS. cerevisiaeã²ãã ã®ã¢ããé žé å scer_prot.fa ãšãBLASTã«ããç·åœããã®ãã¢ããžãŒæ€çŽ¢ãè¡ã(ãããã®é åãã¡ã€ã«ã¯ãäžèšããŒã¿ãã£ã¬ã¯ããªäžã«çœ®ããŠãã)ã
ãŸããscer_prot.fa ãããšã«ãBLASTæ€çŽ¢çšããŒã¿ããŒã¹ãäœæããã
% makeblastdb -in scer_prot.fa -dbtype prot -parse_seqids -out scer
çµæãšããŠãscerã§å§ãŸãè€æ°ã®ãã¡ã€ã«ãäœæãããããããçšããŠBLASTæ€çŽ¢ãå®è¡ããã
% blastp -query seub_genes.pep -db scer -evalue 0.001 -outfmt "6 std stitle" -max_target_seqs 10 -num_threads 8 > blastout.tab
Step 4: DIAMONDãçšãããã¢ããžãŒæ€çŽ¢ïŒè¬çŸ©ã§ã¯ãã¡ãã宿œããïŒ
DIAMONDã¯ãBLASTãšæ¯ã¹ãŠç²ŸåºŠã¯ããèœã¡ãããå§åçã«é«éã§ããããšãããå€§èŠæš¡ãªæ€çŽ¢ã«ãããŠåºãçšããããŠããããŒã«ã§ãããäœ¿ãæ¹ã¯BLASTãšãã䌌ãŠããããåã³ãã³ãã¯diamondã®ãµãã³ãã³ããšããŠåŒã³åºãããšãããã³ãªãã·ã§ã³ã®æå®ããã€ãã³ïŒã€ã«ãªããšãããç°ãªãã®ã§æ³šæãããïŒçªç®ã®ã³ãã³ãã¯é·ãã®ã§æšªã¹ã¯ããŒã«ããŠå šäœã確èªããããšãdiamondã§ã¯ããã©ã«ãã§evalueã®éŸå€ã0.001ãªã®ã§ããã®ãŸãŸã§ãããã°--evalue ãªãã·ã§ã³ã¯çç¥å¯èœã§ããã
ãªããDIAMONDã®ææ°çã§ã¯ãBLASTçšã®ã€ã³ããã¯ã¹ããã®ãŸãŸäœ¿ã£ãæ€çŽ¢ãå¯èœã«ãªããBLASTçšã€ã³ããã¯ã¹ãååšããå Žåã¯ãã¡ããåªå ããããã§ãããããã§ã以äžã§ã¯Step 3ã§äœæããBLASTçšããŒã¿ããŒã¹ãšååããã¡åããªãããã«ã scer2ãšããååã§DIAMONDçšããŒã¿ããŒã¹ãäœæããŠããã
% diamond makedb --in scer_prot.fa --db scer2
% diamond blastp --query seub_genes.pep --db scer2 --max-target-seqs 10 --outfmt 6 qseqid sseqid pident evalue bitscore stitle --threads 4 --out diamondout.tab
Step 5: BLAST/DIAMONDçµæãããã¹ããããé¢ä¿ã®æœåº
Step 3,4ã§äœæããBLAST/DIAMONDæ€çŽ¢çµæãããåã¯ãšãªé åã«ã€ãã¹ã³ã¢ãæé«ã®ãããïŒãã¹ããããïŒã²ãšã€ã ããæœåºããã以äžãDIAMONDã®å Žåã®äŸã瀺ããããã¯ãå ã®æ€çŽ¢çµæããäºãã¯ãšãªé åããšã«ã¹ã³ã¢é ã«äžŠãã§ããããšãåæãšããŠãsortã³ãã³ãã®stable option (-s)ãšunique option (-u) ãçšããããšã§ãå ã®é åºãç¶æãã€ã€ã¯ãšãªé åããšã«æåã®äžã€ã®ãããã®ã¿ãåºåããããšã§å®çŸããŠããã
% sort -s -k 1,1 -u diamondout.tab > diamond_top.tab
ãªãŒãœãã°åå®ã«ãããŠã¯ããã¢ã¯ã€ãºã®ã²ãã æ¯èŒã«ãããŠãäžæ¹åã®ãã¹ããããã ãã§ãªããåæ¹åã®ãã¹ããããã確èªããããšã§ããã®ç²ŸåºŠãé«ããããšãã§ããããããè¡ãããããŸãéæ¹åã®ãã¹ãããããæœåºãããããŒã¿ããŒã¹é å(S. cerevisiae)ããšã®E-valueïŒã«ã©ã 4)ããã³ã¹ã³ã¢(ã«ã©ã 5ïŒã®é ã«äžŠã¹æ¿ããåã³ãã³ããšåæ§ã«ããŠãŠããŒã¯ãªé åãæœåºããã
% sort -k 2,2 -k 4,4g -k 5,5nr diamondout.tab | sort -s -k 2,2 -u > diamond_top_rev.tab
ãã®åŸãäž¡æ¹åã®ãã¹ãããããåãããŠãœãŒãããéè€ããè¡ãåºåãã(uniq -d)ãããã«ãããåæ¹åã®ãã¹ãããããæœåºãããã
% sort diamond_top.tab diamond_top_rev.tab | uniq -d > diamond_bbh.tab
ïŒè¿œå 課é¡ïŒStep 3ã§äœæããBLASTã«ããæ€çŽ¢çµæ blastout.tab ãçšããŠãåæ§ã«ãã¹ãããããããã³åæ¹åãã¹ãããããæœåºãããDIAMONDæ€çŽ¢ã®æãšæ¯ã¹ãŠåºåã«ã©ã ãç°ãªã£ãŠããã®ã§ãã«ã©ã äœçœ®ããããŠããç¹ã«æ³šæããã
Step 6: InterProScan ãçšããã¢ããŒãïŒãã¡ã€ã³æ€çŽ¢ïŒãªãã·ã§ãã«ãè¬çŸ©ã§ã¯ã¹ãããããäºå®ïŒ
ã¢ããŒãïŒãã¡ã€ã³æ€çŽ¢ã¯ãæ©èœæšå®ã®ããã®ããè©³çŽ°ãªæããããããææ®µãšããŠããã¢ããžãŒæ€çŽ¢ãšäœµããŠå®æœãããããšãå€ããInterProScanã¯ãäžã€ã®ã³ãã³ãã§å€å²ã«ãããããŒã¿ããŒã¹ãäžåºŠã«æ€çŽ¢ã§ããããŒã«ãšããŠåºãçšããããŠãããæ€çŽ¢ã«ã¯æéããããããã詊ããŠã¿ãå Žåã¯ãã¯ãšãªé åã®ããäžéšãæœåºããçž®å°çseub_genes6.pepãçšããããšã
% interproscan.sh -i seub_genes6.pep -b seub_genes6 -goterms -pa --cpu 4
çµæã¯seub_genes6ã§å§ãŸãããã€ãã®ãã¡ã€ã«ãšããŠåºåãããããã®ãã¡ãã¿ãåºåãã§ã¢ãããŒã·ã§ã³ãèšèŒããseub_genes6.tsvã¯ãäžèšãã£ã¬ã¯ããªäžã«seub_genes6.iprscan.tsvãšããŠçœ®ããŠããã
Step 7: EggNOG mapperãçšãããªãŒãœãã°æ€çŽ¢
EggNOG mapperã¯ããããããäœæãããªãŒãœãã°ã°ã«ãŒããšç³»çµ±æš¹ã®æ å ±ãçšããŠã¯ãšãªé åã®ãªãŒãœãã°ãåå®ããããã«åºã¥ããŠã¢ãããŒã·ã§ã³ã¥ãããããŒã«ã§ãããæ€çŽ¢ãšã³ãžã³ã«ã¯DIAMONDãçšããŠããããããŒã¿ããŒã¹ã倧ããããã«æ€çŽ¢ã«ã¯æéããããã詊ããŠã¿ãå Žåã¯ãçž®å°çã®seub_genes6.pepãçšããããšã
% emapper.py -i seub_genes6.pep -o seub_genes6 -m diamond --cpu 6
é åå šäœã«å¯ŸããŠå®è¡ããçµæã¯ seub_genes.emapper.annotaionsãšããŠçœ®ããŠããã®ã§ããããçšããŠããã¹ãèšè¿°ããã³GOãªã¹ããæœåºããã
% cut -f1,8 seub_genes.emapper.annotations | grep -v ^# > seub_genes.emapper.tit
% cut -f1,13 seub_genes.emapper.annotations | grep -v ^# > seub_genes.emapper.go
(èš»ïŒåºåã®ã«ã©ã æ§æã¯ããŒãžã§ã³ã«ãã£ãŠç°ãªãã®ã§ãäœçªç®ã®ã«ã©ã ãæãåºããã¯ããããèŠãŠç¢ºèªããããšãææ°ã®ver 2.1.12ã§ã¯ãGOã®ã«ã©ã ã¯10çªç®ã«ãªã£ãŠããã
ïŒçºå±èª²é¡ïŒã¯ã©ã¹ã¿ãŒèšç®æ©ãçšããå€§éæ€çŽ¢ã®é«éå®è¡
ïŒä»¥äžã¯ãååç§åŠç ç©¶æ èšç®ç§åŠç ç©¶ã»ã³ã¿ãŒã®ã·ã¹ãã (RCCS)ã䜿ã£ãŠè¡ãå®ç¿æé ã§ãããåèãŸã§ã«æ²èŒããããä»åã®å®ç¿ã§ã¯è¡ããªãïŒ
ã¯ãšãªãåå²ããŠäžŠåã«å®è¡ããããšã§æ€çŽ¢é床ãäžããããšãã§ãããInterProScanãEggNOG mapperãå®éã«é åå šäœã«å¯ŸããŠå®è¡ããŠã¿ãã人ã¯ãé åãåå²ããŠRCCSã®ã¯ã©ã¹ã¿ãŒã·ã¹ãã ãçšããŠå®è¡ããŠã¿ãããé åãåå²ããããã®ã³ãã³ãsplit_seq.plãçšæããŠããã®ã§ããããå®è¡ããã-BLOCK_SIZEãªãã·ã§ã³ã§ãåå²ã®åäœãšãªãé åé·ã®ç·åãæå®ããã
% perl ~/gitc/data/IU/bin/split_seq.pl -BLOCK_SIZE=100000 seub_genes.pep
åå²ãããé åãã¡ã€ã«ã¯ãquery_seub_genesãšãããã£ã¬ã¯ããªã«æ ŒçŽããã(ãã®å Žåã30åçšåºŠã®ãã¡ã€ã«ã«åå²ããã)ã䜵ããŠããããã®é åãã¯ãšãªãšããŠãPBSãšãããžã§ã管çã·ã¹ãã ãçšããŠã¯ã©ã¹ã¿ãŒäžã§äžŠåã«å®è¡ããããã®ãqsub_blast.sh ãšãããã¡ã€ã«ãäœæãããããã®ãã¡ã€ã«ã®æ«å°Ÿã«ããã³ãã³ã矀ã®ãã¡ãblastp ãã³ã¡ã³ãã¢ãŠããïŒå é ã«#ãã€ããïŒãemapperã®ã³ã¡ã³ããå€ãïŒå é ã®#ãåé€ããïŒãïŒåæ§ã«ãä»ã®ã³ãã³ããã³ã¡ã³ããå€ãããšã«ããå®è¡ã§ããããblast, diamondã«ã€ããŠã¯ãæ€çŽ¢å¯Ÿè±¡ããŒã¿ããŒã¹åãšããŠããã®äžã§èšå®ãããDB倿°ã«é©åãªååïŒããã§ã¯scerïŒãå ¥ããå¿ èŠãããïŒ
ãŸããèšç®ç§åŠã»ã³ã¿ãŒã§ã¯ããžã§ããèµ°ãããéã«ã䜿çšããèšç®ãªãœãŒã¹ã決ãããããªãã·ã§ã³ã«åŸã£ãŠæå®ããããšãèŠåãšããŠæ±ºããããŠãããããã¯ãå é ã® #PBS ã§å§ãŸãè¡ã« -l ãªãã·ã§ã³ãšããŠæå®ãããããã§ã¯ããžã§ãåœãã10ã³ã¢ã䜿ãïŒncpus, ompthreadsïŒãæå€§1æéã®å®è¡æéïŒwalltimeïŒãæå®ããŠããããŸããeggNOG-mapper ã䜿ãããã«ãã¢ãžã¥ãŒã«ã®ããŒããè¡ãå¿ èŠãããã
ããããåæ ãããããã«ãqsub_blast.sh ã以äžã®æ§ã«ä¿®æ£ããïŒ#<<<< ãã€ããè¡ãä¿®æ£ãããã³ããããå Žåã¯ã#PBSè¡æ«å°Ÿã®ã³ã¡ã³ãã¯åé€ããããšïŒã
#!/bin/sh
#PBS -J 1-30
#PBS -N blastjob
#PBS -S /bin/sh
#PBS -V
#PBS -l select=1:ncpus=10:mpiprocs=1:ompthreads=10 #<<<<ãæžãæã(ãªãœãŒã¹æå®ãCPUã³ã¢æ°ãšã¡ã¢ãªïŒ
#PBS -l walltime=1:00:00 #<<<<ãæžãæã(ãªãœãŒã¹æå®ãæå€§å®è¡æéïŒ
source /apl/bio/etc/bio.sh #<<<<ã远å (çç©åŠé¢é£ã¢ããªã䜿ãããã®ç°å¢èšå®ïŒ
module load eggNOG-mapper #<<<<ã远å (ã¢ãžã¥ãŒã«ã®ããŒãïŒ
cd $PBS_O_WORKDIR
DB=scer #<<<<ãããŒã¿ããŒã¹ãæ€çŽ¢ããå Žåã¯ããã倿Ž
SEQDIR=query_seub_genes
RESULT_OUT_DIR=output_seub_genes
QUERY_OUT_DIR=query_seub_genes
INFILE=$QUERY_OUT_DIR/seub_genes.$PBS_ARRAY_INDEX
OUTFILE=seub_genes.$PBS_ARRAY_INDEX
if [ ! -d $RESULT_OUT_DIR ]; then
mkdir $RESULT_OUT_DIR
fi
#blastp -db $DB -query $INFILE -out $RESULT_OUT_DIR/blastp_$OUTFILE -num_threads $NCPUS -outfmt 6 -evalue 0.001 #<<<<ã³ã¡ã³ãã¢ãŠã
#diamond blastp --db $DB --query $INFILE --out $RESULT_OUT_DIR/diamond_$OUTFILE --threads $NCPUS
emapper.py -m diamond --no_annot --no_file_comments -i $INFILE -o $RESULT_OUT_DIR/emapper_$OUTFILE --cpu $NCPUS #<<<<ã³ã¡ã³ããå€ã
#interproscan.sh -goterms -pa -i $INFILE -b $RESULT_OUT_DIR/iprscan_$OUTFILE -cpu $NCPUS
以äžã®ã³ãã³ãã§å®è¡ããã
% jsub qsub_blast.sh
å®è¡ç¶æ³ã¯ãjobinfo
ã§ç¢ºèªã§ããããžã§ãã®æ
å ±ã衚瀺ãããªããªããšãå®è¡ã¯çµãã£ãŠãããããŸãã«æ©ãçµäºããå Žåã¯ã倱æããŠããå¯èœæ§ãé«ãã®ã§ãã«ã¬ã³ããã£ã¬ã¯ããªã«äœæãããblastjob.e#### ãšãããã¡ã€ã«ã«å®è¡ããã³ãã³ãã®æšæºãšã©ãŒåºåãèšé²ãããŠããã®ã§ããããåç
§ããŠå¯ŸåŠããããšã
å®è¡ãç¡äºã«çµäºãããšãåå²ããã¯ãšãªé åããšã®æ€çŽ¢çµæãoutput_seub_genesãã£ã¬ã¯ããªã®äžã«æ ŒçŽãããããã®äžã®ãã¡ã€ã«ãcat ã³ãã³ãã§ãŸãšããŠã«ã¬ã³ããã£ã¬ã¯ããªã«ã³ããŒãããEggNOG mapperã®å Žåããã®æ®µéã§ã¯seed ortholog ã®æ€çŽ¢ã®ã¿ãçµãã£ãŠãã(ãªãã·ã§ã³ã«--no_annotãæå®ãããã)ã®ã§ããã®çµæãããšã«å床emapperãå®è¡ããŠãã¢ãããŒã·ã§ã³ã¥ããè¡ãã
% cat output_seub_genes/*.seed_orthologs > input.emapper.seed_orthologs
% emapper.py --annotate_hits_table input.emapper.seed_orthologs --no_file_comments -o seub_genes --cpu 10 -m no_search --override
çµæã¯seub_genes.emapper.annotationsãšãããã¡ã€ã«ãšããŠäœæãããã