6 Metabolic pathway
1. BlastKOALA
BlastKOALA is an automatic annotation servers for genome and metagenome sequences, which perform KO (KEGG Orthology) assignments to characterize individual gene functions and reconstruct KEGG pathways, BRITE hierarchies and KEGG modules to infer high-level functions of the organism or the ecosystem.
BlastKOALA takes as input amino acid sequences in FASTA format. We therefore use prodigal to translate nucleic acid sequences to the corresponding peptide sequences.By default prodigal takes as input FASTA files with the .fna extension.
- Use perl to add the name of the sample to the beginning of every file and change the file type to
.fna.
- Replace space between the name of you bins and the scaffhold to underscores
- Run
prodigal
- Remove all the characters after the first space in the header
- Transfer the files to your local computer and follow the instructions on the BlastKOALA website for submission.
2. kofamscan
Move or copy the generated amino acids (
.faa) to the usergenomics2and log in to titan under this userActivate the environnement
- Execute the annotation using a foor loop and nohup
for i in *.faa; do exec_annotation -p /home/SCRIPT/kofamscan/profiles -k /home/SCRIPT/kofamscan/ko_list \
--cpu 40 -f detail-tsv --tmp-dir kofamscan-tmp -o ~/$i $i ; doneResulting file will have the .tsv extension and can easily be parsed using R.
** 3. Metabolic **
User genomics2
conda activate METABOLIC_v4.0
# Run this command using nohup with a script savec in the metabolic directory
perl METABOLIC-G.pl -in-gn /home/genomics2/karine/ARC_NANO_FAA/ -o /home/genomics2/karine/arc_metabolic_out3. Other programs we aren’t using anymore
- DRAM
DRAM (Distilled and Refined Annotation of Metabolism) is a tool for annotating metagenomic assembled genomes and VirSorter identified viral contigs. DRAM annotates MAGs and viral contigs using KEGG (if provided by the user), UniRef90, PFAM, dbCAN, RefSeq viral, VOGDB and the MEROPS peptidase database as well as custom user databases.
- Activate environment
- Execute DRAM. Create a bash script called
run_dram.shwith the following commands and execute the script using nohup.Note : DRAM requires the full path (home/genomics/…) and won’t accept path starting from the home directory (~).
DRAM.py annotate -i '/home/genomics/yourname/samples/*.fa' -o /home/genomics/yourname/samples/annotation --threads 40 - Once annotation is done, the following command will summarize all the results from the folder
annotate
DRAM.py distill -i annotation/annotations.tsv -o summaries --trna_path annotation/trnas.tsv --rrna_path annotation/rrnas.tsv- Metabolic
METABOLIC (METabolic And BiogeOchemistry anaLyses In miCrobes) enables the prediction of metabolic and biogeochemical functional trait profiles to any given genome datasets.
- Activate environment
- METABOLIC takes as input amino acid sequences in FASTA format. Create a bash script called
run_metabolic.shwith the following commands and execute the script using nohup. Before running update the path to reflect the actual path where your amino acid FASTA files are located and where you want METABOLIC to store the output files.
#!/bin/bash
perl /home/genomics/METABOLIC/METABOLIC-G.pl -in /home/genomics/user/07_metabolic_pathway/ -o /home/genomics/user/07_metabolic_pathway/metabolic_outut/- MEBS
[MEBS] (https://github.com/valdeanda/mebs)(Multigenomic Entropy-Based Score) allows the user to synthesizes genomic information into a single informative value. This entropy score can be used to infer the likelihood that microbial taxa perform specific metabolic-biogeochemical pathways.
For the moment MEBS shoudl be installed on the user’s local computer. See manual for installation and instructions.