Create your own awesome maps

Even on the go

with our free apps for iPhone, iPad and Android

Get Started

Already have an account?
Log In

Assembly pipeline (Overview) by Mind Map: Assembly pipeline
(Overview)
0.0 stars - reviews range from 0 to 5

Assembly pipeline (Overview)

Assembly of raw reads using a genome assembler.

Software used: MIRA is currently the only assembler which will perform hybrid assemblies using different sequencing technologies, e.g. a mix of 454 and Illumina

De novo assemblies, Reads are assembled using algorithms based upon sequence quality, paired end distances and average depth of coverage: the latter prevents misassembly of heavily repeated areas

Scaffolded assemblies, Used for all genomes with a close relative. Select appropriate scaffold, usually closest relative, ideally using phylogenetic software.

OUTPUT: draft assembly

Obtain raw reads from assembler: Raw data is essential because it contains sequence quality data for each base unlike the automatically-generated sequence data

454 Sequencing

Flowgrams (SFF files)

Illumina sequencing

HUGE (2 GB files) in fastq format - in pairs.

Sanger sequencing

AB1 files

Post-processing of reads

It is important to see what the genome assembly looks like. Software used: Tablet

Proof reading: All assembler make mistakes. All sequences get proof-read by humans. Software used: gap5

For discovery of markers for detection, Single nucleotide polymorphisms (SNPs) are important. Software used: gigaBayes - usually used for mammal data but LFZ has altered it for use in calling bacterial SNPs

OUTPUT: assembled, proof-read genome

Upload assembly data to portal

LFZ has a a Web page under construction to be used to view, manage and perform analyses on whole genome sequence data: http://bgph.dyndns.org/

Annotation: The assembled genome is a set of G, A, T and C nucleotides. Genes must now be assigned to the genome.

The genome is automatically annotated by computer, and each annotation is curated by a human to check accuracy.

Software used: myRAST (for automated annotation), in-house software (for manual curation), and Artemis for genome viewing.

OUTPUT: Annotated genome

New node

New node