by John Nash
0.0 stars - reviews
range from 0 to 5
Software used: MIRA is currently the only assembler
which will perform hybrid assemblies using different
sequencing technologies, e.g. a mix of 454 and Illumina
De novo assemblies, Reads are assembled using algorithms based
upon sequence quality, paired end distances
and average depth of coverage: the latter
prevents misassembly of heavily repeated
Scaffolded assemblies, Used for all genomes with a close relative.
Select appropriate scaffold, usually
closest relative, ideally using phylogenetic
OUTPUT: draft assembly
Flowgrams (SFF files)
HUGE (2 GB files) in fastq format - in pairs.
Proof reading: All assembler make mistakes.
All sequences get proof-read by humans.
Software used: gap5
For discovery of markers for detection, Single
nucleotide polymorphisms (SNPs) are important.
Software used: gigaBayes - usually used for
mammal data but LFZ has altered it for use in
calling bacterial SNPs
OUTPUT: assembled, proof-read genome
LFZ has a a Web page under construction to
be used to view, manage and perform
analyses on whole genome sequence data:
Annotation: The assembled genome is a
set of G, A, T and C nucleotides. Genes
must now be assigned to the genome.
The genome is automatically annotated
by computer, and each annotation is
curated by a human to check
Software used: myRAST (for
automated annotation), in-house
software (for manual curation),
and Artemis for genome viewing.
OUTPUT: Annotated genome