Create your own awesome maps

Even on the go

with our free apps for iPhone, iPad and Android

Get Started

Already have an account?
Log In

Genome sequencing by Mind Map: Genome sequencing
0.0 stars - reviews range from 0 to 5

Genome sequencing


somatic sequencing

mutated vs normal genome sequencing

idea, take sample, sequence, have two genome, look for tumor-specific variation, identify SNV

"first" publication, 2008, cancer cells, venn diagram with Venter/Watson, SNV, all SNVs, - SNV in normal tissue, - novel (db-snp), in a gene, Syn/non-syn

"counting methods"

chip-seq, parts of genome a protein binds to, counting experiment, peaks, where antibody-labelled protein binds

rna-seq, quantification of gene expression, by, sequencing rna, of a particular cell

exome sequencing

why?, 1/6 of costs, 1/15 of data, of whole genome sequencing

de novo assembly

Genomes sequences without reference

Close previous knowledge gaps


goal, study, genetic variation

how, analysing many genomes from same/similar species


progress as

automating/refining dideoxi sequencing, popular machine, AB3770x



1., Sanger/Dideoxy, examples, rice, arabidopsis, human, length, 650-1000bp, idea, DNA cut with restriction enzymes, chain termination

2., Nextgen, idea, DNA shed, add adapters to end of DNA, no cloning step, perform emulsion PCR, Depositing DNA beads into Pico Titer plates, add base one at a time, add all bases, color tags, sequencing by synthesis, true single molecule sequencing, no amplification, single molecule real time sequencing, Ion torrent, computer-based, idea, add T, where T binds release of H-ion


in detail

Chain termination methods, Dye terminator method

pyro-sequencing, incorporation of bases, light emission

Library/template preparation methods, taking DNA, Shearing into smaller chunks (hundreds of bases), Add adapters, Select for adapters, Attach to surface, Creating large insert paired-end libraries, goal, bring two pieces of DNA close together for sequencing, how, ego p15i adapter at boths ends, plus biotin, indexing libraries

High-throughput sequencing, Companies, Roche 454, pyro-sequencing machine, article 2005, based on, pyro-sequencing, that is, works as, create library as in Library preparation, anneal ssDNA to, styrofoam beads, that have, complementary DNA, reagents for PCR, perform "PCR", break open beads, each bead into pico titer plate, add A,C,G,T, measure light intensity, problematic for, have i seen 10 or 11 G's, goal, one bead, one DNA molecule, in one water bubble, final, multiple copies of DNA attached to various sites, hundreds of thousands reads per run, as comp to 96 plates before, output, applications, de novo sequencing (longer read lengths), repeats, metagenomics, variation detection, gene expression, summary, runtime, 8 hrs, length, 300-400bp, most mature new generation technique, works as, video, Fragmentation of DNA, Ligation of Adaptors, Emulsion of DNA for clonal amplification, Denaturation, Deposition of DNA bead followed by packing beads, Addition of free nucleotides + light emission, releasing pyrophosphate group, Sequencing reads are collected and analyzed in high-throughput fashion, Illumina (Solexa), Genome Analyzer, works as, workflow, sequencing by synthesis, add sequencing reagents, incorporate first base, modified, extend by one base only, color-labelled, remove other reagents, detect signal, base calling, start over again, advantages, single-nucleotide at a time, but millions simultaneous, homopolymer, limitations, speed of imaging, short reads, 36bp, long enough for mapping to genome, up to 100 bp, High seek 2000, same chemistry, improved imaging, two flows at same time, enough to seq one whole genome, if reference genome given, Life Technologies (Applied Biosystems), SOLiD, has mini linux cluster beneath it, workflow, uses two-base encoding, several nucleotide pairs map to same color, use map to decode, decoding based on correct first base detection, ligation-based extension, overlaps at know position, two bases, skip 3 bases, two bases, next round, start at n-1, overlapping two bases, align reads, Diff between SNP and errors, Ion torrent, semi-conductor (like digital camera) - sequencing chip, semi-conducter that can read DNA, Ion OneTouch, Polonator, open source, buy hardware, adjust chemistry, small-scale, short reads, Helicos, Heliscope, single-molecule sequencing, Pacific BioScience, Smrt technology, advantages, polymerase as sequencing, high speed, long read length, two keys, phospho-linked nucleotides, phosphate-labeled, rather than base-labeled, polymerase cleaves away label, leaving natural DNA strand behind, zero-mode wave guide, visualization chamber, detects cleaved nucleotides, Intelligent Bio Systems, techniques, Polony sequencing, Roche 454 pyrosequencing, length, up to ~1000bp, error rate, 10-5, Illumina (Solexa) sequencing, relies on the generation of a single strand DNA library by random fragmentation of a DNA sample, shipped-software, GApaline, Bustard (Base Calling), native approach, applies, correction, cross-talk, phasing, pre-phasing, Alignment to reference, SOLiD sequencing, Ion semiconductor sequencing, DNA nanoball sequencing, Lynx Therapeutics' massively parallel signature sequencing (MPSS), rely on, interplay, chemistry, hardware, software, e.g., base calling, problem, different platforms, different algorithms, accuracy, depends on, coverage, base calling, optical sensors, comparison



next-gen sequencing overview,,

How a genome is sequenced


1000 human genomes

human hapmap



whole-genome sequencing

30x coverage

mapping to reference genome

no assembled (like first human)


Paired-end reads Reads that are sequenced from both ends of the same DNA fragment. These can be produced by a variety of sequencing protocols, and paired-end preparation is specific to a given sequencing technology. Some recent sequencing vendors use the terms ‘paired end’ and ‘mate pair’ to refer to different protocols, but these terms are generally synonymous.

aka, mate pairs


A DNA sequence fragment (a ‘read’) that aligns to multiple positions in the reference genome and, consequently, creates ambiguity as to which location was the true source of the read

20-30% with human genome

Copy number variation

long stretches

single nucleotide polymorphism

most common type

missing terms

Amplified DNA fragments


repetitive DNA

occurrence, all kingdoms, plants, Arabidopsis thaliana, whole-genome duplications, maize, 80%, half of human genome

function, some non-functional, played part in human evolution

types, interspersed repeats, longer interspersed repeats, SINEs, short interspersed nuclear elements, LINEs, long interspersed nuclear elements, best example, Alu repeats, 11% of human genome, tandem repeats, short tandem repeats, e.g., microsatellites, nested repeats

computational problems, create ambiguities, can lead to false inference of, SNP, CNV, solution, best hit, all hits, hits above threshold, definition, > 100 bp, > 2-3 times in genome, > 97% identity

Computational analysis

existing tools

SV/CNV detection

SNP detection, GATK, MAQ, SamTools, SOAPsnp, VarScan

Short-read alignment, tools, Bowtie, TopHat, Cufflinks

De novo assembly, Memory-efficient, Data structures, Algorithms, Assembler, String Graph Assembler


mapping reads, reference genome

call SNPs

Genome annotation

identifying elements/structure in genome

Gene prediction

Attaching biological information to these

Structural annotation


gene structure

coding regions

location of regulatory motifs

Functional annotation

biochemical function

biological function

involved regulation and interactions