Assignment 2 : Nucleic Acids Research (NAR) Databases

Assignment 2 : Nucleic Acids Research (NAR) Databases

1. Importance of NAR

1.1. Users require a bioinformatics tool or database for research purpose.

1.2. They provide an easy-to-use guide to navigate this huge volume and help placing related data next to each other.

1.3. Flexible approach to classify protein structure.

1.4. Useful benchmark for protein documentation.

1.5. The source is always monitored and maintained by respective database coordinator.

2. Organization of NAR database and its major grouping

2.1. Nucleotide sequence database

2.1.1. International Nucleotide Sequence Database Collaboration

2.1.2. Coding and non-coding DNA

2.1.3. Gene structure, introns and exons, splice sites

2.1.4. Transcriptional regulator sites and transcription factors

2.2. RNA sequence database

2.3. Protein sequence database

2.3.1. General sequence databases

2.3.2. Protein properties HHMD - Human Histone Modification Database TopFIND - Protein N- and C-termini and protease processing

2.3.3. Protein localization and targeting TM Pad - Helix-packing folds in transmembrane proteins Secreted Protein Database - Secreted proteins from human, mouse and rat

2.3.4. Protein sequence motifs and active sites Minimotif Miner - Search tools for short functional motifs involved in posttranslational modifications, binding to other proteins, nucleic acids, or small molecules. PolyQ - Polyglutamine Repeats in Proteins

2.3.5. Protein domain databases; protein classification BAliBASE - Benchmark database for comparison of multiple sequence alignments. OMA - Orthologous MAtrix: orthology inference among 1000 complete genomes

2.3.6. Databases of individual protein families CLIPZ - Experimentally-determined binding sites of RNA-binding proteins. Heme types, protein structures, axial ligands and Em values

2.3.7. Protein properties

2.4. Structure database

2.4.1. BARD

2.4.2. Small molecules PubChem - Structures and biological activities of small organic molecules. MMsINC - Database of commercially-available compounds for virtual screening and chemoinformatics

2.4.3. Carbohydrates Glycan - Carbohydrate database, part of the KEGG system CSS (Carbohydrate Structure Suite)- Carbohydrate 3D structures derived from the PDB

2.4.4. Nucleic acid structure Greglist - G-quadruplex motifs and potentially G-quadruplex regulated genes Voronoia4RNA - Packing of RNA molecules and complexes G-quadruplex motifs and potentially G-quadruplex regulated genes

2.4.5. Protein structure fPOP - Footprinting protein functional surfaces by comparative patterns Genome3D - Domain structure predictions and 3D models for proteins from model genomes

2.5. Genomic database

2.5.1. MGD - Mouse Genome Database

2.5.2. The Gene Indices

2.5.4. Non-verterbrates Genome annotation terms, ontologies and nomenclature Example : BioGPS Gene annotation portal and a resource on gene and protein function Gene Ontology-based functional similarity values for proteins and protein families Taxonomy and identification Example : SuperCAT A database for multilocus sequence typing analysis of the Bacillus cereus group of bacteria. Example : MetaRef Metagenomic catalog of clade-specific microbial genes General genomic databases Example : BacMap Picture atlas of annotated bacterial genomes Example : CoGen++ Complete Genome Tracking database Viral genome databases Example : ViTa microRNAs targets of the influenza virus Example : phiSITE Gene regulation in bacteriophages Prokaryotic genome databses Example : DOOR Predicted operons in bacterial and archaeal genomes Example : SporeWeb Regulatory pathways during the sporulation cycle of Bacillus subtilis Unicellular eukaryotes genome databases Example : pico-PLAZA Genome database of microbial photosynthetic eukaryotes Example : metaTIGER Metabolic evolution resource and its application to Plasmodium evolution Fungal genome databases Example : GÉnolevures A comparison of S. cerevisiae and 14 other yeast species Example : PhenoM Morphological database of essential yeast genes Invertebrate genome databases Example : FLIGHT Drosophila phenotypes, gene expression and protein interaction data Example : Sebida Sex bias in insect gene expression database

2.6. Metabolic and signaling pathways

2.6.1. Prokaryotic genome databases Example : ProPortal Prochlorococcus marinus and its phages Example : Roundup Orthologs and corresponding evolutionary distances

2.6.2. Enzymes and enzyme nomenclature Example : ExplorEnz Reference database of the IUBMB Enzyme Nomenclature Example : eQuilibrator Thermodynamics calculator for biochemical reactions

2.6.3. Metabolic pathways Example : Bionemo Curated information about biodegradation-related genes and proteins Example : Rhea EBI's biochemical reaction database

2.6.4. Protein-protein interactions Example : Human Protein Interaction database (HPID) Protein interaction predictions by several different methods Example : PepCyber:P~Pep Human protein interactions mediated by phosphoprotein-binding domains

2.6.5. Signalling pathways Example : PhosPhAt Arabidopsis Protein Phosphorylation Site Database Example : KBDOCK Protein domain interactions and interfaces

2.7. Human and other verterbrates genome

2.7.1. Model organisms, comparative genomics Example : cBARBEL Catfish genome database Example : Manteia Embryonic development of the mouse, chicken, zebrafish and human

2.7.2. Human genome databases, maps and viewers Example : Locus Reference Genomic sequences Each LRG is stable genomic DNA sequence for a region of the human genome Example : X:MAP Annotation and visualization of genome structure for Affymetrix exon array analysis

2.7.3. Human ORFs Example : Evola Human genes and their vertebrate orthologs Example : Hoppsigen Human and mouse homologous processed pseudogenes

2.8. Human genes and diseases

2.8.1. General human genetics databases Example : MutDB Predicted biochemical effects of human genetic variation: maping of SNPs on protein sequence and structure. Example : GenAtlas Human genes, markers, and phenotypes

2.8.2. General polymorphism databases Example : Patrocles Polymorphic miRNA-mediated gene regulation in vertebrates. Example : YH database A database for the first Asian diploid genome

2.8.3. Cancer gene databases Example : DriverDB Cancer driver genes/mutations deduced from cancer exome-seq results Example : PubMeth Links between DNA methylation levels and cancer

2.8.4. Gene-, system- or disease-specific databases Example : AutDB Catalog of genes linked to Autism Spectrum Disorders Example : HOX-PRO Clustering of homeobox genes

2.9. Microarray data and other gene expression databases

2.9.1. Example : 4DExpress Database for cross species expression pattern comparisons

2.9.2. Example : LOLA List of lists annotated: a comparison of gene sets identified in different microarray experiments

2.10. Proteomic resources

2.10.1. Example : PeptideAtlas Mass-spectrometry-based proteomics data for human, yeast, E.coli and Mycobacterium

2.10.2. Example : Plasma Proteome Database Qualitative and quantitative information on proteins in human plasma and serum

2.11. Organelle databases

2.11.1. Chloroplast Genome Database

2.11.2. FUGOID

2.11.3. GOBASE

2.11.4. Organelle DB

2.11.5. Organelle genomes

2.11.6. PeroxisomeDB

2.11.7. Plant Organelles Database

2.11.8. PLprot

2.11.9. Mitochondrial genes and proteins

2.12. Plants databases

2.12.1. Chloroplast Genome Database

2.12.2. General plant databases

2.12.3. Arabidopsis thaliana

2.12.4. Rice

2.12.5. Other plants

2.13. Immunological databases

2.13.1. HaptenDB

2.13.2. Protegen

2.14. Cell biology

2.14.1. CloneDB

2.14.2. ExoCarta

2.14.3. Example : LUCApedia Predicted genome, proteome, and reactome of LUCA

2.14.4. Example : DNAtraffic DNA dynamics during the cell cycle

2.15. Other molecular databases

2.15.1. Example : CellFinder Gene and protein expression, phenotype and images mapped to the cell types

2.15.2. Example : MetaRouter Compounds and pathways related to bioremediation

2.15.3. Drugs and drug design Example : BioDrugScreen A resource for computational drug design and discovery Example : Transformer Biotransformation of xenobioitics - drugs and food ingredients - by human enzymes

2.15.4. Molecular probes and primers Example : Primer Studio PCR primers for eukaryotic and prokaryotic genes Example : OligoArrayDb Pangenomic sets of microarray probes for organisms with fully sequenced genomes

2.16. Immunological databases

2.16.1. Example : HPTAA Database of potential tumor-associated antigens that uses expression data from various expression platforms, including carefully chosen publicly available microarray expression data, GEO SAGE data and Unigene expression data

2.16.2. Example : IEDB-3D Structural data within the Immune Epitope Database

2.17. Plants databases

2.17.1. General plant databases Example : Phytozome JGI's platform for green plant genomics Example : GoMapMan Unified plant-specific gene ontology

2.18. Organelle databases

2.18.1. Example : PLprot Arabidopsis thaliana chloroplast protein database

2.18.2. Example : Plant Organelle Databases Images of plant organelles and protocols for plant organelle research12

2.18.3. Mitochondrial genes and proteins Example : MitoGenesisDB Expression data to explore spatio-temporal dynamics of mitochondrial biogenesis

2.18.4. Arabidopsis thaliana Example : CATdb Arabidopsis transcriptome data Example : SeedGenes Genes essential for Arabidopsis development

2.18.5. Rice Example : RiceXPro Rice transcriptome under natural conditions Example : OryGenesDB Rice genes, T-DNA and Ds flanking sequence tags

2.18.6. Other plants Example : MedicCyc Biochemical pathways in Medicago truncatula Example : Genome database for Rosaceae Genetics and genomics data on apple, cherry, peach, pear, raspberry, rose and strawberry