Databases
DNA
> Databases [Bio-Mirror of
biosequence
& bioinformatic data] [Banche
Dati at University of Bologna]
Databases
***Nucleic
Acids Research, January 1, 2007: Current
status
of databases (see also: 2006,
2005,
2004,
2003,
2002,
2001)
***Nucleic
Acids Research, July 1, 2005: Web Resources
(see also: 2003)
***NCBI***Database
Resources at!NCBI'#10;
***The
Molecular Biology Database Collection - 2006 - Databases
listed by name or category
- (2003, 2005
updates)
***DBcat:
a catalog of 500 biological databases | Databases
at the Weizmann Institute of Science
***Database
resources of the National Center for Biotechnology Information (Wheeler
et al., 2005)*(2004)
***Bioinformatics:
Data's future shock - Nature, Apr 15, 2004
Integrated sequence databases
- Entrez search
(Maglott
et al., 2005): integration of data about:
- Genes: Nucleotide
sequence database (Genbank), Genomes
(complete genome assemblies), PopSet
(population study data sets), ProbeSet
(gene expression and microarray datasets), UniSTS
(markers and mapping data), SNP
(single nucleotide polymorphisms);
- Proteins: Protein,
(Protein sequence database), CDD
(conserved domains), ,3D
Structure (three-dimensional macromolecular structures), 3D
Domains (domains from Entrez Structure).
- Documentation: PubMed
(biomedical literature) - Online
Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes
and genetic disorders (Hamosh
et al., 2005) - Online Books
- Taxonomy
(organisms in GenBank) - Online
Mendelian Inheritance in Animals (OMIA)
- Biological
SOAP
servers
and web services provided by the public sequence data bank (Sugawara
& Miyazaki, 2003)
- LocusLink
-
a single query interface to curated sequence and information about
genetic
loci.
- XREF:
cross-referencing
the Genetics of Model Organisms with Mammalian Phenotypes
- Expression data: dbEST; UniGene
- TEMBLOR
project at EBI (biological data integration)
- dbGaP - database of
genotype and
phenotype
- The format of Genbank
flat files - Sample
- Feature
Table - Parser
Available for GenBank Flat File
- SRS
(Sequence Retrieval System; integration among ~90 molecular biology
databases): SRS
at EMBL - EBI | SRS
at HGMP-RC | SRS
7.1
- EnsMart -
retrieving
customised data sets from annotated genomes (at Ensembl) (Kasprzyk
et al., 2004)
- ARB - a software
environment
for
sequence data (Ludwig
et al., 2004)
- DNA variation: Bi-allelic
sequence
data | SNP - Home Page
DNA/RNA sequences
There are three major, comprehensive database of DNA and RNA sequences;
data are automatically and continuosly exchanged among them: - European
Molecular Biology Laboratories (EMBL) Nucleotide
Sequence Database (Kulikova
et al., 2004; Kanz
et al., 2005),
now at European Bioinformatics
Institute
(EBI), near Cambridge, UK (Brooksbank
et al., 2005; Bochrane
et al., 2006); - ***Genbank
at National Center for Biotechnology Information (NCBI),
Bethesda, USA (Benson et al., 2004,
2005
and 2006);
- DNA data bank of Japan (DDBJ)
(click here for search),
Shizuoka, Japan (Miyazaki
et al., 2004; Tateno
et al., 2005; Okubo
et al., 2006)
- Some specialized databases for small sequences
include:
- CUTG (Codon Usage
Tabulated
from Genbank) at Japan University
- Mitomap - mitochondrial
DNA
(mtDNA)
- rRNA at
Gent
University,
Belgium (Wuyts
et al., 2004) | rRNA at ARB
- 5S rRNA database
at Pozan, Poland
- The Ribosomal
Database
Project
(RDP-II) - sequences and tools for high-throughput rRNA analysis (Cole
et al., 2005)
- GtRDB: The
Genomic tRNA
Database | tRNAscan-SE
at Eddy Lab
- tRNA
sequences and
sequences of tRNA genes - August 2003 (Sprinzl
and Vassilenko, 2005)
- dbSTS
-
A public
database of Sequence Tagged Sites (short genomic landmarks)
- Official REBASE
Homepage - Restriction enzyme data and recognition sites (Roberts
et al., 2005)
NEBcutter
- a program to cleave DNA with restriction enzymes (Vincze
et al., 2003) - UTR:
Resources
for analyis
of 5'-UTR and 3'-UTR of eukaryotic mRNAs (Mignone
et al., 2005)
- Transterm
-
RNA sequences
and associated motifs (Jacob
et al., 2006)
- SEGE - Single
Exonic
Genes in Eukaryotes
- Repbase (Repeated Sequences) at Genetic&"10;Information
Research Institute
- L1Base - from
functional
annotation
to prediction of active LINE-1 elements (Penzkofe
et al., 2005)
- V
Base
- human germline variable region sequences
- The ImMunoGeneTics database (IMGT/GENE-DB)
(Baum
et al., 2004, Giudicelli
et al., 2005, Lefranc
et al., 2005)
- VBASE2 - an
integrative
V gene
database (Retter
et al., 2005)
- IPD--the Immuno Polymorphism Database (Robinson
et al., 2005)
- NPRD: Nucleosome Positioning Region Database (Victor
et al., 2005), at SRS
Genes information
- Human Genes Nomenclature
Home Page
- TIGR Human Gene Index
- GeneCards:
encyclopedia
of genes, proteins and diseases at Weissmann Institute
- Allgenes.org -
integrated
database of every known and predicted human and mouse gene
- Pseudogene.org -
Comprehensive database of identified pseudogenes,
- Hoppsigen:
a database of human and mouse processed pseudogenes (Adel
et al., 2005)
MUTATIONS - Human Gene Mutation
Database (HGMD) at Cardiff
- Locus Specific
Mutation
Databases
- COSMIC - Catalogue Of Somatic
Mutations In Cancer - Sanger
Institute
- MutDB - automated SNP
analysis,
functionally important mutations DB, tools (Mooney
and Altman, 2003)
- The TP53 Web Site -
Other databases: p53
at IARC/France
- Database
of Gene Knockout at Frontiers in Bioscience
- DG-CST (Disease Gene
Conserved
Sequence Tags) - human-mouse conserved elemenus associated to
disease
genes (Boccia
et al., 2005)
- T1DBase -
a
community
web-based resource for type 1 diabetes research (Smink
et al., 2005)
- NCRI Informatics
Initiative
at National Cencer Research institute (UK)
ONTOLOGY
What
is an Ontology? - Ontology Working
Group - Gene
Ontology Consortium
(2004
:update on NAR; 2006
update)
GO Annotation
(GOA) at EBI (Camon
et al., 2004)
GoFigure - query your DNA or protein sequence against the GO annotated
sequences
The Cancer Genome
Anatomy Project GeneOntology Browser - Human Gene
Ontology Assignments
at TIGR
- GeneMerge
- functional data for a given set of genes, scores for
over-representation
of particular functions
- GOblet -
Auuomated Gene
Ontology annotation for anonymous sequence data (Hennig
et al., 2003)
- Onto-Tools
- Onto-Express, Onto-Compare, Onto-Design and Onto-Translate (Draghici
et al., 2003)
- OntoBlast
function
- from sequence similarities to potential functional annotations (Zehetner,
2003)
- GOAL (Gene
Ontology
Automated
Lexicon) - functional analysis of cDNA microarrays - University of
Ferrara
- PA-GOSUB
- model
:organism protein sequences with their predicted Gene Ontology (Lu
et al., 2005)
- ChipInfo
- organizing information from online databases into easily
interpretable
tabular format outputs
Genes expression data
- The Expressed
Gene Anatomy
Database (EGAD)
- dbEST
- "Single-pass" cDNA sequences or Expressed Sequence Tags at NCBI
- Mouse
Genome Informatics - Gene Expression Data
Vectors
- VECTORS
database
Eucaryotic Promoters
- TRANSFAC
- The Transcription Factor Database (Matys
et al., 2006)
- WWW
Promoter
Scan
- The Eucaryotic Promoter Database (EPD)
(search;
search
by SRS
under "Nucleotide related databases") (Schmid et al., 2004,
2006)
- WWW
SIGNAL SCAN Databases - Web
Signal Scan Service)
- Promoter Analysis and
Recognition
at Genomatix.de
- Effects of promoter
sequence elements on mRNA transcription (Lapidot
& Pilpel, 2003)
- PromH
- promoters identification using orthologous genomic sequences (Solovyev
& Shahmuradov, 2003)
- SiteSeer -
visualisation
and analysis of transcription factor binding sites in nucleotide
sequences
(Boardman
et al., 2003)
- MATCHTM
- a
tool for searching transcription factor binding sites in DNA sequences
(Kel
et al., 2003)
- Gibbs
Recursive
Sampler - finding transcription factor binding sites (Thompson
et al., 2003)
- YMF -
discovery
of novel transcription factor binding sites by statistical
overrepresentation
(Sinha
& Tompa, 2003)
- Target
Explorer
- identification of new target genes for a specified
set
of transcription factors (Sosinsky
et al., 2003)
- RSAT - Regulatory
Sequence Analysis
Tools (van
Helden, 2003)
- TRED
- a Transcriptional Regulatory Element Database for in silico studies (Zhao
et al., 2005)
- The MAPPER database -
a
multi-genome
catalog of putative transcription factor binding sites (Marinescu
et al., 2005)
- Dragon ERE Finder version 2 - detection and analysis of estrogen
response
elements in vertebrate genomes (Bajic
et al., 2003)
- FIE2 -
extraction of
genomic
DNA sequences around the start and translation initiation
site of human genes (Chong
et al., 2003)
- PromoSer - a
large-scale
mammalian promoter and transcription start site
identification
service (Halees
et al., 2003)
- Dragon Gene Start Finder - approximate locations of the 5'
ends of
genes
(Bajic
& Seah, 2003)
- DoOP - Databases of
Orthologous
Promoters
- upstream sequences (Barta
et al., 2005)
- JASPAR:
an open-access database for eukaryotic transcription factor binding
profiles
(Sandelin
et al., 2004)
- TrSDB:
a proteome database of transcription factors (Hermoso
wt al., 2004)
RNA/DNA structure
- Biological Structure
Resource
- WEB-THERMODYN
- sequence analysis for profiling DNA helical stability (Huang
& Kowalski, 2003)
- Image library of
biological
macromolecules (RNA)
at Jena University, Germany.
- The RNA World Website
- RNA
secondary structure - at Sequence Analysis with Distributed
Resources
- Thermodynamics,
software and databases for RNA structure
- RNA-related
tools
on the Bielefeld Bioinformatics Server (Sczyrba
et al., 2003)
- RNA
and DNA´folding and
hybridization prediction server mfold
by M. Zuker (Zuker,
2003) - mFold
Input
- RNAsoft - a suite of RNA
secondary
structure prediction and design software tools (Andronescu
et al., 2003)
- Pfold -
RNA
secondary
structure prediction using stochastic context-free grammars (Knudsen
& Hein, 2003)
- Structurelab
- RNA structure analysis and comparison (including Stem Trace) -
- Vienna RNA secondary
structure server
- Vienna RNA Package
(Hofacker,
2003)
- PSEUDOVIEWER2
- visualization
of RNA pseudoknots of any type (Han
& Byun, 2003)
- GPRM
- a
genetic
programming approach to finding common RNA secondary structure elements
(Hu,
2003)
- Tools for the automatic identification
and classification of RNA base pairs (Yang
et al., 2003)
- A software tool-box for analysis of regulatory
RNA elements (Bengert
& Dandekar, 2003)
- RNA
Secondary Structure as a Reusable Interface to Biological Information
Resources -
Felciano, Chen & Altman
- SVMPredict
- An antisense oligonucleotide prediction tool using Support Vector
Machines
(Camps-Valls,
2004)
Software for download - RNAdraw
PROTEIN sequences
- UniProt (Universal
Protein
Resource) - (Apweiler
et al., 2004; Bairoch
et al., 2005)
- Protein Information
Resource
Home
Page at NBRF (National Biomedical Research Foundation, USA)
- PRF Protein Research
Foundation
(Osaka, Japan).
- OWL
(non-redundant
protein sequences database) at University of Manchester, UK
PROTEIN structures
- Protein Data Bank
(3D
structures
of proteins)
- Human Protein Reference Database
(HPRD)
(Mishra
et al., 2006)
- The RCSB
Protein
Data
Bank - query system and relational database (mmCIF schema) (Deshpande
et al., 2005)
- PDBsum
more
- new summaries and analyses of the known 3D structures of proteins and
nucleic acids (Laskowski
et al., 2005)
- PDB-Ligand - a
ligand
database based on PDB for classification of ligand-binding structures (Shin
and Cho, 2005)
- E-MSD (EBI Macromolecular
Structure
Database - an integrated data resource for bioinformatics (Velankar
et al., 2005)
- STING
Report - graphic and tabular presentations of protein sequence,
structure
and function (Neshich
et al., 2005)
- SWISS-3DIMAGE
at Geneva University Switzerland - Database of annotated 3D protein
images
- Protein
Structure
Analysis at Department of Biochemistry and Molecular Biology
University
College London
- SCOP search and
retrieval
at Cambridge UK Structural Classification of Proteins
- NRL-3D
- sequence and 3D structures of proteins by PIR
- SWISS-MODEL
- Molecular Modeling on-line
- STRUCTURE at
NCBI
Last updated November 30, 2007
DNA