DNA > Databases [Bio-Mirror of biosequence
& bioinformatic data] [Banche
Dati at University of Bologna]
Databases
- ***Nucleic Acids Research, January, 2024:
Database Issue
- (see also: 2023, 2022, 2021,
- 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011,
- 2010,
2009,
2008,
2007,
2006,
2005,
2004,
2003, 2002, 2001)
- ***Databases
listed by name
or category
- ***Nucleic Acids Research, July, 2024:
Web Server Issue
- (see also: 2023, 2022, 2021,
- 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011,
- 2010,
2009,
2008,
2007,
2006,
2005,
2004,
2003)
- ***NCBI ***NCBI Education
- ***Bioinformatics: Data's future shock - Nature, Apr
15, 2004
- AIBG -
Database and Bioinformatic Tools
Integrated Sequence Databases
- Entrez
search (Maglott
et
al., 2005): integration of data about:
- Genes: Nucleotide
sequence database (Genbank), Genomes
(complete genome assemblies), PopSet
(population study data sets), ProbeSet
(gene expression and microarray datasets), UniSTS
(markers and mapping data), SNP
(single nucleotide polymorphisms);
- Proteins: Protein,
(Protein sequence database), CDD
(conserved domains), 3D
Structure (three-dimensional macromolecular structures), 3D
Domains (domains from Entrez Structure).
- Documentation: PubMed
(biomedical literature) - Online
Mendelian
Inheritance in Man (OMIM), a knowledgebase of human genes
and genetic disorders (Hamosh
et
al., 2005) - Online Books
- Taxonomy
(organisms in GenBank) - Online Mendelian Inheritance in Animals
(OMIA)
- RefSeq
- comprehensive, integrated, non-redundant, well-annotated set
of sequences - Status
Key
- Biological
SOAP servers and web services provided by the public
sequence data bank (Sugawara
&
Miyazaki, 2003)
- LocusLink
- a single query interface to curated sequence and information
about genetic loci.
- XREF:
cross-referencing the Genetics of Model Organisms with Mammalian
Phenotypes
- Expression data: dbEST; UniGene
- TEMBLOR
project at EBI (biological data integration)
- dbGaP
- database of genotype and phenotype
- The format of Genbank flat
files - Sample
- Feature
Table - Parser
Available
for GenBank Flat File
- The Information
Engineering Branch (IEB) - Building NCBI's software and
databases - ASN.1
- NCBI
Data in XML
- SRS
(Sequence Retrieval System - Service Retirement)
- EnsMart
- retrieving customised data sets from annotated genomes (at
Ensembl) (Kasprzyk
et
al., 2004)
- ARB - a software
environment for sequence data (Ludwig
et
al., 2004)
- DNA variation: Bi-allelic
sequence data | SNP - Home Page
DNA/RNA Sequences
There are three major, comprehensive database of DNA and RNA
sequences; data are automatically and continuosly exchanged among
them:
- European Molecular Biology Laboratories (EMBL) Nucleotide
Sequence
Database (Kulikova
et
al., 2004; Kanz
et
al., 2005),
now at European Bioinformatics
Institute (EBI), near Cambridge, UK (Brooksbank
et
al., 2005; Bochrane
et
al., 2006);
- ***Genbank at
National Center for Biotechnology Information (NCBI),
Bethesda, USA (Benson et al., 2004,
2005
and 2006);
- DNA data bank of Japan (DDBJ)
(click here for search),
Shizuoka, Japan (Miyazaki
et
al., 2004; Tateno
et
al., 2005; Okubo
et
al., 2006)
- Some specialized databases for small sequences
include:
- CUTG (Codon
Usage Tabulated from Genbank) at Japan University
- Mitomap -
mitochondrial DNA (mtDNA)
- rRNA
at Gent University, Belgium (Wuyts
et
al., 2004) | rRNA at ARB
- 5S rRNA
database at Pozan, Poland
- The Ribosomal
Database Project (RDP-II) - sequences and tools for
high-throughput rRNA analysis (Cole
et
al., 2005)
- GtRDB: The
Genomic tRNA Database | tRNAscan-SE at Eddy
Lab
- tRNA
sequences and sequences of tRNA genes - August 2003 (Sprinzl
and
Vassilenko, 2005)
- dbSTS
- A public database of Sequence Tagged Sites (short genomic
landmarks)
- Official
REBASE Homepage - Restriction enzyme data and
recognition sites (Roberts
et
al., 2005)
NEBcutter
- a program to cleave DNA with restriction enzymes (Vincze
et
al., 2003)
- UTR:
Resources for analyis of 5'-UTR and 3'-UTR of eukaryotic mRNAs (Mignone
et
al., 2005)
- Transterm
- RNA sequences and associated motifs (Jacob
et
al., 2006)
- SEGE -
Single Exonic Genes in Eukaryotes
- Repbase (Repeated Sequences) at Genetic&"10;Information
Research Institute
- L1Base - from
functional annotation to prediction of active LINE-1 elements (Penzkofe
et
al., 2005)
- V
Base - human germline variable region sequences
- The ImMunoGeneTics database (IMGT/GENE-DB) (Baum
et
al., 2004, Giudicelli
et
al., 2005, Lefranc
et
al., 2005)
- VBASE2 - an
integrative V gene database (Retter
et
al., 2005)
- IPD--the Immuno Polymorphism Database (Robinson
et
al., 2005)
- NPRD: Nucleosome Positioning Region Database (Victor
et
al., 2005), at SRS
Genes Information
- HGNC - Human Genes Nomenclature Home Page
- TIGR Human Gene
Index
- GeneCards:
encyclopedia
of
genes, proteins and diseases at Weissmann Institute
- Allgenes.org -
integrated database of every known and predicted human and
mouse gene
- Pseudogene.org -
Comprehensive database of identified pseudogenes,
- Hoppsigen:
a database of human and mouse processed pseudogenes (Adel
et
al., 2005)
MUTATIONS - DISEASES
- Human Gene Mutation Database (HGMD) at Cardiff
- Locus
Specific Mutation Databases
- COSMIC - Catalogue
Of Somatic Mutations In Cancer
- Sanger Institute
- MutDB - automated SNP
analysis, functionally important mutations DB, tools (Mooney
and
Altman, 2003)
- The TP53 Web Site -
Other databases: p53 at
IARC/France
- Database
of
Gene Knockout at Frontiers in Bioscience
- DG-CST (Disease Gene
Conserved Sequence Tags) - human-mouse conserved elemenus
associated to disease genes (Boccia
et
al., 2005)
- T1DBase
- a community web-based resource for type 1 diabetes research (Smink
et
al., 2005)
- NCRI
Informatics Initiative at National Cencer Research
institute (UK)
- ROCK - Online Breast
Cancer Knowledgebase
ONTOLOGY
What
is an Ontology? - Ontology
Working Group
- Gene Ontology
Consortium (2004
:update
on NAR; 2006
update)
GO Annotation
(GOA) at EBI (Camon
et
al., 2004)
GoFigure - query your DNA or protein sequence against the GO
annotated sequences
The Cancer
Genome Anatomy Project GeneOntology Browser
- Human Gene
Ontology Assignments at TIGR
- GeneMerge
- functional data for a given set of genes, scores for
over-representation of particular functions
- GOblet -
Auuomated Gene Ontology annotation for anonymous sequence data (Hennig
et
al., 2003)
- Onto-Tools
- Onto-Express, Onto-Compare, Onto-Design and Onto-Translate (Draghici
et
al., 2003)
- OntoBlast
function - from sequence similarities to potential functional
annotations (Zehetner,
2003)
- GOAL (Gene
Ontology Automated Lexicon) - functional analysis of cDNA
microarrays - University of Ferrara
- PA-GOSUB
- model :organism protein sequences with their predicted Gene
Ontology (Lu
et al., 2005)
- ChipInfo-
organizing information from online databases into easily
interpretable tabular format outputs
Genes Expression data
Vectors
- VECTORS
database
Eucaryotic Promoters
- TRANSFAC
- The Transcription Factor Database (Matys
et
al., 2006)
- WWW
Promoter Scan
- The Eucaryotic Promoter Database (EPD)
(search;
search by SRS
under "Nucleotide related databases") (Schmid et al., 2004,
2006)
- WWW
SIGNAL
SCAN Databases - Web Signal
Scan Service)
- Promoter Analysis and
Recognition at Genomatix.de
- Effects of promoter
sequence
elements on mRNA transcription (Lapidot
&
Pilpel, 2003)
- PromH
- promoters identification using orthologous genomic sequences (Solovyev
&
Shahmuradov, 2003)
- SiteSeer
- visualisation and analysis of transcription factor binding
sites in nucleotide sequences (Boardman
et
al., 2003)
- MATCHTM
- a tool for searching transcription factor binding sites in DNA
sequences (Kel
et
al., 2003)
- Gibbs
Recursive Sampler - finding transcription factor binding
sites (Thompson
et
al., 2003)
- YMF
- discovery of novel transcription factor binding sites by
statistical overrepresentation (Sinha
&
Tompa, 2003)
- Target
Explorer - identification of new target genes
for a specified set of transcription factors (Sosinsky
et
al., 2003)
- RSAT - Regulatory
Sequence Analysis Tools (van
Helden,
2003)
- TRED
- a Transcriptional Regulatory Element Database for in silico
studies (Zhao
et
al., 2005)
- The MAPPER database
- a multi-genome catalog of putative transcription factor
binding sites (Marinescu
et
al., 2005)
- Dragon ERE Finder version 2 - detection and analysis of
estrogen response elements in vertebrate genomes (Bajic
et
al., 2003)
- FIE2 - extraction
of genomic DNA sequences around the start and translation
initiation site of human genes (Chong
et
al., 2003)
- PromoSer
- a large-scale mammalian promoter and transcription
start site identification service (Halees
et
al., 2003)
- Dragon Gene Start Finder - approximate locations of the 5'
ends of genes (Bajic
&
Seah, 2003)
- DoOP - Databases of
Orthologous Promoters - upstream sequences (Barta
et
al., 2005)
- JASPAR:
an open-access database for eukaryotic transcription factor
binding profiles (Sandelin
et
al., 2004)
- TrSDB:
a proteome database of transcription factors (Hermoso
wt
al., 2004)
RNA/DNA Structure
- Biological
Structure Resource
- WEB-THERMODYN
- sequence analysis for profiling DNA helical stability
(Huang
&
Kowalski, 2003)
- Image library of
biological macromolecules (RNA) at
Jena University, Germany.
- The RNA World
Website
- RNA
secondary
structure - at Sequence Analysis with Distributed
Resources
- Thermodynamics,
software
and databases for RNA structure
- RNA-related
tools on the Bielefeld Bioinformatics Server (Sczyrba
et
al., 2003)
- RNA
and DNA
folding and hybridization prediction server mfold
by M. Zuker (Zuker,
2003) - RNAfold
at Vienna
- RNAsoft - a suite of
RNA secondary structure prediction and design software tools (Andronescu
et
al., 2003)
- Pfold
- RNA secondary structure prediction using stochastic
context-free grammars (Knudsen
&
Hein, 2003)
- Structurelab
- RNA structure analysis and comparison (including Stem Trace) -
- Vienna RNA secondary
structure server - Vienna RNA
Package (Hofacker,
2003)
- PSEUDOVIEWER2
- visualization of RNA pseudoknots of any type (Han
&
Byun, 2003)
- GPRM
- a genetic programming approach to finding common RNA secondary
structure elements (Hu,
2003)
- Tools for the automatic identification
and classification of RNA base pairs (Yang
et
al., 2003)
- A software tool-box for analysis of regulatory
RNA elements (Bengert
&
Dandekar, 2003)
- RNA
Secondary
Structure as a Reusable Interface to Biological Information
Resources - Felciano, Chen & Altman
- SVMPredict
- An antisense oligonucleotide prediction tool using Support
Vector Machines (Camps-Valls,
2004)
Software
- RNAdraw
PROTEIN Sequences
- UniProt
(Universal Protein Resource) - (Apweiler
et
al., 2004; Bairoch
et
al., 2005)
- Protein Information
Resource Home Page at NBRF (National Biomedical Research
Foundation, USA)
- PRF Protein Research
Foundation (Osaka, Japan).
- OWL
(non-redundant protein sequences database) at University of
Manchester, UK
PROTEIN Structures
- Protein Data Bank
(3D structures of proteins)
- Human Protein Reference
Database (HPRD) (Mishra
et
al., 2006)
- The RCSB
Protein Data Bank - query system and relational database
(mmCIF schema) (Deshpande
et
al., 2005)
- PDBsum
more - new summaries and analyses of the known 3D
structures of proteins and nucleic acids (Laskowski
et
al., 2005)
- PDB-Ligand
- a ligand database based on PDB for classification of
ligand-binding structures (Shin
and
Cho, 2005)
- E-MSD (EBI
Macromolecular Structure Database - an integrated data
resource for bioinformatics (Velankar
et
al., 2005)
- STING
Report - graphic and tabular presentations of protein
sequence, structure and function (Neshich
et
al., 2005)
- SWISS-3DIMAGE
at Geneva University Switzerland - Database of annotated
3D protein images
- Protein
Structure Analysis at Department of Biochemistry and
Molecular Biology University College London
- SCOP search
and retrieval at Cambridge UK Structural Classification of
Proteins
- NRL-3D
- sequence and 3D structures of proteins by PIR
- SWISS-MODEL
- Molecular Modeling on-line
- STRUCTURE
at NCBI
Last updated January 6, 2023 (work in progress)
DNA