DNA > Databases [Bio-Mirror of biosequence
          & bioinformatic data] [Banche
          Dati at University of Bologna]
    Databases
    
      - ***Nucleic Acids Research, January, 2025:
        Database Issue 
-       (see also: 2024, 2023, 2022, 2021, 
-        2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011, 
-        2010,
        2009,
        2008,
        2007,
        2006,
        2005,
        2004,
        2003, 2002, 2001) 
- ***Databases
        listed by name
        or category
        
 
- ***Nucleic Acids Research, July, 2025:
        Web Server Issue 
-       (see also: 2024, 2023, 2022, 2021,
 
-        2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011,
-        2010,
        2009,
        2008,
        2007,
        2006,
        2005,
        2004,
        2003) 
 
- ***NCBI
        ***NCBI Education/Learn - NCBI News 1994-2017 - NCBI Insights 2013-
 
- ***Bioinformatics: Data's future shock - Nature, Apr
        15, 2004
-       AIBG -
          Database and Bioinformatic Tools
Integrated Sequence Databases
    
      -  Entrez
            search (Maglott
et
          al., 2005): integration of data about:
-  Genes: Nucleotide
        sequence database (Genbank), Genomes
        (complete genome assemblies), PopSet
        (population study data sets), ProbeSet
        (gene expression and microarray datasets), UniSTS
        (markers and mapping data), SNP
        (single nucleotide polymorphisms);
-  Proteins: Protein,
        (Protein sequence database), CDD
        (conserved domains), 3D
          Structure (three-dimensional macromolecular structures), 3D
          Domains (domains from Entrez Structure).
-  Documentation: PubMed
        (biomedical literature) - Online
Mendelian
          Inheritance in Man (OMIM), a knowledgebase of human genes
        and genetic disorders (Hamosh
et
          al., 2005) - Online Books
        - Taxonomy
        (organisms in GenBank) - Online Mendelian Inheritance in Animals
        (OMIA)
- RefSeq
        - comprehensive, integrated, non-redundant, well-annotated set
        of sequences - Status
          Key
-  Biological
            SOAP servers and web services provided by the public
        sequence data bank (Sugawara
&
          Miyazaki, 2003)
-  LocusLink
        - a single query interface to curated sequence and information
        about genetic loci.
-  XREF:
        cross-referencing the Genetics of Model Organisms with Mammalian
        Phenotypes
- Expression data:  dbEST; UniGene
-  TEMBLOR
        project at EBI (biological data integration)
- dbGaP
        - database of genotype and phenotype
 
- The format of Genbank flat
            files - Sample
        - Feature
            Table - Parser
Available
          for GenBank Flat File
- The Information
          Engineering Branch (IEB) - Building NCBI's software and
        databases - ASN.1
        - NCBI
          Data in XML
      -  SRS
        (Sequence Retrieval System - Service Retirement)
-  EnsMart
        - retrieving customised data sets from annotated genomes (at
        Ensembl) (Kasprzyk
et
          al., 2004)
-  ARB - a software
        environment for sequence data (Ludwig
et
          al., 2004)
-  DNA variation: Bi-allelic
          sequence data | SNP - Home Page
 DNA/RNA Sequences
      There are three major, comprehensive database of DNA and RNA
      sequences; data are automatically and continuosly exchanged among
      them:
      - European Molecular Biology Laboratories (EMBL) Nucleotide
Sequence
          Database (Kulikova
et
          al., 2004; Kanz
et
          al., 2005),now at European Bioinformatics
        Institute (EBI), near Cambridge, UK (Brooksbank
et
        al., 2005; Bochrane
et
        al., 2006);
- ***Genbank at
        National Center for Biotechnology Information (NCBI),
        Bethesda, USA (Benson et al., 2004,
        2005
        and 2006);
- DNA data bank of Japan (DDBJ)
        (click here for search),
        Shizuoka, Japan (Miyazaki
et
          al., 2004; Tateno
et
          al., 2005; Okubo
et
          al., 2006)
      - Some specialized databases for small sequences
        include:
-  CUTG (Codon
        Usage Tabulated from Genbank) at Japan University
-  Mitomap -
        mitochondrial DNA (mtDNA)
-  rRNA
        at Gent University, Belgium (Wuyts
et
          al., 2004)   |    rRNA at ARB
- 5S rRNA
          database at Pozan, Poland
 
-  The Ribosomal
          Database Project (RDP-II) - sequences and tools for
        high-throughput rRNA analysis (Cole
et
          al., 2005)
-  GtRDB: The
        Genomic tRNA Database   |   tRNAscan-SE at Eddy
        Lab
-  tRNA
        sequences and sequences of tRNA genes - August 2003 (Sprinzl
and
          Vassilenko, 2005)
-  dbSTS
        - A public database of Sequence Tagged Sites (short genomic
        landmarks)
-  Official
          REBASE Homepage - Restriction enzyme data and
        recognition sites (Roberts
et
          al., 2005)
      NEBcutter
      - a program to cleave DNA with restriction enzymes (Vincze
et
        al., 2003)
      - UTR:
        Resources for analyis of 5'-UTR and 3'-UTR of eukaryotic mRNAs (Mignone
et
          al., 2005)
-  Transterm
        - RNA sequences and associated motifs (Jacob
et
          al., 2006)
-  SEGE -
        Single Exonic Genes in Eukaryotes
- Repbase (Repeated Sequences) at Genetic&"10;Information
          Research Institute
-  L1Base - from
        functional annotation to prediction of active LINE-1 elements (Penzkofe
et
          al., 2005)
-  V
          Base - human germline variable region sequences
- The ImMunoGeneTics database (IMGT/GENE-DB) (Baum
et
          al., 2004, Giudicelli
et
          al., 2005, Lefranc
et
          al., 2005)
-  VBASE2 - an
        integrative V gene database (Retter
et
          al., 2005)
- IPD--the Immuno Polymorphism Database (Robinson
et
          al., 2005)
- NPRD: Nucleosome Positioning Region Database (Victor
et
          al., 2005), at SRS
 Genes Information
      - HGNC - Human Genes Nomenclature Home Page
-  TIGR Human Gene
          Index
-  GeneCards:
encyclopedia
of
          genes, proteins and diseases at Weissmann Institute
-  Allgenes.org -
        integrated database of every known and predicted human  and
        mouse gene
- Pseudogene.org -
        Comprehensive database of identified pseudogenes,
 
-  Hoppsigen:
        a database of human and mouse processed pseudogenes (Adel
et
          al., 2005)
      MUTATIONS - DISEASES
      - Human Gene Mutation Database (HGMD) at Cardiff
- Locus
            Specific Mutation Databases
- COSMIC - Catalogue
        Of Somatic Mutations In Cancer
        - Sanger Institute
 
-  MutDB - automated SNP
        analysis, functionally important mutations DB, tools (Mooney
and
          Altman, 2003)
- The TP53 Web Site -
        Other databases: p53 at
        IARC/France 
- Database
of
          Gene Knockout at Frontiers in Bioscience
-  DG-CST (Disease Gene
          Conserved Sequence Tags) - human-mouse conserved elemenus
        associated to disease genes (Boccia
et
          al., 2005)
-  T1DBase
        - a community web-based resource for type 1 diabetes research (Smink
et
          al., 2005)
-  NCRI
          Informatics Initiative at National Cencer Research
        institute (UK)
- ROCK - Online Breast
        Cancer Knowledgebase
 
 ONTOLOGY
      What
          is an Ontology? - Ontology
        Working Group
      - Gene Ontology
            Consortium   (2004
:update
          on NAR; 2006
          update)
      GO Annotation 
      (GOA) at EBI  (Camon
et
        al., 2004) 
      GoFigure - query your DNA or protein sequence against the GO
      annotated sequences 
      The Cancer
        Genome Anatomy Project GeneOntology Browser
      - Human Gene
          Ontology Assignments  at TIGR
-  GeneMerge
        - functional data for a given set of genes, scores for
        over-representation of particular functions
-  GOblet -
        Auuomated Gene Ontology annotation for anonymous sequence data (Hennig
et
          al., 2003)
-  Onto-Tools
        - Onto-Express, Onto-Compare, Onto-Design and Onto-Translate (Draghici
et
          al., 2003)
-  OntoBlast
        function - from sequence similarities to potential functional
        annotations (Zehetner,
          2003)
-  GOAL (Gene
        Ontology Automated Lexicon) - functional analysis of cDNA
        microarrays - University of Ferrara
-  PA-GOSUB
        - model :organism protein sequences with their predicted Gene
        Ontology (Lu
          et al., 2005)
-  ChipInfo-
        organizing information from online databases into easily
        interpretable tabular format outputs  
Genes Expression data
    
       Vectors
      -  VECTORS
          database
 Eucaryotic Promoters
      -  TRANSFAC
          - The Transcription Factor Database (Matys
et
          al., 2006)
-  WWW
          Promoter Scan
- The Eucaryotic Promoter Database (EPD)
        (search;
        search by SRS
        under "Nucleotide related databases") (Schmid et al., 2004,
        2006)
-  WWW
SIGNAL
          SCAN Databases - Web Signal
          Scan Service)
-  Promoter Analysis and
          Recognition at Genomatix.de
- Effects of promoter
sequence
          elements on mRNA transcription (Lapidot
&
          Pilpel, 2003)
-  PromH
        - promoters identification using orthologous genomic sequences (Solovyev
&
          Shahmuradov, 2003)
-  SiteSeer
        - visualisation and analysis of transcription factor binding
        sites in nucleotide sequences (Boardman
et
          al., 2003)
-  MATCHTM
        - a tool for searching transcription factor binding sites in DNA
        sequences (Kel
et
          al., 2003)
-  Gibbs
          Recursive Sampler - finding transcription factor binding
        sites (Thompson
et
          al., 2003)
-  YMF
        - discovery of novel transcription factor binding sites by
        statistical overrepresentation (Sinha
&
          Tompa, 2003)
-  Target
          Explorer - identification of new target genes
        for a specified set of transcription factors (Sosinsky
et
          al., 2003)
-  RSAT - Regulatory
        Sequence Analysis Tools (van
Helden,
          2003)
-  TRED
        - a Transcriptional Regulatory Element Database for in silico
        studies (Zhao
et
          al., 2005)
-  The MAPPER database
        - a multi-genome catalog of putative transcription factor
        binding sites (Marinescu
et
          al., 2005)
- Dragon ERE Finder version 2 - detection and analysis of
        estrogen response elements in vertebrate genomes (Bajic
et
          al., 2003)
       
      - FIE2 - extraction
        of genomic DNA sequences around the start and translation
          initiation site of human genes (Chong
et
          al., 2003)
-  PromoSer
        - a large-scale mammalian promoter and transcription
          start site identification service (Halees
et
          al., 2003)
- Dragon Gene Start Finder - approximate locations of the 5'
          ends of genes (Bajic
&
          Seah, 2003)
-  DoOP - Databases of
        Orthologous Promoters - upstream sequences (Barta
et
          al., 2005)
       
      -  JASPAR:
        an open-access database for eukaryotic transcription factor
        binding profiles (Sandelin
et
          al., 2004)
-  TrSDB:
        a proteome database of transcription factors (Hermoso
wt
          al., 2004)
RNA/DNA Structure
      -  Biological
          Structure Resource
-  WEB-THERMODYN
        - sequence analysis for profiling DNA helical stability
        (Huang
&
          Kowalski, 2003)
-  Image library of
          biological macromolecules (RNA) at
        Jena University, Germany.
-  The RNA World
          Website
-  RNA
secondary
          structure - at Sequence Analysis with Distributed
        Resources
-  Thermodynamics,
software
          and databases for RNA structure
-  RNA-related
          tools on the Bielefeld Bioinformatics Server (Sczyrba
et
          al., 2003)
       
      - RNA
        and DNA
        folding and hybridization prediction server mfold
        by M. Zuker (Zuker,
          2003) - RNAfold
        at Vienna
-  RNAsoft - a suite of
        RNA secondary structure prediction and design software tools (Andronescu
et
          al., 2003)
-  Pfold
        - RNA secondary structure prediction using stochastic
        context-free grammars (Knudsen
&
          Hein, 2003)
-  Structurelab
        - RNA structure analysis and comparison (including Stem Trace) -
-  Vienna RNA secondary
          structure server - Vienna RNA
          Package (Hofacker,
          2003)
-  PSEUDOVIEWER2
        - visualization of RNA pseudoknots of any type (Han
&
          Byun, 2003)
-  GPRM
        - a genetic programming approach to finding common RNA secondary
        structure elements (Hu,
          2003)
- Tools for the automatic identification
          and classification of RNA base pairs (Yang
et
          al., 2003)
- A software tool-box for analysis of regulatory
          RNA elements (Bengert
&
          Dandekar, 2003)
-  RNA
Secondary
          Structure as a Reusable Interface to Biological Information
          Resources - Felciano, Chen & Altman
-  SVMPredict
        - An antisense oligonucleotide prediction tool using Support
        Vector Machines (Camps-Valls,
          2004)
      Software
      - RNAdraw
 PROTEIN Sequences
      -  UniProt
            (Universal Protein Resource) - (Apweiler
et
          al., 2004; Bairoch
et
          al., 2005)
-  Protein Information
          Resource Home Page at NBRF (National Biomedical Research
        Foundation, USA)
-  PRF Protein Research
        Foundation (Osaka, Japan).
-  OWL
        (non-redundant protein sequences database) at University of
        Manchester, UK
 PROTEIN Structures
      -  Protein Data Bank
        (3D structures of proteins)
-  Human Protein Reference
          Database (HPRD) (Mishra
et
          al., 2006)
-  The RCSB
          Protein Data Bank - query system and relational database
        (mmCIF schema) (Deshpande
et
          al., 2005)
-  PDBsum
          more - new summaries and analyses of the known 3D
        structures of proteins and nucleic acids (Laskowski
et
          al., 2005)
-  PDB-Ligand
        - a ligand database based on PDB for classification of
        ligand-binding structures (Shin
and
          Cho, 2005)
-  E-MSD (EBI
          Macromolecular Structure Database - an integrated data
        resource for bioinformatics  (Velankar
et
          al., 2005)
-  STING
          Report - graphic and tabular presentations of protein
        sequence, structure and function (Neshich
et
          al., 2005)
-  SWISS-3DIMAGE
          at Geneva University Switzerland - Database of annotated
        3D protein images
-  Protein
          Structure Analysis at Department of Biochemistry and
        Molecular Biology University College London
-  SCOP search
          and retrieval at Cambridge UK Structural Classification of
        Proteins
-  NRL-3D
        - sequence and 3D structures of proteins by PIR
-  SWISS-MODEL
        - Molecular Modeling on-line
-  STRUCTURE
          at NCBI
      
 
      Last updated January 6, 2023 (work in progress)
       DNA
DNA