TRAM (Transcriptome Mapper) 1.3


User Guide - December, 2017
(TRAM 2017)
Revised May, 2018


Mac OS X and Windows versions

INDEX

INTRODUCTION

INSTALLATION


SET UP
    1. Importing data about chromosomes and genes
                of an organism
     1.1 Inserting data about the chromosome number and length
                (bp) of an organism
     1.2 Importing localization data for known genes
     1.3 Importing localization data for UniGene EST clusters
    2. Importing gene identifiers conversion data tables
      2.1 Conversion of Sequence accession numbers to Gene
                Symbols
   2.2 Importing Custom identifiers conversion table
     2.3 Importing gene probe identifiers for a Platform      

USE
    3. Importing the expression data files
    4. Analyzing data
     4.1 Creating and analyzing maps of the transcriptome
     4.2 Searching for clusters of neighbouring over/under-
        expressed genes
     4.3 Use of TRAM as "TRAM Results Viewer" (TRV)

GENERAL DEFINITIONS
    5.1 File
   
5.2 Table
    5.3 Record
    5.4 Field
    5.5 Layout
    5.6 Browse Mode
    5.7 Find Mode
    5.8 Preview Mode

MENU AND COMMANDS

    6.1 TRAM
    6.2 File
    6.3 Edit
    6.4 View
    6.5 Records
    6.6 Scripts
    6.7 Help

TROUBLESHOOTING

TECHNICAL NOTES
    7.1 Known software limits
    7.2 Bug reports

ACKNOWLEDGEMENTS


 
INTRODUCTION                                                               
(Back to Index)


TRAM generates and analyzes transcriptome maps. It is able to import and integrate any gene expression data source in tabulated text format and to map expression values to the relevant genomic region, providing statistical analysis of over- or under-expressed regions compared to the whole genome or to the relative chromosome.

This guide is designed for detailed documentation of TRAM 1.3 software.
It shows how to install the software and how to import expression data to create and analyze transcriptome maps.

Download TRAM 1.3 for Mac OS (Operating System) X or for Windows from the following address:
http://apollo11.isto.unibo.it/software/

The minimum software requirements are:
Mac OS X 10.6, OS X Lion 10.7, OS X Mountain Lion 10.8;
Windows XP Professional, Home Edition (Service Pack 3);

Windows Vista Ultimate, Business, Home Premium (Service Pack 2);
Windows 7 Ultimate, Professional, Home Premium;
Windows 8 Standard and Pro edition.


A connection to the Internet is required to display the software Guide and to download data for set up, but not to run the tool.

If you are working on human expression gene values,
download the file:
TRAM_1.3_HUMAN_2017.zip         (Macintosh)
TRAM_1.3_HUMAN_2017_Win.zip     (Windows)   

For all other cases download the file:
TRAM_1.3.zip               
    (Macintosh)
TRAM_1.3_Win.zip                (Windows)

The downloaded file should be automatically decompressed, generating a "TRAM" folder.
Failing this, double click on the file to activate the default decompression utility of your system.

The TRAM Folder contains:
"TRAM" (Macintosh) or "TRAM.exe" (Windows) file
       (the runtime application);
"TRAM.TMA" (database file);
"Batch_Import_A" folder;
"Batch_Import_B" folder;
"Platform" folder;
"Results" folder;
"FMP Acknowledgments.pdf" file;
"Extensions" folder, containing a "Dictionaries" folder,
     with the dictionary file for supported languages;
     (and an "English" folder with 3 files, for Windows);
40 ".dll" files (for Windows).

TRAM 1.3 is based on FileMaker Pro 12 (FileMaker Pro, Inc.) database management software (http://www.filemaker.com/),
and it is released as a FileMaker Pro 12 template, along with a runtime application able to run "FileMaker Pro" at the core of the software.
The runtime is freely distributed in compliance with the license of "FileMaker Pro 12 Advanced" developer package that was used to create the program.

Standard database commands (Find, Sort, Export records) are available within each layout of TRAM (see "GENERAL DEFINITIONS" and "MENU AND COMMANDS" sections in this Guide).


INSTALLATION                                                       
(Back to Index)

Once decompressed, TRAM is ready to be used.
Macintosh: open the "TRAM" application ("TRAM" or "TRAM.app" file) contained in the "TRAM" folder.
Windows: open the "TRAM" application ("TRAM" Runtime file) contained in the "TRAM" folder.

The minimum software requirements are:
Mac OS X 10.6, OS X Lion 10.7, OS X Mountain Lion 10.8;
Windows XP Professional, Home Edition (Service Pack 3);

Windows Vista Ultimate, Business, Home Premium (Service Pack 2);
Windows 7 Ultimate, Professional, Home Premium;
Windows 8 Standard and Pro edition.

A connection to the Internet is required to display the software Guide and to download data for set up, but not to run the tool.

Please do not change the name of all files and folders of the TRAM software.

You may download multiple copies of TRAM and run them simultaneously, provided that each "TRAM" folder is located in a different directory.
Do not move the "TRAM" folder while the software is open.
Run the "TRAM" software from a local hard disk.
Do not run the software from a network drive.

If a TRAM analysis aborts unexpectedly, it is advisable to restart it in a fresh TRAM copy.

Simply use buttons to navigate in the different sections of the software. The "Back" button brings user to the last visited layout (and not to all previously visited layouts). The "Home" button brings user to the main software screen, from which any layout may be reached.

The TRAM file in the TRAM_HUMAN (or TRAM_MOUSE, TRAM_BRARE if available) species-specific versions is pre-loaded with the latest human (or mouse, or zebrafish, respectively) data for genes, chromosomes, UniGene cluster IDs and all related GenBank Accession Numbers, Expressed Sequence Tags (ESTs).
In addition, the gene identifiers for common commercially available array Platforms as deposited in Gene Expression Omnibus (GEO) are also available in these pre-setup versions (see section 2.2 for details). Please, if you use any other gene identifier type read the "Set up" chapter, section 2.2.

The number of the current National Center for Biotechnology Information (NCBI) Genome Build may be obtained from the site:
http://www.ncbi.nlm.nih.gov/mapview/
by clicking on the organism of interest.
The corresponding genome assembly version used by University of California, Santa Cruz (UCSC) Genome Browser to produce EST localization data may be chosen from the "assembly" menu in the "Table Browser" web page:
http://genome.ucsc.edu/cgi-bin/hgTables?

TRAM_1.3_HUMAN_2017 is the TRAM 1.3 version that is provided already pre-loaded, following a complete Set Up process, with 2017 data for H. sapiens. It replaces any previous version of TRAM_HUMAN.
It includes a pre-loaded "gene_aliases.txt" file.
The 38 gene aliases were manually curated.
The 22,454 clone names were extracted by the "Clone Names" section of each "NCBI Gene" record using an awk script followed by a FileMaker Pro script (the parsing procedure is described in the 1.2, c) section of this Guide).
The 230,702 GenBank Accessions related to a Gene Symbol were extracted by the "Related sequences" section of each "NCBI Gene" record using an awk script followed by a FileMaker Pro script (the parsing procedure is described in the 1.2, c) section of this Guide).
The 185,647 RefSeq Accessions related to a Gene Symbol were extracted by the "NCBI Reference Sequences (RefSeq)" section of each "NCBI Gene" record using an awk script followed by a FileMaker Pro script (the parsing procedure is described in the 1.2, c) section of this Guide).
Data for TRAM_HUMAN 2017 are derived from:

       NCBI       NCBI         UniGene       UCSC EST  
       Gene       Genome       Clusters      Localization
      
Build      Release                    (NCBI  Build)

HUMAN  2017/11    GRCh38       #236          Dec. 2013 (GRCh38)
2017              2013/12/24   2013/03       Download 2017/11

A set up process is required every time your experimental model organism is different from human, for which pre-setup versions may be provided. The set up process is described in the following section.

You will find some useful software and documents
in the directory "TRAM_Utilities" at:
http://apollo11.isto.unibo.it/software/TRAM/

- GEO_GSM_Download, a useful tool to automatically download data matching a list of NCBI Gene Expression Omnibus (GEO) samples (GSM) from the GEO database;
- GEO_GPL_Download, a useful tool to automatically download data matching a list of GEO platforms (GPL) from the GEO database;
- a "Protocol"
with practical advice useful to run a meta-analysis by TRAM.


SET UP                                                                      
(Back to Index)

While TRAM_HUMAN.zip file contains a pre-setup version ready to analyze expression data from human organism, you may also download an empty TRAM template that may be prepared for the analysis of data from any organism.

Pre-setup versions may be directly used to import and analyze expression data without performing the "Set up" process. However, the user could need to perform the "Set up" section 2.2 to load additional Platform schemes if necessary to interpret the gene identifiers listed in his expression data file (see below).
The empty TRAM template must instead always be prepared by performing the "Set up" process from the beginning.

Download the "TRAM.zip" file from:
http://apollo11.isto.unibo.it/software/TRAM/
Following decompression of the "TRAM_1.3.zip" (or TRAM_1.3_Win.zip) file, open the "TRAM" file contained in the "TRAM" folder.

In the "Main" window click "Set Up", this will change to the "Set Up" layout which contains the first main choices.

Set up is composed of two main parts.
1) Organism-specific Genomic Data
Guided feeding of the software with data about chromosomes and genes of the genome of your interest.
2) Gene Identifiers conversion tables
Guided feeding of the software with conversion tables; this allows the conversion of each gene identifier used in the expression data file to the corresponding gene name.

Note: TRAM, as all FileMaker-like databases, automatically saves any changes, so you will not find any "save" options at the end of the import processes.
After the import processes, any manual data change will cause the loss of the originally imported data.

Set Up - Step definition

Note: if you re-execute a step previously executed, you need to re-execute all subsequent steps maintaining the execution order below.

Step Type Execution Order
[time to be completed]
Needed to...
If skipped...
First part
GENES AND GENOME DATA


Collect chromosomes and genes data
I. (Section 1.1)
Importing chromosome data
Needed 1st
[minutes]
Define number and length of chromosomes
The software does not work
II. (Section 1.2)
Importing gene data
Needed 2nd
[hours]
Define genomic coordinates for known genes
The software does not work
IV. (Section 1.3)
Importing EST localization data
(if available in UCSC Browser)
Optional After III. (UniGene identifiers,
required) [hours]
Define genomic coordinates for unknown genes, i.e. EST (Expression Sequence Tag) Clusters
Results can be based only on "known" genes
Second part
GENES IDENTIFIERS


Assign your expression data to chromosomes and genes Skip this if you use Official Gene Symbols as identifiers
III. (Section 2.1)
Importing GenBank/UniGene identifiers conversion table
Optional Before IV and/or V if they are executed [hours, once executed UniGene Tabulator process (0.5 days)] Analyze expression data labeled by any GenBank RNA sequence accession number or UniGene ClusterID The expression data cannot be assigned to the corresponding genes via GenBank/UniGene sequence identifiers
V. (Section 2.2)
Importing custom identifier conversion table
Optional Before VI

Analyze expression data labeled by your custom identifiers

The expression data can be assigned to the corresponding genes only via standard identifiers
VI. (Section 2.3)
Importing Platform conversion table
Optional Last Step Analyzes expression data labeled by any Platform gene identifiers The expression data cannot be assigned to the corresponding genes via Platform identifiers

At the end of Set Up, the user may proceed with expression data file import.

1 Importing data about chromosomes and genes of your organism
(Back to Index)                                              

TRAM software is designed to create a chromosome set and to assign the gene expression values to the right position within each of them.

The software is optimized to parse "NCBI Gene" data to obtain the necessary localization information, due to both the short update period and the gene position accuracy of this database. You may use other sources of data, provided that they are in the format described below (column number and order, file name) to ensure a correct TRAM functioning.

TRAM cannot analyse non-chromosomal elements (such as plasmids), while it is able to map
mitochondrial chromosome genes since version 1.2, although mitochondrial genes have not been considered in TRAM_HUMAN pre-loaded versions.

1.1 Importing data about the chromosome number and length (bp) of a selected organism                           
(Back to Index)

Note: the maximum number of chromosomes accepted by the software is 25 (including autosomal and sexual), only for the purpose of "horizontal" viewing, and unlimited for all other purposes.

When different types of deposited sequences (e.g., with NC_ reference or AC_  or NT_ code type) are available for the studied organism, NC_ (RefSeq) sequence should be chosen as default for each chromosome, AC_ sequence should be chosen, if available, in absence of an NC_ sequence and finally, NT_ sequence should be selected as a last choice.

The following instructions are also available as a guided procedure within the software in the "Set Up" area.
From the TRAM Home, click on the "Set Up" button - then on "Genomic" - then on "Chromosomes".


a) Prepare the table containing data for each chromosome

For example, you may obtain from "NCBI Genome" the data for each chromosome.
Click on the "Open "NCBI Genome" web site" button.

If you did not already set the "Organism" field into the "Setting Segment" or "Setting Cluster" window, the software will ask you for the searched "Organism", please insert the name only as Latin name (e.g. Human = Homo sapiens).

If appropriate, use complete species/strain name given in  square brackets by "Entrez Gene" online database (e.g., Saccharomyces cerevisiae S288c).


In the "NCBI Genome" organism-specific displayed page, locate the "Representative" genome information, then click on individual (RefSeq) chromosome entries at the bottom of the page.

Write the resulting data
in a standard tabulated text file (.txt),
separating each column by a "tab",
in this format [without Column Headers; use the "Name" reported in the Reference genome as the identifier for each chromosome]:


[Chromosome]   [Length]       [Organism]    [RefSeq/Genbank#]
1              248,956,422    Homo sapiens  NC_000001

2              242,193,529    Homo sapiens  NC_000002   
...

Move the obtained file, named genome.txt, into the "TRAM" folder.

b) Import the file
Click on the "Import genome.txt" button, to automatically import and parse the obtained chromosome data.

At the end, data fields in the table "Chromosomes" will appear as follows:

[Chromosome]    [Length]       [Organism]      [Chr_ID]
chr1            248,956,422    Homo sapiens    1
...

where Chr_ID is a unique progressive number assigned by TRAM
to each chromosome.

Following chromosome data import it is useful to check the "Chromosome" Table (click on the "Chromosomes" link in TRAM page from which you have launched the import of chromosome data, or on the "Chr." button in the TRAM Home).
You may manually edit the chromosome records if necessary, using the "Record" Menu and typing into the appropriate fields.

Note: for organisms with only one chromosome (e.g., prokaryotes) insert manually the chromosome data, as follows:
from TRAM Home, click on the "Chr." button,
then on the appearing "Chromosome" layout
create a new record by selecting "New record" from the "Record" menu and insert these data manually in the corresponding field:

Chromosome
Chromosome (exactly this word: chromosome).
Length
Chromosomal length in bp (it can be derived from the corresponding GenBank entry; e.g., 5,498,450 for NC_002695).
Organism     
Organism Latin name
(e.g., Escherichia coli).
If appropriate, use complete species/strain name given in  square brackets by "Entrez Gene" online database (e.g., Escherichia coli O157:H7 str. Sakai).
Chr_ID       
1 (exactly this digit: 1).

GenBank #    
GenBank Accession Number (it can be derived from the
"Entrez Gene" entries relative to the investigated organism, e.g., NC_002695).

Use complete species/strain name given in square brackets by "Entrez Gene" online database (e.g., Escherichia coli O157:H7 str. Sakai).

For organisms with only one chromosome do not use "Special" functions to perform TRAM set up.


1.2 Importing localization data for known genes        
(Back to Index)

The following instructions are also available as a guided procedure within the software in the "Set Up" area.
From the TRAM Home, click on the "Set Up" button - then on "Genomic" - then on "Genes".


a) Download the data for each gene from "NCBI Gene"
                                        
Note: All previously imported data will be deleted.

Note: if the software asks you for the name of the "Organism", please insert this only as Latin name (e.g. Human = Homo sapiens).
If appropriate, use complete species/strain name given in  square brackets by "Entrez Gene" online database (e.g., Saccharomyces cerevisiae S288c).

Click on the "Open "NCBI Gene" web site" button.

The "Entrez Gene" web page will be opened showing all gene data
needed for the specified organism.

In the "NCBI Gene" displayed page (in the web browser),
check that "Current" is selected as "Status" on the left column.


Save the resulting data as follows:
click on the "Send to" link on the right,
choose "File"
(and select "Tabular (text)" format, default order)
then click on the "Create file" button
(do not change the suggested file name).

Move the obtained "gene_result.txt" into the "TRAM" folder.


b) Import the file
Click on the "Import the "gene_result.txt" file" button
to automatically import and parse the downloaded gene data.


IMPORTANT - Do not import the same text file more than once into TRAM database; download or decompress the file again if you need to repeat the import twice.

Gene entries without genomic coordinates,
or with the word "Pseudogene" in the "Description" field
(except when in the context "readthrough transcribed pseudogenes" or "gene/pseudogene") will be deleted.

At the end, data fields in the table for an RNA transcript will appear as follows:

[Chromosome]    [start site]    [end site]    [Gene symbol]
chr1            67,278,568      67,390,570    WDR78


You may check and freely edit the data in the TRAM table "Genes".

c) Optional - Recommended - Resolve Clone names and RNA Accessions

"NCBI Gene" entries include information about Clone identifiers as well as GenBank Accession Numbers for RNA sequences that are related to a specific locus.
Including this information in TRAM is very useful to resolve Gene Identifiers using this type of information. This section must be executed before step III (Section 2.1).

1. Download the data for each gene from "NCBI Gene" in asn.1 format
                                        
Note: if the software asks you for the name of the "Organism", please insert this only as Latin name (e.g. Human = Homo sapiens).
If appropriate, use complete species/strain name given in  square brackets by "Entrez Gene" online database (e.g., Saccharomyces cerevisiae S288c).

Click on the "Open "NCBI Gene" web site" button.

The "Entrez Gene" web page will be opened showing all gene data
needed for the specified organism.

In the "Entrez Gene" displayed page (in the web browser),
click on the "Current Only" link on the right.


Save the resulting data as follows:
click on the "Send to" link on the right,
choose "File"
(and select "ASN.1" format)
then click on the "Create file" button
(do not change the suggested file name).

Rename the obtained "gene_result.txt" file as
"gene_result.asn1.txt".

Move the obtained "gene_result.asn1.txt" file into the "TRAM" folder.

2. Process the
"gene_result.asn1.txt" file using the awk command available in UNIX systems:
- in UNIX, use shell;
- in Mac OS X, use the "Terminal" application;
- in Windows, use a UNIX emulator.

Seek advice from a UNIX user if needed. There is currently no alternative way to effectively process asn.1 files which include the complete information about Clone names and GenBank Accession Numbers.

Change directory in the UNIX system to reach the directory in which the
"gene_result.asn1.txt" file is located.
Copy the text of the following command (copy exactly, including spaces), then press "Enter":

awk '/  gene {/{getline;print}/      heading "Clone Names",/,/    },/{print}/              accession "NM_|              accession "NR_|              accession "XM_|              accession "XR_/{print}/          heading "mRNA",/{getline;print}' gene_result.asn1.txt >gene_result.Clones.RNAs.txt

NOTE: in some lower organisms, RefSeq gene entries have "NC_" prefix. If you see in the "NCBI Reference Sequences" section of the "NCBI Gene" entries for your organism that RefSeq entries have "NC_" codes, please use this alternative command:

awk '/  gene {/{getline;print}/      heading "Clone Names",/,/    },/{print}/              accession "NC_/{print}/          heading "mRNA",/{getline;print}' gene_result.asn1.txt

In any case, the resulting "gene.result.Clones.RNAs.txt" file will be into the "TRAM" folder.


3. Import the files

Click on the "Import the "gene.result.Clones.RNAs.txt file" button to automatically import and parse the downloaded and awk-processed Gene data.

IMPORTANT - Do not import the same text file more than once into TRAM database; download or decompress the file again if you need to repeat the import twice.

You may check and freely edit the data in the TRAM table "Gene Aliases", listing all Clone Names and/or GenBank Accession Numbers related to a specific locus according to "NCBI Gene" entries.

If the same Gene Alias (Gene_Alias) is eventually assigned to different Gene Symbols (Gene_Symbol), the field "Discrepancy Alias vs Symbols" will display "Yes" in all Records with that Gene Alias.

1.3  Importing localization data for EST Clusters, if these data are available in "UCSC Genome Browser"
(Back to Index)

Note: this step is necessary if you wish to analyze the expression data not only for known genes but also for genes so far identified only as UniGene Cluster (cluster of ESTs, Expression Sequence Tags).
The genomic coordinates for UniGene Cluster are available for several organisms in the "UCSC Genome Browser" (University of California, Santa Cruz).

Assembly (build) version for the investigated genome in UCSC and NCBI must be the same, in order to use the same reference genome coordinates and successfully integrate localization data from known genes and from ESTs.

The number of the current NCBI Genome Build may be obtained from the site:
https://www.ncbi.nlm.nih.gov/genome/gdv/
by clicking on the organism of interest.
The corresponding genome assembly version used by UCSC Genome Browser to produce EST localization data may be chosen from the "assembly" menu in the "Table Browser" web page:
http://genome.ucsc.edu/cgi-bin/hgTables?


Note:
All previously imported EST Clusters data will be deleted.
Note:
this step must be performed after the previous "Set up Genes" process (section 1.2) and the UniGene identifiers conversion table import (section 2.1).

a) Download the EST localization data from UCSC "Genome Browser"

The following instructions are also available as a guided procedure within the software in the "Set Up" area.
From the TRAM Home, click on the "Set Up" button - then on "Genomic" - then on "EST Clusters".


Click on the "Open Genome Browser site" button.

Then in the web browser page select:
clade:     your investigated clade (e.g., Mammal)
genome:    your investigated genome (e.g., Human)
group:     "mRNA and EST Tracks"
track:     "ESTs" (if available, otherwise current set up
                   is not possible)
table:     "all_est"
region:    "genome"

output format:         "selected fields from primary
                       and related tables"
output file:           EST.txt
file type returned:    gzip compressed

Click on the "get output" button and select the following fields in the appearing table:
qName
tName
tStart
tEnd

Click on the "get output" button at the bottom of the page.
Once the download of the file "EST.txt.gz" is complete, decompress it and put the resulting "EST.txt" file into the "TRAM" folder.

b) Import the file
Click on the "Import "EST.txt" file" button,
to automatically import and parse the obtained UniGene clusters location data file.

At the end, data fields in the table for an RNA transcript will appear as follows:

[Chromosome]    [start site]    [end site]    [ClusterID]
chr1            67,278,568      67,390,570    Hs.49421

You may check the processed data in the TRAM table "EST_Clusters" (from TRAM Home, click on the "ESTs" button, then on the "EST_Clusters" orange button.

EST entries are parsed via their relationship with "UniGene_ID" table: ESTs belonging to UniGene Clusters are imported in the "EST_Clusters" table, where localization for each cluster is calculated between the minimum start coordinate and the maximum end coordinate available for each EST cluster.
To omit incongruent results, the parsing process will subsequently import in the "Genes" table only the unambiguously mapped UniGene clusters. To this aim, entries with a chromosome name not equal to one in the chromosome names in the "Chromosomes" table will not be considered, as well as those with ESTs mapping on very distant positions on the same chromosome. To this aim, we set a conservative limit to 250,000 bp in TRAM, considering that in Entrez Gene the set was of 28,355 human genes (the largest known genes), the mean size was 43,698 and the standard deviation 102,616, so this is equivalent to consider a size range within mean plus or minus 2 SD (approximately 95% of values in a Gaussian distribution). This correction effectively removes approximately 3,000 transcripts erroneously mapped to regions of several Mb or tens of Mb. The user retains the possibility to inspect the list of EST clusters with a genomic extension >250 kb that are present in a given chromosome segment, even if they are not considered in the creation of the transcriptome map. For this purpose, click "Go" under the title "Genes Table" in the "Map" result layouts, then click "EST Clusters - Go".

2 Importing gene identifiers conversion data tables
(Back to Index)

TRAM software is designed to collect expression data files where genes are identified via specific symbols.

Default Gene Identifier used by TRAM is the Gene Symbol (Official or not) found in the "NCBI Gene" database
(or, in its absence, the "Gene" abbreviation in the entry header), e.g.:

[Column
Headers are not required]

[Gene]     [Expression value]
HBB        160.03
FLJ39609   132.50   

If you have a list of symbols of this type, with the corresponding expression values, you can directly go to "Home" and start to import expression data.

"Gene Name" in TRAM is the best name available for a gene (represented by, in decreasing order: Official Gene Symbol, or the symbol in the "Gene" entry header, or the UniGene Cluster ID).
If the expression data are labeled with gene identifiers/symbols different from Official Gene Symbols or
from the names in the "Gene" entry header, TRAM tries to convert any user-provided gene identifier into a Gene Symbol/Gene name. For this purpose, the user has to import the two-column conversion tables listing a gene identifier and the corresponding Gene Symbol.

It is possible to import more than one Identifier Conversion Table. TRAM has an original, powerful system to integrate multiple alternative conversions of gene identifiers.

The following instructions are also available as a guided procedure within the software in the "Set Up" area.
From the TRAM Home, click on the "Set Up" button - then on "Gene ID".

NOTE: TRAM will try to convert the Gene identifiers present in the user expression data files to Gene Symbols/Gene names, following this priority order until a positive match is found:

1) if you set up the
"Custom" table as described in section 2.3 of the chapter "Set up", the "Custom" table will be first searched to match Gene identifiers in your data to the corresponding Gene Symbols/Gene names, overriding all other conversions;

2) if no match has been found, then the "Genes" table (mandatorily set up
as described in section 1.2 of the chapter "Set up") will be searched to directly interpret Gene identifiers in your data as Gene Symbols/Gene names;

3) if you write a Platform ID code (e.g., GPL... for a GEO Platform) in (at least) the first line of the third column of your data (formatted as described in section 3 of this Guide), a corresponding list of gene Identifiers (often a series of progressive numbers) is expected in your data  and each will be converted in the corresponding Gene Symbol/Gene name, if you previously set up the table for the relative platform as described in section 2.2 of the chapter "Set up";
for example:

1007_s_at  6.38      GPL96  
1053_at    6.65
117_at     6.48

...        ...

[Note - If the first expression value is not in the first row due to the presence of some header lines, please use the very first row anyway in your file to indicate the Platform code, making sure that you are writing it in the third column. If you have only one column in the first row, please press the tabulator key twice then write the Platform code.
Do not insert blank spaces or other characters at the end of the text in a column].

4) if you write the word GeneID in the first line of the third column of your data (formatted as described in the section 3 of this Guide), an "Entrez Gene" Identifier (a progressive number) is expected in your data and it will be converted in the corresponding Gene Symbol/Gene name searching in the "Genes" Table (this has been mandatorily setup as described in section 1.2 of the chapter "Set up");
for example:

780       
6.38      GeneID 
5982       6.65
3310       6.48

...        ...

[Note - If the first expression value is not in the first row due to the presence of some header lines, please use the very first row anyway in your file to indicate the "GeneID" option, making sure that you are writing it in the third column. If you have only one column in the first row, please press the tabulator key twice then write the "GeneID" word.
Do not insert blank spaces or other characters at the end of the text in a column].


5) if no match has still been found, the "Unigene" Table will be searched to directly interpret Gene identifiers in your data as GenBank Accession Numbers (if you set up this table
as described in section 2.1 of the chapter "Set up");

6) if no match has still been found, the "Unigene" Table will then be searched to directly interpret Gene identifiers in your data as UniGene Cluster identifiers (
if you set up this table as described in section 2.1 of the chapter "Set up").

When a match is found, this will prevent the software from searching for a symbol in the next tables.
We suggest to use recently released data for each table to be imported in the TRAM software.

2.1 Conversion of Sequence accession numbers to Gene Symbols            
(Back to Index)

If you have labeled your expression data values by sequence identifiers, you will have to generate and import the complete UniGene identifiers data table for your organism, which will match any GenBank Accession Number for a transcript (RNA, EST) to the known Gene Symbol, when available or, as a second choice, to the corresponding UniGene Cluster ID, if existing.

Note: this process has been already performed (update: Dec. 2017) for the Homo sapiens provided pre-setup versions of TRAM.

Note: In order
to keep the data updated, all data previously imported in this table will be deleted during a new import.
This step must be done before the import of EST localization data (section 1.3) and/or Platform (section 2.2) data, if one of these import processes is performed.

a) Prepare a table containing four columns, separated by a tabulator, relating each GenBank Accession Number to the respective UniGene Cluster ID, Gene Symbol (when available) and GenBank Identifier (GI) (if desired), e.g.:

[Column Headers are not required]

[GenBank      [UniGene      [Gene Symbol]     [GenBank GI
 Accession]    Cluster ID]                     Identifier] 
 
 AF117710      Hs.523443     HBB               4378803

To do this, we propose to import the default output file of  "UniGene Tabulator" (version 1.1 or later) software, a tool able to parse the whole UniGene database for an organism.

The following instructions are also available as a guided procedure within the software in the "Set Up" area.
From the TRAM Home, click on "Set Up" button - then on "Gene ID" - then on "Sequence IDs".


Click on the "Open "UniGene Tabulator" site" button:
http://apollo11.isto.unibo.it/software/UniGene_Tabulator/
your default internet browser will show the software download page.
Download the current version of the software for your OS.
Please, follow the instructions in the UniGene Tabulator User Tutorial to automatically parse UniGene data for the organism of your interest. Please, note that for TRAM purpose, it is not necessary to import the UniGene library data file into UniGene Tabulator.
At the end of the process, the file "UniGene.tab" will be automatically created into the "UniGene Tabulator" folder. A message will alert the user about the availability of the "UniGene.tab" file at the end of the process. This file contains a useful code conversion among: GenBank Accession Number, UniGene cluster ID and
Gene Symbol.

The parsing process could employ several hours to complete, depending on the amount of data available for the selected organism.

At the end of the process the file "UniGene.tab" will appear in your desktop.

b) How to import the UniGene tabulated data file in TRAM

Move the "UniGene.tab" file into the TRAM folder.
Click on the "Import the "UniGene.tab" file" button
to import the data into the respective "UniGene_ID" database table.


Note: All previously imported data will be deleted.


This step is necessary to use either GenBank Accession Numbers or UniGene Cluster IDs as gene identifiers.


The GenBank Accession Number must lack the version of the sequence, which if present is separated by a full stop mark from the main number (i.e. do not use AK125137.1).

NOTE: the field "Is_in_NCBI Gene" is filled with the "Gene Symbol" itself if the "Gene_Symbol" for that UniGene Cluster is found in the "NCBI Gene" database. By default this information is not displayed.
You can display it by first simply clicking into the "Gene_Symbol" field of anyone of the records of the TRAM "UniGene" table, then executing these two commands in succession from the "Records" Menu:
- Show All Records
- Relookup Field Contents...

The execution of the Relookup could take some time.

The complete list of the "Gene Symbols" imported from the "NCBI Gene" database during the "Set Up - Genes" process may be accessed by going to the TRAM "Genes" table (e.g., clicking on the "Genes" button in the TRAM Home), then clicking on the "All Symbols" button.

IMPORTANT: If you perform this step after importing the gene identifiers for a Platform (section 2.2), you have to run the import and analysis of Platform data as well as of sample expression data again, because the conversion of the identifiers to the matching gene symbols may have been changed.

Quality control. All imported records should have a value in the fields UniGene_ID (UniGene cluster identifier) and GenBank_AN (GenBank Accession Number). At the end of the import of UniGene.tab file into TRAM, you may search for records with empty "UniGene_ID" or "GenBank_AN" field [to do this, go to the "UniGene" table of TRAM, press "Find" on the window top bar and then type "=" (without quotes) in the "UniGene_ID" or "GenBank_AN" field]. If you find one or more records without a UniGene_ID or a GenBank_AN, you may manually fill in the missing values, after obtaining them by searching for the GenBank Accession Number with an empty UniGene_ID (or for the UniGene_ID with an empty GenBank Accession Number, respectively) at the address:
http://www.ncbi.nlm.nih.gov/sites/entrez?db=unigene

2.2 Importing Custom identifiers conversion table
(Back to Index)

For expression data values related to personal "custom" gene identifiers, with the correspondence between gene/probe identifiers and gene symbols established by the user, the user has to import the Custom identifiers data table(s).

The following instructions are also available as a guided procedure within the software in the "Set Up" area.
From the TRAM Home, click on the "Set Up" button - then on "Gene ID" - then on
"Custom IDs".

a) Prepare a table containing 2 columns, separated by a tab,
   for each of your custom identifier, e.g.:

[Column Headers are not required]

[ID]  [Official Gene Symbol/Gene name]   
My_1  HBB   
My_2  FLJ39609

Save the data in text format.

b) Import the table

Click on the "Import the custom file" button to import the custom table into the "Custom_ID" TRAM database table.
It is possible to subsequently import additional custom tables for conversion of other identifiers. The conversion specified in the
"Custom_ID" TRAM database table will override any other conversion.

NOTE: the "Custom" table will be used with maximum priority to resolve both gene identifiers listed in expression data files (section 3) and gene identifiers listed in Platform data files (section 2.3).

2.3 Importing gene probe identifiers for a Platform 
(Back to Index)

This step is necessary to use gene probe IDs as gene identifiers for a particular array Platform registered in the GEO (Gene Expression Omnibus) online database or otherwise available.

In order to relate the expression data values to Platform identifiers, the corresponding identifiers data table(s) must be imported.

The following instructions are also available as a guided procedure within the software in the "Set Up" area.
From the TRAM Home, click on the "Set Up" button - then on "Gene ID" - then on "Platform IDs".


a) Alternative option: if the expression data you are going to analyze are derived from the GEO database, locate the GEO Platform data for the platforms of your interest by searching for a Platform (e.g., GLP96) in the "accession" field of the web page "Accession Display":
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi

Details about the GEO database can be found at:
https://www.ncbi.nlm.nih.gov/geo/
please read the GEO "Overview" section at:
https://www.ncbi.nlm.nih.gov/geo/info/overview.html

On the bottom of the resulting Platform description Web page,
click on the "Download full table..." button
and save the file,
or click on the "View full table" button
and save the resulting Web page as a text file.

If neither of these two options is available, please click on the link: SOFT formatted family file(s) to download the Platform description file in format .soft.
Manually change the file extension ".soft" into ".txt".

You may also obtain platform (.adf) text files from ArrayExpress database:
https://www.ebi.ac.uk/arrayexpress/

GEO_GPL_Download is a  useful tool to automatically download data matching a list of GEO platforms (GPL) from the GEO database (Gene Expression Omnibus).
It is distributed along with TRAM in the directory "TRAM_Utilities" at:
http://apollo11.isto.unibo.it/software/TRAM/.
Requirements: any operating system (Linux, Mac OS X, Windows, ...) with Python 2 or Python 3 and IDLE.


b) Alternatively, you may use a platform data file from any source, provided that you have at least two columns of data (in tabulated text format):
- the list of gene identifiers (ID), describing
  the genes included in that experimental platform;
  this column must have the header (first row):
  "ID" (without quotes);
- the corresponding GenBank Accession Number or Gene Symbol.

For example:

ID              [GB_ACC]  [Gene Symbol]   
1007_s_at       U48705    DDR1   

[Column Headers are not required, except for the ID header]

Further data in the column, e.g. Web addresses, will be automatically ignored by TRAM.

c) Import the Platform data file (tabulated text) in TRAM.
   From the TRAM Home, click on the "Set Up" button -
   then on "Gene ID" - then on "Platform IDs".

   Click on the "Import the Platform data file" button.
  
You will then be guided to locate these columns:
ID          (the Platform ID for the probe)
GB_ACC      (the GenBank Accession Number for the probe
                sequence, when available, or alternatively
                the GenBank GI code, or as a last option
                the RefSeq (NM_) code)
Gene symbol (the official Gene Symbol, when available)

Further data in the column, e.g. Web addresses or additional GenBank Accession Numbers following the first one, will be automatically ignored by TRAM.

NOTE - If the column type is not clear by simple inspection of the column content within the first rows, please scroll down the window to evaluate further records (rows) that could clarify if that column contains sequence accession numbers and/or gene symbol identifiers.

Platform data will be imported into the "Platform_ID" TRAM database table.
You will be requested to assign a unique code to each Platform after its import.
At the end of the import, you may delete the original Platform data file.

In addition, a text file with the processed Platform data is automatically created in the "Platform" folder within the "TRAM" folder. This file is automatically named: "GPL...", where (...) is the code you assigned to the platform. The prefix "GPL" is used independently on the GEO origin of the platform. These files could be useful in the case that you successively need to execute a batch platform import (section "
Special" from the TRAM Home) in another copy of the TRAM software, provided that you rename them as GPL1.txt, GPL2.txt and so on.

TRAM will try to use first the GenBank Accession to relate the sequence to the corresponding updated Gene Symbol (when available); alternatively, the "Gene Symbol", as provided in the  data file, will be used.

If the same GenBank Accession Number (GenBank_AN) is eventually assigned to different Gene Names, the field "Discrepancy GenBank vs Gene_Names" will display "Yes" in all Records with that GenBank Accession Number.

In particular, TRAM 1.3 will search to assign the Gene Name to each platform gene identifier using various sources in this priority order (the particular source used for each Probe ID may be found in the "Gene_Name Source" field of the "Platform Identifiers conversion table", briefly "Platform"):

01.
Gene symbol or name obtained from Custom Table via the "ID" Probe Identifier (all data provided by the user and loaded in the "Custom" TRAM table);

02.
Gene symbol or name obtained from "NCBI Gene" via the GenBank Accession (GenBank_AN field) originally provided
by the Platform scheme available online for the gene probe and assigned by "NCBI Gene" (whose data are processed and imported during TRAM Set Up process in the TRAM "Gene Aliases and RNA data table") to a specific locus;

03.
Gene symbol or name obtained from "gene_aliases.txt" file (a file manually created by the user with two columns linking an alias to a Gene Name and imported in TRAM during the Set Up process) via the Gene Symbol (Gene_Symbol field) originally provided by the Platform scheme available online for the gene probe;

04.
Gene symbol or name obtained from "NCBI Gene" Aliases (whose data are processed and imported during TRAM Set Up process
in the TRAM "Gene Aliases and RNA data table") via the Gene Symbol (Gene_Symbol field) originally provided for the gene probe;

05.
Gene symbol or name obtained from "NCBI Gene" Clone Names (whose data are processed and imported during TRAM Set Up process
in the TRAM "Gene Aliases and RNA data table") via the Gene Symbol (Gene_Symbol field) originally provided
by the Platform scheme available online for the gene probe;

06.
Gene symbol as provided by the Platform scheme available online, as long as it is present in "NCBI Gene";

07. UniGene Identifier obtained from "NCBI UniGene"(whose data are processed and imported during TRAM Set Up process in the TRAM "UniGene Identifiers conversion table", in brief: "UniGene" table) via the GenBank Accession (GenBank_AN field) originally provided by the Platform scheme available online for the gene probe;

08. UniGene Identifier obtained from "NCBI UniGene"(whose data are processed and imported during TRAM Set Up process in the TRAM "UniGene Identifiers conversion table", in brief: "UniGene" table) via the GenBank Identifier (GenBank_GI field) originally provided by the Platform scheme available online for the gene probe;

09. Gene symbol just as provided by the Platform scheme available online;

10. UniGene Identifier related to the GenBank Accession, if it is not matched to a locus;

11. GenBank Accession Number (or GenBank GI) as provided by the Platform scheme available online.

In brief, sources for Gene Name are:
                                           
01. Custom Table,
    via Probe Identifier                       
02. "NCBI Gene",
    via GenBank Accession in "Aliases and RNA Table"           
03. Custom file "gene_aliases.txt",
    via Platform Gene Symbol                      
04. "NCBI Gene" Aliases,
    via Platform Gene Symbol in "Aliases and RNA Table"   
05. "NCBI Gene" Clone Names,
    via Platform Gene Symbol in "Aliases and RNA Table"
06. Gene Symbol provided by Platform
    and present in "NCBI Gene"           
07. UniGene,
    via GenBank Accession                           
08. UniGene,
    via GenBank Identifier                           
09. Gene Symbol as provided by Platform                       
10. UniGene Identifier used as Symbol                          
11. GenBank Accession used as Symbol                       
12. [None]                                       

If the first 11 steps give negative results, a name will not be assigned and the gene will not be further analyzed.
Please note that the priority order in TRAM 1.3 has changed in comparison with previous versions of TRAM because UniGene for Homo sapiens, the previous priority source, has not been further updated by NCBI, therefore "NCBI Gene" has been used as an affordable and updated source of Gene Names related to a GenBank Accession Number.

From the TRAM Home, click on the "Genes" button - then on the "Alias" button. This will lead to the "Gene Aliases" Table.
Since TRAM 1.1 (2013) version, TRAM is able to resolve any Gene Alias resulting from the above described process, converting each alternative gene symbol (alias), if included in the section "Other Aliases" of the "NCBI Gene" (formerly "Entrez Gene") record for that gene, to the corresponding "Gene Symbol" (for eukaryotes only).
The user can also place a file named "gene_aliases.txt" in the TRAM folder listing additional aliases to be resolved, not included in the "NCBI Gene" record, in tabulated text format:
first column, gene symbol alias; second column, gene symbol.
If this file is found in the TRAM folder during the execution of the "Set Up" - "Genes" process, this file will be automatically imported and processed.
Clone Names
The user can place a file named "gene_clone_names.txt" in the TRAM folder listing Clone Names to be resolved, in tabulated text format:
first column, clone name; second column, gene symbol.

This file is automatically generated in TRAM 1.3 version by parsing of "NCBI Gene" entries during "Set Up - Genes" process (see section 1.2 c in this Guide).

If this file is  in any way found in the TRAM folder during the execution of the "Set Up" - "Genes" process, it will be automatically imported and processed.

Repeat the Platform import process for any desired Platform.

From the "Platform" TRAM table, you may click on the "Platforms Summary" button, which will take you to a summary table of the data about each Platform. The button "Show Identifiers" associated to each Platform record will show all Identifiers of the relative Platform.

A file with formatted platform data ready to be imported in TRAM will also be created at the end of each guided platform import. This is useful for any subsequent possible use of
the "Special" batch unsupervised platform import function described below.

Clicking on "Special" in the TRAM "Home" window will allow the user to start a batch data import of large pools of Platforms data without the user intervention.
To this aim, prepare all the files with Platforms data as described, name them GPL1.txt, GPL2.txt, ... and put them within the "Platform" folder of the main directory of TRAM.
In this case, a fourth column must be added at least in the first row, with the code identifying the Platform whose data are present in the file (e.g., GPL96):

[Column Headers are not required]

[ID]            [GB_ACC]  [Gene Symbol]    [Platform]  
1007_s_at       U48705    DDR1             GPL96
1053_at         M87338    RFC2
...             ...       ...

You will be asked to choose whether to delete or not the previously imported Platform data.
   
---------
IMPORTANT - To interpret the identifiers in your gene expression data file as Platform ID for the relative setup Platform, remember to write the Platform code (e.g., GPL... for a  Platform) in the third column of your expression data file, at least in the first row, so that your expression data file will contain three columns separated by one tabulator, in this format:

Expression data file

[Column Headers are not required]

[Gene ID]  [Value]   [Platform]
1007_s_at  6.38      GPL96  
1053_at    6.65
117_at     6.48

...        ...

The Platform code in the third column will allow TRAM to link the gene identifiers to the corresponding Platform.
---------

The platforms pre-loaded in TRAM_HUMAN are listed in the Appendix at the bottom of this file.


USE                                                                            
(Back to Index)

A protocol for the execution of meta-analysis by the TRAM software is available along with the TRAM 1.3 version ("TRAM_Meta_Analysis_Protocol_2017" file, located in the "TRAM Utilities" directory of the TRAM web site).

While TRAM_HUMAN.zip file contains a pre-setup version ready to analyze expression data from human organism, you may also download an empty TRAM template that may be prepared for the analysis of data from any organism.
 
Pre-setup versions may be directly used to import and analyze expression data without performing the "Set up" process. However, the user might need to perform the "Set up" section 2.3 to load additional Platform schemes if necessary to interpret the gene identifiers listed in his expression data file (see below).
Conversely, the empty TRAM template must always be prepared by performing the "Set up" process from the beginning (section 1).

Note - Data saving
TRAM, as
any FileMaker-based database, automatically saves any changes, so you will not find any save options at the end of the import processes.
After the import processes, avoid any manual data change that may cause the loss of the original imported data.


Note - Advanced use
You may open the program files using your copy of FileMaker 12 or later, thus becoming fully able to make any modification to the software.
In this case, do not open the program using the "TRAM" file, but open, within FileMaker Pro, the file "TRAM.TMA" instead.
Following modifications, the correct functioning of the program requires its re-launch by "TRAM" runtime, due to data pathway structure stored in the "TRAM" Scripts.

To cancel a TRAM operation before it is completed (not recommended):
Press Command-period keys (Mac OS X)
or Esc (Windows).

It is possible to compare two different biological conditions, importing one as the A sample (or samples pool), and the other as the B sample (or sample pools) to be compared to A.

Switching by TRAM database tables may be done by clicking on the relative buttons present in each layout.

3 Importing the expression data files
(Back to Index)

The user is responsible for the homogeneity or comparability of the data to be imported in terms of: biological sample, microarray platform (although inter-sample normalization methods are provided), and spot quality filtering/data pre-processing.
The software will map the imported values along the chromosomes, but it can't check the validity of the experimental design.

A protocol for the execution of meta-analysis by the TRAM software is available along with the TRAM 1.3 version ("TRAM Meta-Analysis Protocol 2017" file).

Each series of data related to a "Sample" is defined as a "distinct biological sample", for example in the case of two channel experiment, a sample should be a single channel, each channel data being imported as a distinct data file.

Be sure that your system default format uses
"." (full stop mark)
as a decimal separator (English standard).

See below how to check and change the setting if necessary.

IMPORTANT. The expression data file must be a tabulated (tab-delimited) text file containing two columns separated by a TAB character (tabulator key, ASCII9).

First ("left") column
: Gene probe identifier:
   Official Gene Symbols/"NCBI Gene" names (default);
   or, if set up the relative conversions:
   Custom identifiers, or
   Platform Identifiers or
   GenBank Accession Numbers.

Second column: numerical expression value.
Use "." as a decimal separator
(and do not use a thousand separator).
Be sure that your system default format uses
"." (full stop mark)
as a decimal separator (English standard).
If this is not the case, you must change the system setting.

Mac OS X:
in "System Preferences" (from the "Apple" Menu),
click on "International", then on "Formats",
then choose as "Region" a country with the English standard format for numbers (full stop mark as a decimal separator).
System restart or user logout is not required to make the change effective.
Windows: in "Control Panel" (from the "Start" Menu),
click on "International options" then modify the format of numbers choosing a country with the English standard format for numbers (full stop mark as a decimal separator).
System restart or user logout is not required to make the change effective.

The expression value is usually the pre-processed intensity value, i.e. the value assigned to the spot as it has been processed by the software of the specific experimental platform used (for instance following background subtraction for a microarray spot).
Scientific notation is supported
in the format, for example, 20E-2.
TRAM considers the expression values as linear data, and not logarithm-transformed data. If necessary, data should be retransformed before importing them in TRAM. TRAM can back-transform log-transformed values (in base 2, 10 or e) if user prepares data using "Help with data" utility (see below).
Ratio values (e.g., ratio between two microarray channels) are not admitted in TRAM.

GEO_GSM_Download
is a tool useful to download automatically data matching a list of GEO samples (GSM) from the GEO database (Gene Expression Omnibus).
It is distributed along with TRAM in the directory "TRAM_Utilities" at:
http://apollo11.isto.unibo.it/software/TRAM/.
Requirements: any operating system (Linux, Mac OS X, Windows, ...) with Python 2 or Python 3 and IDLE.

When the pre-processed expression values are not available, the user may consider the background (BKD) median as the median of the pixel intensities in the area surrounding the spot, and the feature (spot) median as the median of the pixel intensities in the area inside the spot. The spot intensity may be then calculated by subtracting the background median value from the feature median value and used as the expression value for the corresponding gene.
Clicking on the "Help with data" button in the TRAM "Home" window will allow the user to be interactively assisted in the preparation of text files of the required format, including calculation of the spot intensity by subtracting the background value from the spot value (see below for details).

Third column [optional]: Platform code (e.g., GPL96),
it is needed only in the first row.

IMPORTANT - To interpret the identifiers in your gene expression data file as ID for the relative Platform, you must previously have set up the corresponding Platform as explained in section 2.2 of this Guide. Some Platforms are pre-setup as described in the same section.

Example:

[Column Headers are not required]

[Probe ID] [Value]   [Platform]     
1007_s_at  6.38      GPL96  
1053_at    6.65
117_at     6.48

...        ...

Note: If the first expression value is not in the first row due to the presence of some header lines, please use the very first row anyway in your file to indicate the Platform code, making sure that you are writing it in the third column. If you have only one column in the first row, please press the tabulator key twice then write the Platform code.
Do not insert blank spaces or other characters at the end of the text in a column.

If you use the GenBank Accession Numbers as identifiers, please do not append the version of the sequence to the GenBank identifier, i.e. use AB123456 and not AB123456.1.

Management of absent/negative/zero values

Probes whose expression value is absent (i.e. empty, not available) will not be further considered by TRAM for the construction and analysis of the maps, assuming that an expression level has not been measured.

Sample expression values
equal to or lower than "0" (≤0) will be thresholded to 95% of the minimum positive value present in that sample, in order to obtain meaningful numbers when dividing "Samples Pool A" values by "Sample Pool B" values.
Assuming that in these cases an expression level is too low to be detected under the used experimental conditions, this transformation still allows to obtain a ratio between values in the pool A and values in the pool B, which is useful to highlight differential gene expression.

Expression values assigned to unmapped genes (without known genome coordinates) will be normalized and it will be possible to browse through them in the "Values_A_B_All" layout, but they will not be used in the construction and analysis of the maps.
From the "Values_A_B" layout, the button "A/B (unmapped)" option brings to the layout "Values_A_B_All".

Import utilities

The user must provide TRAM with one or more expression data files with at least two columns: Gene/Probe ID and its corresponding numerical expression value. To prepare the files in this format, you may use any word processor or spreadsheet program and save the file in tabulated text format.

To simplify the extraction of the relevant columns from any available tabulated text file providing expression data, generated by the user's experimental platform or publicly available from any online source, the TRAM internal utility "
Help with data" can be used by pressing the relative button in the TRAM Home.

IMPORTANT - To interpret the identifiers in your gene expression data file as  ID for the relative Platform, you must have previously set up the corresponding Platform as explained in section 2.2 of this Guide. Some Platforms are pre-setup as described in the same section.

Clicking on the "Help with data" button in the TRAM "Home" window will allow the user to be interactively assisted in the preparation of text files of the required format. The user will be guided to import his data file and to select the two columns containing gene identifiers and expression values. A Platform code must be indicated if the gene/probe identifiers are not the standard gene symbols and they need to be converted into gene symbols using Platform data loaded in TRAM (see section 2.2).
Finally, the software asks the user to save the data, generating a text file suitable to be imported in TRAM. The user may choose the desired file name.
If the user plans to
import expression data files using "Batch Import" mode of feeding the database, the text files must be saved with a name of the type A1.txt, A2.txt ... (in the TRAM folder "Batch_Import_A") or B1.txt, B2.txt ... (in the TRAM folder "Batch_Import_B").
Batch processing of a sample series: it is possible to prepare in batch mode a series of sample data files related to the same work, obtained with the same Platform and formatted in an identical way.
Put all the files to be processed in the "Series" folder located in the "TRAM" folder, naming them S1.txt, S2.txt and so on.
From the TRAM "Home", click on the "
Help with data" button and then on the Data file batch processing button.
Locate the "ID" and "Value" columns when requested for the first sample.
Insert the name of the Platform when requested.
TRAM will then automatically process all the files located in the "Series" folder using the same criteria, generating a series of uniformly processed data files with names such as P1.txt, P2.txt and so on. These files may be transferred in the "Batch_Import_A" or "Batch_Import_B" folders to be automatically imported by TRAM using the "Batch mode" import buttons in the TRAM "Home", after renaming them with names such as A1.txt, A2.txt ... or B1.txt, B2.txt ..., respectively.


Clicking on the "Special" button in the TRAM "Home" window will allow the user to automatically perform batch data import of large pools of samples for both A and B Pools in succession, provided that the expression data files have been prepared in the required format (possibly using the "Help with data" utility) and have been saved in the TRAM folder "Batch_Import_A" (with names such as A1.txt, A2.txt ...) and in the TRAM folder "Batch_Import_B" (with names such as B1.txt, B2.txt).

Clicking on the "Export" button in the TRAM "Home" window will assist the user in the export of the (raw or normalized) imported data.

The following instructions are also available as a guided procedure within the software in the appropriate "Set Up" area ("Set up - Part 2 - Gene Identifiers conversion tables").

NOTE: TRAM will try to convert the Gene identifiers present in your expression data files to Gene Symbols/Gene names until a positive match is found, with the following priority order:

1) if you set up the "Custom" table as described in section 2.2 of the chapter "Set up", the "Custom" Table will be first searched to match Gene identifiers in your data to the corresponding Gene Symbols/Gene names, overriding all other conversions;

2) if no match has been found, then the "Genes" table (mandatorily setup
as described in section 1.2 of the chapter "Set up") will be searched to directly interpret Gene identifiers in your data as Gene Symbols/Gene names;

3) if you write a Platform ID code (e.g., GPL... for a GEO Platform) in (at least) the first line of the third column of your data (formatted as described in the section 3 of this Guide), a corresponding list of gene Identifiers (often a series of progressive numbers) is expected in your data  and each will be converted in the corresponding Gene Symbol/Gene name, if you previously set up the table for the relative platform
as described in section 2.3 of the chapter "Set up";
for example:

1007_s_at  6.38      GPL96  
1053_at    6.65
117_at     6.48

...        ...

[Note - If the first expression value is not in the first row due to the presence of some header lines, please use the very first row anyway in your file to indicate the Platform code, making sure that you are writing it in the third column. If you have only one column in the first row, please press the tabulator key twice then write the Platform code.
Do not insert blank spaces or other characters at the end of the text in a column].


4) if you write the word GeneID in the first line of the third column of your data (formatted as described in the section 3 of this Guide), an "Entrez Gene" Identifier (a progressive number) is expected in your data and it will be converted in the corresponding Gene Symbol/Gene name searching in the "Genes" Table (this has been mandatorily setup as described in section 1.2 of the chapter "Set up");

for example:

780       
6.38      GeneID 
5982       6.65
3310       6.48

...        ...

[Note - If the first expression value is not in the first row due to the presence of some header lines, please use the very first row anyway in your file to indicate the "GeneID" option, making sure that you are writing it in the third column. If you have only one column in the first row, please press the tabulator key twice then write the "GeneID" word.
Do not insert blank spaces or other characters at the end of the text in a column].


5) if no match has still been found, the "Unigene" Table will be searched to directly interpret Gene identifiers in your data as GenBank Accession Numbers (if you set up this table
as described in section 2.1 of the chapter "Set up");

6) if no match has still been found, the "Unigene" Table will then be searched to directly interpret Gene identifiers in your data as UniGene Cluster identifiers (
if you set up this table as described in section 2.1 of the chapter "Set up").

When a match is found, this will prevent the software from searching for a symbol in the next tables.
We suggest to use recently released data for each table to be imported in the TRAM software.


If you have a list of Gene Symbols as probe identifiers, with the corresponding expression values, you can directly go to "Home" and start to Import expression data, otherwise go to the "Set Up" chapter, Part 2.

Import start

In the "Main" ("Home") window there are two button series designed for rapidly begin the import processes.

Don't worry if the progress bars seem to advance too slow; this sometimes doesn't reflect the actual progress of the task.

The first import button series ("Import A" and "Import B") imports one expression data file into "Values_A" table or "Values_B" TRAM database table, respectively.

At the start of the import process, the user must choose whether to retain or delete all previously imported data. Clicking on "No" in the first dialog box will let the user add to the previously imported data one or more other datasets. The user may subsequently select any sample subset which must be subjected to analysis.

The second dialog box asks for the selection of the file containing the data table.


All data imported from a file will be labeled by the software with a progressive order number 
(Sample_ID) to easily track (or delete from the analyzed set by the "Remove Sample" function) all data belonging to a specific set.
In addition, "Samples_A" and "Samples_B" tables allow the visualization and annotation of the list of imported samples and to visualize summary data for each sample.
The "Go" buttons open a window in your default browser displaying the entry for Platform,  Series,  Sample, Dataset and PubMed record if you annotated (at any time) the corresponding fields with codes for GPL, GSE, GSM, GDS and PMID, respectively.
At the start of an analysis, the user can also select which samples are to be excluded or included (default) from the current analysis, without removing them from the TRAM database; alternatively the user may even remove any samples from the database.
Please note that changing the set of samples to be analyzed causes restarting of normalization (see below), which may take several minutes or hours, depending on the number of loaded samples.

The software will ask for the import of another set at the end of the process.

As final step, the user can check the results of the import process.

When requested by the software, click on the blue
"Continue" button at the top and on the right of the program window, to ensure a correct functioning of the software.

The second import button series ("Batch mode" buttons) works in the same way but it is optimized to perform a batch, non user-supervised import.
By clicking on "Batch mode" (A or B) all files (formatted as just described for the manual import) contained in the
"Batch Import_A
" folder or in the
"Batch_Import_B"
folder, respectively, will be imported.
In these folders the file must be named as
A1.txt, A2.txt, ... and
B1.txt, B2.txt, ..., respectively (without interruption in the series of progressive numbers).

In the case that you would like to perform a batch import maintaining the previously imported dataset, the first file name should be numbered as the first not used Sample_ID number (e.g. if the last imported set has Sample_ID = 5, the first file must be A6) and that number will correspond to the Sample_ID of that dataset. The software will alert you about this. You may check for the currently used Sample_TRAM_IDs by clicking on the "Samples A" and "Samples B" buttons, respectively, in the TRAM Home.

Clicking on the "Special" button in the TRAM "Home" window will allow the user to automatically perform batch expression data import of large pools of samples in succession for both A and B Pools. Batch import may be followed automatically by data analysis using the "Batch Import + Analysis" button in the "Special" section.

After the import process, expression data are visualized in the "Values_A" and "Values_B" tables, that you be displayed by clicking on the buttons A and B, respectively, from TRAM "Home" (opening window).
These are the data fields for the "Values" tables:

Identifier      (the original probe identifier in your data)
Intensity value (the original numerical expression value)
Sample_ID       (A1, A2... or B1, B2...).
Platform        (filled if you indicated a Platform code "GPL..."
                   in your expression data file.
Exclude         (state of inclusion/exclusion of the data for the analysis)
Gene_name       (Gene Symbol/Gene name following conversion of Identifiers)
Chr             (chromosome name)
txStart         (start position of the gene transcript on the chromosome)
txEnd           (start position of the gene transcript on the chromosome)

IMPORTANT - The conversion of gene or probe identifiers to Gene Symbols/Gene names is performed during expression data import. To keep the database indexed and fast, variations of set up of the software are not dynamically reflected in variations of gene assignment to the probe identifiers. Therefore, changing of any table related to the "Set Up" chapter ("Chromosomes", "Genes", "EST_Clusters" and "UniGene_ID", " Platforms ID", "Custom ID") should be followed by reimport and reanalysis of the expression data to make the changes effective. An exception to this rule is the set up of new Platforms or new Custom ID sets that have to be applied only to new, subsequently loaded samples and not to previously imported samples. In this case reimport of all samples is not needed.
Clicking on the "Special" button in the TRAM "Home" window will allow the user to automatically perform batch data import of large pools of samples.

Interpretation and Normalization of the imported data

The user provides TRAM with an "Intensity value" for each spot,
which is intended to be the pre-processed intensity value, i.e. the numerical value assigned to the spot as it has been processed by the software of the specific experimental platform used (e.g. following background subtraction for a microarray spot).
To allow comparison of gene expression data obtained by different biological samples and/or by different experimental platform, TRAM is able to perform some useful data normalization methods.

The normalization type may be changed by a pop-up Menu from the "Values" or "Samples" data tables.

Intra-sample (intra-array) normalization works within each distinct sample, while inter-sample (inter-array) normalization is simultaneously applied to the desired sample sets.
You may select different combinations between these types of normalization.

Please note that the normalization process may require several hours for databases in which tens of arrays were imported.
Clicking on the "Special" button in the TRAM "Home" window will allow the user to automatically  perform normalization changes of large pools of samples.

The normalization may also be set starting an analysis, so that normalization and analysis will be performed in chain without the user's intervention.

Intra-sample normalization

These methods rescale values within each data set using a standard internal reference for each sample.

None
No Intra-sample normalization is performed.

Mean
[DEFAULT AFTER INSTALLATION]
Each value is expressed as the percentage of the corresponding sample mean value. This is equivalent to the classic "global normalization" in the microarray data analysis.

Median

Each value is expressed as the percentage of the corresponding sample median value. This is equivalent to the classic "global normalization" in the microarray data analysis.

Max

Each value is expressed as the percentage of the corresponding sample maximum value. This is equivalent to the classic "scale normalization" in the microarray data analysis.

Inter-sample normalization

These methods rescale values within each sample set.

None
No Inter-sample normalization is performed.

Quantile

For the implementation in the database structure at the core of TRAM, each intra-sample normalized value is given a rank following sample data sorting in ascending order, then the mean value for all the values with the same rank across all samples is calculated. This mean value is assigned as the expression value to each gene with the same rank in each sample. An original variant of this method implemented in TRAM is described below. (Bolstad et al., 2003).

Scaled_Q (Scaled Quantile) [DEFAULT AFTER INSTALLATION]

Derived from Quantile method, except that the rank for each array is rescaled according to the array with the maximum number of probes. This original method allows to compensate when comparing array with highly different number of probes because in this way the highest values for arrays with low number of probes are given ranks comparable to those assigned to arrays with high number of probes (see the article).

DATA SUMMARY - Values_A_B Layout

The summary of gene expression values, under the current mode of normalization, may be viewed in the "Values_A_B" layout.
This is an indexed database table summarizing all data points available in the sample pool for each gene.
Along with the Mean value and the Standard Deviation (SD) value, the SD value is also shown as a percentage of the expression value.
The "Mean" value of the data points available for each locus is considered the expression value for the respective gene and it is used in the subsequent analysis.
The number of "Data Points" from which the summary data are obtained is also displayed.

The yellow button "A/B (unmapped)" brings to the layout "Values_A_B_All", which includes also unmapped loci that are not listed in the "Values_A_B" table used for the creation and analysis of the transcriptome maps.

Clicking on the "Export" button the data for the genes listed in "Values_A_B" table may be exported as a tabulated text file.
The file contains by default the following columns, from left to right:
01) Gene_name
02) Chromosome name
03) Chromosome Identifier (progressive number)
04) Gene mean expression value for pool A samples.
05) Gene mean expression value for pool B samples.
06) Ratio between gene mean expression value from pool A
    samples and from pool B samples (A/B ratio).



4 Analyzing data                                  (Back to Index)

Different TRAM databases may be obtained by duplicating the fresh "TRAM" folder and starting a new analysis session.

Please do not change the name of any file and folder of the TRAM software.


You may download multiple copies of TRAM and run them simultaneously, provided that each "TRAM" folder is located in a different directory, so you may maintain the original names of TRAM folder and files.
Do not move the "TRAM" folder while the software is open.
Run the "TRAM" software from a local hard disk.
Do not run the software from a network drive.


Don't care if the progress bars seem to advance too slow, this sometimes doesn't reflect the actual progress of the task.

If a TRAM analysis aborts unexpectedly, it is advisable to restart it in a fresh TRAM copy.


For the analysis of a pool of expression data arrays, the expression value for each gene symbol will be the mean expression value among all its corresponding identifiers available in that sample pool.

Basically, TRAM software performs two types of analysis: creation of transcriptome maps ("Map" mode) or search for cluster of over- or under-expressed neighbouring/contiguous genes ("Cluster" mode).

Clicking on the "Special" button in the TRAM "Home" window will allow the user to automatically perform all available analyses in sequence, after an initial choice of the settings required for the analysis.
Note: if Analysis is preceded by automated, batch "Import A+B" of the expression data, the setting "Sample Selection" will be ineffective, and all Samples in Pools A and B will be imported and then will all be used for the analysis.

You may start the analysis clicking on one of the red "Analysis" buttons in the "Home" layout ("Home"). You will then be asked to insert the analysis settings of your choice.

The two settings common to both types of analysis are:

Pool
choice (A, B or A vs. B to compare two series of samples between them using A/B ratio);

Statistics calculations may be performed with respect to all genome segments (or genes) or to the set of segments (or genes) located in the same chromosome.
This implies both descriptive statistics (calculation of percentile thresholds to select over/under-expressed genes) and statistic analysis (parameters for calculation of hypergeometric distribution in order to determine significance of the identified over/under-expressed segments or clusters).

4.1 Creating and analyzing maps of the transcriptome          
(Back to Index)

Click on the "Chromosomal Segments" button in the TRAM "Home" (TRAM main window).

The software will generate a graphical map of the transcriptome showing a vertical line representing each chromosome. An expression value is associated to each segment of the line, whose size is determined by a window (in bp) set by the user. This value is the mean for all available expression data related to the genes included in each segment.

Information about "Location" is derived from "NCBI Gene" imported data and in the "Map" mode is obtained for the first gene listed in each chromosomal segment.

In the "Map" mode, results are always generated calculating both types of analysis (the one based on all genes in the genome and the one based on the genes located in the same chromosome the segment belongs to). You are required to select one type of analysis ("genome" or "chromosome") in order to be directed, at the end of process, to the results layout you selected, but the results for the other layout are also available.
This is because TRAM spends much of the time during "Map" analysis in creating chromosomal segments, so it is convenient to calculate both statistics when segments are created.

SETTINGS

The available settings for this analysis are:

Window: defines the length for a segment.
If the coordinates of a gene span the window boundaries, the gene is included in each window in which a part of it lies.
Each segment on the map shows only those genes having an available expression value in the corresponding sample or pool of samples.

Sliding window shift: defines the overlapping region between a segment and the next one.
A shift equal to zero results into non overlapped segments.
For example, if the window is 1.000.000 bp and the shift equals 200.000 bp, the successive segments will be created with coordinates:
      1 - 1.000.000 bp
200.000 - 1.200.000 bp
400.000 - 1.400.000 bp, and so on.

This function could be useful to increase the sensitivity of the search for over/under-expressed segments.

Percent (segment): defines the threshold required
to consider a segment as "Over- (or Under-) expressed" (i.e. to be marked in red or blue in the expression bar).

The segment which shows mean expression value (calculated as the mean of all known genes included in it) within the highest (n) percent of Values or within the lowest (n) percent of Values, where n=Factor (segment), will be highlighted (in red or blue colour, respectively), thus displaying genomic regions globally over- or under-expressed, respectively, with respect to the desired threshold.

Percent (gene): defines the threshold expression value to consider a gene as "Over- (or Under-) expressed" (i.e. to be marked in red or blue in the segment gene list).

The gene which shows mean expression value within the highest (n) percent of Values or within the lowest (n) percent of Values, where n=Percent (gene), will be highlighted, being listed in red (over-expressed) or blue (under-expressed) colour font, respectively.

The number of over/under-expressed genes in the segment is calculated with respect to the Percent (gene).
Using two different parameters for segment and genes allows the
user to perform a more refined analysis.

Number of genes in the window: defines the minimum number of over/under-expressed genes required to mark the segment with the tag "Over" (or "Under").

The Over-Expressed segment listing a number of Over-Expressed genes equal to or greater than the "Number of genes in the window" will be marked as "Over" in the "Map" layouts.

The Under-Expressed segment listing a number of Under-Expressed genes equal to or greater than the "Number of genes in window" will be marked as "Under" in the "Map" layouts.

SAMPLE NUMBER: defines the minimum number of Samples for which an expression value must be available for a gene in order to include that gene in the analysis.

RESULTS

The results of the analysis are displayed within 30-90 minutes, depending on the number of arrays analyzed. Changing the data normalization type during the analysis requires additional time for the task to be completed.

The results of the analysis are displayed in the "Chromosomal Segments" layouts (i.e., "Map" layouts).
Each chromosomal segment is actually a record of the database. You can find and sort segments using desired criteria.

The "P" field displays the p-value resulting from the hypergeometric distribution calculation for the "Over/Under"-expressed segments. This is the statistical significance, i.e. the probability that the result (presence of n over/under-expressed genes within the same segment) could have been obtained by chance.
Due to the high number of segments in a genome, the "P" value needs to be corrected to avoid False Discovery Rate (FDR).
The "Q" field displays the p-value corrected for FDR.
"P" and "Q" values are displayed only for the segments fulfilling criteria to be tagged as over/under-expressed. If Q≤0.05, the over/under-expression is considered to be statistically significant.
For details and references about the statistical analysis, see the article describing "TRAM".

The user may also produce a graphical output showing the series of chromosome transcriptome maps aligned horizontally and may choose to select representation of specific chromosomes or set of chromosomes.
In addition, specific buttons help retrieve online database entries for the desired genes.

In the "Map" layouts based on all gene values, segments that result to be significantly over/under-expressed only in this type of analysis, but not in the corresponding one based on pertinent chromosome values, will be marked by a "G" and the intensity bar will be highlighted in yellow. The button "Show only"->"Genome Specific" will retrieve only these "G" segments.
In the "Map" layouts based on chromosome-specific values, segments that result to be significantly over/under-expressed only in this type of analysis, but not in the corresponding one based on all gene values, will be marked by a "C" and the intensity bar will be highlighted in yellow. The button "Show only"->"Chromos. Specific" will retrieve only these "C" segments.

Clicking on the
"Export Results Data" button allows the export of the results as a tabulated text file that will be saved in the "Results" folder present in the main "TRAM" directory.
The file contains the following columns, from left to right:
01) Chromosome name
02) Chromosomal location
03) Segment Start genomic position
04) Segment End genomic position
05) Segment expression value
06) Label of segment Over/Under-expression ("Over", "Under")
07) P value
08) Q value
09) List of genes (symbols) included in the segment
10) Number of Over-expressed genes in the segment
11) Number of Under-expressed genes in the segment
12) Total number of genes in the segment

A second file with the label "Set" in the file name is generated, containing the summary of the analysis settings, which are also displayed at the top in all TRAM results layout.

The user can also export result data in different formats (e.g., Excel) using the "Export Records..." command from the "File" Menu.

4.2 Searching for clusters of neighbouring over/under-expressed genes
(Back to Index)

In the "Cluster" mode, the software will search for sets of contiguous/neighbouring genes all expressed beyond a defined "n" threshold, i.e. with expression values higher than the (100 - "n") percentile or lower than the "n" percentile.

In this mode, results are centered on individual differentially expressed loci and they are complementary and more sensitive compared to the "Map" mode of analysis, which requires the definition of an arbitrary window length within which genes must be comprised.


SETTINGS

Click on the "Gene Clusters" button in the TRAM "Home" (TRAM main window).

The available settings for this analysis are:

Percent (gene):  defines the thresholds required to consider a gene as "Over- (or Under-) expressed" (i.e. to be marked in red or blue in the expression bar).

The genes showing mean expression value within the highest (n) percent of Values or within the lowest (n) percent of Values, where n=Percent (gene), will be highlighted, being listed in red (over-expressed) or blue (under-expressed) colour font, respectively,

Over-Expressed gene
(marked as "CLUST-O" in the results layout)


Under-Expressed gene
(marked as "CLUST-U" in the results layout)

Gap:
defines the maximum number of non "Over" or "Under"-expressed genes allowed to be localized between two "Over" or "Under"-expressed genes in a cluster.
Setting a gap equal to 1 means that two over- (under-) expressed genes will be included in the "Cluster" even when they are separated along the chromosome by a gene not fulfilling the conditions to be considered over/under-expressed. For example, a cluster composed by the over-expressed genes A, B, and C, could contain no more than 2 non-over-expressed
genes: one between genes A and B and the other between genes B and C.
Genes with this feature will be marked as "GAP" in the results layouts.
If Gap=0, only contiguous genes will be considered to be in cluster.
Genes with no expression data in the analyzed sample set will not be considered as "GAP" and will instead be marked "EMPTY" in the results layouts. They are visualized, but they are ignored by the searching for cluster process.

Gene Type: the software will construct a scheme of the linear succession of genes present in the table "Genes", filled during the set up process.
The user can set TRAM to use any of the following, while constructing the linear map of genes:

1) Gene symbols only
   (
genes with a "NCBI" Gene symbol/identifier assigned); or

2)
All symbols and UniGene clusters
   (genes as at point 1) plus

   sequences having a UniGene (EST) cluster identifier).


SAMPLE NUMBER: defines the minimum number of Samples for which an expression value must be available for a gene in order to include that gene in the analysis.

RESULTS

The Results of the "Cluster" analysis are typically displayed within a few minutes.

Changing the data normalization type during the analysis takes additional time for the task to be completed.

The results of the analysis are displayed in the "Cluster" layouts.
Each gene is actually a record (row) of the database. You can find and sort genes using desired criteria.

The "P" field displays the p-value resulting from the hypergeometric distribution calculation for the "Over/Under"-expressed Clusters. This is the statistical significance, i.e. the probability that the result (presence of n over/under-expressed clusters within the transcriptome) could have been obtained by chance.
Due to the high number of genes in a genome, the "P" value needs to be corrected to avoid False Discovery Rate (FDR).
The "Q" field displays the p-value corrected for FDR.
"P" and "Q" values are displayed only for the clusters fulfilling criteria to be tagged as over/under-expressed. If Q≤0.05, the over/under-expression is considered to be statistically significant.
For details and references about the statistical analysis, see the article describing "TRAM".

The number (#) of over/under-expressed genes in the cluster, the Length (in bp) of the chromosomal region covered by the cluster, the  number of individual Data Points (e.g., array spots) from which summary data for each gene are obtained are also displayed.

Specific buttons help retrieve online database entries for the desired genes.

In the "Cluster" layouts based on all gene values, genes that result to be significantly over/under-expressed only in this type of analysis, but not in the corresponding one based on pertinent chromosome values, will be marked by a "G" and the intensity bar will be highlighted in yellow. The button "Show only"->"Genome Specific" will retrieve only these "G" genes.
In the "Cluster" layouts based on chromosome-specific values, segments that result to be significantly over/under-expressed only in this type of analysis but not in the corresponding one based on all gene values, will be marked by a "C" and the intensity bar will be highlighted in yellow. The button "Show only"->"Chromos. Specific" will retrieve only these "C" genes.

Clicking on the "Export Results Data" button will export the results as a tabulated text file that will be saved in the "Results" folder present in the main "TRAM" directory.
The file contains the following columns, from left to right:
01) Cluster ID (a unique number used as cluster identifier)
02) Type of Cluster
    (CLUST-O: over-, CLUST-U: under-expressed)

03) Count of over/under-expressed genes in the Cluster
04) Length (bp) of the region covered by the cluster
05) Chromosome name
06) Chromosomal location
07) Gene symbol/name
08) Gene Start genomic position
09) Gene End genomic position
10) Gene expression value (mean among all pool samples)
11) Sample Count (number of Samples with an Expression value
    for that Gene)
    (two values in the A/B analysis, for A and B, respectively)
12) Cluster Mean Expression (mean expression value of the genes
    in the cluster)

13) Label of gene Over/Under-expression ("Over", "Under")
14)
Number of individual Data Points processed for each gene
15) P value
16) Q value
17) Gene description


A second file with the label "Set" in the file name is generated containing the summary of the analysis settings, which are also displayed at the top in all TRAM results layout.

The user can also export result data in different formats (e.g., Excel) using the "Export Records..." command from the "File" Menu.

4.3 Use of TRAM as "TRAM Results Viewer"(TRV)
(Back to Index)

Since 1.2 version, a copy of TRAM itself (empty) may be used as a "TRAM Results Viewer" (TRV) in order to regenerate a grahical view of the results obtained by a copy of TRAM filled with species-specific data Tables and with Results generated by the analysis allowed by TRAM.
1. Choose "! Export Main Tables" from the "Script" Menu of the copy of TRAM where the analyses were executed. This will export all main Tables (all fields) from TRAM in .fmp12 format into the "Results" folder of TRAM
("Settings", "Chromosomes", "Genes", "Samples", "Values_A_B", "Map" and "Cluster" Results Tables).
Platforms, UniGene/ESTs and
sample "Values" Tables data will not be exported.
2. Copy the resulting 16 Tables in the "Results" folder of a distinct, empty copy of TRAM 1.2 itself (TRV).
3. Choose
"! Import Main Tables" from the "Script" Menu of TRV.
This will import the Tables in TRV. The source "Table" files can then be deleted from the TRV "Results" folder.
TRV is intended to allow the distribution of a set of data and results from a particular TRAM analysis without distributing the original whole file (which can be of the size of ~25 Gb when approximately 700 samples are analyzed).
Due to the lack of Platform data, as well as of individual sample "Values" data, in TRAM used as a TRV some types of analysis cannot be run (adding / deleting / including / excluding Samples) and some functions and buttons cannot work (inspecting individual sample Values, inspecting Platform / UniGene / ESTs tables).



GENERAL DEFINITIONS                                        
(Back to Index)

5.1 File
(Back to Index)
 A set of database tables.

5.2 Table
(Back to Index)
A set of records referring to the same subject type (e.g., the "Genes" table).

5.3 Record
(Back to Index)
One set of fields which represent one entry (i.e. containing all requested data for a subject, e.g. a gene probe).
The record browser is a small book icon at the top left of the window. You may also browse the records faster using the cursor at the right of the small book icon.

5.4 Field
(Back to Index)
 The database unit containing a specific data type (e.g., "Gene_name").

5.5 Layout
(Back to Index)
A particular graphical organization of the field of a table.
A table can be visualized in more than one layout.
A layout may display fields from a table or its related fields from other tables.
A file may show data within different layouts.
Visualization of a field is independent from the storage of the contained data.

Browsing among the layouts can be made by clicking on the "Layout:" pop-up Menu at the upper left corner.

You may browse the database by clicking on the small book pages at the top left of the window, or
using the cursor at the right of the small book icon or by
entering a record number and clicking on the "Return" key.
The following information is constantly displayed in the top bar of the  window (if not, select "Status Toolbar" from the "View" Menu):
Records: total number of Records in the table.
Found: total number of the subset of Records currently selected. Clicking on the green circular button will retrieve the complementary subset of currently omitted records.
Sorted: sorting status of the Records (Sorted/Unsorted).

The FileMaker Pro-based database may be used basically in these "modes":
"Browse", "Find", and "Preview".
Switching among different modes can be done from the "View" Menu or from the pop-up Menu bar at the bottom left of the window.

5.6 Browse Mode
(Back to Index)
One way to use the database.

It allows entry, view, browse, sort, and manipulation of data.
It may be selected from:
the "View" menu or
the mode pop-up Menu bar at the bottom left of the window.

In the "Browse" mode, the record sets can be browsed by clicking on the small book icon (with the arrows to move "back" and "forward") in the upper left corner.

Browsing among the tables can be done by clicking on the "Layout" pop-up Menu at the upper left corner.

5.7 Find Mode
(Back to Index)
An alternative mode to use the database.

It allows searching for specific content in the database fields, using any different combination of criteria
(see the "Search mode" section below for more details).
It may be selected from:
the "View" menu or
the mode pop-up Menu bar at the bottom left of the window.

The user can fill in a blank form allowing to search in specific fields.

In the "Find" mode, the small book icon in the upper left corner represents different "requests" that are made for searching the database.

In FileMaker Pro "Find" mode, the "AND" - "OR" - "NOT" operators may be implemented in this way:

"AND" by filling criteria in different fields
      located in the same "Request",


"OR"  by generating additional requests
      (from "Requests" Menu) in the same query,

"NOT" by generating additional requests
      (from "Requests" Menu) and clicking on the "Omit"
      button (located in the window top bar).


The "Operators" pop-up Menu appears by clicking on a field while pressing the "ctrl" key, allowing query of:
exact matchesduplicate values, ranges, wild cards and more.

Click on the "Perform Find" button at the top of the window to start the query.


The result of the search is the subset of the entries matching the set search criteria.

5.8 Preview Mode
(Back to Index)
An alternative way to use the database.

It visualizes a print preview of the found records.
It may be selected from:
the "View" menu
or the pop-up Menu bar at the bottom left of the window.

In the "Preview" mode, the user can obtain a print preview of the data in the current table.
Browsing among the tables can be done by clicking on the "Layout:" pop-up Menu at the upper left corner.


MENU AND COMMANDS                                        
(Back to Index)

6.1 "TRAM" Menu
(Back to Index)                              

About FileMaker Pro Runtime...
Information about FileMaker Pro Runtime at the core of the software.

Preferences...
Standard preferences panel; cache memory size can be set up to 256 Mb.

Hide TRAM
Hiding all TRAM windows.

Quit TRAM
Closing the program.

6.2 "File" Menu                                    
(Back to Index)

File Options...
It is possible to set only the "Spelling" options.

Change Password...
There is no default password set.

Page setup...
Standard page set up command.

Print...
Standard print command.
The appearance will match the layout currently displayed on the screen.

Import Records
This is the general "Import" function of FileMaker Pro.

Export Records...
Export command for the found records set in a given table.
Records are exported in their current sorting mode.
User can select fields to be exported, their relative order,
and the separation character.

Save a Copy as...
Save a copy of the database, complete, compressed or as a clone (database structure with no record present).

6.3 "Edit" Menu                                    
(Back to Index)

Undo
Standard "Undo" command.

Cut
Standard "Cut" text command.

Copy
Standard "Copy" text command.

Paste
Standard "Paste" text command.

Select all
Selection of all text present within a selected field
(to select a field, click into the field).

Find/Replace
Utility for searching/replacing text strings within fields.
Note: Use "Find" mode (from "View" Menu)
      for full search and selection of a record set.

Spelling
Utility to check spelling of text strings within fields.

Export Field Contents...
Utility to export the contents of the selected field to a file.

6.4 "View" Menu                                     
(Back to Index)

Browse Mode
Switch to the "Browse Mode" (see "General Definitions" above).

Find Mode
Switch to the "Find Mode" (see "General Definitions" above).

Preview Mode
Switch to the "Preview Mode" (see "General Definitions" above).

Go to layout
A possible way to switch between different layouts.

View as Form
A possible way to individually display the current record of a found set of records.

View as List
A possible way to display all the records of a found set in the form of a list.

View as Table
A possible way to display all the records of a found set in the form of a spreadsheet-like table.

Toolbars
To switch on/off the toolbars of the application: "Standard"
and "Text Formatting".

Status Area
To switch on/off the "Status Area", the toolbar located at the top of the program window.

Text Ruler
To switch on/off the text ruler of the application.

Zoom in
Used to increase layout dimensions.

Zoom out
Used to decrease layout dimensions.

6.5 "Records" Menu                                 
(Back to Index)

New Record
Create a new empty record in the database.
The new Record will be the latest of the current record set.

Duplicate Record
Duplicate the current record in the database.
The new Record will be the latest of the current record set.

Delete Record...
Delete the current record in the database.

Delete Found Records...
Delete all currently found records in the database.

Go to Record
Move to the selected record by number, previous or next.

Show All Records
Show all the records in the database.

Show Omitted Only
Show all the records in the database not included in the current "found" set.

Omit Record
Remove the selected record out of the current found set, without deleting it.

Omit Multiple...
Remove more than a record, selected by numbers, out of the current found set, without deleting them.

Modify Last Find
Return to the last performed search in order to edit it.

Saved Finds
Save a set of search criteria.

Sort Records...
Sort the current record set according to desired criteria.

Unsort
Display the current record set according to the order of creation of each record.

Replace Field Contents
Replace the value of a field in all Found Records with the value specified in the current record, or by calculation.

Relookup Field Contents...
This command executes a relook up of the value of a field by reading the matched value in a related table (the relationship has been established during database development using a "key" field).

Revert Record...
Restore the value of a field, discarding any change, before clicking out of that field.

6.6 "Scripts" Menu                                  
(Back to Index)

About
This opens the "About" window containing information about the TRAM software.

Guide

The page with the user Guide of the TRAM software (this Guide).

6.7 "Help" Menu                                    
(Back to Index)

Search
Search a "Help" system for the general commands.


TROUBLESHOOTING                                                         (Back to Index)

Sometimes, power failure, hardware problems, or other factors can damage a FileMaker Pro database file.
When the runtime application discovers a damaged file, a dialog box appears, prompting the user to contact the creator.
Even if the dialog box does not appear, files can exhibit erratic behaviour.
If you have FileMaker Pro or FileMaker Pro Advanced installed you can recover it using the "Recover" command.
Otherwise, to recover a damaged file:
- On Mac OS X machines, press Command + Option (cmd-alt) while double-clicking the runtime application icon. Hold the keys down until you see the "Open Damaged File" dialog box.
-
On Windows machines, press Ctrl+Shift while double-clicking the runtime application icon. Hold the keys down until you see the Open Damaged File dialog box.
During the recovery process, the runtime application:
1. Creates a new file;
2. Renames any damaged files by adding "Old" to the end of the
   file name;
3. Gives the repaired file the original name.


TECHNICAL NOTES                                                
(Back to Index)

The
minimum software requirements are:
Mac OS X 10.6, OS X Lion 10.7, OS X Mountain Lion 10.8;
Windows XP Professional, Home Edition (Service Pack 3);

Windows Vista Ultimate, Business, Home Premium (Service Pack 2);
Windows 7 Ultimate, Professional, Home Premium;
Windows 8 Standard and Pro edition.

Other specifications may be found here.

A connection to the Internet is required to display the software Guide and to download data for set up, but not to run the tool.

Please do not change the name of all files and folders of the TRAM software.

You may download multiple copies of TRAM and run them simultaneously, provided that each "TRAM" folder is located in a different directory.
Do not move the "TRAM" folder while the software is open.
Run the "TRAM" software from a local hard disk.
Do not run the software from a network drive.

If a TRAM analysis aborts unexpectedly, it is advisable to restart it in a fresh TRAM copy.

The scripts at the core of TRAM software are "FileMaker Pro" scripts.

TRAM 1.3 is composed of a 228 MB database engine ("TRAM.app") and a template ("TRAM.TMA") with 43 data tables, 134 relationships among them and 489 script definitions.
Following set up including NCBI UniGene and UCSC EST localization data, the size becomes about 5 GB  for human "TRAM.TMA" file.

Time required to import and process a typical microarray data file is about 10 minutes.
Typical execution time is 1-2 hours for
a "Map" analysis and  5-10 minutes for a "Cluster" analysis, depending on the number of analyzed samples, which also heavily affects the time required to refresh data when the type of data normalization is changed.

Large file size and relative slowness of data processing are mainly due to systematic indexing of all data contained in TRAM, with the advantage of very fast data browsing, navigation and search at the end of data import and processing, which may be run in batch mode.

We encourage any creative use, modification and non-commercial redistribution of TRAM, as long as the original paper is cited, and statement that the original program has been modified is provided (in such a case).

7.1 Known software limits
(Back to Index)

Due to FileMaker Pro limits:
maximum TRAM file size is 8 terabytes (1024 gigabytes);
text field can contain up to 1 billion characters;
numbers field can contains values from 10^-400 up to 10^400.

At present, TRAM requires an unambiguous mapping. Genes common to X and Y chromosomes (e.g., CSF2RA) are now mapped, but only on chromosome X. The double X-Y location remains indicated in the "Location" field of the "Genes" Table.

The limit of 25 chromosomes for a genome is declared only for the possibility to display synthetic maps with all chromosomes shown horizontally aligned; however, it does not apply to the data import, standard visualization mode and all data analysis.


7.2 Bug reports
(Back to Index)
Please report any suggestions, bugs or problems to:
Pierluigi Strippoli
pierluigi.strippoli@unibo.it


ACKNOWLEDGEMENTS                                        
(Back to Index)

Thanks to NCBI for the "Entrez" databases and to UCSC Genome Bioinformatics for the "UCSC Genome Browser".
Thanks to FMPexperts List and FMForum for suggestion and tips about FileMaker Pro.

Portions of this software are Copyright 1984-2012 by FileMaker, Inc. All Rights Reserved. 
http://www.filemaker.com
TRA

----------------------------------------------------------------
APPENDIX - PLATFORMS PRE-LOADED IN TRAM_HUMAN

----------------------------------------------------------------

----------------------------------------------------------------
TRAM_HUMAN 2017
----------------------------------------------------------------

The following 34 Platforms (commercially available) that have been used for the analysis of at least 2,000 GEO samples are already loaded as default in the pre set-up versions for human, 2017 (the number of sample available for each Platform has been updated up to November 08, 2017):

HUMAN         (
genome-wide expression arrays,
              
Platforms with >
2,000 Samples in GEO,
              
excluding exon arrays)              

01) GPL570    [HG-U133_Plus_2]              (134,861 Samples)
              Affymetrix Human Genome U133 Plus 2.0 Array


02) GPL10558  Illumina HumanHT-12            (67,677 Samples)
             
V4.0 expression beadchip

03) GPL96     [HG-U133A]                     (40,122 Samples)
              Affymetrix Human Genome U133A Array

04) GPL6244   [HuGene-1_0-st]                (32,630 Samples)
              Affymetrix Human Gene 1.0 ST Array
              [transcript (gene) version]


05) GPL6947   Illumina HumanHT-12            (23,497 Samples)
              V3.0 expression beadchip


06) GPL6480   Agilent-014850                 (20,184 Samples)
              Whole Human Genome Microarray 4x44K G4112F
              (Probe Name version)


07) GPL571    [HG-U133A_2]                   (15,184 Samples)
              Affymetrix Human Genome U133A 2.0 Array

08) GPL4133   Agilent-014850                 (14,011 Samples)
              Whole Human Genome Microarray 4x44K G4112F
              (Feature Number version)


09) GPL6884   Illumina                        (8,250 Samples)
              HumanWG-6 v3.0 expression beadchip

10) GPL13667  [HG-U219]                       (8,159 Samples)
              Affymetrix Human Genome U219 Array


11) GPL13158  [HT_HG-U133_Plus_PM]            (8,029 Samples)
              Affymetrix HT HG-U133+ PM Array Plate


12) GPL97     [HG-U133B]                      (7,877 Samples)
              Affymetrix Human Genome U133B Array

13) GPL17586  [HTA-2_0]
                      (7,279 Samples)
              Affymetrix Human Transcriptome Array 2.0                       [transcript (gene) version]
                           
14) GPL6883   Illumina humanRef-8             (6,879 Samples)
              v3.0 expression beadchip


15) GPL8300   [HG_U95Av2]                     (6,423 Samples)
              Affymetrix Human Genome U95 Version 2 Array

16) GPL3921   [HT_HG-U133A]                   (6,409 Samples)
              Affymetrix HT Human Genome U133A Array


17) GPL14550  Agilent-028004                  (6,160 Samples)
              SurePrint G3 Human GE 8x60K Microarray           
              (Feature Name version)


18) GPL6104   Illumina humanRef-8             (5,885 Samples)
              v2.0 expression beadchip

 

19) GPL11532  [HuGene-1_1-st]                 (5,524 Samples)
              Affymetrix Human Gene 1.1 ST Array
              [transcript (gene) version]


20) GPL16686  [HuGene-2_0-st]                 (4,594 Samples)
              Affymetrix Human Gene 2.0 ST Array
              [transcript (gene) version]


21) GPL4372   Rosetta/Merck                   (4,579 Samples)
             
Human 44k 1.1 microarray 16

22
) GPL6102   Illumina human-6                (4,290 Samples)
              v2.0 expression beadchip

23
) GPL17077  Agilent-039494                  (4,270 Samples)
              SurePrint G3 Human GE v2 8x60K Microarray
              039381 (Probe Name version)


24) GPL15207  [PrimeView]                     (3,983 Samples)
              Affymetrix Human Gene Expression Array         


25) GPL201    [HG-Focus]         
            (3,860 Samples)
              Affymetrix Human HG-Focus Target Array

26) GPL10904  Illumina HumanHT-12             (3,672 Samples)
              V4.0 expression beadchip (gene symbol)


27) GPL8432  
Illumina HumanRef-8             (3,631 Samples)
              WG-DASL v3.0


28) GPL13497  Agilent-026652                  (3,509 Samples)
              Whole Human Genome Microarray 4x44K v2
              (Probe Name version)"


29) GPL14951  Illumina HumanHT-12             (3,503 Samples)
              WG-DASL V4.0 R2 expression beadchip


30) GPL1708   Agilent-012391                  (3,292 Samples)
              Whole Human Genome Oligo Microarray G4112A
              (Feature Number version)
 
              

31) GPL10379  Rosetta/Merck                   (2,859 Samples)
              Human RSTA Custom Affymetrix 2.0 microarray


32) GPL13607  Agilent-028004                  (2,436 Samples)
              SurePrint G3 Human GE 8x60K Microarray
              (Feature Number version)"


33) GPL3991   Rosetta/Merck                   (2,396 Samples)
              Human 3.0 A1

34) GPL887    Agilent-012097                  (2,170 Samples)
              Human 1A Microarray (V2) G4110B
              (Feature Number version)


The following further 27 Platforms (Total=61) are also already loaded as default in the pre set-up versions for human, 2017 (the number of samples available for each Platform has been updated up to November 08, 2017). They have been used in any of the TRAM analyses published to date (May 2018) and they are listed below in numerical/alphabetical order:

35) GPL80     [Hu6800]                        (1,490 Samples)
              Affymetrix Human Full Length HuGeneFL Array

36) GPL91     [HG_U95A]                       (1,095 Samples)
              Affymetrix Human Genome U95A Array

37) GPL92     [HG_U95B]                       (  513 Samples)
             
Affymetrix Human Genome U95B Array

38) GPL93     [HG_U95C]                       (  505 Samples)
             
Affymetrix Human Genome U95C Array

39) GPL94     [HG_U95D]                       (  194 Samples)
              Affymetrix Human Genome U95D Array

40) GPL95     [HG_U95E]                       (  193 Samples)
              Affymetrix Human Genome U95E Array

41) GPL1074   GNF1H                             (158 Samples)
              In situ oligonucleotide - non-commercial
              (Platform in atypical format, not parsable by
              TRAM; the annotated file gnf1h.annot2007.tsv
              downloaded from http://biogps.org/downloads/
              has been used instead).


42) GPL1291   Hitachisoft AceGene             (1,089 Samples)
             
Human Oligo Chip 30K                          
              (Chip Version)


43) GPL1352   [U133_X3P]                      (  745 Samples)
             
Affymetrix Human X3P Array

44) GPL1426   ABI Human Genome Survey           (146 Samples)
             
Microarray Version 1

45) GPL1449   GE Codelink Human Uniset           (92 Samples)
             
I, II, and 20K

46) GPL1823   SHBW                               (35 Samples)
              Spotted DNA/cDNA - non commercial

47
) GPL1824   SHCN                               (39 Samples)
              Spotted DNA/cDNA - non commercial


48
) GPL1825   SHBA                               (30 Samples)
              Spotted DNA/cDNA - non commercial

49
) GPL1826   SHDP                               (11 Samples)
              Spotted DNA/cDNA - non commercial

50
) GPL1827   SHCE                                (8 Samples)
              Spotted DNA/cDNA - non commercial


51
) GPL2006  
Human 19K oligo array              (73 Samples)


52) GPL2895   GE Healthcare/Amersham Bio.     (1,484 Samples)
              Expression BeadChip

         
53) GPL2986   ABI                             (1,761 Samples)
              Human Genome Survey Microarray Version 2

         
54) GPL3121   LC-30                              (77 Samples)

55) GPL4685   [U133AAofAv2]                     (865 Samples)
              Affymetrix GeneChip HT-HG_U133A
              Early Access Array

56) GPL4811   Human 17K cDNA-GeneTrack          (123 Samples)

57) GPL6254   Phalanx Human OneArray            (590 Samples)
              Spotted oligonucleotide - commercial


58) GPL6255   Illumina humanRef-8             (1,060 Samples)
              v2.0 expression beadchip


59) GPL6370   Illumina human-6                  (156 Samples)
              v2.0 expression beadchip
(extended)
              (Platform in atypical format, only Gene Symbols  
              present in "NCBI Gene" have been left in the
              TRAM "Platform" Table).
  


60) GPL7091   Agilent Human oligo 22k A          (16 Samples)
              (Platform in atypical format, PT_ACC and GB_ACC  
              data fields have been manually merged to
              maximize probe annotation).


61) GPL10665  SMD Print_607                      (31 Samples)
              (Platform in atypical format, Gene Symbol and                   Clusterid / UniGene data fields have been
              manually merged to maximize probe annotation).

A-AFFY-33    = GPL96
A-AFFY-37    = GPL571
A-AFFY-44    = GPL570

End of File