UniGene
Tabulator
1.1.6
Tutorial
(Mac
OS X version)
Citation:
Lenzi L,
Frabetti
F,
Facchin F, Casadei R, Vitale L, Canaider S, Carinci P, Zannotti
M,
Strippoli P.
UniGene
Tabulator:
a full parser for the UniGene format.
Bioinformatics. 2006 Oct 15;22(20):2570-1.
Epub
2006 Aug 7
INTRODUCTION
This online
Guide is
designed
for detailed
documentation of
UniGene Tabulator 1.0 software
It teaches how to install
the
software and how to
import the desired
UniGene
entries in the database.
Please
refer to the user Guide
for more detailed information.
Download
UniGene_Tabulator
1.1.6
for Mac OS X
from address:
http://apollo11.isto.unibo.it/software/
Choose the file: UniGene_Tabulator_Mac.zip
The downloaded file should
be automatically decompressed,
generating a "UniGene Tabulator" folder.
Failing this, double
click on the file to activate the default decompression
utility of your system.
The UniGene Tabulator Folder
contains:
"UniGene Tabulator file"
(runtime
application);
"UniGene.UGT" (database
file);
the "FMP
Acknowledgements.pdf" file;
the "Extensions" folder, containing
a “Dictionaries” folder,
with the dictionary file
for supported languages;
the "MacOS_Tutorial"
and
"MacOS_Guide" folders,
contain a copy of the on-line documentation, for
local
(off-line) use.
Please do
not change
the name of all files and folders
of the "UniGene Tabulator" software.
You may download multiple
copies
of "UniGene Tabulator"
and run them simultaneously,
provided that each "UniGene Tabulator" folder
is located
in a different directory.
UniGene
Tabulator is based on FileMaker Pro 12 (FileMaker Pro, Inc.)
database management software (http://www.filemaker.com/),
and it released as a FileMaker Pro 12 template,
along with a free runtime application able to run "FileMaker
Pro"
at the core of the software.
INSTALLATION
Once decompressed,
UniGene Tabulator can
readily be used.
GENERAL
DEFINITIONS
File
A set of database tables.
Table
A set of recorpertaining
to
the same subject.
Record
One set of fields which
constitute one
entry.
The record browser is a
small
book icon at the
top left of the window.
You may browse the
database by
clicking on the
book pages,
or enter a record number and click on the "Return" key.
You
constantly
visualize the following
information:
Records: total
number of Records in the table
Found: total
number
of Records currently selected
Sorted:
sorting
status of the Records
(Sorted/Unsorted)
Field
One area of the record
containing a specific
data type
Browse
Mode
One way to use the database.
It allows data entry,
viewing,
browsing,
sorting, manipulation.
It may be selected from:
the "View"
menu, or
the mode
pop-up Menu bar, at the bottom left of the window.
Find
Mode
An alternative mode to use
the
database.
It allows searching for
specific
content in the
databases fields,
using any different
combination
of criteria
(see the "Search mode"
section below for details about searching).
It may be selected from:
the
"View" menu, or
the
mode pop-up Menu bar, at the bottom left
of the window.
Preview
Mode
An alternative way to use
the
database.
It visualizes a print
preview of
the found
records.
It may be selected from:
the
"View" menu,
or
the pop-up Menu
bar, at the bottom left of the window.
Layout
A particular graphical
organization of the
field of a table.
A file may show data within
different layouts.
A layout may display fields
from
a table or
its
related fields from other tables.
Visualization of a field is
independent from
the storage of the contained data.
USE
1.Download
UniGene
flat files
Download the UniGene file with
the format ".data.gz"
for the organism desired via ftp at:
ftp://ftp.ncbi.nih.gov/repository/UniGene/
(decompress the files when
appropriate).
Download the corresponding
library information file with the format
".lib.info.gz".
At the end of this step, the
user
should have two text files
containing cluster
data
and library
information,
both ready to be imported into the database.
The UniGene page
containing the ftp "UniGene" download link may
also be reached from within the software using the
"Download UniGene data" button.
This invokes the default browser and makes it open a page
containing the “Downlad UniGene” link on the left side blue bar.
Should you be asked for user “Name” and “Password”,
type “anonymous” and your e-mail address, respectively.
2.
Import UniGene clusters and/or
library information
Different
UniGene Tabulator databases may be obtained by duplicating
the fresh
"UniGene Tabulator" folder and starting new import sessions.
Records from different database tables may then be exchanged
among
different .UGT databases.
IMPORTANT.
Do not import
the same text file more
than once into
UniGene Tabulator database; download or
decompress
the files
again if you need to repeat the import twice.
The
".tab" text files provided along with the distribution are only
illustratory outputs from the program, and are not intended to
be
reimported into UniGene Tabulator, which is designed to import
and
parse the original UniGene format data files.
Open the "UniGene tabulator" file
into
the "UniGene Tabulator" folder.
Click on the "Import
UniGene"
button.
This starts both importing and parsing of the data.
Select options from the dialog boxes when required.
First, you may choose if
you want
to import
Library information too.
If you choose “Yes” you will
be
asked to
select the file “.lib.info”
and then the file “.data”;
if you choose “No” you will
be asked to select only the “.data” file.
You can import library
information later
clicking on “Import Library” button.
The time required to obtain
a
completely parsed UniGene
database mainly
depends on the total cluster number and on the total number of
GenBank sequences composing the clusters. Complete parsing for
large
data files
may require up to several days of
calculation.
Precomputed databases for Homo sapiens and Danio rerio are
provided at:
http://apollo11.isto.unibo.it/software/UniGene_Tabulator
Since Unigene Tabulator 1.1 version:
following parsing of UniGene
data
files, the software will create
the UniGene.tab
file within
the 'UniGene Tabulator' folder.
Layout
appearance
may be adjusted
using
"Zoom +"/"Zoom -" buttons,
or clicking on the small
resizing buttons at the bottom left corner
of any window.
Each field
in each table corresponds to a data type
typical of the UniGene Format.
3.
Use UniGene Tabulator as database.
The FileMaker Pro-based database
may be used
basically in these "modes":
"Browse", "Find" and
"Preview".
Switching among different
modes
can be obtained
from the "View" Menu
or from the pop-up Menu bar at the bottom
left of the
window.
BROWSE
MODE
(“NAVIGATION”)
In the "Browse" mode,
one can browse among
the
record sets by
clicking on the small book icon in the upper
left
corner:
Alternatively, you can move
up
and down among the entries
using buttons
at the top left of the UniGene layout:
Browse among
the tables can be made clicking on the “Table” button
in the desired section (Sequence, Protein similarity, STS,
Transcript
Map).
Alternatively, you
can
move up and down among the
tables clicking
on the "Layout" pop-up Menu at the upper left corner.
SEARCH MODE
(“FIND”)
In the "Find" mode,
the
small
book
icon in the upper left corner
represents different
"requests"
that
are made for searching the database.
In the "Find" mode, the user
can
fill
a blank form
allowing searching
in specific fields.
When searching in the master
"UniGene" table,
if one
entry contains various recurrences of a feature,
all related records of
the
respective feature are displayed.
In FileMaker Pro "Find" mode, the
"AND" - "OR" - "NOT" operators may be used
in
this way:
"AND" by filling in different fields located in the
same "Request",
"OR" by
generating
additional requests
(from
"Requests" Menu) in
the same query,
"NOT" by generating
additional
requests (from "Requests" Menu)
and
checking the "Omit" box.
The "Symbols" pop-up Menu in
the
"Find" mode allows query of
exact matches, ranges,
duplicates,
wildcards and more.
The searching results are
entries
subsets
matching the desired criteria.
PREVIEW
MODE
(“PRINT”)
In the "Preview" mode, one
can
obtain a print preview of the data
in the actual table.
Browsing among
the tables can be done by clicking on the “Table” buttons
in the desired
section
(Sequence,
Protein similarity, STS, Transcript
Map).
Alternatively, you can move up and down among
the
tables by clicking
on the "Layout" pop-up Menu at the
upper left corner.
“UniGene
Tabulator”
FUNCTIONS AND MENU COMMANDS
UniGene Tabulator
MENU
About
FileMaker Pro RUNTIMES
Shows information about the
software into a
new window.
Preferences
Standard preferences panel,
memory can be set up
to 256 Mb.
Quit
UniGene Tabulator
Close the program (same as
to
click on the red
button
on the left upper corner of UniGene window).
FILE MENU
File
Options
In this application it
is possible to set only the "Spelling" options.
Change Password
There is no a default
password.
Page
setup
Standard page set up
command.
Print
Standard print command; you
can
choose to
print:
all records in the "Found"
set, or
only the current record, or
a "blank" mask of the record
fields.
The appearance will be that
of
the layout
currently selected from the
layout
Menu.
Import
Records
This is the general "Import"
function
of FileMaker Pro.
Use only "Import UniGene"
function
for correct UniGene file import,
from the "Actions" Menu, or clicking
on the "Import UniGene"
button in
"UniGene tabulator" file.
Export
records
Export command for the found
records set in a given table.
Records are exported in their current sorting.
Users can select fields to
be
exported, their relative order,
and the separation character.
The option “ALL” will
export all
fields (from
all tables)
into a Unicode UTF-16 file (default parameters).
Save/Send
Record
as
An alternative
export
function. It export data from the current record,
or the find set of record, into an "Excel" file (.xls).
Send Mail
To send data from each record in
the found set by single e-mail.
Save
a Copy as
Save a copy of the database,
complete,
compressed or
as clone (database structure
with
no record
present).
EDIT
MENU
Undo
Standard "Undo" command.
Cut
Standard "Cut" text command.
Copy
Standard "Copy" text
command.
Paste
Standard "Paste" text
command.
Clear
Deletion of selected text.
Select
all
Selection of all the text
within
a selected
field
(to select a field, click
into
the field).
Find/Replace
Utility for search/replace
text
strings within
fields.
Note: Use "Find" mode (from
"View" Menu)
for full search and selection of a
record set.
Spelling
Utility for ceck spelling of
text
strings within
field.
Export
Field Contents
Utility to export the
contents of
the selected field to a file.
VIEW
MENU
Browse
Mode
Switch to the "Browse Mode"
(see
"General Definitions" above).
Find
Mode
Switch to the "Find Mode"
(see
"General Definitions" above).
Preview
Mode
Switch to the "Preview Mode"
(see
"General Definitions" above).
View as Form
A possible way to display
individually the current record of a
found set of records.
Got
to
layout
A possible way to switch
between
different layout:
UniGene, SEQUENCES, STS, TXMAP, PROTSIM.
View
as
List
A possible way to display
all the
records of a found set as list.
View
as Table
A possible way to display
all the
records of a found set as
spreadsheet-like table.
Toolbars
To switch on/off the
toolbars of
the application: "Standard"
and "Text Formatting".
Status
Area
To switch on/off the "Status
Area", the left column toolbar.
Text
Ruler
To switch on/off the text
ruler
of the application.
Zoom
in
To increase layout
dimensions,
same as "Zoom +" button.
Zoom
out
To decrease layout
dimensions,
same as "Zoom -" button.
RECORDS
MENU
New
Record
Create a new empty record in
the
database.
The new Record will be the
latest
of the current
record set.
Duplicate
Record
Duplicate the current record
in
the database.
The new Record will be the
latest
of the current
record set.
Delete
Record
Delete the current record in
the
database.
Delete
All Records
Delete all currently found
records in the
database.
Go
to Record
To move to the selected
record by
number, previous or next.
Show
All Records
Show all the records in the
database.
Show
Omitted Only
Show records in the
database outside of the current found set.
Omit
Record
Remove the selected record
out of
the current found set,
without deleting it.
Omit
Multiple
Remove more then a record,
selected by numbers,
out of the current found set, without deleting them.
Modify
Last Find
Return to the last performed
search to edit it.
Sort
Records
Sort the current records set
according to
desired criteria.
Unsort
Sort the current records set
according to the order insertion.
Replace
Field Contents
Replace the value of a field
into
all found set of record with
the value specified in the current record, or by calculation.
Relookup Field Contents
Relookup the value of a
field by
a matching by a
selected key
field.
Revert Record
Restore the value of a
field,
discard any changes,
before to click out of that field.
ACTIONS
MENU
Import
UniGene
data
Import data from a file in
UniGene
flat file format (.data).
(equivalent to the
"Import UniGene" button in the software
main
window).
Since Unigene Tabulator 1.1 version:
following parsing of UniGene
data
files, the software will create
the UniGene.tab
file within
the 'UniGene Tabulator' folder.
This file is a text tabulated file and contains four columns:
NACC
CLUSTER
GENE
NUID
[GenBank
[UniGene
[Gene Symbol]
[GenBank
Accession number] Cluster
ID]
GI]
See also:
http://www.ncbi.nlm.nih.gov/Sitemap/sequenceIDs.html
It could be useful to readily convert GenBank Accession number
into
Gene Symbol, for meta-analysis purposes.
Import
Library info
Import
data
from a file in “.lib.info”
flat file format.
(equivalent to the
"Import Library"
button in the software main window).
Export
data from SEQUENCE
This command is also
available as a button named "Export Sequence"
on the
main layout:
This action will export
data clustered by ACCESSION
NUMBER information,
from the current set of found sequence records
(from all sequence records if no record subset is currently
found).
This action
shows two possibility:
1. All – Each GENBANK
ACCESSION
NUMBER will be exported along with all
the related information in a tabulated form
(i.e. all fields presents in the table
"SEQUENCE", in this order:
CLUSTER, TITLE, GENE, CYTOBAND, GeneID_LID, HOMOL,
EXPRESS, RESTR_EXPR,
POLY_A, CHRO, SCOUNT, NACC, CLON, END, NUID, LIBR,
PUID,
MGC, SEQTYPE,
TRACE, PERIPHERAL, TISSUE, DEV_STAGE,
CANCER_SOURCE,
VERBATIM_TISSUE,
VERBATIM_DEVELOPMENTAL_STAGE);
2. Custom –
User can choose the fields to be exported
(i.e. only selected fields among those
described above).
User must choose name and
position of the
output file.
The same action
starts
clicking
on button "Export Sequence" into
main layout.
Export from other data
tables may
be easily performed
choosing the layout of interest, then using the general "Export
Record..."
command in the "File" Menu.
Erase
Data
Two possibilities are shown:
1 "Delete raw data": delete
only
original raw
data about library information.
It may be useful to "clean" the database following
parsing.
Use this
option to reduce the file size.
2 "Delete ALL data": delete
all
data in the database tables,
including original flat file raw data and parsed
data.
HELP MENU
Info about
UniGene Tabulator
This command shows
information
about the
software in a
new window.
UniGene
Tabulator
Help
This command
shows
this tutorial
about
the software in a new
window.
OTHER FUNCTIONS
IN THE MAIN LAYOUT
The mouse pointer is shown
as an
hand over the buttons.
Clicking on “Cluster” word
of the
title will
open
the actual UniGene record for the
current
cluster,
in the default
browser.
Clicking on the arrow
right to
“PROTGI” field
in the "Protein similarity" section will open
the corresponding
record
of the
Entrez “Protein” database,
in the default browser.
Clicking on the tag of the
“GeneID/LID” field
will open
the corresponding record of the Entrez "Gene” database,
in the default
browser.
---
PROBLEMS
Sometimes, power failure,
hardware problems, or
other factors can damage a
FileMaker database file.
When the runtime application
discovers a
damaged file, a dialog box appears,
telling the users to contact the
creator.
Even
if the dialog box does not appear, files can exhibit erratic
behavior.
If you have FileMaker Pro or
Developer
installed you can recover it
using the “Recover command”.
Otherwise in Mac OS X
machines,
press Option +
while double-clicking
the
runtime application icon.
Hold the keys
down until
you see the open Damage File dialog box.
During the recovery process,
the
runtime
application:
1. creates a new file;
2. renames any damaged file
by
adding “Old” to
the end of the filename;
3. gives the repaired file
the
original name.
--
Technical notes
The scripts at the core of
UniGene Tabulator
software are "FileMaker Pro" scripts,
which in part also invoke
"AppleScript" language commands.
Bugs report
Please report any
suggestion, bug
or problem
to:
pierluigi.strippoli@unibo.it