
Spectrum Research, LLC.
CONTRAST
Connectivity Tracing
Assignment Tools for Automated Assignment of Protein NMR Data
User Guide
Version 2.0
Copyright
Notice
Copyright © 1996 through 2001 Spectrum Research,
LLC. All rights reserved.
No part of this document may be reproduced,
transmitted, transcribed, stored in a retrieval system, or translated into any
language in any form by any means without the written permission of Spectrum
Research, LLC. Spectrum Research, LLC.
reserves the right to change the information in this document without prior
notice.
Trademarks
Contrast
is a trademark of Spectrum Research, LLC.
Acknowledgments
Contrast
software program was developed by Drs. John Markley and John Olson at the
National Magnetic Resonance Facility located at the University of
Wisconsin-Madison. All rights, title,
and interest in Contrast are owned by
the Wisconsin Alumni Research Foundation ("WARF"). The commercial version of Contrast has been exclusively licensed
to Spectrum Research LLC by WARF.
Credits
If the results (figures and/or data) obtained by Contrast TM application are
used for publication purposes, please refer to them in the following manner or
any other equivalent form:
"ContrastTM software, developed by
Spectrum Research, LLC., was used to compute the results in this
publication."
Chapter 1
CONTRAST is a non-graphical software tool for
automating NMR peak assignment. The program works with
NMR data in the form of ASCII lists of peak coordinates and intensities.. The
program provides the user with several versatile tools for manipulating peak
lists in order to design a custom strategy. The program can itself generate
customizable procedures for automatic assignment of NMR data. It should be
possible to use CONTRAST and the strategies it was designed to employ for
working with any type of multidimensional NMR spectral data set (although not
all combinations of NMR spectra are likely to yield complete assignments).
The CONTRAST program was designed to be an in-house
research tool and not a commercial package. We have successfully applied the
program to many real and synthesized NMR data sets, but we are always careful
to check all results. We provide no warranty or guarantee of its performance.
Use the program at your own risk.
Software Licensing and Installation
2.1 How to Obtain the Program
The CONTRAST executable can be downloaded from the
Spectrum Research website (www.specres.com/download.asp) or a demo CD can be
requested from Spectrum Research.
2.2 Installation
The CONTRAST executable, contrast.exe, needs no
special installation. We recommend that the executable and help files (or
corresponding symbolic links) be placed in the directory that contains the
spectral data to be assigned.
If you have obtained source code for CONTRAST, the
file "contrast.c" contains all of the functions and header
information necessary to compile CONTRAST. The program was written on a Silicon
Graphics Indigo workstation, but since all but a few minor functions are
implemented using ANSI C, the program can be ported easily to other platforms
by changing the system calls that are specific for the Silicon Graphics
platform. To compile the program copy contrast.c to the target directory and
type:
cc -o contrast -g contrast.c -lm
at the operating system prompt. The ASCII text file,
contrast.hlp, is a crude manual for the CONTRAST program. The manual is
designed so that it can be easily searched while running CONTRAST with the
CONTRAST "page" function, which is called by typing
"ctrl-h" at a prompt or "h" at the command line. The
contrast.hlp file should be located in the same directory as the CONTRAST
executable in order to use this feature.
Getting Started
This section introduces loading spectrum files,
searching spectra, displaying the results of a search, writing the results of a
search to a file, and quitting the CONTRAST program. A simple example is given
to illustrate each point, and the use of both the command line interface and
macro files is described. The following CONTRAST commands will be described.
lf cosy.con
scan cosy (d1 <.5> 8.0 && d2 > 4.0)
|results
d
btf |results > search.cosy.con
q
To run CONTRAST simply type the name of the CONTRAST
executable at the system prompt (e.g. contrast.exe). The computer's display
will be cleared, and after several lines of copyright information you will be
asked for the name of the log (starting macro) file that you wish to run. If
you want to run a session macro, then type its file name at the prompt. If your
log file name is "usr.log" (the standard session log file name)
simply type return at the prompt. The text that appears in the angle braces in
a CONTRAST prompt is always the default value for the prompt. If you do not
already have a session macro, type a new file name at the prompt. It is
customary to use the suffix ".log" for session macros and ".mac"
for subroutine or branching macros. After the name of the log file is typed in,
the user is prompted by a '>' symbol for the next command.
The LoadFile
command (abbreviated lf) is used to
load peak list files into CONTRAST. CONTRAST peak list files are typically
created from the name of the experiment with the '.con' suffix appended, but
they can have any name. They must, however, adhere to the format outlined in
Section @@. The LoadFile command can
also be used to load the sequence of the protein, since the formats of the
files are similar. The following line loads the file cosy.con into the program:
> lf cosy.con
The Scan
command (abbreviated sc) is used to search peak lists. It is an extremely
versatile command and will be described in more detail in section @@. In order
to search for peaks in the COSY spectrum read into the program the user could
type a command similar to the following:
> sc cosy (d1 <.5> 8.0 && d2 >
4.0) |results
In this example the COSY peak list is searched for
peaks in which the first dimension of each peak (d1) is within a tolerance of 0.5
units (<.5>) from 8.0 and (&&) the second dimension of each peak
(d2) is greater than (>) 4.0. The results of the search are placed in a
buffer called |results. The units of the tolerances and peak coordinates are
dependent on the units used in the input files. Since the coordinates are
typically expressed in terms of parts per million (PPM), we will assume that
input files use PPM in the rest of the manual.
The display command (abbreviated 'd') is used to
examine the contents of CONTRAST buffers. When a search is performed using the Scan command or one of several other
related commands, the results of the search are placed in a named buffer which
is added to the end of a master list of buffers. The buffers persist until the
user deletes them or quits the program. Associated with each buffer is a number
and the search Boolean that was used to create the buffer. Upon typing 'd' at
the CONTRAST command line, the program enters a crude 'display' mode that has a
unique set of subcommands for changing the way the buffers are displayed. These
subcommands are executed as each character is typed. To exit display mode type
'q' at the display command line prompt. Section @@ gives more information on
the different subcommands available within the display mode.
The buffertofile command (abbreviated 'btf') is used
to write the contents of a particular buffer to a file. In the following
example:
> btf |results >search.cosy.con
the |results buffer is written to the file,
search.cosy.con.
There are two pathways for exiting CONTRAST. The
quit command (abbreviated 'q') can be used to exit CONTRAST from the command
line. If CONTRAST is not at the command line, the program can be exited by
typing Ctrl-C to interrupt the action of the program followed by 'x' at the new
prompt. Typing 'q' at this new prompt causes the program to resume the action
that was interrupted by the Ctrl-C command.
Most of the commands that can be executed at the
CONTRAST command line can also be executed from a CONTRAST macro. For our
purposes a macro is an ASCII file that contains CONTRAST commands. When a macro
is executed, CONTRAST interprets each non-whitespace line as if it were typed
at the CONTRAST command line. Each line is executed serially until a quit
command is reached, until the macro branches to another macro, or until the end
of the file is reached. If the end of the file is reached the program returns
to the CONTRAST command line and waits for user input. All text in a macro
between two consecutive asterisks (**) and the next end-of-line marker is
considered to be a comment and is ignored by the program.
The 5 commands just described can be typed into a
file using a text editor and run as a CONTRAST macro. CONTRAST macros can be
run in many different ways. Macro files can be specified at the UNIX command
line when the program is started using the '<' sign to redirect input into
the program as follows:
CONTRAST <user.macro
Alternately the name of the macro can be specified
at the initial prompt by typing the name of the macro file and hitting enter.
Macros can be launched from within other macros or from the CONTRAST command
line using the execute command (abbreviated exe).
> exe user.macro
In this case control is transferred to user.macro
until the end of the file is reached at which time control will be returned to
the calling macro or initial command line. If the macro is terminated with a
quit command, however, the CONTRAST program will be exited without returning to
the calling procedure. The branch command can be used instead of the exe
command in order to fully transfer control to the called macro.
> branch user.macro
Input File Formats
CONTRAST input files use a free format in which
blank lines are ignored and white space (any number and combination of spaces
and/or tabs) is used to delimit fields. Comments can be inserted anywhere in an
input file by prefacing the comment with double asterisks (**). All text
following the double asterisks (up to the end of the line on which they appear)
is considered to be part of the comment and is effectively ignored by CONTRAST.
Most CONTRAST input files are either a form of a spectrum file or a macro file.
In the next release of CONTRAST the user will be given the option of reading in
spectrum files in a macro format, but an understanding of the spectrum file
format is currently essential to using CONTRAST effectively.
A CONTRAST spectrum file consists of a header
followed by a peak list. The header of a spectrum file should contain
information about the spectrum. Since most of this information is the same for
all instances of a particular type of spectrum, it is usually safer to copy and
modify an existing header from a similar spectrum than to write a header from
scratch. When copying a header from the spectrum file of the same kind of
experiment it is usually only necessary to modify the number of peaks, the
tolerances, and the comments. The fields in a spectrum file must appear in the
given order. Although comments and blank lines can appear anywhere in a
spectrum file it is a good practice to settle upon and stick to a style in
order to maximize readability and to minimize the possibility of making
mistakes. As long as fields appear in the correct order, it does not matter if
they are arranged on a different lines or if they are all placed on the same
line or some combination of the two arrangements. As all combinations have not
been rigorously tested, however, we recommend that a format similar to the one
shown below be used. Bold print is used to show essential information which
must be included in a spectrum file, normal print is used to show optional
information, and italics is used to show those elements of optional fields that
are even more optional. The following is the file format for an n-dimensional
spectrum (with as many as C correlations) that contains i peaks.
4.2 Spectrum File Format
name
n i (qual)
comment = numCom
d1lab d1atm d1tol d1cor1 (prob1) d1cor2 (prob2) d1corC (probC)
d2lab d2atm d2tol d2cor1 (prob1) d2cor2 (prob2) d2corC (probC)
dnlab dnatm dntol dncor1 (prob1)
dncor2 (prob2) dncorC (probC)
** comments
** comments
p1coord1 p1coord2 p1coord3 p1ntens
* p1comment
p2coord1 p2coord2 p2coord3 p2ntens
* p2comment
picoord1 picoord2 picoord3 pintens
* picomment
name The
name of the spectrum. The name of a CONTRAST spectrum file is generally the
spectrum name with the '.con' suffix appended to it.
n The
dimensionality of the spectrum.
i The
number of peaks in the spectrum.
(qual) An
estimation of the quality of the spectrum couched in terms of a probability. A
qual
factor of 1.0 indicates that 100% of the expected peaks will be present in the
spectrum, and that very little noise (false peaks)
are present. A qual factor of 0.9
indicates that 90% of the expected peaks are
present.
comment = Text that indicates that the next field
(numCom) is the number of characters the
program should allocate for the comment associated
with each peak. 'ment =' is
italicized to indicate that only 'com' is needed to
signal that the next field is
numCom.
numCom The
number of characters that the program should allocate for the comment
associated with each peak.
d#lab The
label of the #'th dimension of the peaks in the spectrum.
d#atm The
resonance code (also called atom code) describing all of the atoms of the #'th
dimension of the peaks in the spectrum. Since some
dimensions of a spectrum
often detect several different resonances, wild
cards are frequently used in this
field. A description of resonance codes is found in
section @.@.
d#tol The
default tolerance of the #'th dimension of the peaks in a spectrum. A tolerance
is one-half of the resolution of that dimension.
d#cor## The
resonance code (also called atom code) of the #'th dimension of the ##'th
correlation in the spectrum. Correlations describe
the types of peaks that one
would expect to see in a spectrum. An HNCA spectrum,
for example, contains an
Hni,Nai,Cai correlation (amide proton, amide
nitrogen, alpha carbon) and an
Hni,Nai,Ca- correlation (amide proton, amide
nitrogen, alpha carbon from
previous residue). The last resonance code for a
given dimension will be repeated
if previous or subsequent dimensions contain more
resonance codes. A description
of resonance codes is found in section @.@.
(prob##) The estimated probability
of seeing the previous correlation in the spectrum.
Note that only the last
probability listed in a vertical column will be used to describe the
##'th correlation. Other probabilities are used only
to make the file more readable.
** Comment
markers. Comment markers indicate that the text that follows on that
line is a comment and should be ignored by the
program. Users are encouraged to
use comments to document the origin of the spectrum
files and each modification
that the files undergoe. Most CONTRAST functions
that modify a spectrum or
spectrum file will append a comment to the file that
tells what was done to the file
and the date it was done.
comments Any
text that the user wants to include in the file.
p##coord# The
#'th coordinate (frequency dimension) of the ##'th peak in the spectrum
(usually in ppm units).
p##ntens The
intensity of the ##'th peak in the spectrum.
* A
special peak comment marker that causes the program to read in the comment
and associate it with the peak that the comment
follows. The 'comment =
numCom'
line described above is used to specify the maximum number of
characters that can be stored in each peak comment.