
Spectrum Research, LLC.
NMR-SAMS User’s Guide
An expert system for computer-assisted structure elucidation
of organic and natural product compounds based on multidimensional
spectroscopy
NMR-SAMS User’s Guide, Version 2.4
This manual describes release 2.4 of the Windows 95/98/2000/NT4.x version of NMR-SAMS™.
Copyright Notice
Copyright © 1996 through 2001, Spectrum Research, LLC. All rights reserved.
No part of this document may be reproduced, transmitted, transcribed, stored in a retrieval system, or translated into any language in any form by any means without the written permission of Spectrum Research, LLC.
All possible care has been taken in the preparation of this document but Spectrum Research accepts no liability for any errors/omissions that may be found.
Spectrum Research, LLC. reserves the right to change the information in this document without prior notice.
Trademarks
SpecManTM and NMR-SAMSTM are trademarks of Spectrum Research, LLC.
Acknowledgments
NMR-SAMSTM (originally known as CISOC-SES) has been developed by Dr. Shengang Yuan, Dr. Chen Peng and Prof. Chongzhi Zheng at the Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, P.R. China, 1988-1994. It has been further improved by Dr. Chen Peng in the group of Dr. Geoffrey Bodenhausen at the National High Magnetic Field Laboratory in 1995-1996. Portions of NMR-SAMSTM are copyright © 1988 through 1995, Shanghai Institute of Organic Chemistry and Florida State University, and are exclusively licensed to Spectrum Research, LLC. Title and full ownership rights to the converted/modified NMR-SAMSTM will remain solely with Spectrum Research, LLC, and NMR-SAMSTM is asserted to be Spectrum Research’s proprietary information and trade secret.
Credits
If the results (figures and/or data) obtained by NMR-SAMSTM are used for publication purposes, please refer to NMR-SAMSTM in the following manner or any other equivalent form:
" NMR-SAMSTM software, developed by Spectrum Research, LLC., was used to compute the results in this publication".
1.6 A Note on Operating Systems
2.1 Installation of the Program
2.2 Spectrum Research Licensing
2.4 Brief Introduction to Microsoft Windows
2.5 Description of the Main Menus
3.2 General Procedure of Structure Elucidation with NMR-SAMS
3.3 What Spectral Data Does NMR-SAMS Use?
3.4 Use of 2D NMR Connectivities: Bond Constraints
3.5 Use of Chemical Shifts And Peak Multiplicities
4.2 Open An Existing Working Data Set
4.3 Opening A New Working Data Set
4.6 Save A Working Data Set as Different Name
5.2 Conversion of SpecMan 1H Peak List
5.3 Conversion of SpecMan 13C Peak List
5.4 Conversion of SpecMan DQF-COSY Peaks Table
5.5 Conversion of SpecMan HMQC/HETCOR Peaks Table
5.6 Conversion of SpecMan HMBC/COLOC Peaks Table
5.7 Conversion of SpecMan NOESY Peaks Table
5.8 Conversion of SpecMan INADEQUATE Data
6.2 Interpretation of MF, 1H, 13C and HMQC Data
as Building Blocks
6.1.1. Interpretation of Molecular Formula
6.2.2. Interpretation of 1D 1H Data
6.2.3.
Interpretation of 1D 13C Data
6.2.4.
Interpretation of HMQC/HETCOR Connectivities
6.2.5.
Generation of Building Blocks
6.3 User-Defined Building Blocks
6.4 Interpretation of 2D
Spectral Data as Bond Constraints
6.4.1.
Interpretation of COSY Connectivities
6.4.2.
Interpretation of HMBC/COLOC Connectivities
6.4.3.
Interpretation of NOESY Connectivities
6.4.4.
Interpretation of INADEQUATE Connectivities
6.4.5.
Transformation of Bond Constraints
6.4.6.
Setting up Atom-Atom Connection Matrix (ACMX)
7.2 User-Defined Bond Constraints
7.2.1.
Interactive Structure Generation
7.3 User-Defined Atom Environment Constraints
8.2 Input of the Target Structure
8.2.1.
Building a Target Structure in NMR-SAMS
8.2.2.
Importing a Target Structure
8.2.3.
Setting up the Assignment Matrix
8.3 User-Defined Resonance Assignment
9.2 MF-Based Structure Generation of Virtual Compounds
9.3 Quick Structure Elucidation
10.2 Display of Structural Building Blocks
10.3 Display of Target Structure
10.4 Display of Generated Structures/Assignments
10.7 Editing the Display of Generated Structures
11.2 Exporting NMR Spectral Data
11.3 Exporting Resonance Assignment
11.4 Exporting Candidate or Target Structures.
CCSS-13C Chemical Shift Range Correlation Table.
Parameters for Spectral Interpretation
Parameters for Setting up ACMX
d13C 13C chemical shift.
d1H 1H chemical shift.
1D One-dimensional.
2D Two-dimensional.
ACMX Atom-atom Connection MatriX, which summarizes the bond-formation probabilities between the constituent atoms of an unknown.
BB Structural Building Blocks for structure generation, e.g., CH3-, CH2-, and -OH.
BC Bond Constraint derived from 2D NMR spectral data, which defines the number of intervening bonds between the correlated spins.
CCSS Carbon-Centered Single-spherical Substructure.
COLOC COrrelation via Long-range Coupling, a kind of 2D spectrum that provides 2-to-3-bond 13C-1H connectivities.
COSY COrrelated SpectroscopY, a kind of 2D spectrum that provides 1H-1H through-bond connectivities.
CPU Central Processing Unit.
DEPT Distortionless Enhancement by Polarization Transfer, a kind of 1D spectra that provides information concerning the number of attached protons on each carbon atom.
EC Environment Constraint, limitation on the neighboring types of atoms attached to a central atom specified by the user.
HETCOR HETeronuclear Correlation, also called C-H COSY, a kind of 2D spectrum that provides one-bond 13C-1H connectivity information.
HMBC Heteronuclear Multi-Bond Connectivity, a kind of 2D spectrum that provides 2-to-3-bond 13C-1H connectivity information.
HMQC Heteronuclear Multiple Quantum Coherence, a kind of spectrum that provides one-bond 13C-1H connectivity information.
INADEQUATE Incredible Natural Abundance Double Quantum Transfer Experiment, a kind of 2D spectrum that provides one-bond 13C-13C connectivity information.
MDF The Master Data File produced while using NMR-SAMS for structure elucidation. This file stores the intermediate and final results produced during the execution of NMR-SAMS.
MF Molecular formula or empirical formula of a molecule, which is usually derived from mass spectral data.
NMR Nuclear Magnetic Resonance
NOESY Nuclear Overhauser enhancement and Exchange SpectroscopY, a kind of 2D spectrum that provides 1H-1H through-space connectivity information.
NSBC Number of “Sub-bond constraint(s)”, or pair(s) of relevant atoms, that must satisfy a bond constraint in the generated structure.
PSE Partial Structure Elucidation. Structure elucidation based on information available on a portion of the spectral data, which is usually the well-resolved part
Chapter 1
NMR-SAMS
(NMR Spectral Assignment Made Simple) is an expert system for computer-assisted
structure elucidation of unknown organic or natural product compounds from
multidimensional spectroscopy (e.g., MS, NMR, IR and UV) providing
complementary information of chemical compounds. In particular, NMR-SAMS uses information of chemical compounds
from routine 1D and 2D NMR spectroscopy.
Together with SpecMan, it serves as a chemist’s workbench for de novo structure elucidation of small
molecules such as organic compounds, natural products, peptides, and other
small biomolecules. NMR-SAMS is also
used for automated resonance assignment of known compounds.
The basic strategy of structure elucidation using NMR-SAMS is illustrated in Fig. 1.1. When dealing with an unknown compound, the molecular formula (MF) must first be determined by mass spectroscopy or another approach. Next, the 1D and 2D NMR chemical shifts, multiplicities, J-couplings and intensities are extracted from the processed 1D and 2D spectra (transformed through conventional FFT or Non-FFT techniques) using SpecMan. The 1D and 2D spectral data extracted as peak lists using SpecMan are imported into NMR-SAMS, and interpreted as structural building blocks and bond constraints based on one-bond, two-bond and other long-range connectivities. Finally, the building blocks, NMR-derived bond constraints, and other user-defined bond constraints are used to generate plausible candidate structures with resonance assignments. If the structure is already known, the user can specify the proposed structure and let NMR-SAMS complete the resonance assignments directly.

Figure 1.1. Data flow diagram of NMR-SAMS representing the different phases of spectral interpretation, structure generation and resonance assignment. Gray boxes represent optional input data. PSE: means partial structure elucidation based on incomplete spectral data. A bond constraint is represented as n intervening bonds, (B)n, between the correlated atoms.
NMR-SAMS
has the following main features:
·
Input
of peak tables with chemical shifts, multiplicities, J-couplings and
intensities, from a variety of 1D and 2D NMR experiments.
·
Automated
interpretation, bookkeeping, and crosschecking of spectral data with respect to
the molecular formula.
·
Novel
representation of 2D NMR correlation information based on the concept of
chromatic graph.
·
Structure
determination and identification of unknown compounds based on complete
utilization of 2D NMR correlation information and complementary spectral
information from MS, UV and IR spectral data.
·
Partial
structure elucidation of compounds based on incomplete spectral data.
·
Graphical
tools for interactive building and editing of molecular fragments, and for
defining bond constraints and atom environment constraints. Graphical tools to
display and browse through candidate structures and sub-structures. Graphical interaction between structures and
bond constraints.
·
Background
information-independent structure
elucidation, which minimizes the potential human bias introduced into the
structure elucidation process.
·
Fast
structure generation of complex molecules when sufficient constraints are
available.
·
Fast
resonance assignment and structure verification of large complex molecules
based on proposed structures.
·
Automated
resonance assignment based on assigned resonances of compounds.
· Flexible format for report generation of the results of spectral and structural analysis.
The
current version of NMR-SAMS can only handle molecules that have less than 128
non-hydrogen atoms. The total number of free bonds (unsatisfied valences) of
the structural building blocks before structure generation, which determines
the complexity of the problem of structure generation, must not exceed 220 (The
total number of free bonds is equal to the sum of valences of heavy atoms, less
the number of protons and twice the number of known bonds.). The maximum number of peaks in a 1D and 2D
spectrum is limited to 200 and 1000 respectively. The maximum number of bond constraints is limited to 1000.
Most
of the previously proposed CASE (computer assisted structure
elucidation) systems either use a chemical shift-substructure correlation database
or a more concise chemical shift-substructure correlation model, and rely to a
large extent on the knowledge of a human expert. Such systems have been limited to very simple and small
molecules. NMR-SAMS has demonstrated
the impact of using 2D NMR correlation information on improving the efficiency
of CASE systems when dealing with real-world complex molecules. For efficient structure elucidation of
unknown compounds, NMR-SAMS requires the molecular formula (which may or may
not be known accurately from MS or other methods. If the molecular formula is unknown, NMR-SAMS uses the number of
observed carbon and proton peaks along with any available heteroatoms information
to estimate the molecular formula), 1D 1H, 13C, DEPT (or
APT), and 2D DQF-COSY, HMQC (or HETCOR), HMBC (or COLOC, FLOCK), and INADEQUATE
spectral data. It is not mandatory to
have all of these experimental NMR data sets available, because NMR-SAMS can
also solve structure elucidation problems with different possible combinations
of experimental data (for details refer to Section 3.3). Structure elucidation based on 1D 13C
chemical shifts is only possible for very simple molecules, and is not practical
for complex molecules. NMR-SAMS cannot
elucidate unknown structures based solely on 1D 1H chemical shifts.
Although
most spectra used by NMR-SAMS, e.g., 1D 1H, 2D DQF-COSY and HMBC,
are allowed to have peak degeneracy, the 1D 13C spectrum and HMQC (or
HETCOR) must be completely resolved for complete structure elucidation. If severe overlap prevents resolving all of
the 13C peaks, NMR-SAMS will use only the well-resolved spectral
data to generate the plausible substructures.
This is called partial structure elucidation (PSE). Some limitations on PSE are described in
Section 7.1.
In
the current version, NMR-SAMS does not consider molecular symmetry, so partial structure elucidation is performed for a molecule with
global symmetry. For a molecule with
local symmetry where the 13C signals corresponding to symmetric
carbons can be identified, complete structure elucidation by NMR-SAMS is
possible.
Most
of the steps in NMR-SAMS such as interpretation of 1D and 2D data into bond
constraints, and generation of the building block sets, are usually performed
very fast. Structure generation, on
the other hand, is more time-consuming because of its combinatorial
nature. The efficiency of structure generation (which is a factor of the computation time, the quality of the
structure generated, and the number of structures generated) depends on the
size of the molecule and the quality and quantity of the spectral data. When the unknown molecule is big (e.g. with
more than 40 heavy atoms) and the correlation information derived from the
spectral data is not sufficient, the structure generation could take very long to
finish. In such cases, the user is advised
to input as many as known substructures as possible to accelerate the structure
generation process. In addition, the
user can also take advantage of some of NMR-SAMS' other tools, such as
resonance assignment for verification of proposed
structures, and flexible graphics tools for interactive building of structures
to solve this problem.
Although
the spectral interpretation routines of NMR-SAMS are general-purpose, the
structure generator of NMR-SAMS cannot deal with molecules containing ionic
atoms, tautomeric or coordinate bonds.
It recognizes only single, double and triple bonds. Aromatic bonds are represented as
alternating single and double bonds.
Sometimes this might cause redundancy in the structure generation of
aromatic compounds.
In
the current version of NMR-SAMS, if the structure is already known, then target
structure based resonance assignment is possible, provided the NMR
data set is complete.
Although
NMR-SAMS can recognize all chemical elements, the current substructure/d 13C knowledge base (see Appendix III) contains
only the substructures consisting of commonly occurring elements, i.e., C, H,
O, and N. The user can customize this
knowledge base. The user will be
informed about the undefined substructures when other elements exist in the
molecule, and this could reduce the efficiency of structure generation.
NMR-SAMS can be viewed as an expert assistant helping spectroscopists and chemists to solve structure elucidation problems, and is by no means expected to replace the human expert. NMR-SAMS is designed for flexible human intervention, and efficiently uses the additional user knowledge and judgment to control and enhance the structure elucidation process.
The
IRIX version of NMR-SAMS runs on SGI systems running IRIX 6.x or higher
operating system with R4000 or higher processors and at least 128 MB of RAM or
higher and 8-bit graphics. R8000 or
higher processors and 128 MB or more RAM is recommended.
The Solaris version of NMR-SAMS runs on Sun systems
running Solaris 2.x (SunOS 5.x) with SPARC processors and at least 128 MB of
RAM and 8-bit graphics. X/Motif 1.2.3
libraries are required. These are
usually supplied with the SUN Common Desktop Environment (CDE).
The
Microsoft Windows version of NMR-SAMS runs on Pentium or higher processors (or
100% compatibles) with at least 32 MB of RAM running Windows 95/98/2000, or
Windows NT 4.0 or later and a VGA or better monitor. A Pentium II or higher processor with 64 MB or more RAM is
recommended.
NMR-SAMS
requires from 2 MB to 55 MB of hard disk space, depending on the sample data
that is installed. The sample data with
original spectra requires 40MB of hard disk space. Swap drive space (i.e. virtual memory) required is proportional
to the complexity of the data being analyzed.
NMR-SAMS provides online help information for many of its dialog boxes. By clicking the Help button, the relevant help message will be displayed.
Unless
otherwise noted in the text, the User’s Guide of NMR-SAMS uses the
typographical conventions described below:
·
A
command to select is represented in bold type face by the menu name, the option,
and the pull-right option (if any). For example, the command:
Display/Display Options/Chemical Shifts
means, first click Display menu on the menu bar,
then click Display Options in the opened menu.
And then click Chemical Shifts in the pull-right options.
·
Transcript
of a computer file or display is printed in Courier New letters with the
keywords shown in bold, and the annotations (if any) in italic Times letters.
(Such annotations do not appear in the file or display itself).
ATOM~~ATOM:
For each correlation, listed are the IDs of the correlated atom pair, the range of intervening bonds, and the bond type (0: meaningless or unknown)
(1-23: 1~1 2)
(6-22: 1~1 3)
.
.
.
·
Filenames
and parameters are printed in Courier
New letter. For
example:
Files phasefile
and procpar are used for peak picking with SpecMan.
Parameter GEN_FLAG controls
the search criteria of the structure generation.
·
Terms
introduced for the first time are presented in boldface type.
·
Words
in italic represent variables. For
example:
There are n
intervening bonds between the correlated atoms.
Spectrum
Research has attempted to make its products as similar as possible over the
various operating systems. However,
there are some invariable differences that cannot be worked around. As highest priority, data files have been
kept consistent between UNIX and MS Windows machines.
It
is recommended that the user refer to the online help provided by individual PC
vendors for more information on the basics of Operating Systems. NMR-SAMS follows the interface of the
Operating System that it is running on, and therefore, it is important to
become acquainted with the Operating System before attempting to learn NMR-SAMS. See Section 2.4 for information on the
basics of the NMR-SAMS Interface.
Chapter 2
For instructions on NMR-SAMS installation, please refer to ‘The Release Notes’ or ‘nmrsamsPC.readme’ file supplied with the program.
NMR-SAMS
is copy protected by the Spectrum Research Licensing System. This licensing system allows NMR-SAMS to run
only on the computer for which it was sold.
A license.dat file is included with the installation files and this plain text
file will be placed into the NMR-SAMS directory
(C:\Spectrum2001\NMR-SAMS).
If a license file is not located with the NMR-SAMS installation files, please contact Spectrum Research. To create a license file, send the Windows Serial Number (Product ID) to Spectrum Research. Under Windows 95/98/2000 and Windows NT4.x, right click on the “My Computer” icon on the Windows Desktop. Choose “Properties” from the menu that pops up, and the Product ID will be listed in the “Registered To:” section (For example: 02658-OEM-2564589-12458).
When the trial licensing time period is nearing expiration, NMR-SAMS will display a dialog box with the remaining number of days listed on it. Please contact Spectrum Research for a renewal at this time.
To
launch the NMR-SAMS program, click on the nmrsams.exe icon from the File
Manager or Windows Explorer (By default, NMR-SAMS is installed into
C:\Spectrum2001\NMR-SAMS). The program
starts with a Main Graphics Window
that has a menu bar and status bar.
By
default, a Status Window is also opened, which
displays text messages to indicate the current status of the structure
elucidation, and also prompts the user with the “what to do next” steps. The main graphics window is shown below:

When
NMR-SAMS is started, it reads the following three files from the directory
where the user launched NMR-SAMS:
nmrsams.ini: defines some of the initial
settings of the program, such as window sizes, background colors, atom colors,
bond colors, etc. If this file is not
found, default settings will be used.
periodic_tab.def:
defines some properties of
the chemical elements. If this file is
not found or if it is not properly read, NMR-SAMS will not be able to recognize
any element symbols and perform the related functions.
chemical_shifts.def:
defines the knowledge base of 13C chemical shift dispersion
ranges for some common carbon-centered
single spherical substructures (CCSS)
(see Appendix III). If this file is not
found or it is not correctly read, the structure generation will not be
possible (see Section 3.5).
If
the user is new to Microsoft Windows or Windowing systems in general, please
read this section before using NMR-SAMS.
It will help the user become acquainted with the NMR-SAMS interface.
First,
it is a good idea to become acquainted with the online help system provided by
Microsoft Windows. The online help
system is called from within NMR-SAMS when the user clicks on a
"Help" button from any dialog box, and it brings up context sensitive
help in a window. There is also a Help
Contents facility (also known as an Index).
This consists of a list of the topics in the online-help. The user can click on one of these items to
bring up its corresponding information.
'The Contents' option is available via NMR-SAMS's Help menu and from the
Online Help Viewer window by clicking on the “Contents” button.
When
NMR-SAMS is first started, a window will appear with "NMR-SAMS, version
2.4, (C) Spectrum Research, LLC." on the top. The area where this text appears is referred to as the
"Title Bar." The user can
press the left mouse button while the arrow pointer (which is called the
"Cursor") is on the title bar and then move the mouse to move the
window. Release the mouse button to
stop moving the window. That
combination of events (pressing a mouse button, moving the mouse, and then
releasing) is known as "Dragging".
Position the mouse pointer so that it is over the word "File",
located immediately below the title bar.
Now press and then immediately release the left mouse button. This procedure (pressing a mouse button and
then releasing without moving the mouse) is known as "Clicking". The item that was clicked on was the
"Menu Bar". The menu bar
consists of several "Menus" ("File", “Edit”,
"Display", "Analysis", and "Help"). When the File menu is clicked on, a
"Pulldown" appears. This
pulldown consists of "Menu Items" ("Open...",
"New...", etc.). If the user
clicks on one of these menu items, an option will occur. Menu items are the primary way that the user
of NMR-SAMS can convey its wishes to NMR-SAMS.
Some
items on menus are not menu items, however.
The line that appears above the "Quit" menu item is known as a
"Separator". Its purpose is
solely to make the menu easier to read. Click on the "File" menu and
notice that the "Create NMR Data File" menu item has a right pointing
triangle after its text. This type of
menu item is known as a "Pullright".
Click the mouse on the " Create NMR Data File " menu item and
another group of menu items will appear to the right of it. The pullright feature is used to group
related menu items together, reducing the size of the main pulldowns. Click on the "Display" menu and
the menu item "Status Window", which is known, as a
"Toggle" will appear. Toggles
have two states: "Off" (also
known as "Deselected" or "Deactivated"), and "On"
(also known as "Selected" or "Activated"). If the status window is on, turn off the
"Status Window" toggle by clicking on it and the status window will
disappear. Click on the
"Display" menu and turn on the “Status Window” toggle by clicking on
it again, and the status window will pop up again.
Position
the mouse cursor over the frame that surrounds the entire NMR-SAMS window. Drag the mouse to change the size of the
NMR-SAMS window. All sides of the
NMR-SAMS window can be moved to size the window. The field below the NMR-SAMS Toolbar is known as the "Main
Graphics Window". This is where
information about chemical structures is displayed. At the bottom of the Main Graphics Window is the "Status Bar",
and this status bar prints out information about what is going on in
NMR-SAMS. It will notify the user if
the user has asked NMR-SAMS to perform a function that it is not prepared to
do, in addition to giving the user hints about using NMR-SAMS.
Click
on the "Open..." menu item from the "File" menu, and a
window will appear with the title of "Open". This type of window is known as a dialog
box. While a dialog box is displayed,
the user must interact with it before continuing with other areas of NMR-SAMS. Dialog boxes also have a "Help"
button that when clicked, will bring up online help about the dialog box. The dialog box that is currently displayed
is referred to as the "File Browse Dialog", and it is used to specify
a file. The user can move to a certain
directory by using the “Directory” combo box to find the proper parent
directory, and the user can descend the directory structure by double clicking
on a directory name from the list (a “Double Click” is two clicks followed in
rapid succession). After the user has
changed to the appropriate directory, a list of "Files" with the
extension “.mdf” will appear. Click on one of the filenames to select it
and then select the "OK" button at the bottom of the dialog box to
accept the input the selected file.
Click the "Cancel" button to close the dialog box without
performing an action.
When
multiple candidate structures are generated, the first structure will be
displayed along with a window titled Structure Browser. This window is known as a "Palette." Palettes are similar to dialog boxes,
however the user is able to interact with them and with the main NMR-SAMS
window at the same time. The
"Structure Browser" palette is used to control the display of the
candidate structures. In the
"Structure Browser" palette, there is a "Slider", and the
user can drag the slider bar to the left or right to raise or lower its value,
which determines the sequential number of the structure to be displayed. Some palettes also have text fields where
the user can enter numbers or text.
The
user should now have enough information to start exploring NMR-SAMS. Note that NMR-SAMS grays out menu items that
are not available during specific stages of the structure elucidation process. For example, if the user has not prepared
the NMR data file, the menu item Analysis/Interpret
NMR Data will remain grayed out until the data has been prepared.
The menu bar appears at the top of the main graphics window and contains the names of the five NMR-SAMS menus: All tasks in NMR-SAMS can be performed by selecting from these five menus. The five menus are described briefly on the following pages and in greater detail in the other chapters of this book.
The File menu: The File menu lists options related primarily to reading data into and out of NMR-SAMS, as displayed below:

The Edit menu: The Edit menu lists options related to editing of the working data set files and the generated structures, as displayed below:

The Display menu: The Display menu lists options related to the graphical display of intermediate and final results of NMR-SAMS, as displayed below:

The Analysis menu: The Analysis menu lists the options related to structure elucidation, as displayed below:

The Help menu: The Help menu lists the options related to the online help of NMR-SAMS, as displayed below:

The NMR-SAMS toolbar contains icons (pictures) that represent commonly used menu items. If the user clicks on one of the icons, the same action occurs as the corresponding menubar item.
![]()
The following menu items have associated toolbar icons:
File/New
File/Open
File/Save
Display/Building
Blocks & Fixed Bonds
Display/Target
Structure
Display/Generated
Structures or Assignments
Display/Status
Window
Display/Display
Options/Balls
Display/Display
Options/Carbon Symbols
Display/Display
Options/Numbers
Display/Display
Options/Chemical Shifts
Display/Display
Options/Protons
Display/Display
Options/Molecular Formula
Display/Display
Options/Connection Table
Display/Display
Options/Refine
Help/Contents
Chapter 3
This chapter introduces the basic procedure of structure elucidation, with a brief description of the concepts and principles of NMR-SAMS, and concludes with a high-level discussion of the typical flow of activity through NMR-SAMS.
The
process of structure elucidation of an unknown compound through NMR
spectroscopy consists of the following steps:
1.
Determination
of the molecular formula (MF) by MS. Determination of some
functional groups in the unknown compound through IR and UV spectroscopy. MF is optional to NMR-SAMS.
2.
Data
acquisition of 1D and 2D NMR spectra. See Section 3.3 for the spectral data used
by NMR-SAMS.
3.
Extraction
of peak tables with chemical shifts, intensities, J-coupling and
multiplicities. Peak picking of 1D and 2D NMR spectral data
is performed with SpecMan using automatic and
semi-automatic procedures (see SpecMan’s User
Guide). The peak tables are then
converted to NMR-SAMS representation of connectivity information (see Chapter
5).
4.
Set
up of the parameters to control the spectral interpretation and structure
generation. In most cases, the default
values of these parameters can be used (see Appendix IV).
5.
Interpretation
of molecular formula (if known), along with 1H, 13C, and
HMQC spectral data to obtain the structural building blocks. If the MF is unknown, the user can
interactively add heteroatoms into the building block sets (see Chapter 6).
6.
Interpretation
of additional 2D NMR spectral data to obtain the bond constraints (see Chapter
6)
7.
Generation
of candidate structures that are consistent with the experimental data for
unknown compounds (see Chapter 7), or verification of the proposed structure
and completion of 1H and 13C resonance assignments (see
Chapter 8) for known compounds.
Interactive structure generation and resonance assignment is also
possible (see Section 7.2.1).
8.
Exportation
of the results of structure generation and resonance assignments (see Chapter
11).
Structure
elucidation is usually an interactive approach, so this process may need to be
repeated several times until the user obtains satisfactory results. NMR-SAMS assists the user in identifying and
correcting the inconsistencies in the input data. When sufficient input data is not available, NMR-SAMS generates
only partial structures with resonance assignments. NMR-SAMS also warns the user about some common pitfalls that
could lead to incomplete or incorrect structure generation, and provides clues
for further refinement.
The
possible combinations of 1D and 2D spectral data used by NMR-SAMS for structure
elucidation are listed in Table. 3.1.
The fifth combination (routine 1D and 2D spectra along with
complementary information from other spectral data (MS, UV and IR)), is the
recommended choice for structure elucidation of real-world complex
molecules. Other spectral sources such
as MS, IR, and UV are not directly interpreted by
NMR-SAMS but they can be conveniently used as user-defined bond/environment
constraints.
Table 3.1. Possible combinations of 1D and 2D NMR spectral data used by NMR-SAMS a
|
|
1D |
2D |
Comments |
|
1 |
None |
None |
Pure
isomer enumeration from MF |
|
2 |
13C (and DEPT b) |
None |
Very
low efficiency except for simple molecules. |
|
3 |
13C, DEPT b |
INADEQUATE |
Very
high efficiency, if data available. |
|
4 |
13C, DEPT b, 1H |
DQF-COSY c,
HMQC d |
Low
efficiency except for H-rich molecules. |
|
5 |
13C, DEPT b, 1H |
DQF-COSY c,
HMQC d, HMBC e
(NOESY f) |
Most
practical way for de novo structure
elucidation of complex molecules. |
|
6 g |
1H |
DQF-COSY c,
HMQC d, HMBC e (NOESY f) |
Practical
when the amount of sample does not allow for carbon-detecting experiments. |
a
TOCSY is
not used directly by NMR-SAMS, but can be used by SpecMan to assist the peak
picking of DQF-COSY.
b
INEPT, or
APT can also be used.
c
Various
types of COSY experiments can be used, as long as they provides geminal and
vicinal H-H through-bond connectivity.
d
HSQC,
HETCOR, or other types of spectra can also be used, as long as
they provide one-bond C-H connectivity.
e COLOC, FLOCK, or other
types of spectra can also be used, as long as they provide long-range C-H
connectivity.
f NOESY or ROESY is optional.
g HMBC and HMQC must be clean
enough to allow extraction of 13C chemical shifts and multiplicity
information. 13C chemical
shifts can be automatically extracted from HMBC using SpecMan. 13C multiplicities must be
identified manually from the HMQC spectrum.
3.4 Use of 2D NMR Connectivities: Bond Constraints
NMR-SAMS
uses mainly 2D NMR-derived through-bond spin-spin connectivity information for
structure elucidation, because it is reliable and provides comprehensive
structural information for de novo
structure elucidation.
In
NMR-SAMS, the coordinates of 2D cross peaks are first converted into connectivities between the relevant 1D
peaks, and then interpreted as bond
constraints on the relevant atoms.
A bond constraint (BC) is a requirement of a certain number (or a range)
of intervening chemical bonds between correlated spins. For an asymmetric molecule, such spin-spin
BC’s are directly used as atom-atom bond
constraints. In addition to its
efficient utilization of BC’s involving ambiguous bond separation (e.g., 2 or 3
bonds between two HMBC-correlated spins), NMR-SAMS also copes with BC’s
concerning ambiguous atoms. Such
ambiguity typically arises from peak degeneracy or low digital resolution.
In
NMR-SAMS, a BC is represented in the following general format:
(Atom_y ... - Atom_x ... :
minBond ~ maxBond; BondType; minNSBC ~ maxNSBC)Source
where
Atom_y ... is the correlated atom(s)
along the Y dimension (13C domain for an HMQC spectrum). It could be
more than one atom in the case of ambiguity.
Atom_x ... is the correlated atom(s)
along the X dimension (1H domain for an HMQC spectrum). It could be more than one atom in the case
of ambiguity.
minBond and maxBond are the minimum and maximum
bond separations between the relevant atoms.
BondType is the type of the intervening bond between the
atoms. Valid choices are: 0, 1, 2, or 3
for unknown, single, double, and triple, respectively.
minNSBC and maxNSBC are the
minimum and maximum numbers of relevant atom pair(s) that must satisfy this BC
in the generated structure.
Source encodes the connectivity (or other source) from
which the BC was derived. A
connectivity is represented by its spectral type and its ID number. The
following codes are used to represent the different spectral types:
“C” for COSY, “Q” for HMQC (or HETCOR), “B” for HMBC
(or COLOC), “N” for NOESY, “I” for INADEQUATE.
Note: The ID of a connectivity is different from, though
related to, the peak ID(s) in the SpecMan peak
tables. For more details see Fig. 6.4
in Chapter 6.
The following codes are used to represent other
kinds of source:
“S” for a pseudo BC added by the program, “U” for a
user-defined BC, and “G” for a previously generated bond (when using a
generated substructure as the starting point for the next structure generation
cycle).
For
example, an HMBC-derived bond constraint is represented as:
(10 - 17 18: 2 ~ 3; 0; 1 ~ 2)B10
In
the above example, the first set of numbers “10 - 17 18: ” denotes the atoms
that are correlated. In this case,
since the chemical shifts of H-17 and H-18 are very close, it is difficult to
resolve which one of them is really correlated to C-10. Therefore, both of the protons are retained
to represent the possibilities that there could be a correlation between either
C-10 and H-17, or C-10 and H-18, or both.
The next set of numbers “2~3” represents that there could be two or
three intervening bonds between the correlated C-H pair(s). The next number “0” represents the bond type
of the intervening bonds, and in this case, they are treated as unknown. The next set of numbers “1~2” represents
that either one or both pairs of the atoms involved in the bond constraint must
satisfy this bond constraint in the computed structure (i.e., C-10 and H-17, or
C-10 and H-18, or both pairs). Finally,
the character string “B10” means that this bond constraint was derived from the
HMBC connectivity #10. From the comment
of this connectivity, the ID of the actual cross peak (in the SpecMan peaks
table) can be found in the .nmr file. (See Fig. 6.4 in
Chapter 6).
By
default, NMR-SAMS treats unambiguous BC’s (which have exactly two correlated
atoms, one-bond separation, and minNSBC = maxNSBC = 1, which means the BC must be satisfied in a generated
structure, as fixed bonds. The rest, which either have ambiguous bond
separation, or ambiguous numbers of correlated atoms, or both, are treated as ambiguous BC’s. The ambiguous BC’s are used as the major constraints for
structure generation. During structure
generation, NMR-SAMS computes the number of violations of BC’s for the current
substructure/structure. If the actual
number of violations of a substructure/structure is less than the upper limit
of allowed number of violations, then the substructure/structure is retained,
otherwise it is rejected. The BC’s are
also used by some advanced heuristic methods for acceleration of the structure
generation process. (See Section 7.4)
3.5 Use of Chemical Shifts And Peak Multiplicities
NMR-SAMS
uses chemical shifts as the labels of heavy atoms, so that 2D NMR-derived
correlation information can be used as bond constraints on specific atoms. This is also the reason why a generated
structure always has unequivocal 1H and 13C resonance
assignments.
13C chemical shifts are also used to evaluate the
intermediate structures/substructures produced during the structure generation
process. A knowledge base consisting of
a correlation table of substructure and 13C chemical shift (d) range is used for predicting 13C
chemical shift ranges. Each of the
substructures consists of the central carbon atom (which is being considered),
its attached bonds, and the first layer of its neighboring atoms (the outwards
bonds of these atoms are not considered).
This is referred to as a carbon-centered
single-spherical substructure (CCSS). Currently, this table
consists of the 13C chemical shift ranges of around 93 CCSSs
composed of C, N, O, and other common elements that have been adapted from
literature. The correlation table is
stored as an ASCII file, chemical_shifts.def
(see Appendix III), with the
code for each CCSS and its expected minimum and maximum 13C chemical
shift. This file can be customized by
the user, and is read when NMR-SAMS is started.
During structure generation, whenever a carbon atom
has a complete CCSS (i.e., its immediate neighbors are known), then its
expected chemical shift range is derived from the knowledge base and compared
with the observed 13C chemical shift of the central carbon. If the observed shift satisfies this range,
then it is accepted, otherwise the substructure is discarded. If the CCSS is not defined in the knowledge
base table, the test is assumed to have been passed, and the undefined CCSS's
are reported after the structure generation has been completed. As the CCSS's cover only very limited
structural features, their chemical shift ranges are very broad. Thus in NMR-SAMS, 13C chemical
shifts act as a much looser constraint on the structure generation than the 2D
NMR connectivities. Hence it is very
important to include as much correlation information as possible for efficient
structure generation. Sometimes the
correct structure could be overlooked if the molecule has carbons that show odd
chemical shifts. In such cases, it is
recommended that the user broaden the predicted chemical shift ranges by
specifying an extra tolerance (For details refer to the Appendix IV describing
parameter ADD_C13_RNG).
13C peak multiplicities play
an important role in determining the number of attached protons of heavy atoms
(i.e., the building blocks). So it is
recommended to use DEPT (or INEPT, APT) spectra to
obtain complete 13C multiplicity information.
In the current version, 1H chemical
shifts are not used to evaluate substructures.
1H peak multiplicities are used to limit the neighboring
atoms of the concerned atom. (For details refer to the description about H1MULT_FLAG in Appendix IV.)
During structure generation NMR-SAMS searches all possible ways to assemble the structural building blocks into complete structures. Within some allowance for the violation of constraints, the generated structures are consistent with all of the available spectral data and chemical constraints.
The
efficiency of structure generation is a factor of the computation time, the quality of the structure
generated, and the number of structures generated. Because it is a combinatorial problem, structure generation is
usually the most time-consuming step.
“Combinatorial explosion” has been the major bottleneck of early attempts of automated
structure elucidation. NMR-SAMS
provides novel heuristic search algorithms that reorder the solution space
based on bond constraints, and search only the most probable portion of this
space for candidate structures. These
methods exponentially reduce the CPU time for structure generation and hence
make it practical for complex molecules.
Moreover, the user has full control of the usage of these methods to
perform optimized structure generation.
For example, by modifying a few parameters, the user can extend the
search space to a more complete search, or simply turn off the heuristic search
methods to perform an exhaustive search.
On the other hand, the user can limit the search space for faster
structure generation. (See Section 7.4
and Appendix IV about the parameters GEN_FLAG, SAT_BC_RATE and N_FBX_STEP).
For relatively small molecules (e.g. < 30 heavy atoms) with reasonably clean and sufficient spectral data, this process is usually completed in seconds or minutes. In most cases the correct structure is generated either uniquely or along with a few alternatives. For more complex problems (bigger molecules and insufficient spectral constraints), structure generation can be completed in a reasonable computation time if adequate user-defined constraints are included.
The
candidate structures generated by NMR-SAMS include complete structures and optionally, substructures. A complete
structure is defined as one having no unsatisfied free bonds. In the case of partial structure elucidation (see Section 7.1 for
details), the chemically incomplete structure obtained is still referred to as
a complete structure, because all of the free bonds are satisfied either by
real bonds or dummy bonds. During structure generation,
the program enables the user to save the largest intermediate substructures. The substructures are useful when the
generation of complete structures is not possible due to errors in spectral
data or other reasons, and they provide clues and hints for improving the input
spectral data and completing the structure elucidation successfully.
NMR-SAMS was developed to streamline and automate the structure elucidation process with less user-intervention. However, when the molecular size of the unknown is big (e.g., number of non-hydrogen atoms is greater than 40), or insufficient connectivity information is available, user-intervention is absolutely necessary to improve the efficiency of structure generation. Currently the user can interact with the structure elucidation procedure in the following ways:
1.
Modification
of the control parameters for NMR interpretation and
structure generation. For example, the
user can decide whether or not to use the “negative information” of DQF-COSY based on the spectral quality,
and the user can also limit ring sizes to either 5 or 6-membered rings in the
generated structure and discard structures containing other ring sizes.
2.
Modify
the intermediate results in the MDF by using Edit/Master Data File.
3.
Supply
structural building blocks by using Analysis/Edit
Building Blocks if the MF is unknown.
4.
Supply
known structural information as user-defined
bond constraints. This is very important especially for heteroatoms that
are either not observed or have sparse connectivity information in 2D NMR
experiments. Also, different spectral
data, such as IR and UV, normally provide positive evidence of some known functional
groups. Using Analysis/User-defined Bond Constraints, the user can add as many
known bonds as possible between the constituent atoms (see Section 7.2). Using this feature, the user can also
manually assemble the building blocks as a complete structure, or use a
selected substructure (which was previously generated ) as the starting point
for the next structure generation.
5.
Supply
known structural information as atom environment
constraints (EC). An EC defines the number of occurrence of a
certain type of atom(s) as the immediate neighbor(s) of an atom under
consideration (See Section 7.3).
6.
Propose
a possible structure for the unknown and perform resonance assignment. This way the user can verify
user-proposed structures and complete the structure elucidation.
7.
Modify
the results of resonance assignment of a target structure using Analysis/User-Defined Assignment.
The parameter file (.par file) stores the parameters for controlling spectral interpretation, for setting up ACMX, and for structure generation. All of the parameters can be modified by selecting Edit/Parameters/NMR Interpretation, Edit/Parameters/Set up ACMX or Edit/Parameter/2D Structure Generation. Default values are assigned to the parameters according to the nmrsams.ini file when a new working data set is opened. The default values can be customized by editing the nmrsams.ini and nmrsamspersonal.ini files. In most cases, the default parameters should be a good starting point for structure elucidation. In the following chapters, the name of the parameter, e.g., GEN_FLAG, is used to refer to a parameter, and the corresponding titles in the dialog boxes and details about the usage of the parameters are described in Appendix IV.
Chapter 4
This
chapter describes the operations related to the data files used by
NMR-SAMS. During each session of
structure elucidation, NMR-SAMS works with a working data set, which consists of five
text files with the same root name but
different extensions. For example, if the root name is Q-2-test, then the working data set consists of the
following files:
·
A
master data file (MDF), Q-2-test.mdf, where all of the
intermediate and final results are stored.
The user can view and edit this file by using Edit/Master Data File (See Appendix II).
·
A
parameter file, Q-2-test.par, where the control
parameters used for the data interpretation and structure generation are
stored. The user can access the parameters by using the commands in the
pull-right menu of Edit/Parameters (see
Appendix IV).
·
An
NMR data file, Q-2-test.nmr, where the NMR data
converted from the SpecMan peaks table are stored. The user can view and edit this file by using Edit/NMR Data File (see Appendix I).
·
A
log file, Q-2-test.log, where most of the
information, warning, and error messages produced during the analysis are
stored. The user can view the log file
by using Edit/Log File.
·
A
structure file, Q-2-test.str, where the atom-atom
connection table of the generated structures and their resonance assignments
are stored. The user can display the
structures by using Display/Generated
Structures (see Chapter 10).
·
A
lock file, Q-2-test.lock, which is
used to prevent two users opening the same data set simultaneously.
Command: File/Open.
Description: This procedure is used to open an existing working data set. An existing working data set stores the data
and results of the last session of structure elucidation with NMR-SAMS. Opening an existing working data set allows
the user to continue from where the dataset had last been saved. After selecting File/Open, a file browser is displayed, listing the master data
files in the current directory. If
necessary, the user can switch to the desired directory, and then click the
desired master data file name. The
selected file name appears in the Open MDF field. Next click OK, and the working data set is then opened for use.

After a working data file has been opened, the following message will appear:

The message prompts the user to confirm removal of old log messages from the previous session. To remove the old log messages, select ‘Yes’ or to retain the old log messages, select ‘No.’
The status window displays the current state of structure elucidation. It lists the NMR data files that are being used. It also lists the steps that have been completed, and provides tips to the user as to what steps need to be done next. The structural results, such as building blocks or candidate structures, are displayed in the main graphics window (see Chapter 10).
Note: If another working data set is opened before the current modified
working data set has been saved, NMR-SAMS will prompt the user to save the
changes.
If
the user wants to discard the changes that have been made to the current
working data set without exiting the program, re-open the dataset and click
‘Yes’ to the following message:

Then it is possible to start from the point at which the working data set was last saved. Note that if a data set that is being locked by another user is selected, the following warning message will appear:

Click 'Yes' to open the data file anyway, or click 'No' to cancel. Note that if 'Yes' is selected, problems may arise.
Command: File/New.
Description: This procedure is used to
create a new working data set. When dealing with a new structure problem, the
user must open a new working data set.
The user can open a totally new working data set, or open one starting
from an existing NMR data file that has already been prepared.
To
open a totally new working data set, choose File/New. In the displayed file browser, make sure to select the
file type as 'Completely New Dataset (*.mdf).'
Switch to the desired directory if necessary, and type a root name for
the new working data set. The extension
*.mdf will be automatically added.

After
clicking 'Open' NMR-SAMS creates the *.mdf, *.par, *.nmr, *.log and *.str
files. All files, except for the
parameter file (*.par) will be empty.
Next,
NMR-SAMS prompts the user to input the molecular formula (MF) of the sample as shown below:

Input
the molecular formula into the dialog box (see Section 4.4 for more information
about inputting the molecular formula).
To open a new working data set starting with an existing NMR file, select the file type as 'Existing NMR File (*.nmr)' in the file browser. Switch to the desired directory if necessary, and click the desired .nmr file. Next, click 'OK' and a new working set is created with the selected .nmr file.
Note: If the user selects the
filename of an existing data set, NMR-SAMS will warn the user about existing
files with the same root name, as shown below:

Click
'Yes' and the program will overwrite the existing files (except the .nmr file if starting from an existing NMR data file).
If
the user wants to use the existing .nmr file, but doesn't want to
overwrite the existing files, click 'No' to cancel this dialog box. Then, make a copy of the .nmr file with a
new root name and reopen the newly named .nmr file.
Command: File/Input Molecular Formula.
Description: This procedure is used to define the molecular formula of the
sample. Normally this command is used
when the user wants to change the MF, since NMR-SAMS always prompts the user to
enter the MF when a new working data set is first opened (see Section 4.3), as
shown below:

Note
that the element symbol must be typed with the first letter in upper case and
the second one, if any, in lower case.
The user can specify the valence of an atom in parenthesis following the
element symbol (i.e., C10H12N(V)N2S(VI)O8).
If the valence is not specified, the most common chemical valence is
adopted for any elements with multiple valences (i.e., a valence of 3 and 2
would have been adopted for N and S).
The user can also change the valences later by selecting Analysis/User-Defined Building Blocks.
If
the exact MF is unknown, enter the closest possible formula or type
'UNKNOWN'. In any case, the user can
modify the elemental composition of the molecule by using Analysis/User-defined Building Blocks later (see Section 6.3).
Once
a molecular formula has been entered, it is interpreted and a dialog box
appears displaying the standardized MF, the molecular weight, and the double
bond equivalence (DBE), as shown below:

Two
records are written into the MDF. The first record starts with the keyword “MF:” and contains the
standardized MF:
MF: C30H48O3
The
second record starts with the keyword “ATOMS:”. Following this are the molecular weight and the degree of
unsaturation (or double bond equivalence) in the same line. The second
line is a brief description of the entries in each of the remaining lines. Each line consists of the ID, the atomic
number, the chemical valence, the minimum and maximum attached protons, the minimum and maximum of
attached double bonds, and the minimum and maximum attached triple bonds of a
constituent heavy atom, respectively.
The constituent heavy atoms are listed with carbon first, and the
remaining elements in the alphabetic order of their element symbol.
ATOMS: (MW = 456.7074, DBE = 7.0)
#Atom; Element; Valence; Min. & max. attached H; Min. & max. double bonds; Min. & max. triple bonds
# 1. C 4 0 3 0 2 0 1
# 2. C 4 0 3 0 2 0 1
# 3. C 4 0 3 0 2 0 1
.
.
.
#30. C 4 0 3 0 2 0 1
#31. O 2 0 1 0 1 0 0
#32. O 2 0 1 0 1 0 0
#33. O 2 0 1 0 1 0 0
Note: When an atom has multiple valences, the most common valence will be adopted, by default. For example, the valence 3 is always adopted for N. However, the user can specify an uncommon valence while inputting the MF. If there is a -NO2 group in the molecule, input the MF containing a “N(V)” (e.g., C6H5N(V)O2). Modifying the valence manually in the .mdf file is not recommended, because whenever Analysis/Building Blocks is selected, the MF will be re-interpreted and the previous changes will be overwritten.
Command: File/Save.
Description: This command allows NMR-SAMS to update the working data set with the current state of structure elucidation. The user will be prompted to save changes before exiting the program or opening another working data set.
Command: File/Save As.
Description: This command allows NMR-SAMS to save the current state of
structure elucidation in a working data set with a different root name. After selecting File/Save As, the following file browser is displayed. Switch to the desired directory (if
necessary), type the new root name, and then click OK.

Command: File/Exit.
Description: This command allows the user to exit NMR-SAMS. If changes have been made to any of the
three data files (*.nmr, *.mdf, or *.par), and those changes have
not been saved, NMR-SAMS will prompt the user to save them before exiting the
program:

If 'Yes' is clicked, the changes will be updated before exiting the program. However, if 'No' is clicked, the changes will be ignored before exiting the program. The command will be ignored if 'Cancel' is selected.
Chapter 5
It
is important to generate a clean and reliable set of peak lists from different
NMR experiments before using them in NMR-SAMS.
SpecMan provides several advanced and intelligent peak-picking tools to
perform fast and reliable peak picking.
For details regarding peak picking, refer to the SpecMan User's Guide. Since
SpecMan can independently perform peak picking and peaks table conversion, the
user can either perform both steps in SpecMan, or perform peak picking in SpecMan and then peaks table conversion in NMR-SAMS. Either way, the ability to perform
consistency checking during the conversion process will help the user to find
potential errors in the peak picking results.
This
chapter describes how to prepare 1D and 2D NMR spectral data as input for
NMR-SAMS. (for details about the NMR Data File format see Appendix I). It is assumed that the peak picking has
already been performed in SpecMan. The
peak tables from SpecMan are then converted into the NMR-SAMS format by
selecting from the following pull-right options of 'Create NMR Data File' from
the File menu as shown below:

Command: File/Create NMR Data File/H1.
Description: In this procedure, SpecMan
1H peaks table is converted into
NMR-SAMS format. First the following
dialog box is displayed which prompts the user to enter the filename of the 1H
peaks table from SpecMan.

Click
'Browse' to locate the peaks table file, and then click OK. An information dialog box displays the
number of 1H peaks that have been converted:

In
the current version of SpecMan, all 1H peak multiplicities are marked as unknown (u), by
default. Therefore, NMR-SAMS will prompt
the user to supply the 1H multiplicity for the peaks (referring to
their splitting patterns). As shown in Fig. 5.1, if the multiplicities of all
or some of the 1H peaks are known, select Edit/NMR Data File to open the NMR data file and replace the unknown
multiplicity (represented as “u”) by one of the following symbols recognizable
to NMR-SAMS:
s:
singlet, d: doublet, t: triplet, q: quartet, m: other multiplet. If the multiplet is unknown, leave it as
unknown (u).
NMR-SAMS
uses 1H multiplicity information to eliminate inappropriate bonds
while setting up ACMX. For additional details, refer to the usage of parameter H1_MULT_FLAG (in
Appendix IV).

Figure. 5.1. Running NMR-SAMS and SpecMan side-by-side provides a convenient way to verify and edit the 1D peaks converted from SpecMan peaks table. Left (NMR-SAMS): select Edit/NMR Data File to open the .nmr file. Right (SpecMan): Open the 1D spectrum and load the 1D peaks table. From the comment field of a converted peak, the ID (#32) of the original peak is found. By clicking the corresponding entry in the peaks table, the 1D peak (#32, shown in cyan) is highlighted in the spectrum so that the user can see and recognize the multiplicity of this peak before modifying the .nmr file.
Possible Errors: Generally NMR-SAMS
crosschecks the converted 1H peak list against the MF (if known) and alerts the user of any
potential conflicts. The following
situations will be reported when there is a conflict:
·
If
the multiplicity information is unknown for more than three fourths of the
peaks, a warning message prompts the user to supply this information if
possible.
·
If
the number of 1H peaks exceeds the constituent protons, an error
message prompts the user to correct either the peak picking result or the MF.
Results: After conversion, the .nmr file is updated with
information regarding proton peaks starting with the keyword “H1:”. Following is a transcript of the converted 1H peaks:
H1: C:\Spectrum2001\Data\NMR-SAMS\Q-2-test/h1.pks
#1. 4.930 s
;1
#2. 4.755 s
;2
#3. 3.509 u
;3
.
.
.
#32. 0.818 s ;32
#33. 0.811 u ;33
The first line beginning with the keyword “H1:” indicates the start of 1H peak list. Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in the rest of the lines represent the following attributes of each 1H peak:
· Peak ID, a serial number that uniquely identifies this peak.
· Chemical shift of the peak in ppm values.
·
Multiplicity, designated as s (singlet),
d (doublet), t (triplet), q (quartet), m (other multiplet) or u (unknown). By default it is assigned as unknown.
· Comments, which are optional. The number in the comment field corresponds to the ID of the 1H peak in the SpecMan peaks table.
One or more spaces are used as a delimiter for all items except comments that are separated by a semicolon (;). Items marked as optional can be omitted unless an item following them is included. In such a case, the user must include default values for ignored items even if they don’t get used. Comments can always be included as long as they follow a semicolon (;). The peak list intensities and comments of the 1H peak list are not currently used by NMR-SAMS.
Note: Whenever the user repeats a 1H peaks table conversion or modifies a converted peak list (using Edit/NMR Data File), the dependent 2D spectral data must also be reconverted. For example, if a 1H peak is added to the converted 1H peak list, the user must reconvert the COSY, HMQC, HMBC, and NOESY data again (if they had been converted already). Otherwise the added 1H peak will not be reflected in the 2D data.
Command: File/Create NMR Data File/C13 and DEPT.
Description: In this procedure the
SpecMan 13C and DEPT/APT peak tables are converted into
a peak list of 13C chemical shifts and multiplicities. NMR-SAMS requires 13C
multiplicity information for reliable structure elucidation, and in order to
get the complete 13C multiplicity information, the user needs 13C,
DEPT-90/APT-90 and DEPT-135/APT-135 experimental data. However, NMR-SAMS provides a flexible way to
derive the 13C multiplicity information from any combination of
available experiments as described below:
1.
13C Only.
In the dialog box that appears, select ‘None’ for Peak Multiplicity Experiments
and then click ‘Browse’ to find and select the SpecMan-created 13C
Peaks Table, as shown below:

After clicking ‘OK’ NMR-SAMS updates the .nmr file with a list of 13C chemical shifts
having unknown multiplicities as shown in the Results section below. If
the multiplicities of some peaks are known, the user can manually edit the .nmr file to supply this information.
2.
13C and DEPT. In the dialog box that
appears, click ‘Browse’ to enter the SpecMan-created 13C Peaks
Table. Then select ‘DEPT’ for Peak
Multiplicity Experiments, and enter the peaks table filenames for DEPT-45
(optional), DEPT-90, and DEPT-135 experiments.
As mentioned before, all of the DEPT experiments are optional, so turn
off the corresponding toggle if certain DEPT data has not been obtained. Note that ignoring some DEPT experiments
(except for DEPT-45) could leave some peaks with unknown multiplicities.

Also enter a matching tolerance (in ppm) to match
the 13C and DEPT peaks. Upon
clicking ‘OK’, NMR-SAMS will update the .nmr file with a list of 13C
chemical shifts and derived multiplicities as shown in the Results section below.
3.
13C and APT. In the dialog box that appears,
click ‘Browse’ to enter the SpecMan-created 13C Peaks Table. Select ‘APT’ for Peaks Multiplicity
Experiments and then enter the peaks table filenames for APT-45, APT-90, and
APT-135 experiments. As mentioned before, all of the APT experiments are
optional, so turn off the corresponding toggle if certain APT data has not been
obtained. Note that ignoring some APT
experiments (except for APT-45) could leave some peaks with unknown
multiplicities.

Also enter a matching tolerance to match the 13C
and APT peaks. Upon clicking ‘OK’,
NMR-SAMS will update the .nmr file with a list of 13C
chemical shifts and derived multiplicities as shown in the Results section below.
Possible Errors: During the conversion
NMR-SAMS crosschecks the 13C peak list with the MF, and alerts the
user of potential inconsistencies. In
such cases, the following general messages will be reported:
·
If
there are more 13C peaks than the constituent carbon atoms, an error
message will prompt the user to remove peak artifacts or correct the MF.
·
If
there are fewer 13C peaks than the constituent carbon atoms, a
warning message will prompt the user to resolve 13C peak
overlap. Define the overlapping peaks
as individual peaks with slightly different chemical shifts by choosing Edit/NMR Data File and editing the NMR data file (it is usually
possible to resolve such ambiguities by looking at the peak intensity and the
HMQC spectrum, or by acquiring the
spectrum at different conditions). If
the user is unable to resolve overlapping peaks (for example, in the case of a
symmetric molecule, or due to severe overlap in a spectrum), then partial
structure elucidation will be performed (see Section
7.1).
·
If
the multiplicity of one or more 13C peaks is unknown, a warning
message will prompt the user to supply this information, if possible. Lack of this information may result in
multiple building block sets (see Section 6.2).
·
The
number of carbon-attached protons (n_CH )
is calculated based on the 13C multiplicities. If n_CH
is greater than the number of constituent protons, an error message will
prompt the user to correct either the multiplicity information or the MF.
·
When
the number of 13C peaks is equal to that of the carbon atoms and all
13C multiplicities are known, the maximum number of
heteroatom-attached protons (max_XH )
is calculated based on the valence of the constituent heteroatoms. If (n_CH
+ max_XH) is smaller than the number
of constituent protons, an error message will prompt the user to correct either
the multiplicity information or the MF.
Results: After conversion, the .nmr file is updated with information regarding the 13C
peaks starting with the keyword “C13:” in the .nmr file. The following is a transcript of a converted
13C peak list (note that if DEPT or APT data is not used, the
multiplicity will be unknown “u” for all peaks):
C13: C:\Spectrum2001\Data\NMR-SAMS\Q-2-test\c13.pks
#1. 178.822 s ;1
#2. 151.323 s ;2
#3. 109.931 t ;3
.
.
.
#28. 16.340 q ;28
#29. 14.929 q ;29
The first line beginning with the keyword “C13:” indicates the start of the 13C peak list. Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in each of the rest of the lines represent the following attributes of the 13C peak:
· Peak ID, a serial number that uniquely identifies this peak.
· Chemical shift of the peak in ppm values.
·
Multiplicity, designated as s (singlet,
C), d (doublet, CH), t (triplet, CH2), q (quartet, CH3), or u (unknown).
· Comments, which are optional. The number in the comment field corresponds to the ID of the 13C peak in the SpecMan peaks table.
One or more spaces are used as a delimiter for all items except comments that are separated by a semicolon (;). Items marked as optional can be omitted unless an item following them is included. In such a case, the user must include default values for ignored items even if they don’t get used. Comments can always be included as long as they follow a semicolon (;). The peak list intensities and comments of the 13C peak list are not currently used by NMR-SAMS.
Note: Whenever the user repeats a 13C peaks table conversion or modifies a converted peak list (using Edit/NMR Data File), the dependent 2D spectral data must also be reconverted. For example, if a 13C peak is added to the converted 13C peak list, the user must reconvert the HMQC, HMBC, and INADEQUATE data again (if they had been converted already). Otherwise the added 13C peak will not be reflected in the 2D data.
As
shown in Fig. 5.1, NMR-SAMS and SpecMan can be used side-by-side to verify the
peak picking results of peaks mentioned in warning or error dialog boxes.
5.4 Conversion of SpecMan DQF-COSY Peaks Table
Command: File/Create NMR Data File/COSY.
Description: In this procedure NMR-SAMS converts the DQF-COSY cross peak coordinates into
connectivities between 1D 1H
peaks. As illustrated in Fig. 5.2, the
coordinates of the peak center (shown as a cross) are matched to the 1D
chemical shifts (shown as dotted lines).
The 1D peaks that match the peak center within the tolerances (±D2 and ±D1 in F2 and F1 dimensions,
respectively) are taken as the correlated 1D peaks. If more than one 1D peak (such as 1H peaks a and b in
Fig. 5.2) matches the cross peak center in a certain dimension, then all are
treated as possible correlated 1D peaks in that dimension. Such connectivity is
called an ambiguous connectivity and
NMR-SAMS will internally consider all possible correlations for an ambiguous
connectivity (for more details about ambiguous connectivity, see the example in
Section 3.4).

Figure. 5.2. Illustrates the conversion of COSY cross peak coordinates into a correlation between the 1D 1H peaks. The cross (+) denotes the cross peak center. The dotted lines denote the chemical shifts of the three 1D 1H peaks, a, b, and c, respectively. D1 and D2 are the matching tolerances along F1 and F2, respectively. All three peaks, which match the cross peak center within the tolerances, are taken as correlated 1D peaks.
Upon selecting File/Create
NMR Data File/COSY, NMR-SAMS opens a dialog box that prompts the user to
enter the filename of the COSY peaks table.
The user is also prompted to input matching tolerances along the X (i.e.
F2) and Y (i.e. F1) dimensions.

The
default value for the matching tolerance is 0.005 ppm for both dimensions. It is important to select an appropriate
tolerance since too large of a tolerance value could result in undesired
ambiguity, and too small of a tolerance value could ignore some real
peaks. To choose a suitable tolerance,
the four following factors must be considered:
·
Accuracy
of the peak picking. The
grid-intelligence-based peak picking of SpecMan provides a very convenient way
to verify the accuracy of peak picking by comparing the expected locations of
the cross peaks with the picked peaks (see SpecMan's User’s Guide). If a peak list was carefully verified with this
method, it is acceptable to start with a small tolerance.
·
Alignment
between 1D 1H and the COSY spectra.
SpecMan provides convenient tools to correct frequency offset between
the 1D and 2D spectra. Sometimes
different experimental conditions introduce small chemical shift differences
between 1D and 2D resonances. To
further correct the differences due to sample conditions, the user can utilize
the grid-intelligence-based peak picking method of SpecMan. If these corrections have been applied, it
is acceptable to start with a small tolerance.
Possible Errors: