User’s Guide of

NMR-SAMS

 

An expert system for computer-assisted structure elucidation of organic and natural product compounds based on multidimensional spectroscopy

 

 

 

 

 

 

 

 

 

 

 


 

NMR-SAMSTM User's Guide, April 1998.

This manual describes release 2.0 of the Windows 95/NT version of the NMR-SAMSTM Software.

Copyright Notice

Copyright © 1996 through 2001 Spectrum Research, LLC.  All rights reserved.

No part of this document may be reproduced, transmitted, transcribed, stored in a retrieval system, or translated into any language in any form by any means without the written permission of Spectrum Research, LLC.

All possible care has been taken in the preparation of this document but Spectrum Research accepts no liability for any errors/omissions that may be found.

Spectrum Research, LLC. reserves the right to change the information in this document without prior notice.

Trademarks

SpecManTM and NMR-SAMSTM are trademarks of Spectrum Research, LLC.

Acknowledgments

NMR-SAMSTM (originally known as CISOC-SES) has been developed by Dr. Shengang Yuan, Dr. Chen Peng and Prof. Chongzhi Zheng at the Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences in 1988-1994.  It has been further improved by Dr. Chen Peng in the group of Dr. Geoffrey Bodenhausen in the National High Magnetic Field Laboratory in 1995-1996. Portions of NMR-SAMSTM are copyright © 1988 through 1995, Shanghai Institute of Organic Chemistry and Florida State University, and are exclusively licensed to Spectrum Research, LLC.   Title and full ownership rights to the converted/modified NMR-SAMSTM will remain solely with Spectrum Research, LLC, and NMR-SAMSTM is asserted to be Spectrum’s proprietary information and trade secret.

Credits

If the results (figures and/or data) obtained by NMR-SAMSTM application is used for publication purpose, please refer to it in the following manner or any other equivalent form:

"NMR-SAMSTM software, developed by Spectrum Research, LLC., was used to compute the results in this publication".

 

 


Contents

Contents...............................................................................................................................

Abbreviations And Acronyms...............................................................................................

Introduction........................................................................................................................

1.1 General................................................................................................................................................................

1.2 Application Limitations....................................................................................................................................

1.3 System Requirement.........................................................................................................................................

1.4 Help Facility........................................................................................................................................................

1.5 Typographical Conventions..............................................................................................................................

Getting Started with NMR-SAMS.......................................................................................

2.1 Installation of the Program..............................................................................................................................

2.2 Spectrum Research Licensing........................................................................................................................

2.3 Starting NMR-SAMS........................................................................................................................................

2.4 Brief Introduction to Microsoft Windows.....................................................................................................

2.5 Description of the Main Menus.......................................................................................................................

2.6 The NMR-SAMS Toolbar.................................................................................................................................

Understanding NMR-SAMS...............................................................................................

3.1 Overview..............................................................................................................................................................

3.2 General Procedure of Structure Elucidation with NMR-SAMS...............................................................

3.3  What Spectral Data Does NMR-SAMS Use?..............................................................................................

3.4 Use of 2D NMR Connectivities: Bond Constraints...................................................................................

3.5 Use of Chemical Shifts And Peak Multiplicities.......................................................................................

3.6  Structure Generation....................................................................................................................................

3.7 User Intervention.............................................................................................................................................

3.8 Control Parameters........................................................................................................................................

Working Data Set.............................................................................................................

4.1 Overview............................................................................................................................................................

4.2 Opening An Existing Working Data Set....................................................................................................

4.3 Opening A New Working Data Set..............................................................................................................

4.4 Input Molecular Formula...............................................................................................................................

4.5 Save A Working Data Set..............................................................................................................................

4.6 Save A Working Data Set as Different Name............................................................................................

4.7 Exiting NMR-SAMS........................................................................................................................................

Input of NMR Spectral Data............................................................................................

5.1 Overview............................................................................................................................................................

5.2 Conversion of SpecMan 1H Peak List..........................................................................................................

5.3 Conversion of SpecMan 13C Peak List.........................................................................................................

5.4 Conversion of SpecMan DQF-COSY Peaks Table....................................................................................

5.5 Conversion of SpecMan HMQC/HETCOR Peaks Table..........................................................................

5.6 Conversion of SpecMan HMBC/COLOC Peaks Table............................................................................

5.7 Conversion of SpecMan NOESY Peaks Table............................................................................................

5.8 Conversion of SpecMan INADEQUATE Data.............................................................................................

5.9 Manual Peak Picking.....................................................................................................................................

Spectral Interpretation.....................................................................................................

6.1 Overview............................................................................................................................................................

6.2 Interpretation of MF, 1H, 13C and HMQC Data as Building Blocks.......................................................

6.2.1 Interpretation of Molecular Formula...................................................................................................

6.2.1......................................................................................................................................................................

6.2.2. Interpretation of 1D 1H Data.................................................................................................................

6.2.2......................................................................................................................................................................

6.2.3 Interpretation of 1D 13C Data.................................................................................................................

6.2.4 Interpretation of HMQC/HETCOR Connectivities...........................................................................

6.2.5  Generation of Building Blocks............................................................................................................

6.3 User-Defined Building Blocks......................................................................................................................

6.4 Interpretation of 2D Spectral Data as Bond Constraints.........................................................................

6.4.1 Interpretation of COSY Connectivities...............................................................................................

6.4.2 Interpretation of HMBC/COLOC Connectivities..............................................................................

6.4.3 Interpretation of NOESY Connectivities.............................................................................................

6.4.4  Interpretation of INADEQUATE Connectivities...............................................................................

6.4.5 Transformation of Bond Constraints...................................................................................................

6.4.6 Setting up Atom-Atom Connection Matrix (ACMX).........................................................................

2D Structure Generation..................................................................................................

7.1 Overview............................................................................................................................................................

7.2  User-Defined Bond Constraints...................................................................................................................

7.2.1. Interactive Structure Generation........................................................................................................

7.3  User-Defined Atom Environment Constraints..........................................................................................

7.4  Structure Generation....................................................................................................................................

Resonance Assignment.....................................................................................................

8.1 Overview............................................................................................................................................................

8.2  Input of the Target Structure.......................................................................................................................

8.2.1. Inputting the Target Structure Interactively....................................................................................

8.2.2. Inputting the Target Structure via MDL File....................................................................................

8.2.3. Setting up the Assignment Matrix......................................................................................................

8.3  User-Defined Resonance Assignment........................................................................................................

8.4  Resonance Assignment.................................................................................................................................

Isomer Enumeration/Quick Elucidation...........................................................................

9.1 Overview............................................................................................................................................................

9.2 MF-based Isomer Enumeration.....................................................................................................................

9.3 Quick Structure Elucidation.........................................................................................................................

Graphical Display of Results............................................................................................

10.1 Overview..........................................................................................................................................................

10.2 Display of Structural Building Blocks.....................................................................................................

10.3 Display of Target Structure........................................................................................................................

10.4 Display of Generated Structures/Assignments......................................................................................

10.5  Status Window..............................................................................................................................................

10.6 Display Options.............................................................................................................................................

10.7  Editing the Display of Generated Structures..........................................................................................

Exporting Results.............................................................................................................

11.1 Overview..........................................................................................................................................................

11.2  Exporting NMR Spectral Data...................................................................................................................

11.3  Exporting Resonance Assignment............................................................................................................

11.4  Exporting Candidate or Target Structures.............................................................................................

NMR Data File.................................................................................................................

1D Spectral Data....................................................................................................................................................

2D Spectral Data....................................................................................................................................................

Master Data File..............................................................................................................

CCSS-13C Chemical Shift Range Correlation Table..........................................................

Control Parameters..........................................................................................................

Parameters for Spectral Interpretation.............................................................................................................

Parameters for Setting up ACMX......................................................................................................................

Parameters for Structure Generation...............................................................................................................

References........................................................................................................................

Index................................................................................................................................


Abbreviations And Acronyms

d13C                                  13C chemical shift.

d1H                                   1H chemical shift.

1D                                    One-dimensional.

2D                                    Two-dimensional.

ACMX                            Atom-atom Connection MatriX, which summarizes the bond-formation probabilities between the constituent atoms of an unknown.

BB                                    Structural Building Blocks for structure generation, e.g.,  CH3-, CH2<, and -OH.

BC                                    Bond Constraint derived from 2D NMR spectral data, which defines the number of intervening bonds between the correlated spins.

CCSS                               Carbon-Centered Single-spherical Substructure.

COLOC                           COrrelation via Long-range Coupling, a kind of 2D spectrum that provides 2-to-3-bond 13C-1H connectivities.   

COSY                              COrrelated SpectroscopY, a kind of 2D spectrum that provides 1H-1H through-bond connectivities.

CPU                                 Central Processing Unit.

DEPT                               Distortionless Enhancement by Polarization Transfer, a kind of 1D spectra that provides information concerning the number of attached protons on each carbon atom.

EC                                    Environment Constraint, limitation on the neighboring types of atoms attached to a central atom specified by the user. 

HETCOR                         HETeronuclear Correlation, also called C-H COSY, a kind of 2D spectrum that provides one-bond 13C-1H connectivity information.  

HMBC                             Heteronuclear Multi-Bond Connectivity, a kind of 2D spectrum that provides 2-to-3-bond 13C-1H connectivity information.

HMQC                            Heteronuclear Multiple Quantum Coherence, a kind of spectrum that provides one-bond 13C-1H connectivity information.

INADEQUATE              Incredible Natural Abundance Double Quantum Transfer Experiment, a kind of 2D spectrum that provides one-bond 13C-13C connectivity information.

MDF                                The Master Data File produced while using NMR-SAMS for structure elucidation. This file stores the intermediate and final results produced during the execution of NMR-SAMS.

MF                                   Molecular formula or empirical formula of a molecule, which is usually derived from mass spectral data.

NMR                               Nuclear Magnetic Resonance

NOESY                            Nuclear Overhauser enhancement and Exchange SpectroscopY, a kind of 2D spectrum that provides 1H-1H through-space connectivity information.

NSBC                              Number of “Sub-bond constraint(s)”, or pair(s) of relevant atoms, that must satisfy a bond constraint in the generated structure.

PSE                                  Partial Structure Elucidation,   Structure elucidation based on information available on a portion of the spectral data, which is usually the well-resolved part 


Chapter 1

Introduction

1.1 General

NMR-SAMS (NMR Spectral Assignment Made Simple), is an expert system for computer-assisted    structure elucidation of unknown organic or natural product compounds from multidimensional spectroscopy, e.g., MS, NMR, IR and UV providing complementary information of chemical compounds.  In particular, NMR-SAMS uses information of chemical compounds from routine 1D and 2D NMR spectroscopy.   Together with SpecMan, it serves as a chemist’s workbench for de novo structure elucidation of small molecules such as organic compounds, natural products, peptides, and other small biomolecules.  NMR-SAMS is also used for automated resonance assignment of known compounds.     

The basic strategy of structure elucidation using NMR-SAMS is illustrated in Fig. 1.1. When dealing with an unknown compound, the molecular formula (MF) must be first determined by mass spectroscopy or other approaches.   Next, the 1D and 2D NMR chemical shifts, multiplicities, J-couplings and intensities are extracted from the processed 1D and 2D spectra (transformed through conventional FFT or Non-FFT techniques) using SpecMan software.  The 1D and 2D spectral data extracted as peak lists using SpecMan are imported into NMR-SAMS and interpreted as structural building blocks and bond constraints based on the one-bond, two-bond and other long-range connectivities.  Finally the building blocks, NMR-derived bond constraints, and other user-defined bond constraints are used to generate the plausible candidate structures with resonance assignments.  If the structure is already known, you can specify the proposed structure and let NMR-SAMS complete the resonance assignments directly.   

Figure 1.1. Data flow diagram of NMR-SAMS representing the different phases of spectral interpretation, structure generation and resonance assignment. Gray boxes represent optional input data. PSE: means partial structure elucidation based on incomplete spectral data. A bond constraint is represented as n intervening bonds, (B)n, between the correlated atoms.

NMR-SAMS has the following main features:

·        Input of peak tables  with chemical shifts, multiplicities, J-coupling and intensities, from a variety of 1D and 2D NMR experiments.

·        Automated interpretation, bookkeeping, and cross-checking of spectral data with respect to the molecular formula.

·        Novel representation of 2D NMR correlation information based on the concept of chromatic graph.

·        Structure determination and identification of unknown compounds based on full use of 2D NMR correlation information, and complementary spectral information from MS, UV and IR spectral data. 

·        Partial structure elucidation of compounds based on incomplete spectral data.  

·        Graphical tools for interactive building and editing of molecular fragments, and defining bond constraints and atom environment constraints.

·        Graphical tools to display and browse through candidate structures and sub-structures.  Graphical interaction between structures and bond constraints.

·        Background information-independent structure elucidation, which minimizes the potential human bias introduced into the structure elucidation process.

·        Fast structure generation of complex molecules when sufficient constraints are available. 

·        Fast resonance assignment and structure verification of large complex molecules based on proposed structures.  

·        Automated resonance assignment based on assigned resonances of compounds.

·        Flexible format for report generation of the results of spectral and structural analysis. 

1.2 Application Limitations

The current version of NMR-SAMS can only handle molecules that have less than 128 non-hydrogen atoms. The total number of free bonds (unsatisfied valences) of the structural building blocks before structure generation, which determines the complexity of the problem of structure generation, must not exceed 220. (The total number of free bonds is equal to the sum of valences of heavy atoms, less the number of protons and twice the number of known bonds.) The maximum number of peaks in a 1D and 2D spectrum is limited to 200 and 1000 respectively.  The maximum number of bond constraints is limited to 1000.

Most of the previously proposed CASE (computer assisted structure elucidation) systems either use a chemical shift-substructure correlation database or a more concise chemical shift-substructure correlation model, and rely to a large extent on the knowledge of a human expert.  Such systems have been limited to very simple and small molecules.  NMR-SAMS has demonstrated the impact of using 2D NMR correlation information on improving the efficiency of CASE systems when dealing with real-world complex molecules. For efficient structure elucidation of unknown compounds, NMR-SAMS requires the molecular formula (which may or may not be known accurately from MS or other methods.  If the molecular formula is unknown, NMR-SAMS uses the number of observed carbon and proton peaks along with any available heteroatoms information to estimate the Molecular formula), 1D 1H, 13C, DEPT (or APT), and 2D DQF-COSY, HMQC (or HETCOR), HMBC (or COLOC, FLOCK), and INADEQUATE spectral data. It is not mandatory to have all of these experimental NMR data sets available, because NMR-SAMS can also solve structure elucidation problems with different possible combinations of experimental data (for details refer to Section 3.3).  Structure elucidation based on 1D 13C chemical shifts is only possible for very simple molecules, and is not practical for complex molecules.  NMR-SAMS cannot elucidate unknown structures based on only 1D 1H chemical shifts.

Although most spectra used by NMR-SAMS, e.g., 1D 1H, 2D DQF-COSY and HMBC, are allowed to have peak degeneracy, the 1D 13C spectrum and HMQC (or HETCOR) must be completely resolved for complete structure elucidation.  If severe overlap prevents resolving all 13C peaks, NMR-SAMS will use only the well-resolved spectral data to generate the plausible substructures. This is called partial structure elucidation (PSE).  Some limitations on PSE are described in Section 7.1.

In the current version, NMR-SAMS does not consider molecular symmetry, so partial structure elucidation is performed for a molecule with global symmetry.  For a molecule with local symmetry where  the 13C signals corresponding to symmetric carbons can be identified, complete structure elucidation by NMR-SAMS is possible.

Most of the steps in NMR-SAMS such as interpretation of 1D and 2D data into bond constraints, and generation of the building block sets, are usually performed very fast.   Structure generation, on the other hand, is more time-consuming because of its combinatorial nature. The efficiency of structure generation (which is a factor of the computation time, the quality of the structure generated, and the number of structures generated) depends on the size of the molecule and the quality and quantity of the spectral data.  When the unknown molecule is big (e.g. with more than 40 heavy atoms) and the correlation information derived from the spectral data is not sufficient,  the structure generation could take very long to finish.  In such cases you are advised to input as many as known substructures as possible to accelerate the structure generation process.  Also you can take advantage of the other tools of NMR-SAMS to tackle the structure, such as the resonance assignment function to verify a proposed structure,  and the flexible graphics tools to interactively build the structure.

Although the spectral interpretation routines of NMR-SAMS are general-purpose, the structure generator of NMR-SAMS can not deal with molecules containing ionic atoms, tautomeric or coordinate bonds.  It recognizes only single, double and triple bonds. Aromatic bonds are represented as alternating single and double bonds. Sometimes this might cause redundancy in the structure generation of aromatic compounds.

In the current version of NMR-SAMS, if the structure is already known, then target structure based resonance assignment is possible, provided the NMR data set is complete.

Although NMR-SAMS can recognize all the chemical elements, the current substructure/d 13C knowledge base (see Appendix III) contains only the substructures consisting of commonly occurring elements, i.e., C, H, O, and N.   This knowledge base can be customized by you. You will be informed about the undefined substructures when other elements exist in the molecule, and this could reduce the efficiency of structure generation.  

NMR-SAMS can be viewed as an expert assistant helping spectroscopists and chemists to solve structure elucidation problems, and is by no means expected to replace the human expert.  NMR-SAMS is designed for flexible human intervention, and efficiently uses the additional user knowledge and judgment to control and enhance the structure elucidation.  

1.3 System Requirement

The IRIX version of NMR-SAMS runs on SGI systems running IRIX 5.3 or higher and 6.x with R4000 or higher processors and at least 32 MB of RAM and 8-bit graphics.  R8000 or higher processors and 64 MB or more RAM is recommended.  A faster and smaller 64-bit version of NMR-SAMS can be supplied to users running R8000 or higher systems with IRIX 6.x.

The Solaris version of NMR-SAMS runs on Sun systems running Solaris 2.x (SunOS 5.x) with SPARC processors and at least 32 MB of RAM and 8-bit graphics. 64 MB or more RAM is recommended.  X/Motif 1.2.3 libraries are required.  These are usually supplied with the Sun Common Desktop Environment (CDE).

The Microsoft Windows version of NMR-SAMS runs on Intel 386 or higher processors (or 100% compatibles) with at least 32 MB of RAM running Windows 95 or Windows NT 3.51 or later and a VGA or better monitor.  A Pentium or higher processor with 32 MB or more RAM is recommended.  

NMR-SAMS requires from 2 MB to 55 MB of hard disk space, depending on the sample data that is installed.  The sample data with original spectra requires 40MB of hard disk space.  Swap drive space (i.e. virtual memory) required is proportional to the complexity of the data being analyzed.

1.4 Help Facility

NMR-SAMS provides on-line help.  Most of the dialog boxes have Help buttons which can be clicked to get help message about the dialog box.

1.5 Typographical Conventions

Unless otherwise noted in the text, the User’s Guide of NMR-SAMS uses the typographical conventions described below:

·        A command to select is represented in bold type face by the menu name, the option, and the pull-right option (if any). For example, the command:

Display/Display Options/Chemical Shifts    

means, first click Display menu on the menu bar, then click Display Options in the opened menu.  And then click Chemical Shifts in the pull-right options. 

·        Transcript of a computer file or display is printed in Courier New letters with the keywords shown in bold, and the annotations (if any) in italic Times letters. (Such annotations do not appear in the file or display itself).

ATOM~~ATOM:

For each correlation, listed are the IDs of the correlated atom pair, the range of intervening bonds, and the bond type (0: meaningless or unknown)

(1-23: 1~1 2)
(6-22: 1~1 3)

     .

     .

     .

·        Filenames and parameters are printed in Courier New letter. For example:

Files phasefile and procpar are used for peak picking with SpecMan. 

Parameter GEN_FLAG controls the search criteria of the structure generation.

·        Terms introduced for the first time are presented in boldface type.

·        Words in italic represent variables. For example:

There are n intervening bonds between the correlated atoms.


Chapter 2

Getting Started with NMR-SAMS

2.1 Installation of the Program

To install NMR-SAMS, please refer to the Release Notes of Spectrum Research Products.

2.2 Spectrum Research Licensing

NMR-SAMS is copy protected by the Spectrum Research Licensing System.  This licensing system allows NMR-SAMS to run only on the computer for which it was sold.  You should have received a license.dat file along with your installation.  This plain text file should be placed in the main NMR-SAMS directory (C:\Spectrum\ Nmr-sams by default). 

If you did not receive a license file with your NMR-SAMS installation, please contact Spectrum Research.  To create a license file for you, we need to have your Windows Serial Number (Product ID) or UNIX System ID.  Under Windows 95 and Windows NT 4.0, you can find this by clicking the right mouse button on the “My Computer” icon on your Windows 95 Desktop.  Choose “Properties” from the menu that pops up.  You Windows Serial Number is printed last in the “Registered To:” Section and is of the form XXXXX-XXX-XXXXXXX-XXXXX, where the X characters are replaced by numbers and letters.  Under Windows NT 3.51, choose “About Program Manager” from the “Help” menu of Program Manager.  Windows NT 3.51 serial numbers are of the form XXXXX-XXX-XXXXXXX.  The Product ID is listed on the dialog that appears.  On SGI systems, type /etc/sysinfo at a UNIX prompt to get your System ID.  SGI System ID’s are hexadecimal numbers.  The first 8 digits (4 groups of 2 digits) are the ones that are needed by the Spectrum Research License Manager.  On Sun Systems, type ‘hostid’ (usually in the /usr/bsd/ directory).  The 8 digits that are given are the identifier that is needed for the license.

When your licensing time period is nearing expiration, NMR-SAMS will warn you with a dialog box that tells you the number of days remaining.  Please contact Spectrum Research for a renewal at this time.


2.3 Starting NMR-SAMS

From the Program Manager or the Start Menu, click the NMR-SAMS icon in the Spectrum Research group to launch the NMR-SAMS program.  The program starts with a Main Graphics Window that has a menu bar and status bar. By default, a Status Window is also opened, which displays text messages to indicate the current status of the structure elucidation, and also prompts you with the “what to do next” steps.  The main graphics window is shown below:

When NMR-SAMS is started, it reads the following three files in the directory where you launched NMR-SAMS.  If any of these files are not found, it will try to read the missing files from the  installation directory of  NMR-SAMS.  If the files are still not found, except for nmrsams.ini file, it will warn that the rest of the files are missing.

nmrsams.ini, which defines some of the initial settings of the program, such as the window sizes, the colors of the background, atom, and bonds,  and the preferred editor etc.  If this file is not found, default settings are used.

periodic_tab.def, which defines some properties of the chemical elements.  If this file is not found or it is not properly read, NMR-SAMS will not be able to recognize any element symbols, and perform the related functions. 

chemical_shifts.def, which defines the knowledge base of  13C chemical shift dispersion ranges for some common carbon-centered single spherical substructures (CCSS) (see Appendix III).  If this file is not found or it is not correctly read, the structure generation will not be possible. (see Section 3.5).

2.4 Brief Introduction to Microsoft Windows

If you are new to Microsoft Windows or Windowing systems in general, please read this section before using NMR-SAMS.  It will help you to become acquainted with the NMR-SAMS interface.

First, It is a good idea to become acquainted with the online help system provided by Microsoft Windows.  The online help system called from within NMR-SAMS when you click on a "Help" button.  It brings up context sensitive help in a window.  There is also a Help Contents facility (also known as an Index).  This consists of a list of the topics in the online-help.  You can click on one of these items to bring up its corresponding information.  The Contents is available via NMR-SAMS's Help menu and from the Online Help Viewer window by clicking on the “Contents” button.

When you first start NMR-SAMS, a window will appear with "NMR-SAMS, version 2.0, (C) Spectrum Research, LLC." on the top.  The area where this text appears is referred to as the "Title Bar."  You can press the left mouse button while the arrow pointer (which is called the "Cursor") is on the title bar and then move the mouse to move the window.  Release the mouse button to stop moving the window.  That combination of events (pressing a mouse button, moving the mouse, and then releasing) is known as "Dragging".  Position the mouse pointer so that it is over the word "File", located immediately below the title bar.  Now press and then immediately release the left mouse button.  This procedure (pressing a mouse button and then releasing without moving the mouse) is known as "Clicking".  The item that you clicked on was the "Menu Bar".  The menu bar consists of several "Menus" ("File", “Edit”, "Display", "Analysis", and "Help").  After you clicked on the File menu, a "Pulldown" appeared.  This pulldown consists of "Menu Items" ("Open...", "New...", etc.).  If you click on one of these menu items, something will occur.  Menu items are the primary way that you, as a "User" of NMR-SAMS, communicate your wishes to NMR-SAMS. 

Some items on menus are not menu items, however.  The line that appears above the "Quit" menu item is known as a "Separator".  Its purpose is solely to make the menu easier to read. Click on the "Display" menu.  Notice that the "Create NMR Data File" menu item has a right pointing triangle after its text.  This type of menu item is known as a "Pullright".  Click the mouse on the " Create NMR Data File " menu item.  You will see another group of menu items appear to the right of it.  The pullright feature is used to group related menu items together, reducing the size of the main pulldowns.  Click on the "Display" menu and you’ll see the menu item "Status Window", which is known as a "Toggle".  Toggles have two states:  "Off" (also known as "Deselected" or "Deactivated"), and "On" (also known as "Selected" or "Activated").  If the status window is on, turn off the "Status Window" toggle by clicking on it.  You will notice that the status window disappear. Click on the "Display" menu and turn on the “Status Window” toggle by clicking on it again, you will now notice that the status window pops up again.

Position the mouse cursor over the frame that surrounds the entire NMR-SAMS window.  Drag the mouse to change the size of the NMR-SAMS window.  All sides of the NMR-SAMS window can be moved to size the window. The field below the NMR-SAMS Toolbar is known as the "Main Graphics Window".  This is where information about chemical structures is displayed.  At the bottom of the Main Graphics Window is the "Status Bar".  The status bar prints out information about what is going on in NMR-SAMS.  It will notify you if you do something that NMR-SAMS isn't prepared to do.  Also, it will give you hints about using NMR-SAMS. 

Click on the "Open..." menu item from the "File" menu.  A window will appear with the title of "Open ".  This type of window is known as a dialog box.  While a dialog box is displayed, you must interact with it before continuing with other areas of NMR-SAMS. Dialog boxes also have a "Help" button which will bring up online help about the dialog box when clicked.  The dialog box that is currently displayed is referred to as the "File Browse Dialog".  It is used to specify a file. To get to a certain directory, use the “Directory” combo box to find the proper parent directory.  You can descend the directory structure by double clicking on a directory name from the list.  (A “Double Click” is two clicks followed in rapid succession.)  After you have changed to the proper directory, you will see a list of "Files" that have an extension of “.mdf”.  Click on one of the filenames to select it. The "OK" button on the bottom of the dialog box is used to accept the input that you have selected.  Click the "Cancel" button to close the dialog box without performing an action.

When multiple candidate structures are generated, the first structure is displayed along with a window titled Structure Browser.  This window is known as a "Palette."  Palettes are similar to dialog boxes, however you can interact with them and with the main NMR-SAMS window at the same time.  The "Structure Browser" palette is used to control the display of the candidate structures. In the "Structure Browser" palette, you will notice a "Slider".  You can drag the slider bar to the left/right to raise/lower its value, which determines the sequential number of the structure to be displayed.  Some palettes also have text fields where you can type in numbers or text.

You should now have enough information to start exploring NMR-SAMS.  Note that NMR-SAMS grays out menu items that you can not select depending on the current progress of your structure elucidation process. For example, if you have not prepared the NMR data file, the menu item Analysis/Interpret NMR Data  remains grayed out.  This guides you step-by-step through the structure elucidation process.

 

2.5 Description of the Main Menus

The menu bar appears at the top of the main graphics window and contains the names of the five NMR-SAMS menus:

You perform all tasks in NMR-SAMS by selecting options from these five menus. The five menus are described briefly on the following pages and in greater detail in the other chapters of this book.


The File menu     
The File menu lists options related primarily to reading data into and out of NMR-SAMS. The following figure illustrates the File menu:

 

 

The Edit menu     
The Edit menu lists options related to editing of the working data set files and the generated structures.  The following figure illustrates the Edit menu:

 

The Display menu                               
The Display menu lists options related to the graphical display of intermediate and final results of NMR-SAMS. The following figure illustrates the Display menu:

 


The Analysis menu                             
The Analysis menu lists the options related to structure elucidation. The following figure illustrates the Analysis menu:

The Help menu:                   
The Help menu lists the options related to the on-line help of NMR-SAMS. The following figure illustrates the Help menu:

2.6 The NMR-SAMS Toolbar

The toolbar appears between the menubar and the Main Graphics Window.  It contains icons (pictures) that represent commonly used menu items.  If you click on one of the icons, the same action occurs as the corresponding menubar item. 

The following menu items have associated toolbar icons:

    File/New

    File/Open

    File/Save

     Display/Building Blocks & Fixed Bonds

    Display/Target Structure

    Display/Generated Structures or Assignments

     Display/Status Window

    Display/Display Options/Balls

     Display/Display Options/Carbon Symbols

    Display/Display Options/Numbers

    Display/Display Options/Chemical Shifts

    Display/Display Options/Protons

    Display/Display Options/Molecular Formula

    Display/Display Options/Connection Table

    Display/Display Options/Refine

    Help/Contents


Chapter 3

Understanding NMR-SAMS

3.1 Overview

This chapter introduces the basic procedure of structure elucidation, with a brief description of the concepts and principles of the NMR-SAMS, and concludes with a high-level discussion of the typical flow of activity through NMR-SAMS. 

3.2 General Procedure of Structure Elucidation with NMR-SAMS

The process of structure elucidation of an unknown compound through NMR spectroscopy consists of the following steps:  

1.      Determination of the molecular formula (MF) by MS.  Determination of some functional groups in the unknown compound through IR and UV spectroscopy.  MF is optional to NMR-SAMS v2.0.

2.      Data acquisition of 1D and 2D NMR data.  See Section 3.3 for the spectral data used by NMR-SAMS.

3.      Extraction of peak tables with chemical shifts, intensities, J-coupling and multiplicities.  Peak picking of  1D and 2D NMR spectral data is performed with SpecMan using automatic and semi-automatic procedures (for details see User’s Guide of SpecMan).  The peak tables are converted to NMR-SAMS representation of connectivity information. (see Chapter 5)

4.      Setup of the parameters to control the spectral interpretation and structure generation.  In most cases, the default values of these parameters can be used.  (see Appendix IV)

5.      Interpretation of molecular formula, if any, along with 1D 1H, 13C, and HMQC spectral data to obtain the structural building blocks.  If the MF is unknown, you can interactively add heteroatoms into the building block sets (see Chapter 6).

6.      Interpretation of other 2D NMR spectral data to obtain the bond constraints (see Chapter 6)

7.      Generation of candidate structures that are consistent with the experimental data for unknown compounds (see Chapter 7), or verification of the proposed structure and completion of 1H and 13C  resonance assignments (see Chapter 8) for known compounds.  Interactive structure generation and resonance assignment is also possible (see Section 7.2.1).

8.      Exporting results of structure generation and resonance assignments (see Chapter 11).

Structure elucidation is usually an iterative approach, so this process may need to be repeated several times until you get satisfactory results.  NMR-SAMS assists you in identifying and correcting the inconsistencies in the input data.  When sufficient input data is not available, NMR-SAMS generates only partial structures with resonance assignments.   NMR-SAMS also warns you about some common pitfalls that could lead to incomplete or incorrect structure generation, and provides clues for further refinement.

3.3  What Spectral Data Does NMR-SAMS Use?

The possible combinations of 1D and 2D spectral data used by NMR-SAMS for structure elucidation are listed in Table. 3.1. The fifth combination which uses  routine 1D and 2D spectra along with other complementary information from other spectral data (MS, UV and IR), is the recommended one for structure elucidation of real-world complex molecules.  Other spectral sources such as MS, IR, and UV are not directly interpreted by NMR-SAMS but they can be conveniently used as user-defined bond/environment constraints. 

Table 3.1. Possible combinations of 1D and 2D NMR spectral data used by NMR-SAMS a

 

1D

2D

Comments

1

None

None

Pure isomer enumeration from MF

2

13C (and DEPT b)

None

Very low efficiency except for simple molecules.

3

13C, DEPT b

INADEQUATE

Very high efficiency, if data available.

4

13C, DEPT b, 1H

DQF-COSY c, HMQC d

Low efficiency except for H-rich molecules.

5

13C, DEPT b, 1H

DQF-COSY c, HMQC d, HMBC e (NOESY f)

Most practical way for de novo structure elucidation of complex molecules.

6 g

1H

DQF-COSY c, HMQC d, HMBC e (NOESY f)

Practical when the amount of sample does not allow carbon-detecting experiments.

a TOCSY is not used directly by NMR-SAMS but can be used by SpecMan  to assist the peak picking of DQF-COSY.

b INEPT, or APT can also be used.

c Other types of COSY experiment, as long as it provides geminal and vicinal H-H through-bond connectivity, can also be used.

d HSQC, HETCOR, or other type of spectra can also be used, as long as it provides one-bond C-H connectivity.

e COLOC, FLOCK, or other type of spectra can also be used, as long as it provides long-range C-H connectivity.

f NOESY or ROESY is optional.

g HMBC and HMQC must be clean enough to allow extraction of 13C chemical shifts and multiplicity information. 13C chemical shifts can be automatically extracted from HMBC using SpecMan.  13C multiplicities must be identified manually from the HMQC spectrum.

3.4 Use of 2D NMR Connectivities: Bond Constraints

NMR-SAMS uses mainly 2D NMR-derived through-bond spin-spin connectivity information for structure elucidation, because they are reliable and provide comprehensive structural information for de novo structure elucidation.

In NMR-SAMS, the coordinates of 2D cross peaks are first converted into connectivities between the relevant 1D peaks, and  then interpreted as bond constraints on the relevant atoms. A bond constraint (BC) is a requirement of a certain number (or a range) of intervening chemical bonds between the correlated spins. For an asymmetric molecule, such spin-spin BCs are directly used as atom-atom bond constraints.  In addition to its efficient utilization of BCs involving ambiguous bond separation (e.g., 2 or 3 bonds between two HMBC-correlated spins), NMR-SAMS can also cope with BCs concerning ambiguous atoms. Such ambiguity typically arises from peak degeneracy or low digital resolution.

In NMR-SAMS, a BC is represented in the following general format:

(Atom_y ... - Atom_x ... : minBond ~ maxBond; BondType; minNSBC ~ maxNSBC)Source

where

Atom_y ... is the correlated atom(s) along the Y dimension (13C domain for a heteronuclear spectrum). It could be more than one in the case of ambiguity.

Atom_x ... is the correlated atom(s) along the X dimension (1H domain for a heteronuclear spectrum).  It could be more than one in the case of ambiguity.

minBond and maxBond are the minimum and maximum bond separations between the relevant atoms.

BondType is the type of the intervening bond between the atoms. Valid choices are: 0, 1, 2, or 3 for unknown, single, double, and triple, respectively.

minNSBC and maxNSBC are the minimum and maximum numbers of relevant atom pair(s) that must satisfy this BC in the generated structure. 

Source encodes the connectivity (or other source) from which the BC was derived. A connectivity is represented by its spectral type and its ID number. The following codes are used to represent the different spectral types:

“C” for COSY, “Q” for HMQC (or HETCOR), “B” for HMBC (or COLOC), “N” for NOESY, “I” for INADEQUATE.

Note: The ID of a connectivity is different from, though related to, the peak ID(s) in the SpecMan peak tables (For more details see Fig. 6.4 in Chapter 6).

The following codes are used to represent other kinds of source:

“S” for a pseudo BC added by the program, “U” for a user-defined BC, and “G” for a previously generated bond (when using a generated substructure as the starting point for the next structure generation cycle).

 

For example, an HMBC-derived bond constraint is represented as:

(10 - 17 18: 2 ~ 3; 0; 1 ~ 2)B10

In the above example, the first set of numbers “10 - 17 18: ” denote the atoms that are correlated. In this case  since the chemical shifts of H-17 and H-18 are very close, it is hard  to resolve which one of them is really correlated to C-10.  So both the protons are retained to represent the possibilities that there could be a correlation between either C-10 and H-17, or C-10 and H-18, or both.  The next set of numbers “2~3” represent that there could be two or three intervening bonds between the correlated C-H pair(s).  The next number “0” represents the bond type of the intervening bonds, and in this case they are treated as unknown. The next set of numbers “1~2” represent that either one or both pairs of the atoms involved in the bond constraint must satisfy this bond constraint in the computed structure (i.e., C-13 and H-17, or C-10 and H-18, or both pairs).   Finally, the character string “B10” means that this bond constraint was derived from the HMBC connectivity #10.   From the comment of this connectivity, the ID of the actual cross peak (in the SpecMan peaks table) can be found in the .nmr file. (See Fig. 6.4 in Chapter 6).

By default, NMR-SAMS treats the unambiguous BCs, which have exactly two correlated atoms, one-bond separation, and minNSBC = maxNSBC = 1 (which means the BC must be satisfied in a generated structure), as fixed bonds. The rest, which either have ambiguous bond separation, or ambiguous number of correlated atoms, or both, are treated as ambiguous BCs.  The ambiguous BCs are used as the major constraints for structure generation.  During structure generation, NMR-SAMS computes the number of violations of BCs for the current substructure/structure.  If  the actual number of violations of a substructure/structure is less than the upper limit of allowed number of violations, then the substructure/structure is retained, otherwise it is rejected.   The BCs are also used by some advanced heuristic methods for acceleration of the structure generation process.  (See Section 7.4)

3.5 Use of Chemical Shifts And Peak Multiplicities

NMR-SAMS uses chemical shifts as the labels of carbon atoms, so that 2D NMR-derived correlation information can be used as bond constraints on specific atoms. This is also the reason why a generated structure always has unequivocal 1H and 13C resonance assignments.

13C chemical shifts are also used to evaluate the intermediate structures/substructures produced during the structure generation process.  A knowledge base consisting of a correlation table of substructure and 13C chemical shift (d) range is used for predicting 13C chemical shift ranges.  Each of the substructures consists of the central carbon atom (which is being considered), its attached bonds, and the first layer of its neighboring atoms (the outwards bonds of these atoms are not considered).  This is referred to as a carbon-centered single-spherical substructure (CCSS).  Currently, this table consists of the 13C chemical shift ranges of around 93 CCSSs composed of C, N, O, and other common elements which have been adapted from literature.  The correlation table is stored as an ASCII file, chemical_shifts.def (see Appendix III), with the code for each CCSS and its expected minimum and maximum 13C chemical shift. This file can be customized by you.  The file is read when NMR-SAMS is started.

During structure generation, whenever a carbon atom has a complete CCSS (i.e., its immediate neighbors are known), then its expected chemical shift range is derived from the knowledge base and compared with the observed 13C chemical shift of the central carbon. If the observed shift satisfies this range, then it is accepted, otherwise the substructure is discarded.  If the CCSS is not defined in the knowledge base table, the test is assumed to be passed and the undefined CCSSs is reported after the structure generation has been completed.  As the CCSSs cover only very limited structural features, their chemical shift ranges are very broad.  Thus in NMR-SAMS, 13C chemical shifts act as a much looser constraint on the structure generation than the 2D NMR connectivities.  Hence it is very important to include as much correlation information as possible for efficient structure generation.  Sometimes the correct structure could be overlooked if the molecule has carbons that show odd chemical shifts.  In such cases, you are recommended to broaden the predicted chemical shift ranges by specifying an extra tolerance (For details refer to the Appendix IV describing parameter ADD_C13_RNG). 

13C peak multiplicities play an important role in determining the number of attached protons of heavy atoms (i.e., the building blocks). So you are recommended to use DEPT (or INEPT, APT) spectra to obtain complete 13C multiplicity information.

In the current version, 1H chemical shifts are not used to evaluate substructures. 1H peak multiplicities are used to limit the neighboring atoms of the concerned atom. (For details refer to the description about H1MULT_FLAG in Appendix IV.)

3.6  Structure Generation 

During structure generation NMR-SAMS searches all possible ways to assemble the structural building blocks into complete structures.  Within some allowance for the violation of constraints, the generated structures are consistent with all of the available spectral data and chemical constraints. 

The efficiency of structure generation is a factor of the computation time, the quality of the structure generated, and the number of structures generated. Because it is a combinatorial problem, structure generation is usually the most time-consuming step.  “Combinatorial explosion” has been the major bottleneck of early attempts of automated structure elucidation.  NMR-SAMS provides novel heuristic search algorithms that reorder the solution space based on bond constraints, and search only the most probable portion of this space for candidate structures.  These methods exponentially reduce the CPU time for structure generation and hence make it practical for complex molecules.  Moreover, as a user you have full control of the usage of these methods to perform optimized structure generation. For example, by modifying a few parameters, you can extend the search space to a more complete search, or simply turn off the heuristic search methods to perform an exhaustive search. On the other hand, you can limit the search space for faster structure generation.  (See Section 7.4 and Appendix IV about the parameters GEN_FLAG, SAT_BC_RATE and N_FBX_STEP).

For relatively small molecules (e.g. < 30 heavy atoms) with reasonably clean and sufficient spectral data, this process is usually completed in seconds or minutes. In most cases the correct structure is generated either uniquely or along with a few alternatives.  For more complex problems (bigger molecules and insufficient spectral constraints), structure generation can be completed in a reasonable computation time if adequate user-defined constraints are included.   

The candidate structures generated by NMR-SAMS include complete structures and optionally, substructures.  A complete structure is defined as one having no unsatisfied free bonds.  In the case of partial structure elucidation (see Section 7.1 for details), the chemically incomplete structure obtained are still referred to as a complete structure,  because all of the free bonds are satisfied either by real bonds or dummy bonds.  During structure generation, the program enables saving the largest intermediate substructures. The substructures are useful when the generation of complete ones is not possible due to errors in spectral data or other reasons, and they provide clues and hints for improving the input spectral data and completing the structure elucidation successfully.

3.7 User Intervention 

NMR-SAMS was developed to streamline and automate the structure elucidation process with less user-intervention.  But when the molecular size of the unknown is big (e.g., number of non-hydrogen atoms is greater than 40), or insufficient connectivity information is available, user-intervention is absolutely  necessary to improve the efficiency of structure generation.  Currently you can interact with the structure elucidation procedure in the following ways:

1.      Change the control parameters for NMR interpretation and structure generation. For example, you can decide whether or not to use the “negative information” of DQF-COSY based on the spectral quality.  You can also limit the ring sizes to either 5 or 6-membered rings in the generated structure and discard structures containing other ring sizes.

2.      Modify the intermediate results in the MDF by using Edit/Master Data File.

3.      Supply structural building blocks by using Analysis/Edit Building Blocks if the MF is unknown.

4.      Supply known structural information as user-defined bond constraints. This is very important especially for heteroatoms that are either not observed or have sparse connectivity information in 2D NMR experiments. Also, different spectral data, such as IR and UV, normally provide positive evidence of some known functional groups.  Using Analysis/User-defined Bond Constraints, you can add as many known bonds as possible between the constituent atoms (see Section 7.2).  Using this feature, you can also manually assemble the building blocks as a complete structure, or use a selected substructure (which was previously generated ) as the starting point for the next structure generation.

5.      Supply known structural information as atom environment constraints (EC).  An EC defines the number of occurrence of a certain type of atom(s) as the immediate neighbor(s) of an atom under consideration (See Section 7.3).

6.      Propose a possible structure for the unknown and perform resonance assignment.  This way you can verify user-proposed structures and complete the structure elucidation.

7.      Modify the results of resonance assignment of a target structure using Analysis/User-Defined Assignment.

3.8 Control Parameters

The parameter file (.par file) stores the parameters for controlling the spectral interpretation, setting up ACMX, and structure generation.  All of the parameters can be changed through the dialog boxes after choosing Edit/Parameters/NMR Interpretation, Edit/Parameters/Setup ACMX or Edit/Parameter/2D Structure Generation.  Default values are assigned to the parameters according to the .ini file when a new working data set is opened.  The default values can be customized by editing the .ini file before starting the program. In most cases the default parameters provided in the .ini file provided by Spectrum Research should be a good starting point for structure elucidation.

In the following chapters, the name of the parameter, e.g., GEN_FLAG, is used to refer to a parameter. The corresponding titles in the dialog boxes and details about the usage of the parameters are described in Appendix IV.


Chapter 4

Working Data Set

4.1 Overview

This chapter describes the operations related to the data files used by NMR-SAMS.  During each session of structure elucidation, NMR-SAMS works with a working data set, which consists of five text files with the same root name but different extensions.  Suppose the root name is Q-2-test, then the working data consists of the following files:

·        A master data file (MDF), Q-2-test.mdf, where all of the intermediate and final results are stored. You can view and edit this file by using Edit/Master Data File (See Appendix II).

·        A parameter file, Q-2-test.par, where the control parameters used for the data interpretation and structure generation are stored. You can access the parameters by using the commands in the pull-right menu of Edit/Parameters (see Appendix IV).

·        An NMR data file, Q-2-test.nmr, where the NMR data converted from the SpecMan peaks table are stored.  You can view and edit this file by using Edit/NMR Data File (see Appendix I).

·        A log file, Q-2-test.log, where most of the information, warning, and error messages produced during the analysis are stored.  You can view the log file by using Edit/Log File.

·        A structure file, Q-2-test.str, where the atom-atom connection table of the generated structures and their resonance assignments are stored.  You can display the structures by using Display/Generated Structures (see Chapter 10).

·        A lock file, Q-2-test.lock, which is used to prevent two users opening the same data set simultaneously.

The operations related to the working data set can be found in the File menu shown below:

4.2 Opening An Existing Working Data Set

Command: File/Open.

Description:  This procedure is used to open an existing working data set.  An existing working data set stores the data and results of the last session of structure elucidation with NMR-SAMS.  Opening an existing working data set allows you to continue your work from where it was saved.  After selecting File/Open, a file browser is displayed, listing the master data files in the current directory.  If necessary, you can switch to the desired directory, and then click the desired master data file name.  The selected file name appears in the Open MDF field.  Next click OK, and the working data set is then opened for use.

After a working data file is opened, the following message (as seen below) prompts you to confirm removing of old log messages from the previous session.   To retain them, you must click No, or else Yes, to overwrite with the new log messages. .

The status window shows the current state of structure elucidation.  It lists the NMR data files that are being used.  It also lists the steps that have been completed, and provides tips to you about what needs to be done next.    The structural results, such as building blocks or candidate structures, are displayed in the main graphics window (see Chapter 10).

Note:  If you choose to open another working data set before saving the current modified working data set, you will be prompted to save the changes. 

If you want to discard the changes you have made to the current working data set without exiting the program, open it again, and click Yes to the following message. Then you can start from the point you last saved the working data set.

If you select a data set that is being locked by another user of NMR-SAMS, you will be warned by the following message:

Click Yes to open the data file anyway, or click No to cancel.  Note that if you click Yes, it may cause problems.

4.3 Opening A New Working Data Set

Command: File/New.

Description: This procedure is used to create a new working data set. When dealing with a new structure problem, you must open a new working data set.  You can open a totally new working data set, or open one starting from an existing NMR data file that has already been prepared.

To open a totally new working data set, choose File/New. In the displayed file browser, make sure the option Starting with Existing NMR File is turned off.  Switch to the desired directory if necessary, and type a root name for the new working data set.  The extension will be automatically added for each file so you do not need to type it.

After clicking OK, NMR-SAMS creates the five new files described in Section 4.1.  All files are empty except the parameter file, which stores the default parameters.

Next NMR-SAMS prompts you to input the molecular formula (MF) of the unknown when a new working data set is opened. 

Type the molecular formula in the dialog box. See Section 4.4 for more about inputting molecular formula.

To open a new working data set starting with an existing NMR file, check the option Start with Existing NMR File in the file browser.  Then the existing .nmr files in the current directory are listed.  Switch to the desired directory if necessary, and click the desired .nmr file.  Next click OK,  and a new working set is created with the selected .nmr file. 

Note: If you select a filename of an existing data set (with or without selecting the option Existing NMR file), NMR-SAMS warns you (as shown in the dialog box below) about existing files with the same root name.  You can select Yes, and the program will overwrite the existing files, except the .nmr file if you are starting from an existing NMR data file.

If you don’t want to overwrite the existing files, but you still would like to use the existing .nmr file, then first  click No to cancel this dialog box.  Next use a UNIX window to make a copy of the .nmr file with a new root name.  After that repeat the operations described above for opening a new working data set using an existing .nmr file.      

4.4 Input Molecular Formula

Command: File/Molecular Formula.

Description:  This procedure is used to define the molecular formula of the unknown.  Normally this command is used when you want to change the MF, since you are always prompted to enter the MF when you open a new working data set (see Section 4.3).  Note that the element symbol must be typed with the first letter in upper case and the second one, if any, in lower case.  For example:

You can specify valence of an atom in a pair of parenthesis following the element symbol. For example, C10H12N(V)N2S(VI)O8.  If you do not specify the valence, the most common chemical valence is adopted for an element with multiple valence.  In the above example, if it were not explicitly specified, valence 3 and 2 would be adopted for N and S, respectively.  You can also change the valences later using Analysis/User-Defined Building Blocks. 

If you do not know the exact MF, try to enter the closest possible formula, or type “unknown”.  In any case, you can modify the elemental composition of the molecule by using Analysis/User-defined Building Blocks later (see Section 6.3).

The MF is interpreted if it is known.  A dialog box reports the standardized MF, the molecular weight, and the double bond equivalence (DBE).  For example:

Two records are written into the MDF. The first record starts with the keyword “MF:” and contains the standardized MF:

MF: C30H48O3

The second record starts with the keyword “ATOMS:  Following this are the molecular weight and the degree of unsaturation (or double bond equivalence) in the same line.  The second line is a brief description of the entries in each of the remaining lines.  Each line consists of the ID, the atomic number, the chemical valence, the minimum and maximum attached protons, the minimum and maximum of attached double bonds, and the minimum and maximum attached triple bonds of a constituent heavy atom, respectively.  The constituent heavy atoms are listed with carbon first, and the remaining elements in the alphabetic order of their element symbol.

ATOMS:  (MW = 456.7074, DBE = 7.0)                    

#Atom; Element; Valence; Min. & max. attached H; Min. & max. double bonds; Min. & max. triple bonds

# 1.  C 4   0 3   0 2  0 1

# 2.  C 4   0 3   0 2  0 1

# 3.  C 4   0 3   0 2  0 1

      .

      .

      .

#30.   C 4   0 3   0 2  0 1

#31.   O 2   0 1   0 1  0 0

#32.   O 2   0 1   0 1  0 0

#33.   O 2   0 1   0 1  0 0

Note: You can specify an uncommon valence while inputting the MF.  Otherwise, if an atom has multiple valences, the most common valence is adopted by default.  Modifying the valence manually in the .mdf file is not recommended, because whenever you choose Analysis/Building Blocks the MF will be re-interpreted and the previous changes will be overwritten.

For example, the valence 3 is always adopted for N by default.   If you know that there is a -NO2 group in the molecule, input the MF containing a “N(V)”  (e.g.,  C6H5N(V)O2).

4.5 Save A Working Data Set

Command: File/Save.

Description:  This command allows NMR-SAMS to update the working data set with the current state of structure elucidation.  This operation is not absolutely necessary because you will be prompted to save changes before exiting the program or opening another working data set.

 

4.6 Save A Working Data Set as Different Name

Command: File/Save As.

Description:  This command allows NMR-SAMS to save the current state of structure elucidation in a working data set with a different root name.  After selecting File/Save As, the following file browser is displayed.  Switch to the desired directory if necessary, and type the new root name, then click OK.

4.7 Exiting NMR-SAMS

Command: File/Exit.

Description:  This command allows you to exit NMR-SAMS.  If some changes have been made to at least one of the three data files, namely, the .nmr, .mdf, and .par files, and have not been saved, NMR-SAMS prompts you to save them before exiting the program:

If you click Yes, the changes are updated before exiting the program. If you click No, the changes are ignored before exiting the program.  If you click Cancel, the command is ignored.


Chapter 5

Input of NMR Spectral Data

5.1 Overview

It is important to generate a clean and reliable set of peak lists from different NMR experiments before using them by NMR-SAMS. SpecMan provides several advanced and intelligent peak picking tools to perform fast and reliable peak picking.  For details regarding peak picking, refer to SpecMan Users’ Guide.  Although peak picking can be independently performed by SpecMan, we recommend you to perform the two steps (i.e., peak picking using SpecMan and peak table conversion by NMR-SAMS) in tandem for each spectrum, because the consistency-checking during the conversion process helps you find potential errors in the peak picking result. 

This chapter describes how to prepare 1D and 2D NMR spectral data as input to NMR-SAMS. (For details about the NMR Data File format see Appendix I).  It is assumed that the peak picking has already been performed by SpecMan.  The peak tables from SpecMan are then converted into the NMR-SAMS format.  The conversions are done with the pull-right options of Create NMR Data File in the File menu as shown below:

5.2 Conversion of SpecMan 1H Peak List

Command: File/Create NMR Data File/H1.

Descriptions: In this procedure, SpecMan 1H peaks table is converted into NMR-SAMS format.  First the following dialog box is displayed which prompts you to enter the filename of the 1H peaks table from SpecMan. 

Click Browse to locate the peaks table file, then click OK.  An information dialog box displays the number of 1H peaks that have been converted.

In the current version of SpecMan all 1H peak multiplicities are marked as unknown (u) by default.  That’s why NMR-SAMS prompts you to supply the 1H multiplicity for the peaks (referring to their splitting patterns). As shown in Fig. 5.1, if you know the multiplicities of all or some of the 1H peaks, select Edit/NMR Data File to open the NMR data file and replace the unknown multiplicity (represented as “u”) by one of the following symbols recognizable to NMR-SAMS:

s: singlet, d: doublet, t: triplet, q: quartet, m: other multiplet.

If the multiplet is unknown, leave it as unknown (u). 

NMR-SAMS uses 1H multiplicity information to eliminate inappropriate bonds while setting up ACMX. For details refer to the usage of parameter H1_MULT_FLAG (in Appendix IV).

 

Figure. 5.1. Running NMR-SAMS and SpecMan side-by-side provides a convenient way to verify and edit the 1D peaks converted from SpecMan peaks table. Left (NMR-SAMS): select Edit/NMR Data File to open the .nmr file.  Right (SpecMan): Open the 1D spectrum and load the 1D peaks table. From the comment field of a converted peak, the ID (#32) of the original peak is found. By clicking the corresponding entry in the peaks table, the 1D peak (#32, shown in cyan) is highlighted in the spectrum for you to see and recognize the multiplicity of this peak before modifying the .nmr file.

Possible Errors: Generally NMR-SAMS cross-checks the converted 1H peak list against the MF (if known) and alerts you of any potential conflicts.  The following situations will be reported when there is a conflict:

·        If the multiplicity information is unknown for more than three fourths of the peaks, a warning message prompts you to supply this information if possible.

·        If the number of 1H peaks exceeds the constituent protons, an error message prompts you to correct either the peak picking result or the MF.

Results:  After the conversion, the .nmr file is updated with information regarding proton peaks starting  with the keyword “H1:”.  Following is a transcript of the converted 1H peaks:

H1: /usr/people/peng/NMR-SAMS/ndat/Q-2-test/h1p.pks

 #1. 4.930 s   ;1

 #2. 4.755 s   ;2

 #3. 3.509 u   ;3

      .

      .

      .

 #32. 0.818 s   ;32

 #33. 0.811 u   ;33

The first line which begins with the keyword “H1:” indicates the start of 1H peak list. Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in the rest of the lines represent the following attributes of a 1H peak:

·         Peak ID, a serial number that uniquely identifies this peak.

·         Chemical shift of the peak in ppm.

·        Multiplicity, which is designated as s (singlet), d (doublet), t (triplet), q (quartet), m (other multiplet) or u (unknown).  By default it is assigned as unknown. 

·         Comments, which are optional. The number in the comment field corresponds to the ID of the 1H peak in the SpecMan peaks table.

One or more space(s) is used as a delimiter for all items except comments which are separated by “;”.   Items marked as optional can be omitted unless an item following them is included.  In such a case, you must include default values for ignored items even if they don’t get used.  Comments can always be included as long as they follow a “;”.For 1H peak list the peak intensities and comments are not currently used by NMR-SAMS.

Note: Whenever you repeat a 1H peaks table conversion, or modify the converted peak list (using Edit/NMR Data File), you must make sure to convert the dependent 2D spectra again.  For example, if you add a 1H peak in the converted 1H peak list, you must convert the COSY, HMQC, HMBC, and NOESY data again, if they have been converted before. Otherwise the added 1H peak will not be reflected in the 2D data.

5.3 Conversion of SpecMan 13C Peak List

Command: File/Create NMR Data File/C13 and DEPT.

Descriptions: In this procedure the SpecMan 13C and DEPT/APT peak tables are converted into a peak list of 13C chemical shifts and multiplicities. NMR-SAMS requires 13C multiplicity information  for reliable structure elucidation. In order to get the complete 13C multiplicity information, you need 13C, DEPT-90/APT-90 and DEPT-135/APT-135 experimental data.  However, NMR-SAMS provides a flexible way to derive the 13C multiplicity information from any combination of available experiments as described below:  

1.      13C Only. In the dialog box that appears, select None for Peak Multiplicity Experiments.  Click Browse to enter the SpecMan C-13 Peaks Table. 

After clicking OK, NMR-SAMS updates the .nmr file with a list of 13C chemical shifts having unknown multiplicities as shown in the Results section below.  If you know the multiplicities of some peaks, you can manually edit the .nmr file to supply this information.

2.      13C and DEPT.  In the dialog box that appears, click Browse to enter the SpecMan C-13 Peaks Table.  Next select DEPT for Peak Multiplicity Experiments.  Enter the peaks table filenames for DEPT-45, DEPT-90, and DEPT-135 experiments. All of the DEPT experiments are optional as mentioned previously, so if you do not have a certain DEPT data, turn off the corresponding toggle.  Note that, except for DEPT-45, ignoring some DEPT experiments could leave some peaks with unknown multiplicities.

Also you need to enter a matching tolerance (in ppm) to match 13C and DEPT peaks. After clicking OK, NMR-SAMS updates the .nmr file with a list of 13C chemical shifts and derived multiplicities as shown in the Results section below. 

3.      13C and APT. In the dialog box that appears, click Browse to enter the SpecMan C-13 Peaks Table.  Select APT for Peaks Multiplicity Experiments.  Enter the peaks table filenames for APT-45, APT-90, and APT-135 experiments. All of the APT experiments are optional as previously described, so if you do not have a certain APT data, turn off the corresponding toggle.  Note that, except for APT-45, ignoring some APT experiments could leave some peaks  with unknown multiplicities.

Also you need to enter a matching tolerance to match 13C and APT peaks.  After clicking OK, NMR-SAMS updates the .nmr file with a list of 13C chemical shifts and derived multiplicities as shown in the Results section below. 

Possible Errors: During the conversion NMR-SAMS cross-checks the 13C peak list with the MF, and alerts you of potential inconsistencies.  In such cases, the following general messages will be reported:

·        If there are more 13C peaks than the constituent carbon atoms, an error message will prompt you to remove peak artifacts or correct the MF.

·        If there are fewer 13C peaks than the constituent carbon atoms, a warning message will prompt you to resolve 13C peak overlap.  Define the overlapping peaks as individual peaks with slightly different chemical shifts by choosing Edit/NMR Data File and editing the NMR data file (It is usually possible to resolve such ambiguities by looking at the peak intensity and the HMQC spectrum, or by acquiring the spectrum at different conditions).  If you are unable to resolve overlapping peaks (for example, in the case of a symmetric molecule, or due to  severe overlap in spectrum), then partial structure elucidation will be performed (see Section 7.1). 

·        If the multiplicity of one or more 13C peaks is unknown, a warning message will prompt you to supply this information, if possible.  Lack of this information may result in multiple building block sets (see Section 6.2).

·        The number of carbon-attached protons (n_CH ) is calculated based on the 13C multiplicities. If n_CH is greater than the number of constituent protons, an error message will prompt you to correct either the multiplicity information or the MF.

·        When the number of 13C peaks is equal to that of the carbon atoms, and all 13C multiplicities are known, the maximum number of heteroatom-attached protons (max_XH ) is calculated based on the valence of the constituent heteroatoms. If (n_CH + max_XH) is smaller than the number of constituent protons, an error message will prompt you to correct either the multiplicity information or the MF.

Results: After the conversion, the .nmr file is updated with information regarding the 13C peaks starting with the keyword “C13:” in the .nmr file.  The following is a transcript of a converted 13C peak list (Note that if DEPT or APT is not used, the multiplicities will be unknown “u” for all peaks.):

C13: /usr/people/peng/NMR-SAMS/ndat/Q-2-test/c13.pks

 #1. 178.822 s ;1

 #2. 151.323 s ;2

 #3. 109.931 t ;3

      .

      .

      .

 #28. 16.340 q ;28

 #29. 14.929 q ;29

The first line which begins with the keyword “C13:” indicates the start of the 13C  peak list. Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in each of the rest of the lines represent the following attributes of the 13C peak:

·         Peak ID, a serial number that uniquely identifies this peak.

·         Chemical shift of the peak in ppm.

·        Multiplicity, which is designated as s (singlet, C), d (doublet, CH), t (triplet, CH2), q (quartet, CH3), or u (unknown).

·         Comments, which is optional. The number in the comment field corresponds to the ID of the 13C peak in the SpecMan peaks table.

One or more space(s) is used as a delimiter for all items except comments which are separated by “;”  Items marked as optional can be omitted unless an item following them is included.  In such a case, you must include default values for ignored items even if they don’t get used.  Comments can always be included as long as they follow a “;”.  For 13C peak list the peak intensities and comments are not currently used by NMR-SAMS.

Note: Whenever you repeat a 13C peaks table conversion, or modify the converted peak list (using Edit/NMR Data File), you must make sure to convert the dependent 2D spectra again.  For example, if you add a 13C peak in the converted 13C peak list, you must convert the HMQC, HMBC, and INADEQUATE data again, if they have been converted before. Otherwise the added 13C peak will not be reflected in the 2D data. 

As shown in Fig. 5.1, you can run NMR-SAMS and SpecMan side-by-side, to verify the peak picking results of peaks mentioned in the warning or error dialog boxes.

5.4 Conversion of SpecMan DQF-COSY Peaks Table

Command: File/Create NMR Data File/COSY.

Descriptions:  In this procedure NMR-SAMS converts the DQF-COSY cross peak coordinates into connectivities between 1D 1H peaks.  As illustrated in Fig. 5.2, the coordinates of the peak center (shown as a cross) are matched to the 1D chemical shifts (shown as dotted lines).  The 1D peaks that match the peak center within the tolerances (±D2 and ±D1 in F2 and F1 dimensions respectively) are taken as the correlated 1D peaks.  If, in a certain dimension, more than one 1D peak (such as 1H peaks a and b in Fig. 5.2) match the cross peak center, then all are treated as possible correlated 1D peaks in that dimension. Such a connectivity is called an ambiguous connectivity.  Internally, NMR-SAMS will consider all possible correlations for an ambiguous connectivity.  (For details about ambiguous connectivity, see the example in Section 3.4).

Figure. 5.2. Illustrates the conversion of a COSY cross peak coordinates into a correlation between the 1D 1H peaks.  The cross (+) denotes the cross peak center.  The dotted lines denote the chemical shifts of the three 1D 1H peaks, a, b, and c, respectively.  D1 and D2 are the matching tolerances along F1 and F2, respectively.  All three peaks, which match the cross peak center within the tolerances, are taken as correlated 1D peaks.

Upon selecting the command File/Create NMR Data File/COSY, NMR-SAMS opens a dialog box that prompts you to enter the filenames of the COSY peaks table.  Also you are prompted to input matching tolerances along X (i.e. F2) and Y (i.e. F1) dimensions respectively.

The default values for the matching tolerance are 0.005 ppm for both dimensions.  It is important to select an appropriate tolerance because, too big a tolerance could result in undesired ambiguity, and too small a tolerance could ignore some real peaks.  To choose a suitable tolerance you must consider at least the following factors:

·        Accuracy of the peak picking.  The grid-intelligence-based peak picking of SpecMan provides you a very convenient way to verify the accuracy of peak picking by comparing the expected locations of the cross peaks with the picked peaks (See User’s Guide of SpecMan). If a peak list was carefully verified with this method, it is OK to start with a small tolerance.

·        Alignment between 1D 1H and the COSY spectra.  SpecMan provides convenient tools for you to correct frequency offset between the 1D and 2D spectra. Sometimes different experimental conditions introduce small chemical shift differences between 1D and 2D resonances. To further correct the differences due to sample conditions, use the grid-intelligence-based peak picking method of SpecMan.  If these corrections have been applied, it is OK to start with a small tolerance.

Possible Errors: During the peak table conversion, depending on the situation, NMR-SAMS may prompt the following error/ warning messages:

·        If the X or Y coordinate of a cross peak does not match any 1D 1H peak within the matching tolerance, the cross peak will be discarded.  When this message appears, you are supposed to verify this peak and check if it is an artifact.  If it is not an artifact, then either its center has not been picked accurately, or the tolerance used is too small.  Click Cancel to stop the conversion process, and try refining the peak picking results or repeating the conversion with a bigger matching tolerance.

·        If the X or Y coordinate of a cross peak matches more than one 1D 1H peaks within the matching tolerance, then an ambiguous correlation is obtained.  You can either click Cancel to stop this process, and then try a smaller tolerance to reduce ambiguities; or you can click OK to All to let it finish the conversion, then choose Edit/NMR Data File to manually remove the undesired ambiguities in the .nmr file.  Note that although NMR-SAMS can use ambiguous correlation information, too many ambiguous correlations will undermine the efficiency of the subsequent structure generation.   

·        If the X or Y coordinate of a cross peak matches more than six 1D 1H peaks within the matching tolerance, the peak will be discarded.  In such a case, you can either click Yes (or Yes to All) to go on without that peak, or click No to define a reduced matching tolerance and repeat this process.  You can also click Cancel to stop this process, and then merge the very close 1D 1H peaks as a degenerate peak in the SpecMan 1H peaks table before converting it again (see Section 5.2). After that, convert the DQF-COSY peaks table again.

Tips: As shown in Fig. 5.3, you can run NMR-SAMS and SpecMan side-by-side to verify the original peak picking results of peaks mentioned in the warning or error dialog boxes.  This is also useful when you edit the .nmr file using Edit/NMR Data File.

Figure. 5.3. Running NMR-SAMS and SpecMan side-by-side provides a convenient way to verify and edit the 2D peaks during peaks table conversion. Left (NMR-SAMS): a dialog box indicates that cross peak #33 is discarded by NMR-SAMS.  Right (SpecMan): Open the DQF-COSY spectrum and load the 2D peaks table. By clicking the corresponding entry in the peaks table, cross peak #33 is highlighted in the spectrum.  This peak was discarded because it is located too far away from the grid center.  If necessary, you can correct this peak by moving it closer to the grid intersection. After correcting such peaks, save the refined peaks table and repeat the peaks table conversion.  This method can also be used when editing the .nmr file to remove undesired ambiguities and to mark long-range coupled peaks.

For COSY and other homonuclear spectra, NMR-SAMS discards the diagonal peaks and merges symmetric peaks.  This is not done when ambiguous correlation is involved.  For example, the following connectivities are retained:

(10 - 10 11) 3   0.00   0.60

(8 - 9 10)   3   0.00   0.60

(8 - 9)      3   0.00   0.60

The first connectivity may arise from either a diagonal peak or a near-diagonal peak. The latter two, converted from two symmetric peaks, do not have exactly the same correlated 1H peaks so they are not merged.

For each converted COSY connectivity, the intensity level is assigned 3 (i.e., strong). The J-coupling constant is assigned 0.0 (i.e., unknown).  The reliability of the peak is assigned 0.60 if it is converted from a single peak, or 0.84 if from two symmetric ones.  Since the intensity level of a COSY peak is related to its structural interpretation, NMR-SAMS always prompts you to mark the connectivities that may be due to long-range couplings after the conversion is finished, as shown in the dialog box below:

Peaks showing very low intensity or involving sp2-C could be long-range coupled.  If you suspect some peaks to be due to long-rang coupling, select Edit/NMR Data File to edit the .nmr file.  Modify the intensity levels of such connectivities from “3” (i.e., strong) to “1” (i.e., weak), and save the changes.  As described in Fig. 5.3,  you can edit the .nmr file while looking at the original COSY cross peaks.

Note: A short-range coupling COSY connectivity is normally interpreted as 2 or 3 intervening bonds between the correlated protons. If a long-range coupling is mistakenly interpreted as a short-range one, NMR-SAMS will probably miss the correct structure.  A COSY connectivity marked as long-range coupling is usually interpreted as 3-5 intervening bonds between the correlated protons, which also covers the possibility of vicinal coupling.  It is safe to treat a short-range coupling peak as long-range coupling, but it may decrease the efficiency of structure generation. The geminal coupling is always automatically detected by the program.  (For details see Section 6.4).

Results: After the conversion, the .nmr file is updated with information regarding the converted COSY connectivities starting with the keyword “COSY:”.  The following is a transcript of a converted COSY connectivity list:

COSY:

 #1. (1 - 2)      1     0.0   ;1+4

 #2. (1 - 12)     1     0.0   ;2+31

 #3. (2 - 12)     1     0.0   ;3+32

 #4. (3 - 7 8)    3     0.0   ;6+18

 #5. (3 - 13)     3     0.0   ;7+33

 #6. (3 - 18)     3     0.0   ;5+49

.

.

.

The first line which begins with the keyword “COSY:” indicates the start of COSY connectivity list. Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in each of the rest of the lines represent the following attributes of a connectivity:

·         Connectivity ID, a serial number that uniquely identifies this connectivity.

·         IDs of the correlated 1D 1H peaks, (shown in parenthesis) For ambiguous correlations, the IDs of all possible 1D 1H peaks are included. 

·         Peak intensity level, which is classified as four types; strong, medium, weak, and unknown, and denoted as 3,2,1 and 0 respectively.  The default value is 3. For a short range coupled DQF-COSY connectivity, intensity levels should be either 3 or 2.  For a long-range one, the intensity levels should be 1.  If an intensity level 0 is used, NMR-SAMS will expect actual J-coupling values in the field which represents J-coupling.

·         J-coupling. 0.0 is assigned by default, representing unknown. This is optional if peak intensity level is bigger than 0.

·         Comments, which is optional and has a maximum length of 80 characters. The numbers in the comment field correspond to the IDs of the corresponding peaks in the SpecMan peaks table. For merged peaks these numbers are shown with a + sign.  Comments are ignored by NMR-SAMS.

One or more space(s) is used as a delimiter for all items except comments which are separated by “;”.   Items marked as optional can be omitted unless an item following them is included.  In such a case, you must include default values for ignored items even if they don’t get used.  Comments can always be included as long as they follow a “;”.

Note: The conversion of COSY peaks table is dependent on the converted 1H peak list. If you convert the 1H peaks table again, or modify the converted 1H peak list, you must convert the COSY peaks table again. 

5.5 Conversion of SpecMan HMQC/HETCOR Peaks Table

Command: File/Create NMR Data File/HMQC (or HETCOR).

Descriptions:  In this procedure NMR-SAMS converts the HMQC or HETCOR cross peak coordinates into connectivities between 1D 13C and 1H peaks.  In  principle the conversion process is very similar to what was described earlier in Section 5.4. 

Other things that you need to be aware of are as follows:

The correlated 13C peak(s) is always placed ahead of the correlated 1H peak(s) in a converted connectivity, and this applies to both HMQC or HETCOR. 

Unlike the other 2D spectral data, ambiguity is not allowed for an HMQC connectivity. NMR-SAMS first searches each 13C peak against an HMQC peak by matching 13C coordinate within the specified tolerance.   Next  the HMQC peak that has been identified by the previous step is searched against all 1H peaks by matching its chemical shift within the specified tolerance, and the 1H peak with the best match is taken as the correlated 1H peak.  This process is repeated until each HMQC connectivity has exactly one correlated 13C-1H pair.

Possible Errors: After the conversion, the resulting HMQC peak list is cross-checked against the 13C multiplicity information. NMR-SAMS may prompt the following error/warning messages:

·        If the number of correlated HMQC peaks of a certain 13C peak is fewer than expected (1 for CH and CH3, 2 for CH2), it warns you to check for missing HMQC peaks, or the 1H integral to verify if a CH2 shows degenerate 1H peaks.

·        If the number of correlated HMQC peaks of a certain 13C peak is more than expected (1 for CH and CH3, 2 for CH2), it prompts you to check for possible errors due to degenerate 13C peaks, wrong assignment, or artifacts.

NMR-SAMS automatically discriminates HMQC from HETCOR and does not consider diagonal peaks or symmetric peaks. Strong intensity level (represented as “3”) and the actual peak intensity (from SpecMan peaks table) are assigned to each peak.  The peak intensities are not used by NMR-SAMS so it is not necessary to take care of them (see Section 6.2.4).

Results: After the conversion, the .nmr file is updated with information regarding the converted HMQC connectivities starting with the keyword “HMQC:”. The following is a transcript of a converted HMQC connectivity list:

HMQC:

 #1. (3 - 1)      ;2

 #2. (3 - 2)      ;1

 #3. (4 - 4)      ;3

 #4. (6 - 33)     ;4

      .

      .

      .

The first line which begins with the keyword “HMQC:” indicates the start of  HMQC connectivity list. Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in each of the rest of the lines represent the following attributes of a connectivity:

·         Connectivity ID, a serial number that uniquely identifies this connectivity.

·         IDs of the correlated 1D 13C and 1H peaks (shown in parenthesis), which define the correlated 13C and 1H peaks respectively.

·         Comments, which are optional and has a maximum length of 80 characters. The numbers in the comment field correspond to the ID of the corresponding peak in the SpecMan peaks table.

One or more space(s) is used as a delimiter for all items except comments which are separated by “;”.   Items marked as optional can be omitted unless an item following them is included.  In such a case, you must include default values for ignored items even if they don’t get used.  Comments can always be included as long as they follow a “;”.

Note: The conversion of HMQC/HETCOR peaks table is dependent on the converted 1H and 13C peak lists.  If you convert the 1H/13C peaks table again, or manually modify the converted 1H/13C peak list, you must convert the HMQC/HETCOR peaks table again. 

5.6 Conversion of SpecMan HMBC/COLOC Peaks Table

Command: File/Create NMR Data File/HMBC (or COLOC).

Descriptions:  In this procedure NMR-SAMS converts the HMBC or COLOC cross peak coordinates into connectivities between 1D 13C and 1H peaks.  In  principle the conversion process is very similar to what was described earlier in Section 5.4. 

Other things that you need to be aware of are as follows:

The  correlated 13C peaks are always placed ahead of 1H in a converted connectivity, and this applies to both HMBC or COLOC. 

NMR-SAMS automatically discriminates HMBC from COLOC and does not consider diagonal peaks or symmetric peaks. Strong intensity level (represented as “3”) and the actual peak intensity (from SpecMan peaks table) are assigned to each peak. The peak intensity levels are useful if you want to interpret some weak peaks as connectivities longer than 3 bonds (see Section 6.4.2).

Results: After the conversion, the .nmr file is updated with information regarding the converted HMBC connectivities starting with the keyword “HMBC:”. The following is a transcript of a converted HMBC connectivity list:

HMBC:

 #1.   (1 - 6)    3     ;3

 #2.   (1 - 7 8) 3     ;4

 #3.   (1 - 13)   3     ;5

         .

         .

         .

 #128. (29 - 10) 3     ;133

 #129. (29 - 24) 3     ;131

The first line which begins with the keyword “HMBC:” indicates the start of HMBC  connectivity list. Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in each of the rest of the lines represent the following attribute of a connectivity:

·         Connectivity ID, a serial number that uniquely identifies this connectivity.

·         IDs of the correlated 1D 13C and 1H peaks, (shown in parenthesis). For ambiguous correlations the IDs of all possible 1D 13C & 1H peaks are included.

·         Peak intensity level, which is classified as four types: strong, medium, weak, and unknown, and denoted as 3,2,1 and 0 respectively.  This is optional and the default value is 3.

·         Comments, which are optional and has a maximum length of 80 characters. The numbers in the comment field correspond to the ID of the corresponding peak in the SpecMan peaks table.

One or more space(s) is used as a delimiter for all items except comments which are separated by “;”.  Items marked as optional can be omitted unless an item following them is included.  In such a case, please include default values for ignored items even if they don’t get used   Comments can always be included as long as they follow a “;”.

Note: The conversion of HMBC/COLOC peaks table is dependent on the converted 1H and 13C peak lists.  If you convert the 1H/13C peaks table again, or modify the converted 1H/13C peak list, you must convert the HMBC/COLOC peaks table again. 

5.7 Conversion of SpecMan NOESY Peaks Table

Command: File/Create NMR Data File/NOESY (or ROESY).

Descriptions:  In this procedure NMR-SAMS converts the NOESY (or ROESY) cross peak coordinates into connectivities between 1D 1H peaks in exactly the same way as described for COSY in Section 5.4. Strong intensity level (represented as “3”) and the actual peak intensity (from SpecMan peaks table) are assigned to the corresponding entries of each peak.  NMR-SAMS uses NOESY information in a very limited fashion so normally you do not need to take care of the peak intensity for 2D structure determination (see parameters IDEAL_COSY and NOESY_DIST in Appendix IV).

5.8 Conversion of SpecMan INADEQUATE Data

Command: File/Create NMR Data File/INADEQUATE.

Descriptions:  In this procedure NMR-SAMS converts the 2D INADEQUATE cross peak coordinates into connectivities between 1D 13C peaks.  In the following dialog box, you are prompted to define a matching tolerance.  This tolerance is used to match chemical shifts of 13C peak and the F2 coordinates of the INADEQUATE peaks.  This tolerance is also used to match the F1 coordinates to search for coupled INADEQUATE peaks.  Similar to the conversion process of DQF-COSY (Section 5.4), ambiguous connectivities are considered. 

Results: After the conversion, the .nmr file is updated with information regarding the converted INADEQUATE connectivities starting with the keyword “INAD:”. The following is a transcript of a converted HMBC connectivity list:

INAD: 

#1. (1 - 3)       ;1+2

#2. (2 - 4 5)     ;3+4

      .

      .

      .

The first line which begins with the keyword “INAD:” indicates the start of  the INADEQUATE connectivity list. Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in each of the rest of the lines represent the following attributes of a connectivity:

·        Connectivity ID, a serial number that uniquely identifies this connectivity.

·        IDs of the correlated 1D 13C peaks (shown in parenthesis).  For ambiguous correlations the IDs of all possible 1D 13C peaks are included.

·        Comments, which are optional and has a maximum length of 80 characters. The numbers in the comment field correspond to the ID’s of the corresponding INADEQUATE peaks in the SpecMan peaks table.

5.9 Manual Peak Picking 

If you do not have SpecMan (contact Spectrum Research), you can use the following procedure to manually prepare the NMR data file required by NMR-SAMS.

First number the 1D 1H and 13C peaks, preferably from down-field to upper-field (see Fig. 5.4). HMQC can be used to group multiplets and resolve overlapping peaks in 1H spectrum. If two (or more) 1H peaks overlap completely, treat them as one degenerate peak. The 1D 13C peaks must be resolved (i.e., no peak degeneracy is allowed).  If necessary, split a degenerate 13C peak as two peaks with slightly different chemical shifts.  In the worst case where parts of the spectra cannot be resolved due to multiple atoms with very similar chemical environments (e.g. multiple phenyl groups or a long methylene chain), the unresolved 13C (and 1H as well) peaks can be discarded. NMR-SAMS will then perform partial structure elucidation (PSE) based on the incomplete spectral data.

Figure. 5.4. Schematic illustration of the manual preparation of NMR data input to NMR-SAMS from the original spectral plots.  The 1D 1H and 13C peaks are numbered and 2D cross peaks are picked as pairs of correlated 1D peaks.  Two COSY peaks, #2 and #3, which are suspected to be due to long-range coupling, are marked as weak by an intensity level of 1.  HMBC peak #2, which is suspected to be an artifact, is marked with a reliability of 0.4. The grid lines in the 2D spectra illustrate the intra- and inter-spectral alignments of the 1D resonances. For clarity, only COSY and HMBC are shown. See Section 5.4 for details about the format.

Picking of the 2D cross peaks are based on the numbered 1D peaks. The 2D cross peaks are located and assigned to their corresponding 1D peaks in each dimension. A cross peak which can not be resolved can be assigned to more than two 1D peaks. If it is hard to discriminate the cross peak as a possible artifact or noise, use a probability smaller than 0.5 to designate it as an unreliable peak.  For a COSY peak, the interpretation of which is dependent on its intensity level (i.e., J-coupling constant), so a potential long-range coupling must be marked as a “weak” intensity level (represented as 1).  Finally the picked peaks can be listed in a text file format described in Appendix I.


Chapter 6

Spectral Interpretation

6.1 Overview

This chapter describes the steps involved in the interpretation of the molecular formula (MF), 1D and 2D NMR spectral data, and unification of bond constraints derived from NMR data.  First the possible set(s) of structural building blocks are determined from MF, 1H, 13C and HMQC spectral data.  Next the remaining 2D spectral data are interpreted as bond constraints between the building blocks.  In the same step, the various bond constraints are unified as a homogenous set of bond constraints, and an atom-atom connection matrix (ACMX) is setup to summarize the possibilities of bond formation between the building blocks.

The schematics of deriving bond constraints from different 2D NMR spectral data is illustrated in Fig. 6.1. The general definition of bond constraint (BC) has been provided in Section 3.4.

Figure 6.1.  Derivation of bond constraints from conventional 2D NMR experiments.  An INDEQUATE connectivity is interpreted as a C-C bond constraint (BC) of one bond, COSY connectivity as H-H BC of 2 to 5 bonds,  HMQC connectivity as a C-H BC of one bond, and HMBC connectivity as a C-H BC of 2 or 3 bonds.  The various BCs are transformed into a unified set of C-C BCs based on the HMQC connectivities. 

The spectral interpretation-related steps correspond to the first three options in the Analysis menu shown below:

6.2 Interpretation of MF, 1H, 13C and HMQC Data as Building Blocks

Command: Analysis /Building Blocks.

Description:  This procedure interprets the MF, 1H, 13C, and HMQC data, and generates all possible set of building blocks for structure generation.

You are prompted to enter the MF when a new working data set is opened.  If you want to enter a different MF, choose File/Input Molecular Formula to enter a new one.  The MF can be unknown.  See Section 4.4 for details.

1H, 13C, and HMQC data are read from the .nmr file.  If MF is unknown, you must at least have 13C spectral data.  If the MF is known, and you have no NMR data, you can perform isomer enumeration. 

Parameters: None.

Results:  The results of interpretation of MF, 1H, 13C, and HMQC data are written into the .mdf file. The first set of the generated building blocks are displayed on the screen. In the next few sections the results of this procedure are described in detail.

6.2.1 Interpretation of Molecular Formula

See section 4.4 for description of the interpretation of MF. 

6.2.2.Interpretation of 1D 1H Data

The 1H peak list in the NMR data file is interpreted and written into the MDF as a record starting with the keyword “1DH1:”.  Following the keyword are the number of 1H peaks, and the minimum and maximum number of heteroatom-attached protons.  The latter is currently not used so it is always set as 0 - 0.  The second line is a brief description of the entries in the rest of the lines.  Each of the subsequent lines include  the peak ID, the chemical shift, and the minimum and maximum numbers of the corresponding protons, and the multiplicity of the 1H peak.  The minimum and maximum numbers of the corresponding protons are not used now so they are always kept as zeros.  Following is a transcript of such a record:

1DH1: num.peaks = 33, num.hete.Hs = 0-0

#Peak. Chem.shift (min. protons ~ Max. protons multiplicity )

# 1. 4.930(0~0 1)

# 2. 4.755(0~0 1)

# 3. 3.509(0~0 0)

# 4. 3.435(0~0 0)

# 5. 2.725(0~0 0)

# 6. 2.611(0~0 0)

# 7. 2.235(0~0 0)

      .

      .

      .

6.2.3 Interpretation of 1D 13C Data

The 13C peak list in the NMR data file is interpreted and written into the MDF as a record starting with the keyword “1DC13:”.  Following the keyword are the number of 13C peaks.  The second line is a brief description of the entries in the rest of the lines. Each of the subsequent lines include the peak ID, the chemical shift, and the minimum and maximum numbers of the attached protons of a 13C peak.  If the multiplicity of a peak is unknown, a range of attached protons (i.e., 0 to 3) will be assigned to the carbon.

Another record, starting with the keyword “SYMMETRY:”, describes molecular symmetry of the unknown molecule.  Currently this entry is either “No”, when the number of 13C peaks equals that of carbon atoms, or “PSE” for partial structure elucidation.

Following is a transcript of such records:

1DC13: num.peaks = 21

#Peak, Chem.shift, (Rng.of att.H, i.e., mult.-1)

# 1. 196.06(0~0)

# 2. 145.56(0~0)

# 3. 144.65(0~0)

# 4. 140.75(1~1)

# 5. 123.40(0~0)

# 6. 121.57(0~0)

# 7. 56.28(1~1)

# 8. 53.85(0~0)

      .

      .

      .

 

SYMMETRY: No     

6.2.4 Interpretation of HMQC/HETCOR Connectivities

Each HMQC/HETCOR connectivity in the NMR data file is interpreted as a C-H BC according to the following rules:  

1.      All connectivities are interpreted as a C-H BC of exactly one bond. 

2.      If a 1H peak is found to have no HMQC peak, you will be prompted (as shown below in the dialog box) to supply the type of heteroatom attached to it.  The program then automatically assigns a  heteroatom to the proton and adds a X-H BC (X is the heteroatom) to the list of HMQC-derived C-H ones.  The program first lists all of the 1H peaks without HMQC connectivities, together with the recommended assignment of heteroatoms.  For example: 

If you agree with the H-X assignment, click Yes. Otherwise click No, and you will be prompted to assign heteroatoms to each of the 1H peaks. For example:

The current heteroatoms with attached 1H peaks are numbered and listed in the dialog box.  This is useful when you want to attach more than one 1H peaks to the same heteroatom.  In such a case, you can type a heteratom followed by a number in the list so that the current 1H will be attached to it.  

If you are not sure which kind of heteroatom should be connected to the 1H peak, leave the text field empty or type ‘unknown’.  NMR-SAMS will not attach this proton to any heteroatom.  In such a case, any connectivity information relevant to this proton will be ignored during the subsequent analysis.

The results of interpretation of HMQC connectivities are written into the MDF as a record starting with the keyword “HMQC:”. Following the keyword is a comment, denoting the sequence of the correlated atoms in each bond constraint. Each of the rest of the lines is a C-H bond constraint. Following is a transcript of the record:

 

HMQC: (Node sequence: C-13, H-1)

(3 - 1: 1 ~ 1; 0)Q1

(3 - 2: 1 ~ 1; 0)Q2

(4 - 4: 1 ~ 1; 0)Q3

(6 - 33: 1 ~ 1; 0)Q4

      .

      .

      .

 

6.2.5  Generation of Building Blocks

If the MF is known, this procedure allocates the constituent protons to the heavy atoms based on the 13C multiplicities and chemical valences of the heavy atoms.  The generated building blocks sets must comply with the 13C multiplicities and number of attached 1H peaks to the heteroatoms. Each heavy atom, with its attached protons and unsatisfied valence, is called a building block.  The unsatisfied valence is represented as free bonds. 

If the MF is unknown, carbon building blocks are derived directly from the 13C peaks, with a certain or uncertain number of attached protons depending on the 13C multiplicity is known or unknown.  If some 1H peaks are attached to heteroatom, heteroatom building blocks are also derived.   You can use the Analysis/User-Defined Building Blocks function to edit the building blocks.

The free bonds of different building blocks can be connected to form bonds, as illustrated in Fig. 6.2:

Figure. 6.2 Examples of structural building blocks and bond formation between them. 

 

The resulting building blocks are written in the MDF as a record starting with the keyword “FRAG_SET:”.  The following is a transcript of such a record:

FRAG_SET:

#1:   C  C  CH2 CH1 C  CH1 CH1 CH1 CH1 C 

      C  C  CH2 CH1 CH2 C  CH2 CH2 CH2 CH2

      CH3 CH2 CH2 CH2 CH3 CH2 CH3 CH3 CH3 CH3

      O  OH1 OH1

After the building blocks are generated, the first set of building blocks are displayed. If there are multiple building block sets, a Building Block Browser is displayed (as shown below) which allows  you to browse through each building block set by moving the slider. 

Multiple sets of building blocks are generated when either one of these conditions prevail: some or all the 13C multiplicities are unknown, or there are different kinds of heteroatoms with attached protons.  NMR-SAMS can use multiple sets of building blocks for structure generation, but it only uses the first one for target structure-based resonance assignment.  So wherever possible you are advised to delete the undesired ones.

To remove the building block set which is being displayed, click Delete in the Building Block Browser.  To select the displayed building block set as the only one for structure generation, click Select in the Building Block Browser, and the rest of the building block sets will be removed.

Note: In the case of a 13C peaks with unknown multiplicity, NMR-SAMS will try to enumerate all possible numbers of attached protons for its corresponding building block if MF and 13C spectral data are used.  If this is not possible (e.g. when MF is unknown, or there are fewer 13C peaks than carbon atoms), NMR-SAMS will generate a building block with unknown number of attached protons, such as ‘CH?’.  Such a building block will be forced to be ignored during the subsequent structure generation.

Possible Errors:

·        If no valid building block set is generated, you have to check the MF, 13C multiplicities, and the valence of the atoms.

·        The maximum number of building block sets is set to 500.  If it exceeds this number, the remaining ones are ignored.  In such a case, use 13C multiplicities to constrain the generation of building blocks.

6.3 User-Defined Building Blocks

Command: Analysis/User-Defined Building Blocks.

Description: Whether the MF is known or unknown, this option allows you to add, delete, or modify the building blocks.

To add a building block, select Add, and type the element symbol after Element. Select Ignored Atom if you want to ignore it in structure generation (see Section 7.1 for details regarding Ignored Atoms).  After Proton Count, select the correct number of attached protons.  If unknown, select Unknown.  The default valence will appear after Valence, although you can select a different one.  If you type “C” after Element, you will be able to check Assigned C-13 Shift and type a 13C chemical shift for it.  If the Proton Count is bigger than zero, you will be able to check Assigned H-1 Shift, and type one or two 1H chemical shifts for the protons.  When entering multiple proton shifts use a blank space as a delimiter. Then click at an empty place in the main graphics window, a building block with the defined attributes will be added.

You can copy the attributes from an existing building block by clicking on that building block while keeping the Ctrl key pressed.

Note:  There are some limitations on the use of the Add building blocks.  The newly added carbon building blocks will be ignored (i.e., not used for bond formation during the structure generation). Any building blocks that have unknown number of attached protons will be ignored.  Finally, the chemical shifts of the added building blocks are only for cosmetic purpose, i.e., they will not be evaluated during the subsequent analysis although they are always displayed.

To modify a building block, check Modify in the palette if it is not checked.  Next copy the attributes from that building block by clicking on it while pressing the Ctrl key.  Then change the corresponding attributes in the palette. Finally click on the building block again (without pressing the Ctrl key) and the building block will be modified accordingly.

Tip:  To modify a non-ignored building block as an ignored one (or vise versa), you do not need to copy all attributes before modifying it.  Just set the option Ignored Atom as required, and click on that building block.  The first time you click it will only toggle the ‘Ignored Atom’ state, if the required value is different from the current state of the building block.  If you want to change other attributes also, click it again and all other attributes will be modified according to those specified.

Note you can not modify all attributes except Ignored Atom and Proton Count for a carbon building block derived from 13C data.

To delete a building block, check Delete in the palette if it is not checked.  Then click on the building block you want to delete.

Note that you can not delete a carbon building block that was derived from a 13C peak. 

Results:  The modified building blocks are written in the MDF as a record starting with the keyword “FRAG_SET:”.  The original record is overwritten. 

 

6.4 Interpretation of 2D Spectral Data as Bond Constraints

Command: Analysis/Bond Constraints.

Description:  This procedure interprets the COSY, HMBC, NOESY, and INADQUATE spectral data in the .nmr data file to define bond constraints.  Then the various bond constraints are unified, and atom-atom connection matrix is setup for subsequent structure generation or resonance assignment.

Parameters: The relevant parameters for interpreting the 2D spectral data can be accessed from the dialog box shown below by choosing Edit/Parameters/NMR Interpretation. For explanation of the parameters, see Section Parameters for Spectral Interpretation in Appendix IV.

The relevant parameters for setting up the ACMX can be accessed from the dialog box shown below by choosing Edit/Parameters/Setting up ACMX. For explanation of the parameters, see Section Parameters for Setting Up ACMX in Appendix IV.

Results:  In the next few sections the results of this procedure are described for each type of spectral data. 

6.4.1 Interpretation of COSY Connectivities

The results of COSY interpretation are written into the MDF as a record starting with the keyword “COSY:”, which can be edited by choosing Edit/Master Data File.  Each COSY connectivity in the NMR data file is first classified as due to either potential long-range coupling or short-range coupling.  Based on that, a H-H BC is assigned to it.  The rules for this step are described below:  

1.      If the intensity level is weak (represented as “1”), it is treated as due to potential long-range coupling.

2.      If the intensity level is medium, strong (represented as “2” or “3”, respectively), or blank, it is treated as due to short-range coupling.

3.      If the intensity level is unknown (represented as “0”), then the J-coupling constant is used to classify short-range and long-range couplings.  If the J-coupling constant is also unknown (represented as 0.0), then an error message will be displayed and the interpretation is aborted.  If the J-coupling constant is defined as J Hz, it is compared with the parameter COSY_J_CATEG (which is set as 3.0 by default).  All connectivities that have J Ł COSY_J_CATEG are treated as due to potential long-range coupling, and the rest as  short-range coupling. 

4.      When a connectivity is classified as due to short-range coupling, and has a correlated singlet 1H peak, then NMR-SAMS prompts you to confirm whether it is due to long-range coupling.  If you click Yes, it is classified as a long-range coupling, otherwise (for selection No) it remains as a short-range coupling. 

5.      If you like, you can active a check of possible long-range coupling based on 1H chemical shift.  To do this, select Edit/Parameters/NMR Data Interpretation, and add a proper value (e.g. 4.5) after Minimum H-1 Shift for Checking Long-Range H-H Coupling.  This checking is turned off by default (i.e., value set as 0).

6.      By default all connectivities due to short-range coupling are interpreted as H-H BCs with 2 to 3 intervening bonds.  By default all connectivities due to long-range coupling are interpreted as a H-H BCs with 3 to 5 intervening bonds.  The number of intervening bonds is controlled by the  parameter, COSY_BC.

7.      The bond types of the intervening bonds are always set as unknown (0).  The number of sub-bond constraints (NSBC) that must satisfy a BC, minNSBC  and  maxNSBC, are determined as follows:

minNSBC = 1 if  P ł RELIAB_PEAK_PROB, or

minNSBC = 0 if  P < RELIAB_PEAK_PROB, and

maxNSBC  = n1 ´ n2 ,

where P is the reliability of the connectivity, and n1 and n2 are the number of correlated 1D peaks in each dimension, respectively.  The default value of the parameter, RELIAB_PEAK_PROB is set as 0.50.  For example, the following connectivity is due to an “unreliable” DQF-COSY peak since the reliability is 0.4: 

#8 (2 - 5 6) 3 0.00 0.4 ;unreliable, may be an artifact

So this connectivity is interpreted as the following H-H BC:

(2 - 5 6: 2 ~ 3; 0; 0 ~ 2)C8

which means that this BC is flexible enough to be considered as satisfied if none, one, or both of the proton pairs (i.e. H2-H5 and H2-H6) have a bond separation of two or three bonds in the generated structure.  

7.      If two 1H peaks are very close and no COSY peak is observed between them, you are alerted to check if any near-diagonal peak has been neglected between them.  If you are not sure about this,  the program allows you to add a "pseudo bond constraint" for this proton pair.  The tolerance for checking near-diagonal COSY peaks is controlled by a parameter called COSY_DIAG_RESO, and its default value is 0.02ppm.  You can change this by selecting Edit/Parameters/NMR Interpretation. The pseudo BC here is used to prevent two atoms from being forbidden to connect while setting up the ACMX.

The results of COSY interpretation are written into the MDF as a record starting with the keyword “COSY:”. Following the keyword is a comment, denoting the parameters used for the interpretation. Each line thereafter is a H-H bond constraint. Following is a transcript of the record:

COSY: (COSY_BC = 3 5 2 3; COSY_DIAG_RESO = 0.020)

(1 - 2: 3 ~ 5; 0; 1 ~ 1)C1

(1 - 12: 3 ~ 5; 0; 1 ~ 1)C2

(2 - 12: 3 ~ 5; 0; 1 ~ 1)C3

(3 - 7 8: 2 ~ 3; 0; 1 ~ 2)C4

      .

      .

6.4.2 Interpretation of HMBC/COLOC Connectivities

Each HMBC/COLOC connectivity list in the NMR data file is interpreted as a C-H BC according to the following rules:  

1.      Each connectivity is interpreted as a C-H BC of a certain range of intervening bonds based on the intensity level of the peak and the relevant parameters.  

2.      The bond types of the intervening bonds are always set as unknown (0).  The number of sub-bond constraints (NSBC) that must satisfy a BC, minNSBC  and  maxNSBC, are determined as follows:

minNSBC = 1 if  P ł RELIAB_PEAK_PROB, or

minNSBC = 0 if  P < RELIAB_PEAK_PROB, and

maxNSBC  = n1 ´ n2 ,

where P is the reliability of the connectivity, and n1 and n2 are the number of correlated 1D peaks in each dimension, respectively.   The default value of the parameter, RELIAB_PEAK_PROB is set as 0.50.  For example, the following connectivity is due to an “unreliable” HMBC peak because its reliability is 0.4:  

#3 (10 - 8) 3 0.00 0.4 ;very weak, may be an artifact

So this connectivity is interpreted as the following C-H BC:

(10 - 8: 2 ~ 3; 0; 0 ~ 1)B3

The last two numbers, 0 and 1, mean that bond separation between C10 and H8, can either satisfy or violate this BC in the generated structure. 

The results of interpretation of HMQC connectivities are written into the MDF as a record starting with the keyword “HMBC:”. Following the keyword is a comment, denoting the parameters used for interpretation and sequence of the correlated atoms in each bond constraint. Each line thereafter is a C-H bond constraint. Following is a transcript of the record:

HMBC: (HMBC_BC = 2 3, Node sequence: C-13, H-1)

(1 - 6: 2 ~ 3; 0; 1 ~ 1)B1

(1 - 7 8: 2 ~ 3; 0; 1 ~ 2)B2

(1 - 13: 2 ~ 3; 0; 1 ~ 1)B3

(1 - 15: 2 ~ 3; 0; 1 ~ 1)B4

      .

      .

      .

6.4.3 Interpretation of NOESY Connectivities

A NOESY connectivity in the NMR data file is always interpreted as a H-H BC of 2 to 6 bonds. NOESY is useful to NMR-SAMS only when you opt to use the negative information of COSY together with NOESY.  For example, if there is neither a COSY nor a NOESY peak observed between two carbon atoms then this pair is forbidden to connect (see the usage of parameter IDEAL_COSY in Appendix IV).  In the current version of NMR-SAMS, the through space NOESY correlations are not used as bond constraints during structure elucidation. 

The results of interpretation of NOESY  connectivities are written into the MDF as a record starting with the keyword “NOESY:”. Following the keyword is a comment, denoting the parameters used for interpretation and sequence of the correlated atoms in each bond constraint. Each of the rest of the lines is a H-H bond constraint. Following is a transcript of the record:

NOESY: (NOESY_BC = 2 6 0, Node sequence: H-1, H-1)

(1 - 2: 2 ~ 6; 0; 1 ~ 1)N1

(1 - 3: 2 ~ 6; 0; 1 ~ 1)N2

(1 - 12: 2 ~ 6; 0; 1 ~ 1)N3

(2 - 12: 2 ~ 6; 0; 1 ~ 1)N4

(3 - 7 8: 2 ~ 6; 0; 1 ~ 2)N5

      .

      .

      .

6.4.4  Interpretation of INADEQUATE Connectivities

Each INADEQUATE connectivity in the NMR data file is interpreted as a C-C BC according to the following rules:  

1.      Each connectivity is interpreted as a C-C BC of one intervening bond by default.  The number of intervening bonds are controlled by the first two values of the parameters INAD_BC. 

2.      The bond type is controlled by the third value of the parameter INAD_BC, and by default is defined as unspecified (i.e., unknown).  This can be changed to single, double, or triple.  For example, if an INADEQUATE experiment is optimized to manifest only single C-C bond, you can set the third value of INAD_BC as 1, so that all of the connectivities are interpreted as C-C single bonds.  This will improve the efficiency of the structure generation since NMR-SAMS will not consider the other possibilities of these bonds.

3.      The number of sub-bond constraints (NSBC) that must satisfy a BC, minNSBC  and  maxNSBC, are determined as follows:

minNSBC = 1 if  P ł RELIAB_PEAK_PROB, or

minNSBC = 0 if  P < RELIAB_PEAK_PROB, and

maxNSBC  = n1 ´ n2 ,

where P is the reliability of the connectivity, and n1 and n2 are the number of correlated 1D peaks in each dimension, respectively.   The default value of the parameter RELIAB_PEAK_PROB is set as 0.50.  For example, the following connectivity is due to an “unreliable” INADEQUATE peak since its reliability is set as 0.4: 

#18 (9 10 - 28) 3 0.0 0.4 ;C9 and C10 too close to resolve

This connectivity is interpreted as the following C-C BC:

(9 10 - 28: 1 ~ 1; 0; 0 ~ 2)B3

which means that this BC is flexible enough to be considered as satisfied if either none, one, or both of carbon pairs (i.e. C9-C28 and C10-C28) have a bond separation of one bond in the generated structure.     

The results are written into the MDF as a record starting with the keyword “INADEQUATE:”. Following the keyword is a comment, denoting the parameters used for interpretation. Each line thereafter is a C-C bond constraint. Following is a transcript of the record:

INADEQUATE: (INAD_BC = 1 1 0)

(2 - 1: 1 ~ 1; 0; 1 ~ 1)I1

(4 - 3: 1 ~ 1; 0; 1 ~ 1)I2

(5 - 4: 1 ~ 1; 0; 1 ~ 1)I3

(6 - 5: 1 ~ 1; 0; 1 ~ 1)I4

      .

      .

      .

6.4.5 Transformation of Bond Constraints

After interpreting the various 2D spectral data as bond constraints, this procedure transforms the various kinds of BCs into a homogenous set of C-C (or heteroatoms) BCs based on the HMQC-derived C-H BCs. The following rules are used:

1.      An INDEQUATE-derived C-C BC remains unchanged.

2.      The correlated 1H peaks in a DQF COSY-derived H-H BC is replaced by their correlated 13C peaks in HMQC, and the bond separation is reduced by 2.

3.      The correlated 1H peak(s) in an HMBC-derived C-H BC is replaced by their correlated 13C peaks in HMQC, and the bond separation is reduced by 1.

4.      The correlated 1H peaks in a NOESY-derived H-H BC is replaced by their correlated 13C peaks in HMQC, and the bond separation is reduced by 2.

5.      If a degenerate 1H peak has multiple correlated 13C peaks, pseudo C-C BCs are added between these 13C peaks.  The pseudo BC is used to prevent the two atoms from being forbidden to connect while setting up the ACMX.

Note: A degenerate 1H peak has multiple correlated 13C peaks in HMQC unless they arise from geminal protons. If a certain BC involves such a 1H peak, all correlated 13C peaks are included in the resulting C-C BC. So additional ambiguity is introduced to the resulting C-C BC.  In such a case, NMR-SAMS can use such ambiguous BCs for structure generation.

6.      The source of the relevant BCs are included as comments in the resulting C-C BC so that you can keep track of the various connectivities from which a C-C BC is derived.

Fig. 6.3 illustrates the transformation of an ambiguous COSY BC into C-C BC.  The ambiguity arises from the overlapping peaks of H8 and H9.

Figure 6.3 Illustration of the transformation of a DQF-COSY-derived H-H BC into a C-C BC based on the relevant HMQC connectivities.  The two protons in the circle can not be resolved in the DQF-COSY spectrum, thus introducing ambiguity in the resultant C-C BC.  For details about the format of the bond constraints, please refer to Section 3.4.

All resultant C-C BCs are cross-checked for mutual consistency.  If two BCs have the same relevant nodes, they are merged according to the following rules:

·        If all entries are identical except the source, their sources are merged.

·        If the ranges of bond separation, minBond and maxBond, are different and an intersection is possible,  then the intersection of the two ranges is adopted. Otherwise NMR-SAMS will prompt  you to supply a valid minBond and maxBond.  For example, if one BC requires a bond separation of 1 to 3 bonds, and the other, 1 to 1 bond, then the intersection, 1 to 1 bond (i.e., exactly one bond), is adopted for the merged BC.  On the other hand, if one BC requires a bond separation of 2 to 3 bonds, and the other, 1 to 1 bond, then the following message (as shown below) will prompt you to enter the proper bond separation because no intersection is possible between the two BCs.

In this example, type “1 1” if  you are sure  it is a vicinal coupling, or “1 3” if you are not.

·        Similar to bond separation, if the ranges of NSBC, minNSBC and maxNSBC, are different, the intersection of the two ranges is adopted whenever an intersection is possible. Otherwise you will be prompted with a similar message as above to supply a valid range for minNSBC and maxNSBC.

·        If  the bond types are different, then NMR-SAMS adopts the higher bond order ( the order of priority is: triple , double, single and unknown). 

Note:   Most of the BCs can be combined with other BCs (e.g., a COSY BC with an HMBC one) except  NOESY BCs,  which are treated differently.  NOESY BCs can be combined only with other NOESY BCs concerning the same 13C signals.

Results: The results are written into the MDF as a record starting with the keyword “C13~~C13:”. Following the keyword are some comments which are internally used by the program (Note: you must not change these comments). Every line thereafter represents a C-C bond constraint.  For details regarding the format of bond constraints, see Section 3.4.  Following is a transcript of the record:

 

C13~~C13: COSY-Y, NOESY-Y, HMBC-Y, INAD-N (Node sequence: C-13, C-13)

(3 - 25: 1 ~ 2; 0; 1 ~ 1)C2Q1Q27C3Q2Q27B13Q27B114Q1B115Q2

(9 - 15 19: 1 ~ 1; 0; 1 ~ 2)C4Q7Q11Q17B46Q11Q17

(9 - 8: 1 ~ 1; 0; 1 ~ 1)C5Q7Q6B39Q7B48Q6

      .

      .

      .

 

Tips: Running NMR-SAMS and SpecMan side-by-side provides a convenient way to inspect the original cross peaks when a bond constraint is mentioned in a dialog box, or when you are editing the bond constraints in the MDF.  Fig. 6.4 illustrates how to keep track of the cross peaks from which a bond constraint is derived.

Figure 6.4  Schematics representing the way to keep track of the cross peaks from which a bond constraint (BC) was derived.  Run NMR-SAMS and SpecMan side-by-side. From the comment field of the BC (which you are verifying), find the code of connectivities from which the BC was derived (“C3+66”, “Q18”, and “Q28” in this example). This means that this BC was derived from COSY peaks #3 and #66, and HMQC peaks #18 and #28. With SpecMan, load the COSY peaks table and then click the IDs of one of these cross peaks.  Upon clicking the IDs, SpecMan displays the cross peaks in the 2D spectral window.

 

6.4.6 Setting up Atom-Atom Connection Matrix (ACMX)

After the user selects Analysis/Bond Constraints, Analysis/User-Defined Bond Constraints, or Analysis/User-Defined Environment Constraints, NMR-SAMS tries to generates an ACMX for each building block set based on the available building blocks, bond constraints, and environment constraints. NMR-SAMS uses atom-atom connection matrix (ACMX, also known as free bond connection matrix) to represent the bonding possibilities between the constituent heavy atoms of the unknown molecule.   By default, the unambiguous bond constraints (which define one bond between exactly two atoms) are treated as fixed bonds, and the rest are used as constraints during the subsequent structure generation.

If there is only one set of building blocks, NMR-SAMS automatically forms some common functional groups based on 13C chemical shifts and elemental composition while setting up the ACMX.  These functional groups include >C=O, -COO-, -COOH, -COON<, -COONH-, -NO2, -OSO3Hn (n = 0 or 1), and -OPO3Hn (n Ł 0, 1, or 2).  Sometimes these automatically added functional groups are not reliable so you are advised to check and modify them if necessary (see Section 7.2).

Results: For each building block set, a record starting with the keyword “ACMX: #x:” (where x is the sequential number of the ACMX) is written in the MDF.  The following is a transcript of such a record:

 

ACMX: #1:

(HETCON_FLAG = 0, CCBOND_FLAG = 1 1 1,  BC_WEIGHT = 48,

IDEAL_COSY = 1, H1MULT_FLAG = 1, MAX_GEN_ANBC = 3, FIX_BOND_FLAG = 1)

# 1. 6 0  0  1  1    1  0 2 1  0 1 0    3 31 31 32      0

# 2. 6 0  0  2  2    4  0 2 0  0 1 0    0               0

# 3. 6 2  0  3  3    2  0 2 0  0 1 0    0               0

       .

       .

       .

 

After setting up ACMX, the first building block set is displayed along with the fixed bonds, if any.  If there are multiple ACMXs, a Building Block Browser is displayed.  This browser enables browsing through the building block sets.  By default, atoms with satisfied valences are displayed in gray, and the ones with free bonds are displayed in blue and marked by an asterisk ( “*”).  Bonds of unspecified type are displayed as dashed lines. You can select Display/Display Options/Show Disconnectivities to highlight the atoms that can not be connected to a certain atom when you click it. You can also select Display/Display Options/Connection Table to display a Connection Table.  The Connection Table lists building blocks, their associated chemical shifts, and the current bond constraints and environment constraints (see Chapter 10).

The ACMXs are not displayed but can be viewed in the MDF by using Edit/Master Data File.

Possible Errors: Depending on the situation, the following potential error messages appear during the setup of ACMX:

·        Too many fixed bonds for a certain atom.  This means, either a long-range coupled COSY peak was mistakenly interpreted as a vicinal one, or the valence of this atom was set wrong.  In the former case, mark the long-range COSY connectivities in the .nmr file (see Section 6.4.1) and choose Analysis/Bond Constraints again.   In the latter case, modify the valence of this atom according to Section 4.4.

·        Too many double bonds for a certain atom.  The minimum and maximum number of attached double bonds of each atom are determined during the interpretation of the MF (see Section 4.4). If this happens, you can modify the corresponding entries and repeat this step.

·        Too many triple bonds for a certain atom. The minimum and maximum number of attached triple bonds of each atom are determined during the interpretation of the MF (see Section 4.4).  If this happens, you can modify the corresponding entries and repeat this step.

·        Too many free bonds.  The number of free bonds, n_free_bond, can be calculated as follows:

n_free_bond = Svalence - SH - 2 ´ Sfixed_bond

where Svalence, SH, and Sfixed_bond are, respectively, the sums of valences of the heavy atoms, the constituent protons, and the fixed bonds (double and triple bonds multiplied by 2 and 3 respectively).  n_free­_bond is one of the major factors that determines the complexity of the structure generation problem.  The current upper limit of the free bonds is 220.  If n_free_bond overflows, you can manually add some known bonds in a record starting from the keyword “ATOM~~ATOM:” in the MDF to reduce the free bonds (see Section 7.2).

 


Chapter 7

2D Structure Generation

7.1 Overview

This chapter describes the 2D structure generation of NMR-SAMS.  The structure generation of NMR-SAMS starts from an ACMX described in the previous chapter.  Usually, before structure generation, you add some known bonds, edit the fixed bonds derived by the program, add some environment constraints, and check the parameters for structure generation.  Next the structure generator of NMR-SAMS assembles the building blocks into complete structures that are compatible with all available spectral and chemical constraints. 

The structure generation is based on heteroatoms and the carbon atoms labeled by 13C chemical shifts. Depending on the number of observed 13C peaks, you can either perform complete structure elucidation or partial structure elucidation.  In some cases, such as a symmetric molecule or when the 13C spectrum shows severe overlap, partial structure elucidation is performed based on the limited carbon atoms labeled by the well-resolved 13C chemical shifts, as well as the constituent heteroatoms.  The remaining carbon atoms, called ignored atoms, are excluded during structure generation.  The resulting structure is usually a partial structure, with some dummy bonds which are supposedly linked to the ignored moieties.  Fig. 7.1 shows an example of partial structure elucidation:

Figure 7.1.  Illustrates partial structure elucidation of paclitaxel using NMR-SAMS.  Both the 1H and 13C resonances of the three phenyl groups are difficult to resolve and are thus ignored.  Using only the well-resolved portions of the 1D and 2D spectra, NMR-SAMS generates the core structure, with three dummy bonds (represented as the bold arrows), supposedly linked to the ignored phenyl groups.

Note that, compared to complete structure elucidation, partial structure elucidation has the following limitations:

·        An ignored moiety is assumed to be linked to the core structure by a single bond, i.e., a dummy bond is of single bond type. Only one dummy bond is automatically added on each atom.  In the case an ignored atom is connected to the core structure by a multiple bond, you are advised to add the remaining dummy bonds as user-defined bond constraints prior to structure generation (see Section 7.2).

·        You must  provide the number (or a range) of the dummy bonds to be fixed in a generated structure, before performing the structure generation (see Section 7.4).

·        For efficient structure elucidation, you must provide as many user-defined bond/environment constraints as possible, to reduce the search space, and thereby speed up the convergence of structure generation (see Sections 7.2 and 7.3).

The structure generation-related steps correspond to the second group of options in the Analysis menu as shown below:

7.2  User-Defined Bond Constraints

Command: Analysis/User-Defined Bond Constraints.

Description: This procedure is used to define known structural fragments as user-defined bond constraints between the building blocks.

Figure 7.1.  Screen snapshot showing the process of defining user-defined bond constraints. The building blocks and fixed bonds are displayed in the main graphics window. The User-Defined Bond Constraints palette provides tools to add or remove bonds between the building blocks.  See text for details.

Upon selecting Analysis/User-Defined Bond Constraints, a User-Defined Bond Constraint Editor palette is displayed (Fig. 7.2). To add a bond, select Add in the User-Defined Bond Constraints palette.  Then select the type of the bond to add (Single, Double, Triple, or Unknown).  If you are not sure about the bond type, choose Unknown.  Click the two building blocks between which you want to add the bond.  If the first building block is clicked by mistake, click it again to de-select it.  If the second one is clicked by mistake and the bond is accepted, delete the bond as described in the next section.  The added bond is then checked against the available constraints.  If any inconsistency is detected the bond will be rejected, otherwise it is accepted.

To delete a bond, choose Delete in the User-Defined Bond Constraints palette.  Next click the two atoms defining the bond to delete. If the first building block is clicked erroneously, click it again to de-select it.  If the second one is clicked erroneously and the wrong bond is removed, add the bond again as described in the previous section.  Whenever you delete an NMR-derived bond, NMR-SAMS prompts you to confirm whether you want to prevent it from being added again (as shown below):

If you are not sure, click No.  Otherwise if you want to forbid it from being added again, click Yes, and NMR-SAMS will add a pseudo bond constraint to prevent the atoms from being connected. This pseudo bond constraint can be removed by choosing Delete and then clicking the two relevant atoms.

To modify a bond type, select Add and the desired bond type in the User-Defined Bond Constraints palette. Next click the two atoms defining that bond, and NMR-SAMS will prompt the following dialog box.   Click Yes and the previous bond will be modified as the one which you just entered.

For Partial Structure Elucidation, a dummy bond can be added on the atom which is known to be connected to a certain ignored moiety. To do this, select Add and Dummy Bond in the User-Defined Bond Constraints palette, and then click on the desired atom. The added dummy bond is designed as a tilde (“~”).  To delete a dummy bond, select Delete and Dummy Bond, and then click on the desired atom.  The tilde (“~”) will disappear.  Note that a dummy bond is of single type.  If you add two dummy bonds on the same atom, it could be a double bond, or two single bonds.

After you have finished adding or removing bonds, click OK in the User-Defined Bond Constraints palette.  NMR-SAMS will cross-check all of the bond constraints, including the user-supplied ones and the NMR-derived ones.  Then the ACMX will be automatically setup again (see Section 6.4.6)

Results: The user-defined bond constraints are saved as a record starting with the keyword “ATOM~~ATOM:” in the MDF.   The previous ACMX (s) is overwritten by the updated one(s).  Each updated ACMX is saved as a record starting with the keyword “ACMX: #x:” where x is the sequential number of the ACMX.  If a complete structure is obtained, it is saved in a record starting with the keyword “RESULTS:”.  Following is a transcript of such a record of user-defined bond constraints:

ATOM~~ATOM:

(9 - 8: 1 ~ 1; 0)G

(14 - 8: 1 ~ 1; 0)G

(19 - 9: 1 ~ 1; 0)G

      .

      .

      .

Limitation:  Currently the interactive input of user-defined bond constraints is limited to bonds between two assigned atoms.  A general bond constraint, which may have ambiguous atoms or bond separation, must be manually appended to the bond constraints under the keyword “ACMX:”.  If there are multiple ACMX’s, you have to do this for each of them.

7.2.1. Interactive Structure Generation

Besides adding known fragments, you can also use this procedure to interactively complete the structure generation.

You can start from either a building block set, or a previously generated substructure if you have performed structure generation. To start from a building block set, display the building blocks by selecting Display/Building Blocks & Fixed Bonds and select the appropriate one if there are multiple sets of building blocks.  To start from a substructure, display the previously generated substructure by selecting Display/Generated Structures, and then select one that seems to be a good starting point.  Next select Analysis/User-Defined Bond Constraints to add/delete/modify bonds between the building blocks until you get a complete structure.

This is analogous to manually assembling a structure on paper, but has the advantage of checking the consistency with the spectral and chemical constraints on the fly.  Moreover, the interaction between the displayed atoms and the bond constraints (Fig. 7.2) helps you in identifying potential bonds that can be added to the selected atom. 

Figure. 7.2. Illustration of interaction between the building blocks and the Connection Table.  By clicking an atom in the main graphics window, its associated chemical shifts and relevant bond constraints are highlighted in the Connection Table.  Alternately by clicking an entry in the Connection Table, the relevant atom(s) in the main graphics window can be  highlighted.

Note:  While adding user-defined bond constraints, you need to double click an atom to highlight its relevant bond constraints in the Connection Table, otherwise a bond will be added between this atom and the next one you click.

Once a complete structure is obtained, NMR-SAMS congratulates you with the  following message:

Click OK in this dialog box, and then click OK on the User-Defined Bond Constraints palette, and NMR-SAMS will prompt you to save the completed structure.

7.3  User-Defined Atom Environment Constraints

Command: Analysis/Atom Environment Constraints. 

Description:  This procedure is used to define known structural information as atom environment constraints (EC).  An EC defines the number of occurrences of a certain type of atoms (with specific/non-specific bond type) as the immediate neighbors of the atom under consideration (called focus atom).  For an EC, you do not need to know the numbering of the neighboring atom.  For example, it is commonly difficult to distinguish the two different situations illustrated in Fig. 7.3 so you can not input them as user-defined bond constraints. However you can enter them as two ECs requiring that both C-1 and C-2 have exactly one oxygen as their neighbors.

Figure 7.3.  A situation where it is difficult to predict whether C-1 and C-2 are connected to the same oxygen atom (a) or to different oxygen atoms (b).  This can be defined as two environment constraints: (1 - O: 1 ~ 1; 1) and (2 - O: 1 ~ 1; 1). 

In the MDF, an EC is represented as a line in the following format:

(focusAtom - neighborElement: minOccurrence ~ maxOccurrence; bondType)

where

focusAtom is the ID of the focus atom,

neighborElement is the element symbol of the neighboring atom(s) under consideration,

minOccurrence and maxOccurrence are the minimum and maximum occurrences of the neighboring atom under consideration, and

bondType is the type of bond between the focus atom and the neighboring atom under consideration. bondType can be 0, 1, 2, or 3 for unspecified, single, double, or triple, respectively.  If the bond is unspecified, it will be treated as all types of bonds.

Relevant Operations:

The following Edit Atom Environment Constraint dialog box appears after selecting Atom Environment Constraints from the Analysis menu. The current ECs will be displayed in the dialog box.

 

To add an EC, type the ID of the focus atom as Focus Atom ID. Type the element symbol of the neighboring atom under consideration as the Neighboring Element.  Select a Bond Type. Note that Unspecified covers all types of bonds.  Next type the Minimum and Maximum Occurrences of such neighboring atom(s).  Then click Add and the newly defined EC will be listed in the Atom Environment Constraint table.

To modify an EC, first click the EC from the listed ones. The corresponding entries will be updated according to it.  Then type your new values for Range of Occurrence, and click Add.  The EC will be updated.  Note that if you change Focus Atom ID, Neighboring Element, or Bond Type, then a new EC will be added.

To delete an EC, first click the EC from the listed ones. Next click Delete and it will be removed from the table.

After completing the EC additions,  click OK and NMR-SAMS will set up the ACMX again by including the  updated ECs.  

Examples:

(1 - N: 0 ~ 1; 0)      requires atom #1 to be linked to no more than one N atom.

(2 - N: 1 ~ 1; 3)  requires atom #2 to be linked to exactly one N with a triple bond. Nitrogen atoms with other types of bonds are not limited by this EC.

(3 - C: 1 ~ 1; 2)  requires atom #3 to be linked to exactly one C with a double bond. Carbon atoms with other types of bonds are not limited by this EC.

Results: The user-defined environment constraints are saved as a record starting with the keyword “ENVIRONMENT:” in the MDF.  The previous ACMX (s) are overwritten by the updated one(s).  Each updated ACMX(s) is saved as a record starting with the keyword “ACMX: #x:” where x is the sequential number of the ACMX.  The following is a transcript of a record of environment constraints:

ENVIRONMENT:

(4 - O: 1 ~ 1; 0)

      .

      .

      .

Note: Currently NMR-SAMS does not cross-check the ECs for consistency with the current structural state and the bond constraints.  So it may accept an EC that could potentially conflict with the current structure state or bond constraints.  NMR-SAMS does not cross-check the ECs for mutual consistency, either.  No EC can be violated in the generated structure, therefore a wrong EC could result in missing a correct  structure.   Hence use ECs with caution.  Also, for Partial Structure Elucidation you can not add an EC on an ignored atom.

7.4  Structure Generation

Command: Analysis/Generate 2D Structures.

Description: In this step NMR-SAMS searches all possible ways to assemble the structural building blocks into complete structures.  The resulting structures or substructures should be compatible with all available spectral and chemical constraints, as long as the number of violated constraints are within the user defined limits.

If there are multiple ACMXs, structure generation is performed using every one of them, one at a time.  The resulting structures are saved in the structure file (.str file).  By changing the control parameters, you can opt to save intermediate substructures along with the complete ones, and limit the maximum number of structures.

For Partial Structure Elucidation Only: During partial structure elucidation (PSE), the structure generator will try to generate the largest substructure which is consistent with the available data. In some instances a dummy bond is used to satisfy a free bond by assuming that it is connected to one of the ignored atoms. After choosing Analysis/Generate 2D Structures, you are prompted to define a range of dummy bonds to be fixed in a generated structure.  Note that this does not include the dummy bonds you have added as user-defined bond constraints (See Section 7.2).  For example as shown in the dialog box below, there are three phenyl groups in paclitaxel molecule and these  are not included in the structure elucidation process.  Hence, you type "3 3" in order  to add exactly three dummy bonds in each generated structure (see Fig. 7.1).  When  not sure about the number of dummy bonds, you can type a range (e.g. “0 3”).  In that case many more candidate structures will be generated. 

During  structure generation, a Structure Generation in Process, dialog box is displayed (Fig. 7.4). This dialog box shows the initial state and the current results of the structure generation. The dialog box is updated at a frequency based on the parameter DISP_CMPLT_DELAY (The default value is 0.1 minute).

Figure 7.4. The Structure Generation in Process dialog box of NMR-SAMS. In the dialog box, the first line indicates the current ACMX being used.  If there are multiple ACMXs, every one of them will be used for structure generation.  Listed under the Initial Problem State, are the MF, the number of free bonds, and the unsatisfied bond constraints.  These values define the complexity of the structure generation problem.  The more the number of constituent atoms and free bonds, and the fewer the bond constraints, the more complex the structure generation will be. Listed under Results are the current number of generated structures (and the number of chemically unique structures in parenthesis), the number of retained substructures, and elapsed computation time in minutes.

Depending on the complexity, the computational time required by the structure generation may be in seconds or hours.  You can abort the structure generation at any time by clicking the Stop button on the dialog box.  Then the current generated structures/substructures will be saved. 

Relevant Parameters:

As described in Section 3.6, structure generation is a complex and time-consuming problem.  A set of parameters are provided so that you can control the speed and completeness of structure generation. Initially you can try the default values of these parameters, which are optimized for heuristic search and are proven to be effective for many structure elucidation problems.  Depending on the results, you can try different combinations of parameters to accelerate the structure generation, or to make the structure generation more complete or exhaustive.

The commonly used parameters relevant to structure generation can be modified in the Edit Parameters for Structure Generation dialog box. The dialog box is displayed after selecting Edit/Parameters/2D Structure Generation.  This dialog box is illustrated in Fig. 7.5, with the relevant parameters being checked.  You can select the options or enter different values.  Also you can click Default With NMR to restore the default setting for NMR-based structure generation.  After clicking OK, all of the changes made will be applied.  By clicking Cancel, the changes  made will be ignored and the program will revert the values to the original setting.  

Figure 7.5. The Edit Parameters for Structure Generation dialog box, with the parameters relevant to structure generation checked.  See Appendix IV for usage of these parameters.

Tips:  After completing structure generation, the parameters used for the calculations are written into the log file.  You can view the log file by choosing Edit/Log File.

Results:  The results of structure generation are complete structures and substructures, with assignment of 1H and 13C chemical shifts. A redundant structure usually implies alternative assignment of 13C chemical shifts.  In the MDF, the results of structure generation are summarized in a record starting with the keyword “RESULTS:”.  Following is a transcript of such a record:

RESULTS: 

For ACMX #1, 12 structures were generated and 12 of them are chemically unique.

  38 substructures were retained.

 Actually Used SAT_BC_RATE = 1.000.

 

SUMMARY:  50 (sub)structures saved in file

/usr/people/peng/NMR-SAMS/ndat/taxol-grid/paclitaxel-test.str.

N_STR = 12, N_PRO_STR = 12, N_SS = 38, N_PRO_SS = 219274, MIN_GEN_BOND = 31

Time for Structure generation: 1447.90/1502s.

where N_STR is the number of chemically unique complete structures generated, N_PRO_STR is the total number of complete structures generated, N_SS is the number of retained largest substructures, N_PRO_SS is the total number of generated substructures, and MIN_GEN_BOND is the minimum number of generated bonds in the retained substructures. The CPU and elapsed times for structure generation are reported in seconds.

The generated structures/substructures are stored in the structure file (.str file) as connection tables.

For the graphical display of the structures/substructures, see Chapter 10.

 

Possible Errors:

If the structure generation has completed without generating any structures, or if the candidate structures do not look correct,  then you need to consider the following suggestions and repeat the structure generation process:

·        If the program exceeds the upper limit of allowed structures, this could result in losing some potential structures. Increase MAX_REC_STR and repeat the structure generation. See parameter MAX_REC_STR in Appendix IV. 

·        Check the peak picking results and look for errors related to long-range coupled DQF-COSY peaks, 1H multiplicity, and the usage of negative information of DQF-COSY data.  Such errors could cause the failure of structure generation. (See the usage of parameters FIX_BOND_FLAG, H1_MULT_FLAG and IDEAL_COSY in Appendix IV).

·        Increase the Maximum Limit for Bond Constraint Violation to allow some of the bond constraints to be violated during structure generation. See parameter MAX_ERR_BC in Appendix IV.

·        Increase the Additional Tolerance for using 13C chemical shifts. See parameter ADD_C13_RNG in Appendix IV.

·        Allow the program to search a larger solution space by decreasing the Ending Value of Rate of Bond Constraint Satisfaction, and/or increasing the value of Average Number of Possibilities to Search for Each C-C Bond. See parameter SAT_BC_RATE and N_FBX_STEP in Appendix IV.  You can also select different Search Criteria for Structure Generation.  By selecting Basic or Exhaustive as the Search Criteria for Structure Generation, the search space is automatically increased.

·        Check the information in the log file to find potential problems.  Also, in verbose mode more instructions and warning messages will be displayed as dialog boxes while running the program. If you used non-verbose mode for the whole analysis procedure, choose Edit/Parameters/NMR Interpretation to open the Edit Parameters for NMR Data Interpretation dialog box. Turn on the toggle Verbose Mode.  Next, repeat the spectral interpretation and structure generation process.

·        For partial structure elucidation, make sure to define the correct number or range for the dummy bonds to be fixed in a generated structure.  Pay close attention to the limitations of partial structure elucidation described in Section 7.1.

If the structure generation seems to be endless, and no complete structure was generated before you interrupted the process, consider the following suggestions and repeat the structure generation:

·        Make sure default parameters for structure generation are used.  You can set the default values by selecting Edit/Parameters/2D Structure Generation, then click “Default with NMR” in the dialog box.

·        If the molecule is big (e.g. > 40 heavy atoms), input as many known fragments as possible by selecting Analysis/User-Defined Bond Constraints.  It is especially important to input bond constraints concerning heteroatoms to improve the efficiency of structure generation. You can also add environment constraints by selecting Analysis/Atom Environment Constraints.

·        Limit the size of rings to whatever is appropriate (e.g. 5 and 6-membered rings). See parameters MIN_RING_SIZE and MAX_RING_SIZE in Appendix IV.

·        Use an intermediate structure as the starting point for structure generation.  To do this, make sure to choose to record the intermediate substructures (see parameter REC_SS_FLAG in Appendix IV),  interrupt the structure generation process after, say 10 minutes, and save the retained substructures.  Use your judgment to select the substructure closest to the one which seems to have converged in the right direction, and select Analysis/User-Defined Bond Constraints to modify it.  Next repeat structure generation starting from this substructure. 

·        Propose a target structure, and let NMR-SAMS do the resonance assignment.  Resonance Assignment is usually much faster than structure generation. If full assignment is not possible, increase MAX_ERR_BC by 1 or 2 to allow more bond constraints to be violated.  If full assignment is still not possible, check the largest partial assignment.  By comparing the partial assignment with your proposed structure, it may be possible to identify the inconsistency between the proposed structure and the spectral data. See Chapter 8 for detail.

Structure generation is a combinatorial problem, and it is normal to see long computation times for complex molecules, especially when the spectral constraints are not sufficient to converge the structure generation process rapidly.   If the above suggestions do not help, try interactive building of the structure  (see Section 7.2.1).   During the interactive building of the molecule, NMR-SAMS checks each bond for consistency with the spectral data as they get added.  If an inconsistency is found, the error message will help you in tracing the potential error in the peak picking results, or in the structure, or in the parameter setting.

 


Chapter 8

Resonance Assignment

8.1 Overview

This chapter describes the target structure-based resonance assignment using NMR-SAMS.  As described in Chapter 7, each generated structure during the structure generation has its 13C and 1H assignments.  If you have apriori knowledge about the structure, and can provide some proposed structures,  then it is worthwhile to skip the structure generation step and try only the resonance assignment function of NMR-SAMS for verification of  user  proposed structures.

Unlike other methods of resonance assignment which are based on predicting 13C or 1H chemical shifts from large spectral databases, NMR-SAMS uses mainly 2D NMR-derived connectivity information for resonance assignment.  During the assignment process, NMR-SAMS first predicts a coarse 13C chemical shift range for each carbon atom (see Section 3.5) and obtains tentative assignments. Next the 2D NMR-derived connectivity information is used to improve these tentative assignments to final resonance assignments.  In this manner, the final assignments of NMR-SAMS are usually much more reliable than the assignments based solely on predicted chemical shifts.

For the same molecule, resonance assignment is much faster than de novo structure generation, even if you allow some bond constraints to be violated.  If the structure generation is going very slow, try proposing few candidate structures (based on apriori knowledge of the system) to complete resonance assignment. By doing that, NMR-SAMS will assist you in identifying possible inconsistencies between the structure and spectral data. After correcting the errors in the spectral data, you can repeat structure generation to generate all possible structures that are consistent with the spectra data.  This should prevent missing any potential structure. 

The resonance assignment-related steps correspond to the following group of options on the Analysis menu:

8.2  Input of the Target Structure

Command: Analysis/Input Target Structure.

Description:  In this step you can input your proposed structure as the target structure for resonance assignment.  If the proposed structure has already been built with a third party software, then first save it in MDL format, and next select Analysis/Input Target Structure/Import MDL to import the structure.  Otherwise you can build the structure by selecting Analysis/Input Target Structure/Build Molecule.  After accepting the target structure, NMR-SAMS automatically sets up an assignment matrix, which will be used for the subsequent resonance assignment. 

8.2.1. Inputting the Target Structure Interactively

To input the target structure interactively, choose Analysis/Input Target Structure/Build Molecule. This will bring a molecular editor palette for interactive sketching of the molecule.

After the Molecule Builder is displayed, you may need to click Clear to remove the current displayed structure, or you can build a structure starting with the current displayed structure. To sketch the target structure, first select Add, Atom, and Continuous Mode.  Leave Element as “C”, and Ambiguous unchecked (This toggle is reserved for defining a substructure, and is currently not used).  Then click in the main graphics window at the position where the first atom is supposed to appear.  An atom will be drawn at that location and when you click the next location, another atom will be drawn with a bond connected to the previous atom. In this manner atoms are added at where you click and bonds (single by default) are added automatically between the current and last atom.  To temporarily stop adding a bond, click the middle (or right) mouse button, then no bond will be added between the current and the next atom.  You can also turn off the Continuous Mode to add isolated atoms.

After the skeleton is sketched, you can modify the structure.  To modify an atom, select Modify and Atom, type the desired element symbol after Element.  You can also move the slider to change the valence of the atom, and choose a valence different from the default one.  Then click on the atom you want to change.  To modify a bond, select Modify and Bond, then select the desired bond type. If you are not sure about the connectivity or attached protons of an atom, select Ambiguous. Next click on the two associated atoms of the bond to modify, and the bond will be modified.  If Continuous Mode is on, use the middle (or right) mouse button to pause the continuous mode temporarily. 

To delete an atom, select Delete and Atom, then click on the atom to delete.  To delete a bond, select Delete and Bond, then click on the two associated atoms of the bond to delete. 

On the upper left corner of the main graphics window, a molecular formula is displayed to show the elemental composition of the built molecule.  If it is not on, select Display/Display Options/Molecular Formula to turn it on.  After building the target structure, click OK to accept it. 

The elemental composition of the target structure must be identical to that of the unknown structure.  If a target structure built with the molecule builder is accepted, the following dialog box prompts you to save the target structure in an MDL file.  To export the target structure, make sure that the target structure is displayed, then select Structure (MDL) from the Export pull-right in the File menu.  The target structure is then exported into an MDL file xxx000.mdl, where xxx is the root name of the working data set.  

Results: In the MDF, the results are saved in a record starting with the keyword “TSS:”.  Following the keyword, the number of heavy atoms in the target structure is listed.  The second, third and fourth lines are annotations.  In all other lines, the following entries are specified for a heavy atom in the target structure:  ID, element symbol, valence, number of free valences, connectivity (i.e., the number of neighboring heavy atoms, the neighboring atoms and bonds), and the predicted 13C chemical shift ranges.  For details about the prediction of 13C chemical shift, see Section 3.5.

TSS: n_atom = 33

------------------------------ Connection table ---------------------

 #At. Symb. Val. Ambi?  Conn. Neighbors and bonds     Pred. C13 range

----------------------------------------------------------------------

 # 1.   C    4   0    3  (32:1) (31:2) ( 5:1)            151.0 - 187.0

 # 2.   C    4   0    3  ( 9:1) ( 3:2) (25:1)            100.0 - 170.0

 # 3.   C    4   0    1  ( 2:2)                           80.0 - 159.0

 # 4.   C    4   0    3  (22:1) (12:1) (33:1)             42.0 - 109.0

 # 5.   C    4   0    4  (18:1) ( 8:1) (15:1) ( 1:1)      27.0 - 100.0

 # 6.   C    4   0    3  (26:1) (12:1) (16:1)             18.0 -  75.0

 # 7.   C    4   0    3  (11:1) (24:1) (16:1)             18.0 -  75.0

 # 8.   C    4   0    3  ( 9:1) (14:1) ( 5:1)             18.0 -  75.0  .

      .

      .

Note:  In addition to element composition, the valences of the atoms in the target structure must be identical to those of the atoms in the unknown structure (refer to Section 4.4 for description of the valences of the atoms in the unknown structure).

8.2.2. Inputting the Target Structure via MDL File

To input the target structure via an MDL file, choose Analysis/Input Target Structure/Import MDL File.  A file browser will be displayed and allow you to select the .mdl file.  The target structure will be displayed in the main graphic window.  If there are any previous assignment results in the .mdl file, the following dialog box will be displayed prompting you about the ‘analog-based assignment’:

If you select the option Analog-Based Assignment, the previous assigned 13C chemical shifts will be compared with the current 13C chemical shifts.  Using the matching tolerance (default value: 3.0 ppm),  NMR-SAMS compares the carbon chemical shifts of the assigned analog molecule with the  corresponding 13C chemical shifts of the current molecule to complete the first level of assignments for  carbon atoms.   Note that only the 13C chemical shift and multiplicity are considered.  1H chemical shifts and 2D connectivities are not considered.  So this function must be used with caution.  You can further edit the tentative assignments using the Analysis/User-Defined Resonance Assignment option.

8.2.3. Setting up the Assignment Matrix

Upon entering the target structure, either interactively or via an MDL file, NMR-SAMS internally sets up a matrix summarizing a preliminary assignment of the building blocks to the constituent heavy atoms in the target structure.

A building block (see Section 6.2) is assigned to a constituent atom in the target structure if the element type, valence, attached protons, and d13C (for carbons) of the former match those of the constituent atom.  If an atom in the target structure does not have any matching building block, then NMR-SAMS points out that the complete assignment is not possible for this target structure.

Any initial assignments (either from the ‘analog-based assignment’, or manually entered using Analysis/User-Defined Resonance Assignment) will be set as fixed assignments in the matrix.

Relevant Parameter: ADD_C13_RNG is used to increase the predicted 13C chemical shift ranges (See Appendix IV).  This parameter is useful when some odd 13C chemical shifts are expected for the proposed structure. 

Results: In the MDF, the results are saved in a record starting with the keyword “AEMX:”.  Following the keyword, the number of heavy atoms in the target structure, as well as that in the unknown molecule, are listed.  The rest of the lines list the elements in the assignment matrix.  An element a[i, j] is 1 if constituent atom i (in target structure) can be assigned to building block j.  Otherwise it is 0.

AEMX: 23 23

# 1. 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

# 2. 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

# 3. 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

# 4. 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

# 5. 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

# 6. 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

      .

      .

      .

8.3  User-Defined Resonance Assignment

Command: Analysis/User-Defined Assignment.

Description:  After importing the target structure, or after performing automated resonance assignment, you can use this command to further edit the current assignments.  In the User-Defined Resonance Assignment palette shown below, the current assignments of the 13C and 1H chemical shifts are listed.  To assign a chemical shift to a carbon, select Add, then click on an unassigned peak (i.e., entry in which the Assigned Atom # is defined as ‘none’), then click on the carbon atom which you want to assign. To remove the assignment of a chemical shift, check Delete, and click the peak in the palette, or click the corresponding atom in the graphic window. In either way the peak will be removed from the atom.

Upon clicking OK, the current assignment will be used and a new assignment matrix will be setup so that you can perform automatic assignment based on the user-defined ones.

Note: when adding an assignment the program checks only the 13C multiplicity and 13C chemical shifts.  The 2D NMR connectivities are not verified.

8.4  Resonance Assignment

Command: Analysis/Assign Spectra.

Description:  In this step NMR-SAMS assigns the building blocks to the constituent heavy atoms in the target structure.  The assignment process is actually a structure generation process based on the   additional constraints from the assignment matrix.  The results can be either complete assignments or partial assignments.

If complete assignment is not possible, NMR-SAMS tries to generate the largest partial assignments. By comparing the partial structures with the target structure, it may be possible to identify the inconsistency between the target structure and the spectral data.

The resonance assignment starts from a selected atom and, if complete assignments is successfully obtained during the first attempt, NMR-SAMS stops after searching all the possible mappings.  On the other hand, if a complete assignment is not obtained during the first attempt, NMR-SAMS loops through different starting atoms and repeats the assignment process in order to get the largest possible partial assignments. The Resonance Assignment in Progress dialog box (shown below) displays the number of starting atoms that have been tried.  Since it takes a long time to loop through every atom in the structure as the starting atom, you can use the Stop button to abort the process after a few starting atoms have been tried. 

Relevant Parameters:  As the resonance assignment process is very similar to structure generation, most of the parameters relevant to structure generation (See Section Parameters for Structure Generation in Appendix IV) are also relevant to resonance assignment.  The following parameters are exceptions, since the heuristic methods for structure generation are not used during resonance assignment:

GEN_FLAG,

SAT_BC_RATE, and

N_FBX_STEP.

Results: The resulting structures/substructures, representing complete/partial assignments, are saved in the structure file (.str file). In the MDF, the number of partial assignments, and some other information are summarized in a record starting with the keyword “RESULTS:”. 

The results of resonance assignment are displayed as the target structure with assigned chemical shift on it.  Since the resonance assignment is actually a structure generation process, you can also display the generated structures by choosing Display/Generated Structures.  See Section 10.6 for details regarding the display options.

For a partial assignment, the peripheral assigned atoms are usually the part of the proposed structure that conflict with the spectral data (Fig. 8.1).  This provides you clues to either improve the proposed structure or correct errors in the spectral data. After that you can repeat the resonance assignment procedure to assign the improved target structure.

Figure. 8.1. Example showing the verification of a proposed structure by resonance assignment.  The displayed partial assignment indicates that it is not possible to assign peaks beyond C6 and C16.  Highlighted in red is the incorrect portion of the target structure, where C27 should have been connected to C16, instead of C13.  By comparing the partial structure with the proposed one, it is easy to identify the parts of the proposed structure that needs to be revised.

Possible Errors:

If complete assignment is not obtained for the proposed target structure, it usually implies that the proposed structure is incompatible with the spectral data.  In such cases, NMR-SAMS provides the following suggestions:

·        The stored partial assignments are the largest possible assignments.  You can view these partial assignments using the Browser shown below:

By studying the partial assignments you can probably determine the inconsistencies between the target structure and the spectral data.  Repeat the assignment process after fixing the inconsistencies.

·        Look under the suggestions for structure generation (see Section 7.4 for details.).  Note that resonance assignment shares most of the parameters used for structure generation, except GEN_FLAG, SAT_BC_RATE and N_FBX_STEP.  After adjusting the parameters repeat the assignment process.


 

Chapter 9

Isomer Enumeration/Quick Elucidation

9.1 Overview

The current version of NMR-SAMS provides some basic tools for isomer enumeration.  For more advanced features, we recommend another Spectrum Research product called MS-SAMS.  For simple molecules, it is possible to enumerate exhaustively the different isomers based solely on the molecular formula (MF).  This can be done step by step by going through all the menu items except those related to NMR data interpretation, under the Analysis menu.  Alternately, you can do a quick enumeration by selecting Analysis/Quick Enumeration or Elucidation. 

Similarly, for quick structure elucidation of a simple molecules with a clean NMR data set, select Analysis/Quick Enumeration or Elucidation.  This streamlines all of the analysis steps which are normally done stepwise in an interactive mode.

9.2 MF-based Isomer Enumeration

To enumerate the constitutional isomers starting from a MF, do the following steps (All of the commands have been described in the previous chapters):

1.      Click File/New to open a new working data set. Type in the MF in the Input Molecular Formula dialog box.

2.      Select Edit/Parameters/Setup ACMX.  Click Default in the Edit Parameters for Setting Up ACMX dialog box to set the default values.  Next, select Enabled for All under Bond Formation Between Heteroatoms.  This will allow heteroatoms to be connectable.

3.      Select Edit/Parameters/2D Structure Generation.  Select Exhaustive under Search Criteria for Structure Generation.  Turn off Exclude Structures with Chemically Unstable Moeties.  Set 0 as Maximum Candidate Structures to Store (i.e., no limitation).  Turn off Store Partially Completed Structures.

4.      Select Analysis/Building Blocks to generate all the possible structural building blocks.   Note that the maximum number of building block sets is 500.

5.      Select Analysis/Bond Constraints to setup the atom-atom connection matrices based on each of the building block sets.

6.      Select Analysis/User-Defined Bond Constraints to add any known molecular fragments.  For exhaustive isomer enumeration, this must not be done.

7.      Select Analysis/Atom Environment Constraints to add any known environment constraints.  For exhaustive isomer enumeration, this must not be done.

8.      Select Analysis/2D Structure Generation to generate structures.

Another simple way to perform exhaustive isomer enumeration consists of the following two steps:

1.      Select File/New to open a new working data set. Type in the MF in the Input Molecular Formula dialog box.

2.      Set parameters as described in steps 2 and 3 above.

3.      Select Analysis/Quick Enumeration or Elucidation.  This will go through all the steps listed above with the default parameters.  Note that in this mode you can not input any user-defined bond constraints or environment constraints.

Note: Exhaustive isomer enumeration is not practical for large molecules.  The more practical way of isomer enumeration, using fragmentation information from mass spectroscopy, is to use MS-SAMS (MS Spectral Analysis Made Simple) of Spectrum Research LLC.

9.3 Quick Structure Elucidation

If you are dealing with clean spectral data set of a relatively simple molecule (e.g. < 20 heavy atoms), you can use the following steps to do a quick structure elucidation.

1.      Select File/New to open a new working data set. Depending on whether the NMR Data File is ready or not, you can either choose to start with an existing NMR data file, or start from scratch without NMR data file.  Type in the MF in the Input Molecular Formula dialog box. 

2.      If the NMR data file is not ready, select File/Create NMR Data File to convert the SpecMan peaks table into NMR-SAMS NMR data file.

3.      Select Edit/Parameters/NMR Interpretation to verify the parameters for NMR data interpretation.  Click Default in the Edit Parameters for NMR Interpretation dialog box to select the default parameters.  If you prefer, you can also change some of the parameters.

4.      Select Edit/Parameters/Set up ACMX to verify the parameters setting up ACMX.  Click Default in the Edit Parameters for Structure Generation dialog box to select the default parameters.  If you prefer, you can also change some of the parameters.

5.      Select Edit/Parameters/2D Structure Generation to verify the parameters for structure generation.  Click Default in the Edit Parameters for Structure Generation dialog box to select the default parameters.  If you prefer, you can also change some of the parameters.

6.      Select Analysis/Quick Enumeration or Elucidation.  This will go through all of the steps related to spectral analysis and structure generation (see Chapters 6 and 7) using the parameters you set. Note that in this mode you can not input any user-defined bond constraints or environment constraints.  Also the non-verbose mode is automatically selected so most of the information and warning dialog boxes will be suppressed.  (But still you can check such messages in the log file using Edit/Log File. )


Chapter 10

Graphical Display of Results

10.1 Overview

This chapter describes the operations related to the display of some of the intermediate and final results of NMR-SAMS.  The structure-related intermediate and final results of NMR-SAMS can be graphically displayed in the main graphics window.  After each step, the intermediate results, if any, are automatically displayed in the following order of priority:

1.      Candidate structures/substructures (results of structure generation or resonance assignment)

2.      Target structure for resonance assignment

3.      Building blocks. 

If none of these results exist, the main graphics window remains blank. 

To display any of these results, or to change the display features, use the following options in the Display menu:

10.2 Display of Structural Building Blocks

Command: Display/Building Blocks & Fixed Bonds.

Description:  The building blocks are displayed in the main graphics window.  If there are multiple sets of building blocks, a Building Block Browser is displayed.  This permits browsing and selecting of building blocks  for display.  Each building block is displayed as the heavy atom with attached protons, if any.  A star “*” denotes an atom having unsatisfied valences (free bonds).  By default an atom without free bond is displayed in gray, and one with free bonds in blue.  For partial structure elucidation, the ignored atoms are displayed in red.  You can customize these colors by modifying the color attributes in the nmrsams.ini  file (see Section 2.3). The fixed bonds, if any, are displayed as solid lines or dashed lines.  A dashed line represents a bond having unspecified bond type (i.e., this bond may be single, double, or triple).  For partial structure elucidation, a dummy bond is displayed as a tilde (“~”).

You can click and drag an atom to move it around, or click and drag the bond to move the fragment around. (Moving a fragment is not possible while adding user-defined bond constraints or building a molecule).  To refine the display, select Display/Display Options/Refine.

You can change the displayed features. For example, you can select to display the associated 13C and 1H chemical shift of the building blocks.  The bond constraints, and atom environment constraints, if any, can be listed along with the building blocks in a Connection Table.  See Section 10.6 for descriptions of the display options.

Interaction between building blocks and the connection table: If the Connection Table is not displayed, check Display/Display Options/Connection Table to display it.  By clicking an atom in the displayed building block, the relevant entries (atom connectivity, assigned chemical shifts, bond constraints, and environment constraints, if any) are highlighted in the Connection Table.  By clicking an entry in the Connection Table, the relevant atom(s) in the displayed building blocks are highlighted.

Note: After selecting Analysis/Generate Building Blocks, if there are multiple building block sets,  you can click the Delete or Select button in the Building Blocks Browser palette to delete or select the current displayed building block set (see Section 6.2).  If you set up multiple ACMXs based on multiple building block sets, the multiple building blocks can still be displayed, but you can not delete or select a certain building block sets at this stage (see Section 6.4.6).  To do that, you have to generate the building block sets again (using Analysis/ Building Blocks).

10.3 Display of Target Structure

Command: Display/Target Structure & Assignments

Description:  Display the target structure for resonance assignment (see Section 8.2) in the main graphics window.  Chemical shifts will be displayed on the atoms if assignment has been done. If there are more that one possible assignments, an Assignment Browser appears so that you can browse through all the possible assignments.

See Section 10.6 for descriptions of the display options. 

Note:  The numbering of the atoms represents the order in which the atoms are added when building the molecule.  This is different from a generated structure where an atom ID usually corresponds to the ID of its assigned 13C peaks.

10.4 Display of Generated Structures/Assignments

Command: Display/Generated Structures.

Description:  Display the candidate structures/substructures in the main graphics window. If there are more than one structure/substructure, a Structure Browser is displayed so you can choose any one from them to display. In a substructure, a star “*” denotes an atom having unsatisfied valences (free bonds).  By default an atom without free bond is displayed in gray, and one with free bonds in blue.  For partial structure elucidation, the ignored atoms are displayed in red. You can customize these colors by modifying the color attributes in the nmrsams.ini file (see Section 2.3). For partial structure elucidation, a dummy bond is displayed as a tilde (“~”).

By default, the results of target structure-based resonance assignment are displayed as target structure with assigned chemical shifts (see Section 10.3).  However, you can also choose this command to display them as generated candidate structures.  In this way a complete/partial assignment is displayed as a complete structure/substructure.  If not selected, the assigned 13C and 1H chemical shifts of each atom can be displayed by selecting Display/Display Option/Chemical Shifts.

Interaction between structure and connection table:  If the Connection Table is not displayed, check Display/Display Options/Connection Table to display it.  By clicking an atom in the displayed structure, the relevant entries (atom connectivity, assigned chemical shifts, bond constraints, and environment constraints, if any) are highlighted in the Connection Table.  By clicking an entry in the connection table, the relevant atom(s) in the displayed structure are highlighted. This is a convenient tool for verifying the structure and constraints.

10.5  Status Window

Command: Select Display/Status Window to turn on or turn off the status window.  The status window can also be closed by clicking the Close button on it.

Description:  The status window displays text messages to indicate the current status of the structure elucidation, and also prompts you with the “what to do next” steps.  Note that these prompts about “what to do next” is only for general guidance, and you can repeat any of the already finished steps when necessary.  Certain steps remain grayed out in the menu until the previous one has been done. 

10.6 Display Options

Command: Select Display/Display Options, then select one of the toggles or option in the pull-right menu:

Description: NMR-SAMS provides a variety of display options that you can select to toggle on or off, depending on your preferences. The display options are also available as icons on the Tool Bar (see Section 2.6).

Balls: Display circles representing the atoms.  By default, normal atoms are gray, ambiguous atoms are blue, and ignored atoms are red.

Element Symbols: Display element symbols for atoms.  By default, the symbols are yellow. 

Numbers: Display atom numbers.  The connection table refers to these numbers.  By default, they are green.

Chemical Shifts: Display chemical shifts for carbon atoms with assigned 13C chemical shifts.  If protons are displayed, the 1H chemical shifts are displayed in parentheses after the 13C chemical shifts.  Chemical shifts are displayed in the same color as the Numbers.

Molecular Formula: Display the molecular formula of the current displayed structure or substructure. Note that for partial structure elucidation, the displayed molecular formula may contain fewer protons than the actual protons because the attached protons of the ignored atoms are not displayed.

Molecular Weight: Display the molecular weight of the current displayed structure or substructure. Note that for partial structure elucidation, the displayed molecular formula may contain fewer protons than the actual protons because the attached protons of the ignored atoms are not displayed.

Show Disconnectivity: Highlight the atoms that can not be connected to the currently highlighted atom.  This option is effective only after the bond constraints have been generated, i.e., the connectivities between the atoms are known.

Protons: Display attached protons (if any).  The protons that are attached to each atom are displayed as "H" or "H#" where '#' is the number of protons attached to that atom. Protons are displayed in the same color as the Element Symbols.

Connection Table: Display a table listing the atom connectivity information, the bond constraints, and the atom environment constraints if any.

Refine: Moves the current molecule's atoms, attempting to place them in the way deemed most appropriate for the molecule by optimizing its internal geometry.

Tips: Default colors (and other display options) can be changed in the initialization file, nmrsams.ini.  See this file for more information.

10.7  Editing the Display of Generated Structures

NMR-SAMS also provides an option for editing the displayed structure.  This option is in the Edit menu as show below:

Command: Edit/Generated Structures 

Description: This option is used to edit the generated structure/substructures by adding, modifying or deleting atoms/bonds. This option is especially useful for partial structure elucidation, where you can manually link the ignored atoms to the dummy bonds so as to complete the full structure.

To edit a structure, first choose Display/Generated Structures to display the candidate structures.  Choose to display the structure you want to modify.  Next choose Edit /Generated Structures.  The following Molecular Editor palette appears:

Usage of the Molecular Editor palette for adding/removing/modifying atoms and bonds is mostly the same as described in Section 8.2.  The minor differences are described as follows:

·        Normally if you used a correct MF, you do not need to add or remove atoms.  To add/remove/modify bonds, it is more convenient to turn off the Continuous Mode toggle. 

·        For partial structure elucidation, the number of attached protons of each ignored atom (displayed in red by default) is treated as unknown. When adding bonds between ignored atoms, the atoms are treated as ambiguous ones (displayed in blue by default) with unknown number of attached protons.  To display a specific number of attached protons for such an atom, choose Modify, Atom, desired Element symbol and Valence.  Turn off Ambiguous.  Next click on an atom with unknown number of attached protons. The number of attached protons for that atom will be calculated and displayed based on its valence and connectivity.

·        Upon completing the modification, click the OK button.  Then immediately choose to File/ Export/ Structure (MDL) to export the current modified structure into an MDL file.  See Section 11.2 for detail.  If you want to continue editing the structure, choose Edit/Generated Structures again.

Note:  Any changes made to the displayed structure can only be saved in an MDL file, not in the .str file.  The changes will not be retained once you switch to display another structure or something else.


Chapter 11

Exporting Results

11.1 Overview

This chapter describes the options related to the report generation of the results of NMR-SAMS, which includes the NMR spectral data, the resonance assignment, and the candidate structures.  Such files can be readily reformatted for presentation and publication.  The relevant options are the pull-right options of File/Export as shown below:

11.2  Exporting NMR Spectral Data

Command: File/Export/Chemical Shift Correlations

Description:  NMR-SAMS provides a tool to export NMR peak lists (in the form of chemical shift correlations).  To create chemical shift correlation table, select File/Export/Chemical Shift Correlations in the menu. The correlation of chemical shifts are written into a file xxx.spc (xxx is the root name of the current working data set).

 

Sample Output:

***** NMR Spectral Data of Q-2-demo, Created by NMR-SAMS V2.0 *****

 

---------------------------------------------------------------------------------------

#H-1  Shift  Multi. Integral          COSY                      NOESY

---------------------------------------------------------------------------------------

  1.    4.930    s    5.3e-002   4.755(w) 1.778(w)           4.755(s) 3.509(s) 1.778(s)

  2.    4.755    s    5.2e-002   1.778(w)                    1.778(s)

  3.    3.509    u    1.7e-002   2.235* 1.752 1.513          2.235*(s)

       .

       .

       .

 

---------------------------------------------------------------------------------------

#C-13 Shift  Multi. Integral   HMQC                      HMBC

---------------------------------------------------------------------------------------

  1.  178.822   s    2.7e-002                     2.611 2.235* 1.752 1.565 1.544*

  2.  151.323   s    3.3e-002                     4.930 4.755 3.509 2.232 1.778 1.513

  3.  109.931   t    2.4e-002    4.930  4.755     3.509 1.778

       .

       .

       .

Note:

Long-range coupled COSY peaks are marked by “(w)”.  NOESY peaks are marked by “(s)”, “(m)”, and “(w)” for strong, medium and weak, respectively.

For a H-H or C-H correlation that involves ambiguous correlated 1D peaks as follows:

a1  a2 ... an ľ b1  b2  ...  bm ,               

which means the following combinations:

a1 - b1, a1 - b2 …, and an - bm,

These combinations are represented as n lines, ai - b1 ( 1 < i < n), and b1 is marked by a “*” to represent b1, b2 … or bm:

a1 - b1* 

a2 - b1*

.

.

.

an - b1*.

 

11.3  Exporting Resonance Assignment

Command:  File/Export/Assignment

Description: NMR-SAMS provides a tool to export the resonance assignment of a candidate structure. To do this, first display the desired candidate structure/substructure. Next select File/Export/ Assignment. The resonance assignments of the candidate structure or substructure which is on display will be written to a text file “xxx00n.rst”, where xxx is the root name of the current working data set, and n the sequential number of the substructure. This file contains the 13C and 1H assignments of all atoms in the molecule.  If NOESY peaks are available, then the assignment of the NOESY peaks along with distance constraints and the actual bond separation between the relevant protons are included.  This information enables resolving of ambiguous NOE peaks, and identification of through-space NOE connectivities.

Relevant Parameters:

NOESY_DIST: (default values: 1.90  5.00   1.90 3.00   1.90 2.50) The six values of this parameter are used to define the minimum and maximum distance bounds between the correlated protons, when the NOESY peak intensity level is weak, medium, or strong respectively.

Sample Output:

***** Resonance Assignment by NMR-SAMS V2.0*****

 

STRUCTURE #1(Unique #1, generated from ACMX #1 at CPU: 0.21s.)

 

--------------- Assignments of C-13 and H-1 resonances: --------------

 #Atom           Assigned C-13            Assigned H-1

----------------------------------------------------------------------

  C-1            178.822( 1)        

  C-2            151.323( 2)        

  C-3            109.931( 3)            4.755( 2)   4.930( 1)

  C-4             78.147( 4)            3.435( 4)

  C-5             56.647( 5)        

  C-6             55.956( 6)            0.811(33)

       .

       .

       .

-------------------------Assignment of NOE Cross Peaks--------------------------------

 #NOE #H1(ppm,#C13) - #H1(ppm,#C13)  Intensity   Distance constraint  Bond separation

--------------------------------------------------------------------------------------

 1    1(4.930, 3)  -   2(4.755, 3)    0.000 s       1.9  -  2.5         2

 2    1(4.930, 3)  -   3(3.509, 9)    0.000 s       1.9  -  2.5         4

 3    1(4.930, 3)  -  12(1.778,25)    0.000 s       1.9  -  2.5         4

 4    2(4.755, 3)  -  12(1.778,25)    0.000 s       1.9  -  2.5         4

 5*   3(3.509, 9)  -   7(2.235,15)    0.000 s       1.9  -  2.5         4

 5*   3(3.509, 9)  -   8(2.232,19)    0.000 s       1.9  -  2.5         3

      .

      .

      .

Note:

If a NOESY peak involves ambiguous correlation of 1H peaks, all the relevant proton pairs are listed and marked by a “*”.  Such ambiguity can only be resolved by using molecular modeling methods.

11.4  Exporting Candidate or Target Structures.

Command:  File/Export/Structure (MDL)

Description: NMR-SAMS provides a tool to export 2D structures to third party molecular drawing programs such as Chemdraw, ChemSketch, etc. The structure is exported with coordinates in MDL  format. To export structures, select File/Export/Structure (MDL), and the currently displayed structure will be exported into a file “xxx00n.mdl”, where xxx is the root name of the current working data set, and n the sequential number of the structure (n = 0 when the target structure without assignment result is exported).

If a target structure is displayed with chemical shift assignments, the resonance assignments are also listed at the end of the .mdl file.  This is useful for ‘analog-based assignments’.  See Section 8.2.2 for details.

 


Appendix I

NMR Data File

NMR-SAMS accepts NMR spectral data in the form of an ASCII file, with a novel flexible format designed to cope with practical problems that commonly exist in real-world spectral data. The data file can either be prepared manually using spectral information from other third party vendors, or  automatically by converting the peak tables generated by SpecMan with the conversion procedures described in Sections 5.2-5.8.

1D Spectral Data

In the spectral data file, the 1D peaks are listed first. Keywords H1:” and “C13:” are used to designate the start of the entries of 1H and 13C spectral data, respectively. Following the keyword, each line specifies the data of a peak, and the section ends with an empty line. The following is a transcript of a sample 1H peak list converted from a SpecMan peaks table:

H1: /usr/people/peng/NMR-SAMS/ndat/Q-2-test/h1.pks

 #1. 4.930 s 5.331e-02 ;1

 #2. 4.755 s 5.185e-02 ;2

 #3. 3.509 u 1.656e-02 ;3

    .

    .

    .

The first line which begins with the keyword “H1:” indicates the start of 1H peak list. After the keyword “H1:” and a blank space, comments may be added up to 80 characters in length. The entries in the rest of the lines represent the peak ID, chemical shift (in ppm), multiplicity, intensity (optional), and comments (optional) for each 1H peak, respectively. If the peaks are converted from a SpecMan peaks table, the comment contains the ID of the original peak in the SpecMan peaks table. (Both IDs may be different). One or more space(s) is used as a delimiter for all items except comments which are separated by “;”.  

The peak IDs are frequently used in other places to refer to these 1D peaks. The multiplicity of a 1D peak is represented as u, s, d, t, q, or m for unknown, singlet, doublet, triplet, quartet, or other general multiplets, respectively.  Detailed description of the use of 1D peak multiplicity information can be found in the usage of parameter H1_MULT_FLAG (Appendix IV).  The peak intensity and comments are optional.  The comments are useful to keep track of the original peaks in 1D spectrum.

 

2D Spectral Data

Instead of cross peak coordinates, NMR-SAMS uses the 2D NMR data in the form of correlations between 1D peaks, which are referred to as connectivities in this manual. In the data file, keywords COSY:”, “HMQC:”, “HMBC:”, “NOESY:” and “INAD:” are used to designate the start of the entries of DQF-COSY, HMQC/HETCOR, HMBC/COLOC, NOESY, and 2D INADEQUATE connectivities, respectively.  Following the keyword, each line specifies the data of a connectivity, and the section ends with a blank line.  In the line of a connectivity, one or more spaces are used as a delimiter for all items except comments which are separated by “;”.   The following is a transcript of a sample DQF-COSY connectivity list:

COSY: DQF-COSY data of Q-2

 #1. (1 - 2)   1  0.0  0.84 ;1+4

 #2. (1 - 12)  1  0.0  0.84 ;2+31

 #3. (2 - 12)  1  0.0  0.84 ;3+32

 #4. (3 - 7 8) 3  0.0  0.84 ;6+18

 #5. (3 - 13)  3  0.0  0.84 ;7+33

 #6. (3 - 18)  3  0.0  0.84 ;5+49

    .

    .

    .

The first line which begins with the keyword “COSY:” indicates the start of COSY  connectivity list. After the keyword and a blank space, comments may be added up to 80 characters in length. The entries in the rest of the lines represent the connectivity ID, IDs of the correlated 1D 1H peaks shown in parenthesis, peak intensity levels (which are classified as four types: strong, medium, weak, and unknown, and denoted as 3, 2, 1 and 0 respectively.)  J-coupling constant (optional, 0.0 for unknown), reliability (optional, which refers to the probability of the peak being considered as a real peak), and comments (optional, with a maximum size of 80 characters), for each COSY connectivity, respectively. 

Again, the ID of a connectivity will be used in other places to refer to this connectivity (such as in the bond constraints, see Section 3.4.  For each of the connectivities converted from a SpecMan peaks table, the comment contains the ID(s) of the cross peak(s) from which the connectivity is derived.  This offers a way to keep track of the cross peak(s) from which a connectivity is derived.  (See Fig. 6.4 in Chapter 6).

For ambiguous connectivities the IDs of all possible 1D peaks are listed as correlated nodes. The intensity level is used only for DQF-COSY (and NOESY if it is used). The intensity level of a short-range coupling DQF-COSY peak must be assigned 3 (strong) or 2 (medium), while that of a possible long-range coupling peak must be assigned 1 (weak).  The J-coupling constant entry is used only for DQF-COSY.  See Sections 5.4 and 6.4.1 for details regarding usage of this information.

Items marked as optional can be omitted unless an item following it is included.  In such a case, you must include default values for ignored items even if they don’t get used.  Comments can always be included as long as they follow a “;”. Following example shows  some valid representations of connectivities:

#2 (1 - 2)                                         A strong peak between spin 1 and 2

#3 (1 - 10 11) ;10 and 11 too close to resolve     An ambiguous peak

#4 (8 - 10) 1 0.0 0.4                        An unreliable weak peak

Note: The same keywords and formats are used for both 13C- and 1H-detected 2D heteronuclear spectra (e.g. HMQC and HETCOR). For example, keyword “HMQC:” is used also for HETCOR data and the associated 13C peak(s) always appear before 1H peaks in the representation of a connectivity.  Refer to the examples shown in Sections 5.5 and 5.6.


Appendix II

Master Data File

The master data file (MDF) stores all of the intermediate and final results, except the connection table of the candidate structures (which are saved in the structure file).  The results are saved as records, each of which start with a keyword, such as “ATOMS:”, and ends with a blank line, or end of the file.  The intermediate results of one analysis step will be used as the input for the next dependent step(s).  NMR-SAMS saves only one copy for each record in the MDF.  So if a certain analysis step is repeated, the relevant records, as well as those produced by the dependent steps, if any, are overwritten.  For example, command Analysis/Bond Constraint uses the results of Analysis/ Building Blocks. If you repeat the latter step after completing Analysis/Bond Constraints, the following message warns you that the previous results, as well as the dependent ones, will be overwritten.

The MDF can be inspected and edited by selecting Edit/Master Data File.  By default, the “vi” editor is used, though you can choose to use “jot”, or other editors (by modifying the choice of Editor defined in the nmrsams.ini file).  By changing some of the intermediate results, you can control the flow of the structure elucidation process.  Note that the keywords and the formats must not be modified, otherwise the program will be not able to find the record or read the data properly.  Moreover, once you modify a certain record, you much repeat the dependent analysis steps (if they have been done before) to use the modified data.  Table A.3.1 lists the data records that are produced in each of the analysis steps. The steps (or commands) are arranged in the general order in which they are used for structure elucidation. The resulting records are specified as whether they can be modified or not.

Table A3.1. Data Records in the Master Data File of NMR-SAMS


Command

Keywords

Content of the record

Modify?

 

MF:

Input molecular formula of the unknown

No

 

ATOMS:

Elemental composition of the unknown and some properties of the atoms

No

 

1DH1:

Results of the analysis of the 1D 1H NMR spectrum

No

Analysis/ Building Blocks

1DC13:

Results of the analysis of the 1D 13C NMR spectrum

No

 

SYMMETRY:

Either the unknown is symmetric or not, or to pursue partial structure elucidation. 

No

 

HMQC:

The C-H BCs derived from HMQC correlations. 

No

 

FRAG_SET:

The building blocks for structure generation.

Yes

 

COSY:

The H-H BCs derived from COSY correlations.

No

Analysis/Bond Constraints

HMBC:

The C-H BCs derived from HMBC correlations.

No

 

INADEQUATE:

The C-C BCs derived from INADEQUATE correlations.

No

 

C13~~C13:

The unified set of C-C BCs.

Yes

 

ACMX:#x:
(x = 1, 2, …)

Atom-atom connection matrix (matrices).

Yes

Analysis/User-Defined Bond Constraints

ATOM~~ATOM:

User-supplied BCs

No

Analysis/Atom Environment Constraints

ENVIRONMENT:

User-supplied environment constraints.

No

Analysis/2D Structure Generation

RESULTS:

Summary of the results of structure generation.

No

or Analysis/Assign Spectra

UNRECOG_CCSS:

The undefined substructures (CCSS) encountered during the structure generation. 

No

Analysis/Input Target Structure

TSS:

Connection table of the target structure for resonance assignment.

No

 

AEMX:

The assignment matrix for resonance assignment.

Yes

 


Appendix III

CCSS-13C Chemical Shift Range Correlation Table

This file, chemical_shifts.def, serves as the knowledge base of the NMR-SAMS for 13C chemical shift prediction (see Section 3.5).  It stores the 13C NMR chemical shift dispersion ranges of some common carbon-centered single-spherical substructures (CCSS).  Several rare CCSSs (whose chemical shift ranges cannot be found in the references) are assigned a range of -99 to -999, which, in effect, prohibit the formation of such CCSSs in structure generation. You can modify and expand the knowledge base by adding entries in the same format.

Format: A line starting with a "!" at the first column is taken as comments. A CCSS is coded as the focus (always C) followed by the neighboring atoms and the bonds (single: default; double: =; and triple: #. Aromatic bonds are decomposed into alternating single and double bonds). The order of the neighboring atoms is of no consequence.  Following the code of each CCSS is the lower and upper limit of the 13C chemical shift dispersion of the focus carbon.

References:

1. Pretsch, Emo et al., Tableeln zur Strukturaufklarung Organishcer Verbindungen mit Spektroskopisher, Methoden, 2nd ed., Berlin,Springer-Verlag, 1981

2. Bremser, W., Chemical Shift Ranges in Carbon-13 NMR Spectroscopy, Weinheim, Verlag Chemie, 1982

 

C(=S)(N)(N)       165   185

C(C)              0     32

C(S)             6     20

C(C)(C)           10    70

C(S)(C)           16    60

C(C)(C)(C)        18    75

C(S)(C)(C)              22    73

C(N)              27    46

C(C)(C)(C)(C)     23    100

C(N)(C)           35    90

C(S)(C)(C)(C)     35    90

C(Cl)(C)          37    56

C(N)(C)(C)              40    90

C(O)              49    62

C(N)(C)(C)(C)     50    99

C(O)(C)           46    109

C(O)(C)(C)              42    109

C(O)(C)(C)(C)     52    110

C(Cl)(C)(C)(C)    65    110

C(O)(O)(C)(C)     86    120  

C(O)(O)(C)              86    110

C(O)(O)          86    110

C(O)(O)(O)        107   118

C(O)(O)(O)(C)     77    125

C(O)(N)(C)(C)     71    114

C(O)(N)           60    89

C(N)(O)(C)        70    111

C(N)(N)(C)        41    99

C(=C)             80    159

C(=C)(C)          80    160

C(=C)(Cl)(C)            90    160

C(=C)(O)(O)       141   176

C(=C)(O)(C)       90    161

C(=C)(N)(C)       90    160

C(=C)(C)(C)       100   170  

C(=C)(N)          120   170

C(=C)(N)(C)       120   170

C(=C)(O)          115   189

C(=C)(=C)         118   220

C(=C)(N)(N)       121   180

C(=C)(O)(N)       -99   -999

C(=O)(O)(C)       151   187

C(=O)(N)(C)       158   185

C(=O)(=N)         120   131

C(=O)(Cl)(C)            158   180

C(=O)(C)          185   204

C(=O)(C)(C)       164   226

C(=O)(C)          197   204

C(=O)(N)(N)       150   163

C(=O)(O)          158   167

C(=O)(N)          160   183

C(=O)(O)(O)       150   160

C(=O)(=C)         200   206

C(=S)(O)(C)       188   211

C(=S)(N)(C)       188   211

C(=S)(C)(C)       219   240

C(=S)(N)(N)       165   185

C(=N)(=S)         120   140

C(=N)(O)          151   156

C(=N)(C)(C)       144   170

C(=N)(C)          144   170

C(=N)                   127   156

C(=N)(=C)         -99   -999

C(=N)(O)(C)       149   195

C(#C)             20    100

C(#C)(C)          20    100

C(#C)(O)          88    89

C(#C)(N)          79    84

C(#C)(S)          71    72

C(#C)(P)          71    107

C(#N)(S)          110   120

C(#N)(C)          115   125

C(#N)(O)          107   110

C(N)(N)(C)(C)     56    107

C(S)(N)(C)(C)     85    100

C(=S)(=C)         230   270

C(=N)(S)(C)       155   170

C(=C)(S)(N)       125   182

C(F)(F)(F)(C)     104   129

C(F)(F)(F)(N)     116   122

C(F)(F)(F)(O)     118   121

C(F)(F)(C)(C)     88    135

C(O)(N)(N)(C)     83    121

C(O)(O)(N)(C)     102   134

C(F)(F)(O)(C)     114   120

C(O)(O)(N)        101   131

C(O)(N)(N)        105   106

C(O)(O)(O)(O)     115   136

C(Cl)(N)(C)(C)    73    97

C(Cl)(O)(C)(C)    72    107

C(=C)(Cl)(C)      87    167

C(Cl)(C)(C)       45    92

C(Cl)(N)(C)       62    93

C(Cl)(O)(C)       74    97


Appendix IV

Control Parameters

All control parameters of NMR-SAMS can be accessed by choosing the pull right options of Edit/Parameters.  You are not recommended to edit the parameter file (.par file). Choose Edit/Parameters/NMR Interpretation to open the Edit Parameters for NMR Interpretation dialog box, choose Edit/Parameters/Setup ACMX  to open the Edit Parameters for Setting up ACMX dialog box,  or choose Edit/Parameters/2D Structure Generation to open the Edit Parameters for 2D Structure Generation dialog box.

This appendix explains the usage of the control parameters of NMR-SAMS. The parameters are listed in the following three groups:

1.      Parameters for spectral interpretation.  The names and titles of these parameter are listed in Table A4.1. The actual operations related to spectral interpretation are described in Section 6.4.

2.      Parameters for setting up ACMX. The names and titles of these parameter are listed in Table A4.2. The actual operations related to setting up ACMX are described in Section 6.4.

3.      Parameters for structure generation.  The names and titles of these parameter are listed in Table A4.3. The actual operations related to structure generation are described in Section 7.4.

The usage of these parameters are described in the following sections. In each section, the parameters are arranged in the order they appear in the dialog boxes.  Both their titles in the dialog box and their names (used in the parameter file, as well as in this manual) are used as their identifiers.

The default value for each parameter is  listed.  Whenever a new working data set is opened, the default values are assigned to all parameters. You can also assign the default values to any of the groups of parameters by clicking Default button in its dialog box.


Table A.4.1 Parameters for Spectral Interpretation

Parameter Name

Title in Edit Parameters for Spectral Interpretation Dialog Box

COSY_J_CATEG

J(HH) Cutoff for Long-range Coupling COSY peaks

COSY_BC[4]

H-H Bond Separation for a Long-range COSY-type Peak, Minimum and Maximum
H-H Bond Separation for a Short-range COSY-type Peak, Minimum and Maximum

COSY_DIAG_RESO

Tolerance for Near-diagonal COSY Peak Checking

MIN_MB_H1

Minimum 1H Chemical Shift for Checking Long-range H-H Coupling

HMBC_BC[6]

C-H Bond Separation for a Weak HMBC-Type Peak, Minimum and Maximum
C-H Bond Separation for a Medium HMBC-Type Peak, Minimum and Maximum
C-H Bond Separation for a Strong HMBC-Type Peak, Minimum and Maximum

INAD_BC[3]

C-C Bond Separation for an INADEQUATE Peak, Minimum and Maximum
Type of INADEQUATE-derived C-C Bond

RELIAB_PEAK_PROB

Minimum Probability for All Reliable Cross Peaks

NOESY_DIST[6]

H-H Distance for a Weak NOESY-type Peak, Minimum and Maximum
H-H Distance for a Medium NOESY-type Peak, Minimum and Maximum
H-H Distance for a Strong NOESY-type Peak, Minimum and Maximum

PRO_LEVEL

Verbose Mode

 

 

Table A.4.2 Parameters for Setting up ACMX

Parameter Name

Title in Edit Parameters for Setting up ACMX Dialog Box

IDEAL_COSY

Use of COSY Negative Information

H1_MULT_FLAG

Use of 1H Multiplicities to Suppress Inappropriate Bonds

FIX_BOND_FLAG

Extract Unambiguous 1-Bond Constraints as Fixed Bonds

HETCON_FLAG

Bond Formation Between Heteroatoms

CCBOND_FLAG

Allowed Carbon-Carbon Bond Types

 

Table A.4.3 Parameters for Structure Generation

Parameter Name

Title in Edit Parameters for Structure Generation Dialog Box

GEN_FLAG

Search Criteria for Structure Generation

SAT_BC_RATE[3]

Rate of Bond Constraint Satisfaction, Starting, Ending and Step Values

N_FBX_STEP

Average Number of Possibility for Each C-C Bond Formation

MAX_ERR_BC

Maximum Limit for Bond Constraint Violation

MIN_RING_SIZE

Ring Size, Minimum

MAX_RING_SIZE

Ring Size, Maximum

ADD_C13_RNG

Addition Tolerance for Using C-13 Chemical Shifts

MIN_MB_C13

Minimum C-13 Shift for Multi-Bond Carbon

BAD_SS_FLAG

Exclude Structures with Chemically Unstable Moieties

MAX_REC_STR

Maximum Candidate Structures to Store

REC_SS_FLAG

Store the Partially Completed Structures

DISP_CMPLT_DELAY

Interval for Updating Structure Generation Dialog Box

 

Parameters for Spectral Interpretation

J(HH) Cutoff for Long-range Coupling COSY Peaks:

COSY_J_CATEG:     3.0  

The value of this parameter is used by NMR-SAMS to automatically classify the intensity level of DQF-COSY peaks based on the J-coupling constant, when the intensity levels for individual COSY peaks are unknown (i.e., equals 0).  If a COSY peak has a J coupling constant of less than or equal to the value of this parameter, it is classified as a long-range coupled (or weak) peak.  Otherwise, it is classified as short-range coupled (or strong) peak.

H-H Bond Separation for a Long-range COSY-type Peak, Minimum and Maximum:

H-H Bond Separation for a Short-range COSY-type Peak, Minimum and Maximum:

COSY_BC:    4 5  2 3

The values of this parameter are used by NMR-SAMS during the interpretation of COSY peaks as bond constraints.  If a peak is classified as long-range coupled, then NMR-SAMS requires the number of intervening bonds in the structure to be within the range, namely, greater than or equal to COSY_BC [1] and less than or equal to COSY_BC[2]. If a peak is classified as short-range coupled, then NMR-SAMS requires the number of intervening bonds in the structure to be within the range, namely,  greater than or equal to COSY_BC[3] and less than or equal to COSY_BC[4].

Tolerance for Near-diagonal COSY Peak Checking:

COSY_DIAG_RESO:         0.02

The value of this parameter is used by NMR-SAMS to discriminate the COSY diagonal peaks from the near diagonal cross peaks.   If a near-diagonal COSY peak is not observed and the 1H chemical shift difference between two protons is less than or equal to this value, you are notified.  You can then allow NMR-SAMS to add a pseudo bond constraint to this peak and prevent losing a correct structure when real peaks are omitted.

Minimum 1H Chemical Shift for Checking Long-range H-H Coupling:

MIN_MB_H1:     0

The value of this parameter is used for checking the presence of long-range coupled COSY peaks, if it is bigger than 0.  If a COSY peak is interpreted as a geminal or vicinal coupling, and one of the protons has a 1H chemical shift greater than this value, then you are warned of the potential long-range H-H coupling.  If a long-range coupling is not correctly identified, it could lead to the loss of a correct structure.  Therefore, you are advised to extend the number of intervening bonds in the bond constraint to cover a long-range coupling.

The default value of this parameter is 0, i.e., the checking is turned off.

C-H Bond Separation for a Weak HMBC-Type Peak, Minimum and Maximum:

C-H Bond Separation for a Medium HMBC-Type Peak, Minimum and Maximum:

C-H Bond Separation for a Strong HMBC-Type Peak, Minimum and Maximum:

HMBC_BC:    2 5   2 3   2 3  

The values of this parameter are used for interpreting the HMBC peaks. NMR-SAMS uses these values to determine the number of intervening bonds that each peak represents. 

A weak HMBC peak must result in a carbon to proton separation within the specified range (i.e. greater than or equal to the HMBC_BC[1] and less than or equal to the HMBC_BC[2]).

A medium HMBC peak must result in a carbon to proton separation within the specified range (i.e. greater than or equal to the HMBC_BC[3] and less than or equal to the HMBC_BC[4]).

A strong HMBC peak must result in a carbon to proton separation within the specified range (i.e. greater than or equal to the HMBC_BC[5] and less than or equal to the HMBC_BC[6]).

C-C Bond Separation for an INADEQUATE Peak, Minimum and Maximum:

Type of INADEQUATE-derived C-C Bond (the last value):

INAD_BC:    1 1 0

The values of this parameter are used for interpreting an INADEQUATE peak.   NMR-SAMS uses the first two values to determine the number of intervening bonds that each peak represents.  Each INADEQUATE peak must result in a carbon to carbon separation within the specified range (i.e. greater than or equal to INAD_BC[1] and less than or equal to INAD_BC[2]).

The last value is used to determine the type of bond that the peak represents.  For the "Unspecified" type (i.e., INAD_BC[3] = 0), NMR-SAMS lets the bonds be of any type (e.g. single, double or triple).  For the other types, NMR-SAMS forces the bonds to be of the specified type.

 

RELIAB_PEAK_PROB: 0.50

The value of this parameter is used for interpreting COSY, HMBC, and INADEQUATE peaks as bond constraints.   It is used as the minimum probability for a reliable peak.  A peak with a probability greater than or equal to this value is taken as a reliable one, otherwise unreliable.

 

NOESY_DIST:       1.90 5.00   1.90 3.00   1.90 2.50

The values of this parameter are used for exporting the NOESY peaks. When exporting the resonance assignment results, NMR-SAMS uses these values to calibrate the proton-proton bounds (H-H geometric distance) from the NOE intensity levels. These values represents the minimum and maximum H-H distances in Angstroms for weak, medium, and strong NOE peaks respectively.

Verbose Mode:

PRO_LEVEL: 0

If verbose mode is on (i.e., PRO_LEVEL = 0), then NMR-SAMS will show more messages to you.  This is useful for users who are just beginning to use NMR-SAMS, or for users who want to make sure that they are notified of any strange instances.  If verbose mode is off (i.e., PRO_LEVEL = 1), NMR-SAMS notifies you only when either an error occurs, or user-input is required.  This mode is useful for advanced users. Note that this parameter does not affect the messages stored in the log file.

 

Parameters for Setting up ACMX

Use of COSY Negative Information: 

IDEAL_COSY:   1

If the first button, Treat as Ideal Spectrum, is selected (this corresponds to IDEAL_COSY = 1, the default setting), then NMR-SAMS treats COSY as an ideal spectrum, namely, two proton-bearing carbon atoms are forbidden to connect if no COSY peak is observed between them. Although this is usually true, and this reduces the time taken to generate structures, it could also lead to losing a correct structure if some 3JH,H couplings were not observed for reasons such as H-H configuration or chemical environments.

If the second button, Use with NOESY Data, is selected (this corresponds to IDEAL_COSY = 2), is selected, then NMR-SAMS will use the negative information in conjunction with NOESY data.  In this case, two proton-bearing carbon atoms are forbidden to connect if neither COSY nor NOESY peaks were observed between them.  This may be safer than the previous choice and is recommended if NOESY data is available.

If the last button, Do Not Use, is selected (this corresponds to IDEAL_COSY = 0), then the negative information is not used. In this case, all proton-bearing atoms will be allowed to connect even if no COSY peaks were observed between them. Though this is a safe option, this could significantly reduce the efficiency of structure generation.

Use of 1H Multiplicities to Suppress Inappropriate Bonds:

H1_MULT_FLAG:    1

If this option is selected (this corresponds to H1_MULT_FLAG = 1, the default setting), the following rules are used to exclude some carbon atoms from bonding during the structure generation:

1.      Only CHx-CHy (x > 0,  y ł 0) are considered;

2.      CH3 with a multiplicity M = 1(s), 2(d), 3(t), or 4(q) is forbidden to bond to CHy if y ą M -1;

3.      CH3 with other multiplicities M > 4 is forbidden to bond to CHy if y = 0;

4.      CH with a multiplicity M = 1 is forbidden to bond to CHy if y = 2 or 3.

If this option is not selected (this corresponds to H1_MULT_FLAG = 0), all 1H multiplicities will not be used.  In such a case the structure generation may take longer, and produce more candidate structures.

1H multiplicities must be used carefully in order not to lose the correct structure.  If you find the multiplicity of a certain 1H peak is not reliable, or does not fit these rules, input it as unknown multiplicity (represented as “u”, see Section 5.2), so that its multiplicity will not be used.  If you do not want to use all 1H multiplicities, turn off this option so that all 1H multiplicities will be ignored by NMR-SAMS. 

Extract Unambiguous 1-Bond Constraints as Fixed Bonds:

FIX_BOND_FLAG: 1

This flag defines whether or not to use NMR-derived unambiguous bond constraints (e.g. those from well-resolved COSY peaks) as fixed bonds prior to structure generation. Once a fixed bond is defined, it cannot be broken though its bond type can be changed in the subsequent structure generation.  While this enhances the efficiency of structure generation, the correct structure may be lost if one of the fixed bonds is incorrect. (e.g., a long-range coupled DQF-COSY peak was mistakenly interpreted as a vicinal coupling).  The default is to use unambiguous bond constraints as fixed bonds (this corresponds to FIX_BOND_FLAG = 1).

If you choose not to use the unambiguous bond constraints as fixed bonds (this corresponds to FIX_BOND_FLAG = 0), all NMR-derived bond constraints will be used during structure generation. In that case the bond constraints can be violated when MAX_ERR_BC > 0. But this may significantly reduce the efficiency of structure generation.

Note that NMR-SAMS always treats user-supplied unambiguous bond constraints as fixed bonds.

Bond Formation Between Heteroatoms:

HETCON_FLAG:  0

If the first option, Disabled for All, is selected (corresponding to HETCON_FLAG = 0, the default setting), bonds are forbidden to be formed between all heteroatoms during structure generation.  If the second option, Disabled for Same Type, is selected (corresponding to HETCON_FLAG = 1), bonds are forbidden to be formed between the same type of heteroatoms. If the third option, Enabled for All, is selected (corresponding to HETCON_FLAG = 2), there is no limitation on the bond formation between the heteroatoms.

Since the default setting is Disabled for All, you must be cautious when functional groups such as -NO2 or -O-O- exist in the molecule.  However, if you define such groups as user-defined bond constraints (see Section 7.2), then you can still select the “Disabled for All” option to enhance the efficiency of structure generation. 

Allowed Carbon-Carbon Bond Types:

CCBOND_FLAG: 1 1 1

If one or more of the three types of the C-C bonds are not checked (corresponding to CCBOND_FLAG[i] = 0, where i = 1, 2 or 3 for single, double, and triple bond, respectively), then the corresponding type of C-C bonds will not be formed during the subsequent structure generation.  These options are used only when it is known that some types of C-C bonds do not need to be formed during structure generation. 

This is useful when it is known that a certain type of C-C bonds do not exist in the molecule, or that they have already been extracted as fixed bonds.  For example, in the case of INADEQUATE data, which usually provides all single C-C bonds correlation, you can set the fixed bonds flag on (FIX_BOND_FLAG = 1) and let NMR-SAMS to extract all C-C single bonds.  Then you can force NMR-SAMS to generate only C-C multi-bonds during the structure generation by turning off Single, and checking Double and Triple for this parameter.  By default they are all checked (i.e., CCBOND_FLAG[i] = 1, where i = 1, 2 and 3).

 

Parameters for Structure Generation

Search Criteria for Structure Generation: 

GEN_FLAG: 1

If Advanced (i.e., GEN_FLAG = 1, the default value) is selected, the advanced heuristic search method is used to accelerate the structure generation.  Such heuristic search method takes advantages of the bond constraints and 13C chemical shifts, reorders the solution space so that only the most probable portion of the solution space is searched for candidate structures.  This search criteria usually leads to the fastest structure generation with reliable results when sufficient spectral data is used.

If Basic (i.e., GEN_FLAG = 2) is selected, more relaxed parameters (SAT_BC_RATE and N_FBX_STEP) are used for the penalty function so that a wider solution space is searched during structure generation.

If Exhaustive (i.e., GEN_FLAG = 0) is selected, the brute-force exhaustive search method is used for structure generation.  This is usually very slow so it is useful only when the molecule is very small, or when the heuristic methods mentioned above fail to give the correct structure. This option is recommended for exhaustive isomer enumeration based solely on the MF. 

Note: SAT_BC_RATE and N_FBX_STEP are two important parameters which control the completeness of the search when the GEN_FLAG is set as 1 or 2.  By modifying their values, you can set a good balance between speed and completeness of structure generation.

Rate of Bond Constraint Satisfaction, Starting, Ending, and Step Values: 

SAT_BC_RATE:  1.2  0.6  0.1

This parameter is one of the important parameters related to heuristic structure generation, and is used only when GEN_FLAG = 1 or 2.  The three values of this parameter determine the use of a penalty function for evaluating the substructures based on the “rate of BC-satisfaction”, K:

·        SAT_BC_RATE[1] is the required starting value of K, Ks.  The default value is 1.2; 

·        SAT_BC_RATE[2] is the required ending value of K, Ke.  The default value is 0.6;

·        SAT_BC_RATE[3] is the step value, DK, for automatic adjustment of K. The default value is 0.1.

For the first run of structure generation, K = Ks.  If no complete structure is obtained (which usually means that K is too big) and K > Ke, the structure generation will be automatically repeated using K = K - Ke.  Such iteration ends when at least one structure is generated or when K Ł Ke or K Ł 0.  If DK = 0, or Ks Ł Ke, the structure generation will not be repeated, namely, only one structure generation will be performed with K = Ks. 

Appropriate usage of SAT_BC_RATE limits the search scope of the structure generation to the most probable portion, hence speeds up this process without losing the correct structure.  A bigger value of K makes the search less complete and the computation time shorter, and vice versa.  If Ks = 0.0, the penalty function is ignored so that the substructures are not evaluated based on the rate of BC-satisfaction.  This is the most exhaustive search but can be very slow.

Tip: The variation of the K values can be seen in the log file.  Also the K value that was used during the last iteration of structure generation can be found under the keyword “RESULTS:” in the MDF.  For details regarding evaluation of substructures based on the rate of BC-satisfaction, please refer to References 1-3.

Average Number of Possibilities for Each C-C Bond Formation: 

N_FBX_STEP: 3.0

Analogous to SAT_BC_RATE, this is another important parameter related to heuristic structure generation, and is used only when GEN_FLAG = 1 or 2.  The value of this parameter defines the average number of free bonds to be tried while forming a bond on a certain atom. This limits the search scope of structure generation to the most plausible portion of the solution space.  3.0 is the default value.  A bigger value of N_FBX_STEP makes the search more complete and the computational time longer, and vice versa.  If N_FBX_STEP = 0.0, all free bonds will be tried.  This is the most exhaustive search but can very slow.

Note: In contrast to SAT_LRDC_RATE, N_FBX_STEP is not automatically adjusted based on the results.  For details regarding the scope of search based on N_FBX_STEP, please refer to Ref. 3 listed in References.

Maximum Limit for Bond Constraint Violation: 

MAX_ERR_BC: 1

Sometimes it is necessary to allow a few BCs to be violated during structure generation.  For example, occasionally 4-bond C-H correlations are observed in HMBC.  As all HMBC-derived BCs are interpreted as 2 or 3-bond separations be default (see parameter HMBC_BC), the correct structure can be generated only when a certain number of BCs are allowed to be violated.  This a trade-off because allowing some BCs to be violated reduces the efficiency of structure generation significantly, since more incorrect substructures need to be considered during structure generation.

Minimum Ring Size:

MIN_RING_SIZE:  0

This parameter defines the minimum ring size for the rings in the generated structures.  If a value smaller than 3 is defined, there will be no limit on the ring sizes (the default value is 0, i.e., no limitation).

Maximum Ring Size: 

MAX_RING_SIZE: 0

This parameter defines the maximum ring size for the rings in the generated structure.  If a value smaller than 3 is defined, there will be no limit on the ring sizes (the default value is 0, i.e., no limitation).

If the structure generation is very slow, you can try to limit the maximum ring size, e.g., by setting MAX_RING_SIZE as 6 if appropriate.

Addition Tolerance for Using C-13 Chemical Shifts: 

ADD_C13_RNG: 0.0

This parameter (which is set as 0.0 by default) is the tolerable violation of 13C chemical shifts which is used for evaluating substructures based on 13C chemical shifts.  If ADD_C13_RNG = t, the predicted d13C range of a carbon is d1 to d2, and the observed d13C is d*, then the substructure containing this CCSS is regarded as bad if d* < d1-t or d* > d2 + t.  This parameter is useful when some odd 13C chemical shifts are expected in the molecule (see Section 3.5).

This parameters is also used for setting up an assignment matrix during resonance assignment (see Section 8.2).

Minimum C-13 Shift for Multi-Bond Carbon: 

MIN_MB_C13:  60

The value of this parameter determines the lowest possible 13C chemical shift for an sp2 or sp carbon. Any multi-bonds will be forbidden to be attached to a carbon atom with d13C < MIN_MB_C13. The default value is 60; but 100 can be used if no triple bond is expected in the unknown structure.

Exclude Structures with Chemically Unstable Moieties: 

BAD_SS_FLAG: 1

If this option is checked (i.e., BAD_SS_FLAG = 1, the default value), some simple chemically unstable structural moieties will be rejected during structure generation.  Such structural moieties include:

1.      =C=,  i.e., multiple double bonds on a carbon atom,

2.      =X1=X2=,  where X1 and X2 are any heteroatoms, and =C=,

3.      C(X1)(X2)(X3), where X1, X2 and X3 are any heteroatoms connected to the same carbon atom, except -CF3 group,

4.      Three-membered ring, except an epoxide ring without attached double bond on each carbon atoms.

If this option is not checked (i.e. BAD_SS_FLAG = 0), such moieties are not excluded.

Maximum Candidate Structures to Store:

MAX_REC_STR: 50

This parameter defines the maximum number of generated structures.  The default value of MAX_REC_STR is 50. When the number of candidate structures reaches MAX_REC_STR, the structure generation will be terminated.  Note that MAX_REC_STR does not include redundant structures.  For example, if a structure is generated twice (with alternate 13C assignments), it will be counted as one when checking for MAX_REC_STR.  So if redundant structures are generated, the number of candidate structures will be more than MAX_REC_STR.  If you choose to record intermediate substructures (see REC_SS_FLAG), the number of retained substructures, N_SS, is determined as follows:

N_SS = MAX_REC_STR - N_unique_str,

where N_unique_str is the number of chemically unique complete structures. 

If MAX_REC_STR = 0, unlimited number of candidate structures will be generated.  In such a case, substructures will not be allowed to be recorded (See REC_SS_FLAG).

Store the Largest Substructures in Addition to Complete Structures:

REC_SS_FLAG: 1

If this option is checked (i.e., REC_SS_FLAG = 1, the default value), the intermediate substructures generated during the structure generation process are recorded.  Such substructures are useful if complete structures can not be generated (due to errors in spectral data or use of inappropriate parameters). 

Since the number of substructures can potentially be very large, they are allowed to be stored only when you define an upper limit for the total number of structures (i.e., MAX_REC_STR > 0). The number of retained substructures, N_SS, is determined as follows:

N_SS = MAX_REC_STR - N_unique_str,

where N_unique_str is the number of chemically unique complete structures.  If the number of generated substructures exceeds N_SS, only the largest ones are retained.

After the structure generation is completed (or interrupted), NMR-SAMS prompts you to save the substructures in the structure file along with the completed ones. If you click Yes, then the substructures are saved and can be displayed along with the completed ones.  If you click No, all substructures will be discarded.

Interval for Updating Structure Generation Dialog Box: 

DISP_CMPLT_DELAY: 0.10

This parameter defines the interval (in minutes) for updating the Structure Generation in Process dialog box during structure generation or the Resonance Assignment in Process dialog box during resonance assignment.



References

1.      Chen Peng, Shengang Yuan, Chongzhi Zheng, Yongzheng Hui, "Efficient Application of 2D NMR Correlation Information in Computer-Assisted Structure Elucidation of Complex Natural Products," J. Chem. Inf. Comput. Sci., 1994,34, 805-813

2.      Chen Peng, Shengang Yuan, Chongzhi Zheng, Yongzheng Hui, Houming Wu, Kan Ma, "Application of Expert System NMR-SAMS to the Structure Elucidation of Complex Natural Products," J.  Chem. Inf. Comput. Sci., 1994, 34, 814-819

3.      Chen Peng, Shengang Yuan, Chongzhi Zheng, Zhengshuang Shi, Houming Wu, "Toward Practical Computer-Assisted Structure Elucidation for Complex Natural Products: Efficient Use of Ambiguous 2D NMR Correlation Information," J. Chem. Inf. Comput. Sci. 1995, 35, 539-546

4.      Chen Peng, Shengang Yuan, Chongzhi Zheng, Lingran Chen, "From Spectra to Structure by Computer: Dreams and Reality," Computers and Applied Chemistry, Computer Chemistry Monograph Series 4, Beijing: Science Press, 1995: 26-33.

5.      Chen Peng, Shengang Yuan, Chongzhi Zheng, Yongzheng Hui, “Graph-theory-based Computer Representation of Two-Dimensional NMR Correlation Information for Automated Analysis,” Computers and Applied Chemistry, Computer Chemistry Monograph Series 4, Beijing: Science Press, 1995: 34-38

6.      Chen Peng, Geoffrey Bodenhausen, Shengxiang Qiu, Harry H. S. Fong, Norman R. Farnsworth, Shengang Yuan, Chongzhi Zheng, “Computer-Assisted Structure Elucidation: Application of CISOC-SES to the Resonance Assignment and Structure Generation of Betulinic Acid,” Magnetic Resonance in Chemistry, 1998, 36, 267-278.



 

Index


13C spectrum                                                             31

1H spectrum                                                              28

ACMX                                                       i, 62, 63, 99

ADD_C13_RNG                                  17, 75, 80, 112

Analysis/2D Structure Generation                       86

Analysis/Assign Spectra                                         81

Analysis/Atom Environment Constraints 70, 76, 86

Analysis/Generate 2D Structures                         72

Analysis/Generate Building Blocks               85, 89

Analysis/Input Target Structure                           77

Analysis/Input Target Structure/Build Molecule 78

Analysis/Input Target Structure/Import MDL    78

Analysis/Molecular Formula                                 46

Analysis/NMR Data                                          50, 53

Analysis/Quick Enumeration or Elucidation 85, 86, 87

Analysis/User-Defined Bond Constraints 66, 68, 76, 86

APT                                                                           31

assignment matrix                                                      80

Atom-atom Connection MatriX                            i, 62

BAD_SS_FLAG                                                     113

bond constraint                                                      1, 15

ambiguous                                                             59

ambiguous                                                             16

cross-check of                                                       60

format of                                                               15

merge of                                                                 60

pseudo                                                       56, 59, 67

unambiguous                                                         62

user-defined                                                           66

violation of                                                          112

building block                                                   i, 49, 99

display of                                                              88

candidate structure                                                      1

display of                                                        88, 90

editing                                                              91, 92

export of                                                                95

maximum of                                                         113

CASE                                                                           3

CCBOND_FLAG                                                   110

CCSS                                                  i, 8, 16, 101, 112

chemical shift                                                         1, 16

prediction of                                                        101

chemical valence                                            25, 49, 79

chemical_shifts.def                                        8, 16, 101

chromatic graph                                                           2

COLOC                                                            i, 41, 57

combinatorial explosion                                             17

computer assisted structure elucidation                      3

connection table                                                   89, 90

connectivity                       1, 15, 35, 39, 41, 42, 56, 97

format of                                                               97

ID of                                                                      15

COSY         i, 14, 18, 34, 35, 55, 56, 59, 106, 108, 109

COSY_BC                                                         56, 106

COSY_DIAG_RESO                                        56, 107

COSY_J_CATEG                                             55, 106

data acquisition                                                          13

DEPT                                                         i, 14, 17, 31

diagonal peak                                                             37

DISP_CMPLT_DELAY                                   73, 114

Display/Building Blocks & Fixed Bonds        68, 88

Display/Display Option/Chemical Shifts            90

Display/Display Options                                        90

Display/Display Options/Balls                              91

Display/Display Options/Chemical Shifts          91

Display/Display Options/Connection Table  89, 91

Display/Display Options/Element Symbols        91

Display/Display Options/Molecular Formula 78, 91

Display/Display Options/Numbers                       91

Display/Display Options/Protons                          91

Display/Display Options/Refine                     89, 91

Display/Generated Structures or Assignments 68, 89

Display/Status Window                                          90

Display/Target Structure                                       89

distance constraint                                                     94

double bond equivalence                                            25

dummy bond                         18, 64, 67, 72, 76, 89, 90

EC                                                                                 i

Edit/Generated Structures                                     92

Edit/Log File                                                55, 74, 87

Edit/Master Data File                                             55

Edit/NMR Data File                                    31, 34, 36

Edit/Parameters/2D Structure Generation 74, 85, 86, 104

Edit/Parameters/NMR Interpretation 53, 54, 86, 104

Edit/Parameters/Parameter File                         104

environment constraint                                     i, 18, 70

format of                                                               70

input of                                                                 70

File/Create NMR Data File                                   86

File/Create NMR Data File/C13 and DEPT        31

File/Create NMR Data File/COSY                      34

File/Create NMR Data File/H1                             28

File/Create NMR Data File/HMQC (or HETCOR)         39

File/Create NMR Data File/NOESY (or ROESY) 42

File/Exit                                                                    26

File/Export/Assignment                                          94

File/Export/Chemical Shift Correlations            93

File/Export/Structure (MDL)                                 95

File/Molecular Formula                                         24

File/New                                                       23, 85, 86

File/Open                                                                 21

File/Save                                                                  25

File/Save As                                                             26

FIX_BOND_FLAG                            62, 75, 109, 110

fixed bond                                16, 62, 63, 89, 109, 110

focus atom                                                                 70

free bond                                                                    49

number of                                                              63

GEN_FLAG                                    5, 19, 82, 110, 111

graphical display                                                       88

H1_MULT_FLAG                               29, 75, 96, 109

H1MULT_FLAG                                                     62

HETCON_FLAG                                                   110

HETCOR                                                   i, 14, 39, 47

HMBC                     i, 14, 41, 42, 56, 57, 59, 107, 108

HMBC_BC                                               57, 107, 112

HMQC                         i, 14, 33, 39, 40, 43, 47, 48, 59

IDEAL_COSY                                42, 57, 62, 75, 108

ignored atom                                                              64

INAD_BC                                                   58, 59, 108

INADEQUATE                               i, 14, 58, 108, 110

intensity level                                    38, 41, 42, 44, 55

IR                                                                         13, 14

IRIX                                                                            4

isomer enumeration                                                   85

J-coupling                                                                    1

J-coupling constant                                       37, 44, 55

knowledge base                                                            4

licensing                                                                       6

license.dat                                                                6

log file                                                                        20

long-range coupling                            37, 38, 44, 55, 75

through-p-electron                                                55

main graphics window                                           7, 88

mass spectroscopy                                                      1

master data file                                                          99

keyword of                                                            99

modification of                                                      99

record of                                                                99

MAX_ERR_BC                                  75, 76, 109, 112

MAX_REC_STR                                      75, 113, 114

MAX_RING_SIZE                                          76, 112

maxNSBC                                    15, 16, 56, 57, 58, 61

MDF                                              i, 18, 20, 59, 63, 99

MDL file                                                                   78

MF                                                       i, 14, 23, 30, 85

Microsoft Windows                                                    4

MIN_MB_C13                                                       113

MIN_MB_H1                                                         107

MIN_RING_SIZE                                            76, 112

minNSBC                                    15, 16, 56, 57, 58, 61

molecular formula                                             i, 13, 23

molecular symmetry                                              3, 47

MS                                                                       13, 14

MS-SAMS                                                          85, 86

multiplicity                                                      1, 16, 29

13C                                                                         49

1H                                                                          75

N_FBX_STEP                                                  82, 111

near-diagonal peak                                               37, 56

negative information                                      18, 57, 75

NMR data file                                                     20, 96

1D data                                                                  96

2D data                                                                  97

keywords of                                                    96, 97

nmrsams.ini                                               7, 89, 90, 99

NOESY                                           i, 14, 42, 57, 59, 94

NOESY_BC                                                              58

NOESY_DIST                                             42, 94, 108

NSBC                                                  ii, 56, 57, 58, 61

paclitaxel                                                        65, 72, 75

parameters                                                     18, 23, 82

for resonance assignment                                80, 82

for setting up ACMX                                         108

for spectral interpretation                       53, 54, 106

for structure generation                                 73, 110

parameter file                                                        20

summary of                                                         104

partial structure elucidation ii, 3, 18, 33, 43, 47, 64, 67, 72, 76

peak ID                                                                      15

peak intensity                                                  1, 41, 42

peak picking                                                        13, 28

export of                                                                93

manual                                                                   43

peak table conversion                                                28

periodic_tab.def                                                           8

PRO_LEVEL                                                           108

PSE                                                                  ii, 43, 47

rate of BC-satisfaction                                            111

REC_SS_FLAG                                                76, 113

reference                                                                  115

RELIAB_PEAK_PROB                       56, 57, 58, 108

reliability                                                             37, 56

report generation                                                       93

resonance assignment          1, 4, 16, 18, 76, 77, 78, 81

display of                                                              90

export of                                                                94

ring size

maximum of                                                         112

minimum of                                                         112

ROESY                                                                      42

root name                                                                   20

SAT_BC_RATE                                         75, 82, 111

SGI                                                                              4

short-range coupling                                            38, 55

Solaris                                                                          4

SpecMan                                             1, 13, 28, 36, 96

spectral interpretation                                               53

13C                                                                         47

1H                                                                          46

COSY                                                                    55

HMQC                                                            47, 49

INADEQUATE                                                    58

NOESY                                                           57, 59

spectral source                                                           15

status window                                                       7, 90

structure file                                                              20

structure generation                   17, 64, 72, 74, 76, 109

complexity of                                                        63

efficiency of                                    4, 17, 36, 38, 58

heuristic                                                               111

interactive                                                        68, 76

Sun                                                                               4

symmetric peaks                                                       37

target structure                                                          76

display of                                                        88, 89

export of                                                                95

user intervention                                                        18

UV                                                                       13, 14

vebose mde                                                              108

Windows 95                                                                4

Windows NT                                                               4

working data set                                                  19, 20

X/Motif                                                                       4