Zap 1.00 Manual: Chapter 6S

LESSON 3: Using Zap with SYBYL 3D QSAR


This lesson demonstrates the use of Zap with the SYBYL QSAR module. It would be best for this lesson if there were no molecules or backgrounds from previous lessons currently active in SYBYL. If you are entering the HINT Tutorial at this point, follow the instructions in Step 1 of Lesson 1.

  1. Open a SYBYL Molecular Spreadsheet and Database

    From the File pulldown on the menubar select Molecular Spreadsheet and New.... The Data Source will be a Database. In the DATABASE_FILE dialog box, choose ymartinD2.mdb for this lesson and press Open.

  2. Add Activity Values to the Spreadsheet

    From the File pulldown menu on the Molecular Spreadsheet choose Import. The Format will be Tripos; choose D2activity.tripos as the File by pushing the ... button or typing it in. Press Import.

  3. Fill a Spreadsheet column with the Zap Potential field

    From the spreadsheet menubar select the AutoFill button (and choose a new Column). Select ZAPCOMFA as the New column type. The Add Column (ZapCoMFA) dialog box contains options to tailor the Zap field that will be entered into the QSAR table. If the ZAPCOMFA column type is not listed as a New column type, Cancel and enter mss!reset_eslc to the Sybyl> prompt. For this first run, we will choose mostly the default settings: (Map Type = Molecule Potential, Inside Mol Cut Off = on, Low Limit = 30, High Limit = 30.) The Region will be from Calculate Automatically... using the Calculate CoMFA Region Automatically dialog box, where all Spacings should be 2 Angstroms and all Margins should be 4 Angstroms. Use ymartin.rgn as the CoMFA Region File name. Press OK to calculate the region and then press OK to the Add Column (ZapCoMFA) dialog box and accept zap2 as the Column name. This AutoFill operation will take about 2 minutes. Note that it is normal for this column to be apparently filled with zeroes. See the Zap FAQs.

  4. Fill Spreadsheet columns with the standard CoMFA steric and electrostatic fields

    Normally the standard CoMFA steric and electrostatic fields are saved in one column of the MSS as a field couplet, but for this exercise we are going to place them in separate columns so we can compare the Zap field with the Electrostatic field directly, and in combination with the steric field. Select the AutoFill button (and choose a new Column). Select COMFA as the New column type. Note that in the Add New CoMFA Column dialog box the CoMFA Field Class: is Tripos Standard. In the Field Types frame select Type(s): Electrostatic, Dielectric = Distance, Smoothing = None, and Drop Electrostatics = Within Steric Cutoff for Each Row. The Electrostatic Cutoff should be 30 kcal/mol and the Transition: should be Smooth. Important: Choose Use Existing Region (martin.rgn). Press Add Column. To keep track of these fields, enter "Electro" as the Column name. Now, repeat the process for the Steric field, which should go in column 4.

  5. Run PLS analyses on these fields

    We will make 5 PLS runs on this data set to ascertain the best model. First we will examine each of the fields (Zap [col 2], Electro [col 3], Steric [col 4]) individually, and then look at the two potentially useful field combinations Zap/Steric and Electro/Steric. We will use the SAMPLS protocol for rapid creation and evaluation of field models. Follow the directions below for each field or field combination:
    Choose the columns for the PLS study: Use Select Cols, enter 1 and the col number(s), separated by commas, for the field(s) in the Expression text field, press Add and Done. From the QSAR pulldown, select Partial Least Squares... to call the Partial Least Squares Analysis dialog box. The Dependent Column is (always) 1. Select 10 Components and make sure Use SAMPLS is on. Press Do PLS. Note the reported q2 as a function of components. Be careful to not always select the "recommended" optimum. Often it is better practice to sacrifice 1-2 % in q2 for a lower number of components. To repeat for another field model, press the ... button next to the Columns to Use: text field.

    For the Zap field:

    Standard Error of Prediction for 10 components:
    1.368 1.376 1.291 1.278 1.248 1.261 1.262 1.279 1.325 1.374
    Crossvalidated R2 for 10 components:
    0.026 0.056 0.204 0.256 0.325 0.345 0.378 0.397 0.391 0.386
    -- optimum is 0.397 at 8 components
    
    For the Electro field:
    Standard Error of Prediction for 10 components:
    1.439 1.656 1.662 1.700 1.699 1.778 1.861 1.977 2.037 2.082
    Crossvalidated R2 for 10 components:
    -0.078 -0.367 -0.318 -0.317 -0.253 -0.303 -0.352 -0.442 -0.441 -0.410
    -- optimum is -0.078 at 1 components
    
    For the Steric Field:
    Standard Error of Prediction for 10 components:
    1.282 1.026 1.077 0.971 0.958 0.987 1.084 1.178 1.290 1.392
    Crossvalidated R2 for 10 components:
    0.144 0.475 0.447 0.570 0.602 0.599 0.541 0.488 0.422 0.370
    -- optimum is 0.602 at 5 components
    
    For the Zap + Steric Field:
    Standard Error of Prediction for 10 components:
    1.324 1.213 1.191 1.122 1.111 1.099 1.084 1.155 1.193 1.312
    Crossvalidated R2 for 10 components:
    0.087 0.267 0.323 0.427 0.465 0.502 0.541 0.508 0.506 0.440
    -- optimum is 0.541 at 7 components
    
    For the Electro + Steric Field:
    Standard Error of Prediction for 10 components:
    1.369 1.383 1.288 1.234 1.226 1.240 1.273 1.327 1.402 1.446
    Crossvalidated R2 for 10 components:
    0.024 0.045 0.208 0.307 0.348 0.366 0.367 0.351 0.318 0.320
    -- optimum is 0.367 at 7 components
    
    From these results, it appears that the activity for these molecules correlates most highly with the CoMFA steric field. There is no correlation at all with the CoMFA electrostatic field, and a fair correlation with the Zap Potential field. Adding the electrostatic field to the steric field seriously degrades the model (ca. 24 %), while adding the Zap field to the steric field hurts it some (ca. 6 %). There could be valid situations where you need the charge-based descriptor fields in your CoMFA model, even if the statistical metrics of the QSAR model are degraded. As an aside, the combination of the steric field and the HINTCOMFA field yields a model with q2 = 0.782 with 7 components.

  6. Graphing the CoMFA results

    As in standard CoMFA, to produce CoMFA coefficient contour plots, you will need to calculate a non-crossvalidated CoMFA analysis using the optimum number of components calculated with the cross-validated (SAMPLS) partial least squares runs. Choose the ZAP and STERIC columns (along with the PK dependent column), set the Validation to No Validation, Components to 7, and Column Filtering to kcal/mol. Press Do PLS. Save this PLS Analysis as YMartinD2_ZapSteric_nocv.pls. The following output should have appeared in the Sybyl text window:

    Relative Contributions
    #                   Norm.Coeff. Fraction 
    -                   ----------- ---------
    1 ZAP2 (968 vars)         2.191     0.515
    2 COMFA3 (968 vars)       2.066     0.485
    
    Summary output
    Standard Error of Estimate           0.305
    R squared                         0.964
    F values     ( n1= 7, n2=18 )    68.192
    Prob.of R2=0 ( n1= 7, n2=18 )     0.000
    

    Press End in the Partial Least Squares Analysis dialog. Next display the D2 molecule set by manually selecting all of the rows of the MSS and selecting File, Put Rows into Molecular Areas.... Next, there is a graphical command in the Zap software to aid in graphing multifield CoMFA results. From the eslc pulldown on the main SYBYL menubar select the Zap, Zap QSAR, Graph ZapQSAR... command. This brings up the Retrieve ZapQSAR dialog box that guides you, with a series of Sybyl dialog boxes, through retrieving and graphing the CoMFA field contours. Choose which field types you wish to graph and their Columns: First, Retrieve: Steric and Column 3. Choose Display area D1 in the next dialog, Level selection USER_SPECIFIED, and Specification criterion PERCENT_OF_RANGE. Enter Contour level 30 and choose Color CYAN, then Contour level 70 and choose Color MAGENTA. Press End when the REAL dialog asks for a third value and End to the Field Option dialog. Now for the Zap field: Retrieve: ZapPotential and Column 3. Choose Display area D2 in the next dialog, Level selection USER_SPECIFIED, and Specification criterion PERCENT_OF_RANGE. Enter Contour level 30 and choose Color GREEN, then Contour level 70 and choose Color RED. Press End when the REAL dialog asks for a third value and End to the Field Option dialog. The Sybyl text window will have scrolled some useful interpretation information, but basically, the MAGENTA indicates where steric bulk can be tolerated, and CYAN indicates where it cannot be tolerated. GREEN indicates where higher activity, i.e., larger PK, would be obtained by making that region of space have a more negative potential, and RED indicates where the potential should be more positive.

    CoMFA coefficients map of Steric and ZapPotential fields for D2 agonists data set.

    Important: This Retrieve ZapQSAR dialog and procedure does not work when there is only one CoMFA type column in the analysis.