LESSON 2: The Multi-Conformer Protocol for Antithrombin Inhibitors



  1. Invoke SYBYL
  2. If you haven't done this already, perform step 1 of Lesson 1.

  3. Create a SYBYL Molecule Database and Molecular SpreadSheet suitable for the MCP.
  4. This is the most tedious part of the MCP in terms of "operator" time. The basic idea is that we need a single database including the ligands and their conformers. However, there are caveats to this database dictated by the operational constraints of Sybyl, SPL and the HASL/MCP code. The main caveat is that the names of the "molecules" within the component "MOL2" files of the database must follow a rigid format: LIGAND_CONFORMER, where "LIGAND" is the name of the ligand and "CONFORMER" is a unique code for the conformer sequence of that ligand. An example might be MTX_001, etc. The 4.10S release of HASL was designed with the idea of FlexX generating the conformer data for each ligand, but that is only one of a multitude of ways conformer data may be generated for the MCP. If you provide us with examples of other input paradigms, we will be happy to see if there is a way to automate creation of the MCP spreadsheet, but in practice it may be neessary to hand edit the molecule files in the database to achieve this goal.

    The database for this tutorial has already been created, but we will review the steps for its creation as a refernce for further work.

    From the menubar, choose the eslc, HASL, MCP pulldown and select Create MCP Database/MSS. This activates the Create HASL MCP Database/MSS Dialog Box. The New (MCP) Database Name is the name of the database we are preparing for the Multi-Conformer Protocol. This datbase creation procedure can be interrupted and restarted subsequent sessions, so that this doesn't actually have to be a "new" database. In fact, you can select an existing database with the ... button. If it is a new databse,you would press the Create button. Press the ... button and select thrombin28.mdb from the database select dialog.

    [If you were adding more ligands to the database you would then enter the name of the "old" database into Existing Conformers Database: Database Name (or choose it with the associated ... button). Then you could either add all conformers from that database by choosing Select All Conformers or choose specific conformers from that database by choosing Select Conformers Using MSS and picking rows off a Molecular SpreadSheet. Then you would press the Add Selected Conformers to the New (MCP) Database button to append the selected conformers to the MCP database. This procedure is then repeated for additional "old" databases, until you are "done".]

    Press the Done/Create MSS button, which replaces the OK on this dialog. This executes the commands to create a Molecular SpreadSheet from the MCP database, adds two columns of ligand name and conformer sequence number that are used by the MCP is processing the molecular data, and prompts the user for the biological activity data for each ligand in the table (and automatically inserts the activity data for all of the conformers of each ligand). The activity data for the antithrombin lgand set is as follows:

    Ligand Activity
    LIG01 11.08
    LIG02 10.49
    LIG03 8.04
    LIG04 9.29
    LIG05 10.55
    LIG06 9.92
    LIG07 6.92
    LIG08 8.79
    LIG09 8.40
    LIG10 11.28
    LIG11 11.58
    LIG12 8.56
    LIG13 13.59
    LIG14 9.51
    LIG15 10.76
    LIG16 11.58
    LIG17 9.45
    LIG18 9.72
    LIG19 10.05
    LIG20 9.35
    LIG21 10.22
    LIG22 10.71
    LIG23 9.02
    LIG24 11.69
    LIG25 10.30
    LIG26 11.17
    LIG27 6.59
    LIG28 7.12

  5. Add a "Basic" HASL column to the Molecular Spreadsheet for each conformer.
  6. This is the same as step 3 of Lesson 1:

    Press the AutoFill button on the MSS and select Column from the Option menu. Select HASLMODEL as the New column type and press OK. The Add Column (HASL) dialog contains options for optimizing and customizing the individual molecular lattices that will be later merged into the HASL model. In the present case we will use the simplest case, setting HASL Source: to be Use Basic H-Val Parameters.

    The HASL Region: is the dimensions of the lattice over which all molecules in the set will be calculated. Use Pre-Calculate as Union... This calls the Calculate SYBYL Region Automatically dialog. Set the Spacings for X, Y and Z to be 1.5 A and leave the Margins for X, Y and Z at 4 A. Enter as a SYBYL Region File thrombin28.rgn. Press OK. The SYBYL Region File text box on the main HASL dialog should be filled with thrombin.rgn after a brief calculation. The last item to deal with is HASL Model File. Enter thrombin28.hsl and press OK to AutoFill the HASL column in the MSS. The suggested HASL Column heading of HASL4 is fine.

  7. Set Up the Multi-Conformer Protocol Run
  8. From the menubar, choose the eslc, HASL, MCP pulldown and select Run Multi-Conformer Protocol. This activates the HASL MCP Dialog Box. First verify that the proper values are in the text boxes for Liagnd Column, , Activity Data Col and HASL Column. The Iterations of 100 and Error Limit of 0.01 are both fine. The HASL Model File should be thrombin28.hsl.

    The next two parameters control the size and optimization of the random MCP models constructed from the entire ligand pool. Number of Ligands in MCP Cross-Validation Group is the size of each random model. We recommend that this be kept between 5 and 10. For this tutorial enter 8 as this parameter. After each model is contructed, the MCP begins a ligand optimization phase, where, starting with the poorest predicted ligand, all conformers of that ligand are tried in the model and the conformer that gives the best cross-validated r2 is retained. Then the MCP proceeds to optimize what is now the poorest predicted ligand in the same manner. The Number of Ligands in Group to Conformer-Optimize controls how many timesd this procedure will be applied to each model. This, of course, has to be less than or equal to the Number of Ligands in MCP Cross-Validation Group. For this tutorial set the Number of Ligands in Group to Conformer-Optimize to 5. The presumption is that the best predicted liagnds would not benefit significantly from conformer optimization.

    The Filter Activity parameters allow you to block ligands out of the MCP that fall outside a specified activity range. This may be useful in some experiments. The Force Ligands & Freeze Conformers... button brings up a dialog (MCP Force Ligands & Freeze Conformers) that allows you to control the composition of eqch MCP model by "forcing" certain ligands to be in all models, or to "freeze" the conformation of some of these ligands by requiring that a specified conformer is always used. The easies way to use this dialog is to select the Forced Ligand and/or Frozen Conformer checkboxes, select a row on the MSS and press the MSS Select... button on the dialog. Currently you are only able to choose three Forced Liagnds/Frozen Conformers for each MCP experiment.

    Finally, you need to choose under what conditions the MCP ends. Three MCP Finish Criteria are provided in this dialog: Total Unique MCP Models, Each Ligand in >= Models, and All-Ligand HASL C-V r2>=. A fourth option is also available: pressing Ctrl-C will end the MCP and save all completed models. For this tutorial we recommend using Each Ligand in >= Models with 20 models.

    When this is all set, press OK to initiate the Multi-Conformer Protocol Run.

  9. What to Watch and How to Monitor the Progress of the MCP.
  10. The MCP echoes various tidbits of its progress to the Sybyl text window. Included here is the model number being calculated, the ligand(s) being optimized, the cross-validated r2 of complete models, and information when all-ligand HASL models are being calculated and thier results. A complete record of each model, including its optimization, is stored in the ".mcp" file which is automatically created by the protocol and named with the same filename root as the .hsl file. For the current case this would be thrombin28.mcp. MOre useful for monitoring the MCP is the file $HASL_SCRATCH/hasl_mcp_status.dat which is updated at the completion of each full model. This file can be examined using standard unix tools such as cat, more, etc. in a new unix shell. Following is the status file at the completion of the first full model:

    Complete MCP Models: 1
    LIG01 0 1(0) 2(0) 3(0) 4(0) 5(0) 6(0) 7(0) 8(0) 9(0) 10(0) 11(0)
    LIG02 1 12(0) 13(0) 14(0) 15(0) 16(0) 17(0) 18(0) 19(1) 20(0) 21(0) 22(0)
    LIG03 1 23(0) 24(1) 25(0) 26(0) 27(0) 28(0) 29(0) 30(0) 31(0) 32(0) 33(0)
    LIG04 0 34(0) 35(0) 36(0) 37(0) 38(0) 39(0) 40(0) 41(0) 42(0) 43(0) 44(0)
    LIG05 0 45(0) 46(0) 47(0) 48(0) 49(0) 50(0) 51(0) 52(0) 53(0) 54(0) 55(0)
    LIG06 1 56(0) 57(0) 58(0) 59(0) 60(0) 61(0) 62(0) 63(0) 64(1) 65(0) 66(0)
    LIG07 0 67(0) 68(0) 69(0) 70(0) 71(0) 72(0) 73(0) 74(0) 75(0)
    LIG08 1 76(1) 77(0) 78(0) 79(0) 80(0) 81(0) 82(0) 83(0) 84(0) 85(0) 86(0)
    LIG09 0 87(0) 88(0) 89(0) 90(0) 91(0) 92(0) 93(0) 94(0) 95(0) 96(0) 97(0)
    LIG10 0 98(0) 99(0) 100(0) 101(0) 102(0) 103(0) 104(0) 105(0) 106(0) 107(0) 108(0)
    LIG11 0 109(0) 110(0) 111(0) 112(0) 113(0) 114(0) 115(0) 116(0) 117(0) 118(0) 119(0)
    LIG12 0 120(0) 121(0) 122(0) 123(0) 124(0) 125(0) 126(0) 127(0) 128(0) 129(0)
    LIG13 0 130(0) 131(0) 132(0) 133(0) 134(0) 135(0) 136(0) 137(0) 138(0) 139(0)
    LIG14 0 140(0) 141(0) 142(0) 143(0) 144(0) 145(0) 146(0) 147(0) 148(0) 149(0)
    LIG15 0 150(0) 151(0) 152(0) 153(0) 154(0) 155(0) 156(0) 157(0) 158(0) 159(0)
    LIG16 0 160(0) 161(0) 162(0) 163(0) 164(0) 165(0) 166(0) 167(0) 168(0) 169(0)
    LIG17 0 170(0) 171(0) 172(0) 173(0) 174(0) 175(0) 176(0) 177(0) 178(0) 179(0)
    LIG18 0 180(0) 181(0) 182(0) 183(0) 184(0) 185(0) 186(0) 187(0) 188(0) 189(0)
    LIG19 0 190(0) 191(0) 192(0) 193(0) 194(0) 195(0) 196(0) 197(0) 198(0) 199(0)
    LIG20 0 200(0) 201(0) 202(0) 203(0) 204(0) 205(0) 206(0) 207(0) 208(0) 209(0)
    LIG21 1 210(0) 211(0) 212(0) 213(0) 214(0) 215(0) 216(0) 217(0) 218(0) 219(1)
    LIG22 1 220(0) 221(0) 222(0) 223(1) 224(0) 225(0) 226(0) 227(0) 228(0) 229(0)
    LIG23 0 230(0) 231(0) 232(0) 233(0) 234(0) 235(0) 236(0) 237(0) 238(0) 239(0)
    LIG24 0 240(0) 241(0) 242(0) 243(0) 244(0) 245(0) 246(0) 247(0) 248(0) 249(0)
    LIG25 0 250(0) 251(0) 252(0) 253(0) 254(0) 255(0) 256(0) 257(0) 258(0) 259(0)
    LIG26 1 260(0) 261(0) 262(0) 263(0) 264(0) 265(0) 266(0) 267(0) 268(1) 269(0)
    LIG27 0 270(0) 271(0) 272(0) 273(0) 274(0) 275(0) 276(0) 277(0) 278(0) 279(0)
    LIG28 1 280(0) 281(0) 282(0) 283(0) 284(0) 285(0) 286(0) 287(1) 288(0) 289(0)
    

    The header obviously reports the number of complete models. In the following rows, the ligand name is in the first column, the number of times that ligand has appeared in MCP models is in the second column. Since we are using a group size of eight for this demo, there are eight ligands that have each appeared once in models. The following columns are the specific rows in the MSS for conformers of the ligand. In parentheses are the number of times that conformer has been used in complete models. Note that this is an indication of the conformer's success in creating a valid 3D QSAR model as this is the result after conformer optimization.

    This MCP run will actually take 24-48 hours to complete. You can stop it with Ctrl-C or let it complete. A copy of the results file at the completion of the MCP has been included in the demo directory, so you don't need to let the MCP finish before proceeding to the next step. In the event you let it run for a while, following is the status file at the completion of 24 models (and 2 all-ligand HASL models):

    Complete MCP Models: 24
    LIG01 5 1(0) 2(0) 3(2) 4(1) 5(0) 6(0) 7(0) 8(2) 9(0) 10(0) 11(0)
    LIG02 6 12(0) 13(2) 14(0) 15(0) 16(1) 17(1) 18(0) 19(2) 20(0) 21(0) 22(0)
    LIG03 10 23(0) 24(3) 25(1) 26(0) 27(0) 28(1) 29(2) 30(0) 31(2) 32(1) 33(0)
    LIG04 11 34(0) 35(1) 36(2) 37(1) 38(0) 39(0) 40(2) 41(1) 42(1) 43(0) 44(3)
    LIG05 7 45(2) 46(1) 47(1) 48(0) 49(0) 50(1) 51(0) 52(0) 53(0) 54(0) 55(2)
    LIG06 8 56(0) 57(1) 58(1) 59(1) 60(0) 61(2) 62(0) 63(0) 64(1) 65(2) 66(0)
    LIG07 2 67(0) 68(0) 69(0) 70(0) 71(0) 72(0) 73(2) 74(0) 75(0)
    LIG08 9 76(2) 77(0) 78(0) 79(1) 80(0) 81(1) 82(1) 83(1) 84(1) 85(1) 86(1)
    LIG09 4 87(0) 88(0) 89(0) 90(2) 91(0) 92(0) 93(0) 94(0) 95(0) 96(2) 97(0)
    LIG10 12 98(1) 99(2) 100(4) 101(2) 102(1) 103(1) 104(0) 105(0) 106(0) 107(0) 108(1)
    LIG11 9 109(1) 110(1) 111(1) 112(0) 113(2) 114(1) 115(0) 116(1) 117(1) 118(1) 119(0)
    LIG12 6 120(1) 121(0) 122(1) 123(1) 124(0) 125(0) 126(1) 127(2) 128(0) 129(0)
    LIG13 7 130(3) 131(0) 132(0) 133(0) 134(0) 135(0) 136(0) 137(0) 138(1) 139(3)
    LIG14 7 140(0) 141(2) 142(1) 143(0) 144(0) 145(0) 146(2) 147(0) 148(0) 149(2)
    LIG15 6 150(1) 151(0) 152(0) 153(2) 154(1) 155(0) 156(0) 157(2) 158(0) 159(0)
    LIG16 7 160(1) 161(1) 162(0) 163(0) 164(2) 165(0) 166(1) 167(0) 168(0) 169(2)
    LIG17 6 170(2) 171(0) 172(0) 173(1) 174(1) 175(0) 176(1) 177(0) 178(1) 179(0)
    LIG18 8 180(0) 181(0) 182(0) 183(0) 184(1) 185(2) 186(1) 187(1) 188(0) 189(3)
    LIG19 6 190(0) 191(0) 192(0) 193(1) 194(1) 195(0) 196(2) 197(0) 198(2) 199(0)
    LIG20 6 200(2) 201(1) 202(1) 203(1) 204(0) 205(0) 206(0) 207(0) 208(0) 209(1)
    LIG21 9 210(0) 211(1) 212(2) 213(2) 214(0) 215(0) 216(1) 217(0) 218(1) 219(2)
    LIG22 5 220(0) 221(1) 222(0) 223(1) 224(0) 225(0) 226(0) 227(2) 228(1) 229(0)
    LIG23 7 230(1) 231(1) 232(2) 233(0) 234(0) 235(0) 236(1) 237(0) 238(0) 239(2)
    LIG24 6 240(2) 241(2) 242(0) 243(0) 244(0) 245(0) 246(1) 247(0) 248(1) 249(0)
    LIG25 9 250(1) 251(1) 252(0) 253(0) 254(0) 255(1) 256(2) 257(1) 258(3) 259(0)
    LIG26 7 260(1) 261(0) 262(0) 263(0) 264(1) 265(1) 266(0) 267(0) 268(2) 269(2)
    LIG27 2 270(0) 271(0) 272(0) 273(0) 274(0) 275(0) 276(1) 277(0) 278(1) 279(0)
    LIG28 5 280(0) 281(0) 282(1) 283(0) 284(0) 285(0) 286(0) 287(1) 288(3) 289(0)
    
    Now completed 2 All-Ligand HASL C-V Models
    Model:  1  C-V r2:  0.102805
    Model:  2  C-V r2:  0.104778
    

  11. Analyze the Complete MCP Results File to Identify Optimum Conformer Overlap
  12. From the menubar, choose the eslc, HASL, MCP pulldown and select Analyze MCP Results. The HASL MCP Analyze Dialog Box has a variety of metrics that can be added to the Molecular SpreadSheet as columns to aid in the interpretation of the MCP results. First, verify that the column numbers for the ligand names, conformer id, activity and HASL are correct. The HASL MCP Results File is the name of the output file from the Multi-Conformer Protocol. If you have actually run the entire protocol, then leave this value as thrombin28.mcp. If you interrupted the run (as recommended above), then enter demo_thrombin28.mcp, here. The next option determines whether you want an ASCII formatted file of the results. For the demo, uncheck the Write MCP Analysis option. The next block of options selects which metrics are to be added to the MSS. This calculation is quite fast, so it is possible to easily experiment with these. For now, check on only the Conformation AMD Composite Score and Conformation Rank: AMD Score. Press OK for the Analysis to proceed.

    When the analysis calculation is complete, the Molecular SpreadSheet will be updated with the two new selected columns. Now you can re-sort the MSS and key on the Column with Conformation Rank: AMD Score. To do this pull down the View, Sort... command on the MSS Menubar. Select as the Primary key the Conformation Rank: AMD Score column and as the Secondary key the Ligand column. When the table is re-sorted all of the highest scoring (with the AMD score) conformers will be at the top of the table.

    It is now possible to go on and calculate a HASL analysis as described in Lesson 1 or perform other computational experiments on this overlapped set of ligand conformers.

    Antithrombin Ligand Conformer Overlap Suggested by the MCP