Appendix A. HASL Creation and Testing.

An example HASL creation and testing session is presented herein to illustrate the steps in detail. This example uses 15 glutamine synthetase inhibitor structures (see Theory; Phosphorus and Sulfur, 45, 183, 1989) in the Alchemy *.MOL format. The files containing these structures are located in the subdirectory, "DATA," on the HASL program diskette. These structures were superposed in a prior step to maximize steric overlap of the CH2CH2CH(NH2)COOH backbone, and segregated into a 10-member learning set and a 5-member test set:

Learning Set

Molecule	Structure			pKi

GS3	(HO)2POCH2PO(OH)CH2CH2CH(NH2)COOH	3.60
GS4	CH3PO(OH)CH2CH2CH(NH2)COOH		4.14
GS6	(HO)2P(O)CH2COCH2CH2CH(NH2)COOH	       	3.23
GS7	CH3CH2SO(NH)CH2CH2CH(NH2)COOH		1.66
GS9	HOPH(O)CH2CH2CH(NH2)COOH		2.68
GS10	HOOCCH2PO(OH)CH2CH2CH(NH2)COOH	        2.14
GS12	3,4-Cl2PhCH2PO(OH)CH2CH2CH(NH2)COOH	1.81
GS13	3,5-(CH3)2PhCH2PO(OH)CH2CH2CH(NH2)COOH  1.81
GS17	HONHCH2CH2CH(NH2)COOH			4.68
GS19	Tetrazolyl-CH2CH2CH(NH2)COOH		2.68
Test Set

/Molecule	Structure			pKi

GS5	CH3SO(NH)CH2CH2CH(NH2)COOH		3.79
GS11	PhCH2PO(OH)CH2CH2CH(NH2)COOH		2.14
GS14	4-BrPhCH2PO(OH)CH2CH2CH(NH2)COOH	1.81
GS20	HOOCC(=CH2)CH2CH(NH2)COOH		2.66
GS27	Tabtoxinine-b-lactam			3.00
The selection of the learning and test set members do not necessarily reflect the best possible modeling paradigm; the sets were chosen at random and serve simply to illustrate HASL model building and testing.

Using the HASL Options Menu

The parameter file, PARAMS.DAT, was edited using the Parameter screen to opt for the default atom types (as listed), lattice resolution at 2.5 Angs, rotational increment (default value), fitting method (eqn 1), file format (Alchemy MOLFILE), fitting method (fixed), and H-value definition (normal). The options then selected were:

Create HASL and Test Molecule(s)		Both operations in one session

10						10-member learning set

file name (e.g. data\gs3.mol)			File name/activity values are entered activity 
(e.g. 3.60)					for all 10 molecules in the learning set
			
new History file				Starting a new HASL model

300 iterations					Arbitrary choice

0.001 error limit				Even if the error limit is not achieved, the 					
						iteration process minimizes the error

5						5-member test set

file name & activity (as above)			Test set data entered for all 5 molecules
At this point the HASL program creates a batch file (SERIES.BAT) to carry out the above instructions, shells out to DOS, launches the batch file, and upon its completion, returns the user to the Options Menu. The commands listed below are those found in the SERIES.BAT batch file:


create<mol1.fit		the first molecule becomes a lattice of points, a HASL
fit<mol2.fit		the second molecule is "fitted" onto the lattice
merge			the first and second molecular lattices are merged into one HASL
fit<mol3.fit		the third molecule is "fitted" onto the merged lattice
merge			the HASL incorporates new points from the third molecule
fit<mol4.fit		...and so on ...
merge
fit<mol5.fit
merge
fit<mol6.fit
merge
fit<mol7.fit
merge
fit<mol8.fit
merge
fit<mol9.fit

merge
fit<mol10.fit
merge
copy history.000 history		after all ten molecular lattices have been merged, the 
fit<mol1.fit			history file is nulled and molecules 1-10 are re-fitted 
fit<mol2.fit			onto the 10-molecule HASL; in this way, the history
fit<mol3.fit			file accurately records the points in the HASL which
fit<mol4.fit			belong to each molecule
fit<mol5.fit
fit<mol6.fit
fit<mol7.fit
fit<mol8.fit
fit<mol9.fit
fit<mol10.fit			iterations are run according to the itlim file contents &
iterate				the best receptor description is written to RECEPTOR
copy history.000 history		new history file to record test molecule fitting		
fit<mol1.tst			each test molecule is fitted unto the iterated HASL
fit<mol2.tst 			
fit<mol3.tst
fit<mol4.tst
fit<mol5.tst

All the molecular coordinate file names and activity values are stored in either MOL*.FIT (learning set members) or MOL*.TST (test set members) files. Both file sets have the same format, for example, in this case MOL1.FIT contains the following two lines:

data\gs3.mol
3.60000

When CREATE.EXE is run in the batch mode, as the command line,

create under DOS, this command starts up CREATE.EXE which in turns seeks out two inputs: a file name and an activity value. The "<" symbol tells CREATE.EXE to look to the file mol1.fit for the input. The same logic applies for all the command lines in the batch files created by HASL.

As stated earlier, the information in ITERAT.DAT (list of actual and predicted learning set activity values) makes up the Learning Set Report (see Reports), while the HISTORY file contains the information necessary to construct the Test Set Report. Please note that these reports are necessarily volatile (i.e. subject to change depending on any further user-initiated building and testing of the HASL model).

In the present case, if the user wishes to add the 5-member test set to the HASL model (thus creating a 15 molecule HASL model), when "Create HASL" is next chosen from the Options Menu, and no other operations have been performed, the user can elect to "Add to the Current HASL," and follow ensuing prompts.

Batch File Operations

As can be readily deduced by examining the nature of the batch files produced by the HASL modeling program, it is possible to avoid using the HASL.EXE module as a "front end," create the appropriate batch files using a text editor, and launch them under DOS. The user can experiment with such an approach keeping in mind the following points:

The HASL suite of programs, namely, CREATE, FIT AND MERGE, were designed to be called in a logical sequence in order to permit the user the greatest degree of flexibility in designing a modeling paradigm. Thus, the use of batch file mode to explore alternative ways of creating a HASL model is encouraged.