There are a three of options for structure input format, as follows:
MDL Informations Systems, Inc. SDFile format
SMILES code was developed by David Weininger (D. Weininger, J. Chem. Inf. Comput. Sci., 28, 31-36, 1988) to provide a string code for the input of molecular structure. The user is referred to this reference and subsequent papers for the description of the SMILES code and techniques for creation of SMILES code for molecular structures. Essentially, the chemical graph is reduced to a tree (noncyclic) graph by removing one bond for each ring; the atoms between which the bond was broken are labeled with a number. Branches are enclosed in parentheses. A short description of the SMILES rules is given below. For more information on SMILES see Chapter 3 of the Daylight Theory Manual.
The SGI, SUN, and LINUX versions of HintLogP have the added capability of reading and decoding SMILES files using the Daylight Toolkit SMILES interpreter instead of the OELIB SMILES interpreter. This is an optional feature that requires a run-time SMILES Toolkit license from Daylight Chemical Information Systems, Inc. The format of the Daylight SMILES or OELIB SMILES files is the same. Each record of the SMILES files, which are generally named with the .smi extension, is simply a SMILES string followed by the Molecule Name (space delimited). There is no file termination code. This file format matches what is supported by the Daylight database software and will be a useful option for some sites that have large databases already encoded in this way. The other potential advantage is that the Daylight Toolkit is the defacto standard for interpretation of SMILES codes; which could be a consideration for those who plan to work with large complex SMILES libraries on UNIX computers.
EduSoft includes two demo SMILES files: demo1.smi is a single molecule, benzene, and demo2.smi is a database file that contains 100 structures. Note the molecule name/identifier is at the end of each line, in this case the CAS Registry Number.
The File demo2.smi as Supplied With All Versions of HintLogP Software:
OC4(C3(C(C2C(C1(CCC(=O)C=C1CC2)C)C(C3)O)CC4)C)C(=O)COC(=O)C 50-03-3 BrC43C(C2C(C1C(C(CC1)C(CCCC(C)C)C)(CC2)C)CC3Br)(CCC(C4)Cl)C 5337-45-1 O(C5C(C4C(C3C(C1(C(C2C(CC1)(CCC2C(=C)C)C)CC3)C)(CC4)C)(CC5)C)(C)C)C(=O)c6ccccc6 1617-69-2 OC1C(C4C(CC1)(C3=C(C2(C(C(CC2)C(CCCC(C)C)C)(CC3=O)C)C)C(=O)C4)C)(C)C 5346-40-7 OC4CC3C(C2C(C1C(C(CC1)C(O)C)(CC2)C)CC3)(CC4)C 80-92-2 OC4C(C3C(C2C(C1(C(C(CC1)C(CCC(=O)O)C)(CC2=O)C)C)CC3)(CC4)C)(C)C 5346-42-9 OC4C(C3C(C2C(C1(C(C(CC1)C(CCC(=O)O)C)(CC2=O)C)C)C(=O)C3)(CC4)C)(C)C 5399-41-7 O2C1(OCC(CC1)C)C(C3C2CC5C3(CCC6C4(C(CC(CC4)O)CCC65)C)C)C 470-03-1 O2C1(OCC(CC1)C)C(C3C2CC5C3(CCC6C4(C(CC(CC4)O)CCC65)C)C)C 470-01-9 O2C1(OCC(CC1)C)C(C3C2CC5C3(CCC6C4(C(CC(CC4)O)CCC65)C)C)C 126-19-2 O2C1(OCC(CC1)C)C(C3C2CC5C3(CCC6C4(C(CC(CC4)O)CCC65)C)C)C 126-18-1 O2C1(OCC(CC1)C)C(C3C2CC5C3(CCC6C4(C(CC(CC4)O)CCC65)C)C)C 77-60-1 OC4CCC3(C2C(C1C(C(CC1)C(=O)C)(CC2)C)CC=C3C4)C 145-13-1 OC4CCC3(C2C(C1C(C(CC1)C(=O)C)(CC2)C)CC=C3C4)C 566-63-2 ClC4CCC3(C2C(C1C(C(CC1)C(CCCC(C)C)C)(CC2)C)CC=C3C4)C 910-31-6 O4C(=O)C3C51C(CCC(C1)OC(=O)C)(C2=CCC6(C(C2(C3C4=O)C=C5)CCC6C(=O)C)C)C 25495-42-5 OC2C3C(C1C(C(CC1)C(=O)C)(C2)C)CCC4=CC(=O)CCC43C 600-57-7 OC2C3C(C1C(C(CC1)C(=O)C)(C2)C)CCC4=CC(=O)CCC43C 80-75-1 OC4C3(C(C2C(C1CCC(=O)C=C1CC2)CC3)CC4)C 434-22-0 O(C)C(=O)C=C4C3(C(C2C(C1(CCC(=O)C=C1CC2)C)C(=O)C3)CC4)C 1474-15-3 OC1C3C(C2C(C1)(C(=CCOC(=O)C)CC2)C)CCC4=CC(=O)CCC43C 5327-59-3 O(C4C1(C(C3C(CC1)c2c(cc(cc2)O)CC3)CC4)C)C(=O)CCC5CCCC5 313-06-4 OC(=O)C(C4C3(C(C2C(C1(CCC(=O)C=C1CC2)C)CC3)CC4)C)C 5327-60-6 OC4(C3(C(C2CCC1=CC(=O)CCC1(C2=CC3)C)CC4)C)C 1039-17-4 OC4(C3(C(C2C(C1(CCC(=O)C=C1CC2)C)C(C3)O)CC4)C)C 1807-02-9 OC4(C3(C(C2C(C1(CCC(=O)C=C1CC2)C)C(C3)O)CC4)C)C 1043-10-3 O1C32C1CC5(C(C2CCC4=CC(=O)CCC43C)CCC5(O)C)C 1042-33-7 O1C(C(C(C(C1CO)O)O)O)OC2C(OC(CC2OC)OC7CC6(C(C5C(C3(C(C(CC3)C4=CC(=O)OC4)(CC5)C)O)CC6)(CC7)C=O)O)C 560-53-2 OC5C(C4C(C3C(C1(C(C2C(CC1)(CCC2C(=C)C)CO)CC3)C)(CC4)C)(CC5)C)(C)C 473-98-3 BrC(=C(C)C)CCC(C1C4(C(CC1)(C3=C(C2(C(C(C(CC2)O)(C)C)CC3)C)CC4)C)C)C 50719-45-4 OC4CC3C(C2C(C1C(C(CC1)C(CCCC(C)C)C)(CC2)C)CC3)(CC4)C 17608-41-2 OC4CC3C(C2C(C1C(C(CC1)C(CCCC(C)C)C)(CC2)C)CC3)(CC4)C 516-92-7 OC4CC3C(C2C(C1C(C(CC1)C(CCCC(C)C)C)(CC2)C)CC3)(CC4)C 80-97-7 OC4CC3C(C2C(C1C(C(CC1)C(CCCC(C)C)C)(CC2)C)CC3)(CC4)C 360-68-9 C1(CCCC1)C2CCCCC2 1606-08-2 C43(C(C2C(C1(CCC=CC1=CC2)C)CC3)CCC4C(CCCC(C)C)C)C 747-90-0 C1(C(CCC1)C)C2CCCCC2 5405-90-3 S(=O)(=O)(NC(=O)CCC(C4C3(C(C2C(C1(C(CC(CC1)O)CC2)C)CC3O)CC4)C)C)c5ccc(cc5)N 5407-24-9 OC3C4(C(C1C(C2(C(CC1O)CC(CC2)O)C)C3)CCC4C(CCC(=O)O)C)C 81-25-4 N(C4C3(C(C2C(C1(CCC(=O)C=C1CC2)C)CC3)CC4)C)C(=O)C 1865-62-9 OC4(C3(C(C2C(C1(C(=CC(=O)C=C1)CC2)C)C(=O)C3)CC4C)C)C(=O)CO 1247-42-3 [N+](=O)([O-])c1c(ccc(c1)[N+](=O)[O-])NN=C5CCC4(C3C(C2C(C(CC2)O)(CC3)C)CCC4=C5)C 2347-93-5 N6C1C(OC5(C1C)CCC4C3C(C2(CCC(CC2=CC3)O)C)C(=O)C4=C5C)CC(C6)C 469-59-0 O1C(C(C(C(C1C)O)O)O)OC6CCC5(C4C(C2(C(C(CC2)C3=COC(=O)C=C3)(CC4)C)O)CCC5=C6)C 466-06-8 O1C(CC(C(C1C)O)OC)OC6CC5(C(C4C(C2(C(C(CC2)C3=CC(=O)OC3)(CC4)C)O)CC5)(CC6)C=O)O 508-77-0 O1C(C(C(C(C1CO)O)O)O)OC6CCC5(C4C(C2(C(C(CC2)C3=COC(=O)C=C3)(CC4)C)O)(CC(C5=C6)OC(=O)C)O)C 507-60-8 N81C(C(C7(C(C1)C6(C(C5C2(OC3(C(C2(CCC3OC(=O)c4cc(c(cc4)OC)OC)C)CC5)O)C6)(CC7O)O)O)O)(O)C)CCC(C8)C 71-62-5 O1C(C(C(C(C1CO)O)O)O)OC2C(C(OC(C2O)C)OC7CCC6(C5C(C3(C(C(CC3)C4=COC(=O)C=C4)(CC5)C)O)CCC6=C7)C)O 124-99-2 N71C(C(C6C(C1)C5C(C4C2(OC3(C(C2(CCC3OC(=O)C(O)(CC)C)C)C(C4OC(=O)C)OC(=O)C)O)C5)(C(C6O)OC(=O)C(CC)C)O)(O)C)CCC(C7)C 143-57-7 N71C(C(C6C(C1)C5C(C4C2(OC3(C(C2(CCC3OC(=O)C(O)(C(O)C)C)C)C(C4OC(=O)C)OC(=O)C)O)C5)(C(C6O)OC(=O)C(CC)C)O)(O)C)CCC(C7)C 124-97-0 O1C(CC(C(C1C)OC2OC(C(C(C2)O)O)C)O)OC3C(OC(CC3O)OC8CC7C(C6C(C4(C(C(CC4)C5=CC(=O)OC5)(CC6)C)O)CC7)(CC8)C)C 71-63-6 O1C(C(C(C(C1COC2OC(C(C(C2O)O)O)CO)O)O)O)OC3C(OC(CC3OC)OC8CC7(C(C6C(C4(C(C(CC4)C5=CC(=O)OC5)(CC6)C)O)CC7)(CC8)C=O)O)C 33279-57-1 O1C(C(C(C(C1CO)O)O)O)OC2C(OC(CC2OC(=O)C)OC3C(OC(CC3O)OC4C(OC(CC4O)OC9CC8C(C7C(C5(C(C(CC5)C6=CC(=O)OC6)(CC7)C)O)CC8)(CC9)C)C)C)C 17575-20-1 O1C(C(C(C(C1CO)O)O)O)OC2C(OC(CC2OC(=O)C)OC3C(OC(CC3O)OC4C(OC(CC4O)OC9CC8C(C5C(C6(C(C(C5)O)(C(CC6)C7=CC(=O)OC7)C)O)CC8)(CC9)C)C)C)C 17575-22-3 O1C=C(C=CC1=O)C5C4(C(C3(C(C2(CCC(C=C2C(C3)OC(=O)C)O)C)CC4)O)(CC5)O)C 507-59-5 O1C(C(C(C(C1CO)O)O)O)OC2C(OC(CC2OC(=O)C)OC3C(OC(CC3O)OC4C(OC(CC4O)OC9CC8C(C7C(C5(C(C(C(C5)O)C6=CC(=O)OC6)(CC7)C)O)CC8)(CC9)C)C)C)C 17575-21-2 OC4CCC3(C2C(C1C(C(CC1)C(C)C=CC(C(C)C)CC)(CC2)C)CC=C3C4)C 83-48-7 OC4CCC3(C2C(C1C(C(CC1)C(CCC(C(C)C)CC)C)(CC2)C)CC=C3C4)C 83-46-5 OC4(C3(C(C2C(C1(CCC(=O)C=C1CC2)C)CC3)CC4)C)C=C 1235-98-9 O(C4CC3C(C2C(C1C(C(CC1)(O)C(=O)C)(CC2)C)CC3)(CC4)C)C(=O)C 5456-44-0 O(C4CCC3(C2C(C1C(C(CC1)C(C)C=CC(C(C)C)CC)(CC2)C)CC=C3C4)C)C(=O)C 4651-48-3 O(C4CC3C(C2C(C1C(C(CC1)C(C)C=CC(C(C)C)CC)(CC2)C)CC3)(CC4)C)C(=O)C 13010-52-1 O(C6CC5C(C4C(C1C(C(CC1)C(CC=C(c2ccccc2)c3ccccc3)C)(CC4)C)CC5)(CC6)C)C(=O)C 4144-29-0 OC4CC3C(C2C(C1C(C(CC1)C(CCC(=O)OC)C)(CC2)C)CC3)(CC4)C 15074-01-8 OC4CC3C(C2C(C1C(C(CC1)C(CCC(=O)OC)C)(CC2)C)CC3)(CC4)C 1249-75-8 OC4(C3(C(C1C(C2(C(CC1)CC(=O)CC2)C)C(=O)C3)CC4)C)C(=O)COC(=O)C 3751-02-8 OC4(C3(C(C1C(C2(C(CC1)CC(=O)CC2)C)C(=O)C3)CC4)C)C(=O)COC(=O)C 1499-59-8 OC4(C3(C(C2C(C1(CCC(=O)C=C1CC2)C)C(=O)C3)CC4)C)C(=O)COC(=O)CCCCCCCCCCCCCCCCC 5432-63-3 OC(=O)CCC(C4C3(C(C1C(C2(C(CC1=O)CC(=O)CC2)C)CC3=O)CC4)C)C 81-23-2 OC3C4(C(C2C(C1(C(CC(CC1)O)CC2)C)C3)CCC4C(CCC(=O)O)C)C 30635-00-8 OC3C4(C(C2C(C1(C(CC(CC1)O)CC2)C)C3)CCC4C(CCC(=O)O)C)C 83-44-3 OC4CCC3(C2C(C1C(C(CC1)C(CCCC(C)C)C)(CC2)C)CC=C3C4)C 57-88-5 O(C4CCC3(C2C(C1C(C(CC1)C(CCCC(C)C)C)(CC2)C)CC=C3C4)C)C(=O)C 604-35-3 ClC43C2C5C1OC1C(C2C(C3(Cl)Cl)(C(=C4Cl)Cl)Cl)C5 128-10-9 ClC43C2C5C1OC1C(C2C(C3(Cl)Cl)(C(=C4Cl)Cl)Cl)C5 60-57-1 ClC43C2C5C1OC1C(C2C(C3(Cl)Cl)(C(=C4Cl)Cl)Cl)C5 72-20-8 OC4(C1(C(C3C(C(C1)O)C2(C(=CC(=O)C=C2)CC3)C)CC4)C)C(=O)CO 50-24-8 OC1C3C(C2C(C1)(C(=CCO)CC2)C)CCC4=CC(=O)CCC43C 3103-13-7 BrC4CC3(C1C(C2C(CC1=O)(C(=CC(=O)OC)CC2)C)CCC3=CC4=O)C 5415-46-3 BrC4CC3(C1C(C2C(CC1O)(C(=CC(=O)OC)CC2)C)CCC3=CC4=O)C 5415-47-4 O=C4CCC3(C2C(C1C(C(CC1)C(C)C=O)(CC2)C)CCC3=C4)C 66289-21-2 O=C4CCC3(C2C(C1C(C(CC1)C(C)C=O)(CC2)C)CCC3=C4)C 3986-89-8 S1C5(NC(C1)C(=O)O)CC4C(C3C(C2C(C(CC2)C(=O)C)(CC3=O)C)CC4)(CC5)C 6293-78-3 OC4(C1(C(C3C(C(C1)O)C2(C(=CC(=O)C=C2)CC3)C)CC4)C)C(=O)COC(=O)CCC(=O)O 1715-33-9 OC4(C3(C(C2C(C1(CCC(=O)C=C1CC2)C)C(C3)O)CC4)C)C(=O)COC(=O)CCC(=O)O 125-04-2 O(C4C3(C(C2C(C1(CCC(=O)C=C1CC2)C)CC3)CC4)C)C(=O)CCC5CCCC5 58-20-8 N1(CCCCC1)C=C(C5C4(C(C3C(C2(CCC(=O)C=C2CC3)C)CC4)CC5)C)C 24377-48-8 O(C4C3(C(C2C(C1(CCC(=O)C=C1CC2)C)CC3)CC4)C)C(=O)CC 58769-88-3 O(C4C3(C(C2C(C1(CCC(=O)C=C1CC2)C)CC3)CC4)C)C(=O)CC 57-85-2 S=P(OC4CCC3(C2C(C1C(C(CC1)C(CCCC(C)C)C)(CC2)C)CC=C3C4)C)(OCC)OCC 24352-66-7 N%10C9(OC8C(C7(C(C6C(C1(C(CC(CC1)OC2OC(C(C(C2O)O)OC3OC(C(C(C3OC4OC(C(C(C4O)O)O)CO)OC5OCC(C(C5O)O)O)O)CO)CO)CC6)C)CC7)C8)C)C9C)CCC(C%10)C 17406-45-0 O=C4C3(C(C2C(C1(CCC(=O)C=C1CC2)C)CC3)CC4)C 18485-76-2 O=C4C3(C(C2C(C1(CCC(=O)C=C1CC2)C)CC3)CC4)C 63-05-8 OC4(C3(C(C2C(C1CCC(=O)C=C1CC2)CC3)CC4)C)C#C 68-22-4 OC4(C3(C(C2C(C1(CCC(=O)C=C1CC2)C)CC3)CC4)C)C#C 434-03-7 OC5C1(C(C4C(CC1)c2c(cc(cc2)OC(=O)c3ccccc3)CC4)CC5)C 50-50-0 O(CC(=O)C4C3(C(C2C(C1(CCC(=O)C=C1CC2)C)CC3)CC4)C)C(=O)C 56-47-3 Oc1cc4c(cc1)C2C(C3C(CC2)(C(=O)CC3)C)CC4 53-16-7 OC4C3(C(C2C(C1(CCC(=O)C=C1CC2)C)CC3)CC4)C 571-41-5 OC4C3(C(C2C(C1(CCC(=O)C=C1CC2)C)CC3)CC4)C 86335-11-7
EduSoft includes a demo MOL2 database file that contains 10 structures.
The First Two Structures in the demo4.mol2 File Supplied With HintLogP Software:
# Name: B_ESTRADIOL # Creating user name: gkellogg # Creation time: Tue Nov 9 13:17:55 1993 # Modifying user name: gkellogg # Modification time: Tue Nov 9 13:27:21 1993 @
MOLECULE B_ESTRADIOL 44 47 1 0 0 SMALL USER_CHARGES INVALID_CHARGES @ ATOM 1 C0 3.5269 1.2282 -0.5910 C.3 1 MOL1 -0.3000 2 C1 2.0219 0.8438 -0.1562 C.3 1 MOL1 0.0000 3 C2 1.0537 1.8981 -0.7458 C.3 1 MOL1 -0.2000 4 C3 0.0299 1.4727 -1.7954 C.3 1 MOL1 -0.2000 5 C4 -0.6674 0.2085 -1.3476 C.3 1 MOL1 -0.1000 6 C5 0.3651 -0.9761 -1.1904 C.3 1 MOL1 -0.1000 7 C6 -0.3887 -2.2712 -0.7075 C.3 1 MOL1 -0.2000 8 C7 -1.9470 -2.2001 -0.8458 C.3 1 MOL1 -0.2000 9 C8 -2.5481 -0.9510 -0.2037 C.ar 1 MOL1 0.0000 10 C9 -3.6511 -1.0182 0.6047 C.ar 1 MOL1 -0.1000 11 C10 -4.1448 0.1316 1.2039 C.ar 1 MOL1 0.0300 12 O11 -5.2368 0.0644 2.0110 O.3 1 MOL1 -0.3800 13 C12 -3.5084 1.3467 0.9895 C.ar 1 MOL1 -0.1000 14 C13 -2.3852 1.4139 0.1732 C.ar 1 MOL1 -0.1000 15 C14 -1.8960 0.2944 -0.4257 C.ar 1 MOL1 0.0000 16 C15 1.7503 -0.6620 -0.6221 C.3 1 MOL1 -0.1000 17 C16 2.3448 -1.4886 0.6115 C.3 1 MOL1 -0.2000 18 C17 1.8447 -0.7429 1.8521 C.3 1 MOL1 -0.2000 19 C18 1.9854 0.7486 1.4492 C.3 1 MOL1 -0.0700 20 O19 3.0303 1.3907 2.2055 O.3 1 MOL1 -0.3800 21 H21 3.7606 2.2203 -0.2175 H 1 MOL1 0.1000 22 H22 3.6066 1.2177 -1.6741 H 1 MOL1 0.1000 23 H23 4.2684 0.5450 -0.1910 H 1 MOL1 0.1000 24 H24 0.4635 2.4607 0.0315 H 1 MOL1 0.1000 25 H25 1.5259 2.7971 -1.2309 H 1 MOL1 0.1000 26 H26 0.4426 1.4496 -2.8310 H 1 MOL1 0.1000 27 H27 -0.6977 2.3024 -1.9616 H 1 MOL1 0.1000 28 H28 -1.2830 0.0097 -2.2984 H 1 MOL1 0.1000 29 H29 0.5285 -1.2724 -2.2903 H 1 MOL1 0.1000 30 H30 -0.0612 -3.1901 -1.1806 H 1 MOL1 0.1000 31 H31 -0.2539 -2.4288 0.3800 H 1 MOL1 0.1000 32 H32 -2.3687 -3.0871 -0.3978 H 1 MOL1 0.1000 33 H33 -2.1839 -2.1834 -1.9121 H 1 MOL1 0.1000 34 H34 -4.1287 -1.9704 0.7788 H 1 MOL1 0.1000 35 H35 -5.5691 0.8589 2.4352 H 1 MOL1 0.3500 36 H36 -3.8869 2.2397 1.4650 H 1 MOL1 0.1000 37 H37 -1.9092 2.3662 0.0343 H 1 MOL1 0.1000 38 H38 2.5108 -1.0804 -1.3610 H 1 MOL1 0.1000 39 H39 2.0677 -2.5309 0.5916 H 1 MOL1 0.1000 40 H40 3.4400 -1.5052 0.6368 H 1 MOL1 0.1000 41 H41 0.8164 -1.0696 2.0500 H 1 MOL1 0.1000 42 H42 2.4339 -1.0074 2.7284 H 1 MOL1 0.1000 43 H43 1.0919 1.2266 1.8949 H 1 MOL1 0.1000 44 H44 3.8454 0.8990 2.0823 H 1 MOL1 0.3500 @ BOND 1 1 2 1 2 1 21 1 3 1 22 1 4 1 23 1 5 2 3 1 6 2 16 1 7 2 19 1 8 3 4 1 9 3 24 1 10 3 25 1 11 4 5 1 12 4 26 1 13 4 27 1 14 5 6 1 15 5 15 1 16 5 28 1 17 6 7 1 18 6 16 1 19 6 29 1 20 7 8 1 21 7 30 1 22 7 31 1 23 8 9 1 24 8 32 1 25 8 33 1 26 9 10 ar 27 9 15 ar 28 10 11 ar 29 10 34 1 30 11 12 1 31 11 13 ar 32 12 35 1 33 13 14 ar 34 13 36 1 35 14 15 ar 36 14 37 1 37 16 17 1 38 16 38 1 39 17 18 1 40 17 39 1 41 17 40 1 42 18 19 1 43 18 41 1 44 18 42 1 45 19 20 1 46 19 43 1 47 20 44 1 @ SUBSTRUCTURE 1 MOL1 1 TEMP 0 **** **** 0 ROOT # Name: ACETYLCHOLINE # Creating user name: gkellogg # Creation time: Tue Nov 9 13:28:28 1993 # Modifying user name: gkellogg # Modification time: Tue Nov 9 13:30:31 1993 @ MOLECULE ACETYLCHOLINE 26 25 1 0 0 SMALL USER_CHARGES INVALID_CHARGES @ ATOM 1 C0 -4.3973 -0.2463 0.5026 C.3 1 MOL1 -0.3000 2 C1 -2.9663 -0.5262 0.1201 C.2 1 MOL1 0.4100 3 O2 -2.4730 -1.6120 0.3866 O.2 1 MOL1 -0.3800 4 O3 -2.2053 0.4016 -0.5122 O.3 1 MOL1 -0.1800 5 C4 -0.8689 0.1813 -0.8888 C.3 1 MOL1 -0.0500 6 C5 0.1357 0.3156 0.2722 C.3 1 MOL1 0.2200 7 N6 1.5948 0.1395 0.0157 N.4 1 MOL1 -0.6800 8 C7 2.3386 0.3304 1.3574 C.3 1 MOL1 0.1200 9 C8 2.1496 1.1730 -0.9461 C.3 1 MOL1 0.1200 10 C9 1.9383 -1.2514 -0.4837 C.3 1 MOL1 0.1200 11 H11 -4.6967 0.7455 0.1659 H 1 MOL1 0.1000 12 H12 -5.0472 -0.9927 0.0417 H 1 MOL1 0.1000 13 H13 -4.5015 -0.3041 1.5875 H 1 MOL1 0.1000 14 H14 -0.5718 0.9172 -1.6399 H 1 MOL1 0.1000 15 H15 -0.7202 -0.8022 -1.3248 H 1 MOL1 0.1000 16 H16 -0.2368 -0.3946 1.0400 H 1 MOL1 0.1000 17 H17 -0.0872 1.3068 0.7218 H 1 MOL1 0.1000 18 H18 2.1823 1.3190 1.8019 H 1 MOL1 0.1000 19 H19 3.4141 0.2178 1.2571 H 1 MOL1 0.1000 20 H20 2.0340 -0.3870 2.1268 H 1 MOL1 0.1000 21 H21 1.7384 1.0660 -1.9552 H 1 MOL1 0.1000 22 H22 1.9316 2.2021 -0.6474 H 1 MOL1 0.1000 23 H23 3.2344 1.1254 -1.0896 H 1 MOL1 0.1000 24 H24 1.5616 -2.0481 0.1641 H 1 MOL1 0.1000 25 H25 3.0109 -1.4412 -0.5993 H 1 MOL1 0.1000 26 H26 1.5188 -1.4541 -1.4741 H 1 MOL1 0.1000 @ BOND 1 1 2 1 2 1 11 1 3 1 12 1 4 1 13 1 5 2 3 2 6 2 4 1 7 4 5 1 8 5 6 1 9 5 14 1 10 5 15 1 11 6 7 1 12 6 16 1 13 6 17 1 14 7 8 1 15 7 9 1 16 7 10 1 17 8 18 1 18 8 19 1 19 8 20 1 20 9 21 1 21 9 22 1 22 9 23 1 23 10 24 1 24 10 25 1 25 10 26 1 @ SUBSTRUCTURE 1 MOL1 1 TEMP 0 **** **** 0 ROOT
@<TRIPOS>MOLECULE
Sybyl is developed and distributed by
Tripos, Inc
St. Louis, MO
This Structure Data file (SDFile) is carefully described in A. Dalby, J. G. Nourse, et al., J. Chem. Inf. Comput. Sci., 32, 244-255 (1992) or online at www.mdl.com/downloads/ctfile/ctfile_subs.html.
The use of the SDFile format produced by MDL software is easily done. The user first produces the desired molecule files by using MDL software in its usual manner. These molecule files are incorporated into the input file along with the data lines desired by the user, following each Molfile. The record separating the Molfile from the data records contains 'M END'. See the example below and the reference given above. The information for each molecule is terminated by a blank record followed by a record containing $$$$. The whole SDFile is terminated with a blank record.
For example, included with HintLogP are two SDF files, demo3.sdf which contains 50 anonymous structures, and demo5.sdf which contains 12 common structures.
The First Two Structures in the demo5.sdf file Supplied with HintLogP Software:
Benzoic Acid ChemDraw02260010222D 9 9 0 0 0 0 0 0 0 0999 V2000 -2.4700 1.3000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 -1.7200 -0.0025 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -2.7800 -1.0625 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 -0.2200 -0.0025 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.5300 -1.3000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0300 -1.3000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.7825 -0.0025 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0300 1.2975 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.5300 1.2975 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 2 3 2 0 0 0 0 2 4 1 0 0 0 0 4 5 1 0 0 0 0 5 6 2 0 0 0 0 6 7 1 0 0 0 0 7 8 2 0 0 0 0 8 9 1 0 0 0 0 4 9 2 0 0 0 0 M END > 25
249.2 > 25 122.4 > 25 Benzenecarboxylic acid > 25 3-04-00 $$$$ m-methylbenzoic acid ChemDraw02250014002D 10 10 0 0 0 0 0 0 0 0999 V2000 -2.4700 1.9500 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 -1.7200 0.6475 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -2.7800 -0.4125 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 -0.2200 0.6475 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.5300 -0.6500 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0300 -0.6500 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.7825 0.6475 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0300 1.9475 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.5300 1.9475 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.7825 -1.9525 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 2 3 2 0 0 0 0 2 4 1 0 0 0 0 4 5 1 0 0 0 0 5 6 2 0 0 0 0 6 7 1 0 0 0 0 7 8 2 0 0 0 0 8 9 1 0 0 0 0 4 9 2 0 0 0 0 6 10 1 0 0 0 0 M END > 25 263 > 25 111-113 > 25 m-Toluic acid > 25 3-04-00 $$$$
(Note "blank" record to terminate file!!!)
MDL Information Systems, Inc.
San Leandro, CA