University of Tuebingen Lehrstuhl Kognitive Systeme, Prof Dr. Zell
print version HomeJOELib JOELib Tutorial Functionalities Molecule
 
Home
Introduction
Users/Publications
Screenshots
JOELib Tutorial
Contents
Preface
Installation
Basics
Functionalities
Molecule
SMARTS
Process and Filter
Import and Export
Atom typer
Utilities
Maintenance
Descriptors
Algorithms
Interfaces
Interfaces JNI
Interface ML
Documentation
Examples
Applications
Support
Structures
Bibliography
Glossary
Index
JOELib2 Tutorial
JOELib API
JOELib2 API
Download
Mailing lists
License
Acknowledgements
Links
 
JOELib@FM
JOELib@SF
PMD Online
PMD Offline
CVS Repository
 
Research at WSI-RA
Software at WSI-RA
WSI-RA Department
Faculty
University
 

Chapter 3. Molecule operation methods and classes

Molecule

Molecule data entries

Data entries.

Table 3-1. Predefined data types/names

Data typeData nameAllowed occurence
JOEDataType.JOE_UNDEFINED_DATAUndefinedmultiple
JOEDataType.JOE_VIRTUAL_BOND_DATAVirtualBondDatamultiple
JOEDataType.JOE_ROTAMER_LISTRotamerListmultiple
JOEDataType.JOE_EXTERNAL_BOND_DATAExternalBondDatamultiple
JOEDataType.JOE_COMPRESS_DATACompressDatamultiple
JOEDataType.JOE_COMMENT_DATACommentmultiple
JOEDataType.JOE_ENERGY_DATAEnergyDatamultiple
JOEDataType.JOE_PAIR_DATAPairDatasingle attribute name

Molecule descriptors

Set and get descriptor data entries

The typical data type to store descriptors is the JOEDataType.JOE_PAIR_DATA.

Every descriptor can be accessed by his name. To access the descriptor data entries efficiently the descriptor data entries are stored in a dictionary. Therefore descriptors can only occure once in a molecule.

Example 3-1. Getting descriptor data entries

// getting an iterator over all data elements
// including SSSR informations and other stuff
GenericDataIterator  gdit  = mol.genericDataIterator();
while ( gdit.hasNext() )
{
  // get the next data element
  genericData = gdit.nextGenericData();
  // use only the data elements which contains descriptor
  // or user defined data
  if ( genericData.getDataType() == JOEDataType.JOE_PAIR_DATA )
  {
    // write this descriptor data as typical data block
    // to an SD file
    ps.printf( ">  <%s>", genericData.getAttribute() );
    pairData = ( JOEPairData ) genericData;
    // write data in SD format, lines not longer than 80 characters
    // per line and remove empty lines in data entries with
    // ? or a character of your choice
    ps.println( pairData.toString( IOTypeHolder.instance().getIOType( "SDF" ) ) );
  }
}

Example 3-2. Setting descriptor data entries

// add a user defined data entry to the molecule
JOEPairData   dp = new JOEPairData();
// the data entry has the name 'attribute'
dp.setAttribute( attribute );
// and a typical String value
// own types must have the fromString and toString method !!!
dp.setValue( dataEntry.toString() );
mol.addData( dp );

Using external calculated descriptors

A big advantage is that you can use descriptors from other programs. If no calculation routine in JOELib exists all unknown descriptors (e.g. additional data elements in SDF-files) are handled as String's. If you know the data type you can simply define your own data parser/writer. All known decsriptors can be defined in joelib/data/plain/knownResults.txt. If you access data elements with mol.getData("DataName") the data element will be automatically parsed if the data type is known (e.g. atom or bond properties or matrices or ...).

You can supress data parsing by using mol.getData("DataName", false) which can be usefull if you not want to modify all data elements (should be faster !).

If you have special atom or bond properties you should always implement the joelib.molecule.types.AtomProperties or the joelib.molecule.types.BondProperties classes which guarantees you to access the data elements by the atom index or bond index which were used in JOELib. All implemented result classes are available at joelib/desc/result and contains simple types like int or double but also complex types like double array or int matrix. If you want use this data types in different file formats you should add your needs to the fromString(IOType ioType, String sValue) and toString(IOType ioType).

Available descriptors

Descriptor calculation example: joelib.test.TestDescriptor

Writing your own descriptor and result classes

All new descriptors should implement the joelib.desc.Descriptor-interface and be defined in the joelib.properties-file.

A simple example is the Kier descriptor joelib.desc.types.KierShape1. If you have a group of similar descriptors which uses the same initilization and result class you can write a wrapper class like joelib.desc.SMARTSCounter which can very easily be used to create a lot of SMARTS pattern count descriptors, e.g. joelib.desc.types.HBD1 to count the number of hydrogen donors in a molecule.

To remain user and developer friendly you should always produce a simple set of documentation files (XML, HTML, RTF) in the docs-directory:

The easiest way would be to create a XML DocBook documentation file in the docbook/descriptors-directory. These files can be easily transformed to HTML, RTF and PDF files. If you want using a formation in these descriptor documentation files you must use <sect1>...</sect1> or <sect2>...</sect2>, the <chapter> entries were already used by the tutorial book. Futhermore you can use listitems, tables or analoge elements. All these single descriptor documentation files will be generated by the Ant makefile mechanism (calling ant tutorial) and be available as HTML- and RTF-files in the docs/tutorial/descriptors/documentation-directory


Last changes: 08.12.2010, 16:49 CET (UTC/GMT +1 hour) wegner.
http://www.ra.cs.uni-tuebingen.de/software/joelib/tutorial/functionalities/molecule.html
2003 University of Tübingen, Germany