Welcome to the homepage of the Symbolic Modeler (SyMod) method and open-source software developed and distributed by the Computational Genetics Laboratory (CGL) at Dartmouth Medical School with funding from NIH R01 AI59694 and R01 LM009012.

An alpha version of SyMod is available for testing by request from:


The goal of SyMod is to use machine learning to develop symbolic models of the relationship between one or more discrete and/or continuous attributes (i.e. independent variables) and a discrete or continuous endpoint (i.e. dependent variable). SyMod was developed as a data mining alternative to parametric statistical methods such as linear regression and logistic regression that are based on the generalized linear model. Unlike linear models, SyMod allows the user to specify a set of mathematical functions and operators that are then used to build predictive models using stochastic search algorithms such as genetic programming.

The current version of SyMod uses Symbolic Discriminant Analysis (SDA) to model discrete (i.e. binary) endpoints and Symbolic Regression to model continuous endpoints.  See the recent paper by Moore et al. in Human Heredity for an up to date description of SDA. More information about Symbolic Regression can be found in the books by Dr. John Koza.

A major strength of SyMod is the ability use domain specific knowledge to help guide the search for an optimal model.  This is particularly important when the individual attributes or features do not have independent effects on the endpoint.

For more information about SyMod please see the Computational Genetics Laboratory website at www.epistasis.org and visit Epistasis Blog at compgen.blogspot.com.

Here are some recent screenshots:





The page was last modified by JHM on February 25, 2007