In this paper we discuss the problem of predicting the fish toxicity property of chemical compounds, and show how this can be approached using a computational intelligence method. There are two views on assessing toxicities: One says that such properties can be derived from the whole molecular structure, the other that some specific functional substructures, called Structural Alerts (SA), are able to explain the toxicity. In this work, a new Structure-Activity Relationship (SAR) approach is proposed to mine molecular fragments that act like SAs for the biological activity. We apply our data mining method, called SARpy, to a dataset about LC50 for the fathead minnow, and build a multiclass classifier in the categories defined by the legislation. We test the model on an external test set of data about trout toxicity. The new model shows marked prediction skills and, more interestingly, it is based on mined structural alerts. Discovering new knowledge about substructures statistically strongly connected to toxicity opens to other future in-silico methods. The model is freely available in the VEGA huba among other models for aquatic toxicity.
ahttps://www.vegahub.eu