Barry Husowitz and Reinaldo Sanchez-Arias
A support vector classification wrapper feature elimination approach was used to find the most relevant pairs of molecular features that adequately and accurately can predict acute aquatic toxicity. These pairs were then used to derive chemical thresholds or boundaries between chemical properties for toxic and nontoxic organic chemicals that can be used as a “rule of thumb” to design less toxic chemicals. The most relevant pairs were determined to be: Lowest Unoccupied Molecular Orbital (LUMO) and Aqueous Solubility (QPlogS), Difference between the LUMO and HOMO (dE) and Octonal-Water Partition Coefficient (QPlogo.w), and Difference between the LUMO and HOMO (dE) and Van der Waals surface area of polar nitrogen and oxygen atoms (PSA). Projected hyper planes were constructed for each pair and the following thresholds were found: for Lowest Unoccupied Molecular Orbital (LUMO) and Aqueous Solubility (QPlogS) they roughly correspond to QPlogS>-1 and LUMO>1, and for Octonal-Water Partition Coefficient (QPlogo.w) vs. difference between the LUMO and HOMO (dE) they roughly correspond to QPlogo.w<1 and dE>9. This study shows how a statistical approach such as support vector machines can be applied to the rational design of chemicals with reduced toxicity.
PDFShare this article
Journal of Biometrics & Biostatistics received 3496 citations as per Google Scholar report