Machine learning classification can reduce false positives in structure-based virtual screening 论文

2020Proceedings of the National Academy of Sciences引用 220
Cell Image Analysis TechniquesGenetics, Bioinformatics, and Biomedical ResearchComputational Drug Discovery Methods

摘要

Significance Many potential drug targets have been identified, but development of chemical probes to validate these targets has lagged behind. Computational screening holds promise for providing chemical tools to do so but has long been plagued by high false-positive rates: Many compounds ranked highly against a given target protein do not actually show activity. Machine learning approaches have not solved this problem, which we hypothesize is because models were not trained on sufficiently compelling “decoys.” By addressing this through a unique training strategy, we show that more effective virtual screening is attainable. We expect this insight to enable improved performance across diverse virtual screening pipelines, thus helping to provide chemical probes for new potential drug targets as they are discovered.