Graph-based molecular Pareto optimisation

dc.date.accessioned	2023-03-16T16:11:11Z
dc.date.available	2023-03-16T16:11:11Z
dc.date.created	2022-06-10T13:41:47Z
dc.date.issued	2022
dc.identifier.citation	Verhellen, Jonas . Graph-based molecular Pareto optimisation. Chemical Science. 2022, 13(25), 7526-7535
dc.identifier.uri	http://hdl.handle.net/10852/101521
dc.description.abstract	Computer-assisted design of small molecules has experienced a resurgence in academic and industrial interest due to the widespread use of data-driven techniques such as deep generative models. While the ability to generate molecules that fulfil required chemical properties is encouraging, the use of deep learning models requires significant, if not prohibitive, amounts of data and computational power. At the same time, open-sourcing of more traditional techniques such as graph-based genetic algorithms for molecular optimisation [Jensen, Chem. Sci., 2019, 12, 3567–3572] has shown that simple and training-free algorithms can be efficient and robust alternatives. Further research alleviated the common genetic algorithm issue of evolutionary stagnation by enforcing molecular diversity during optimisation [Van den Abeele, Chem. Sci., 2020, 42, 11485–11491]. The crucial lesson distilled from the simultaneous development of deep generative models and advanced genetic algorithms has been the importance of chemical space exploration [Aspuru-Guzik, Chem. Sci., 2021, 12, 7079–7090]. For single-objective optimisation problems, chemical space exploration had to be discovered as a useable resource but in multi-objective optimisation problems, an exploration of trade-offs between conflicting objectives is inherently present. In this paper we provide state-of-the-art and open-source implementations of two generations of graph-based non-dominated sorting genetic algorithms (NSGA-II, NSGA-III) for molecular multi-objective optimisation. We provide the results of a series of benchmarks for the inverse design of small molecule drugs for both the NSGA-II and NSGA-III algorithms. In addition, we introduce the dominated hypervolume and extended fingerprint based internal similarity as novel metrics for these benchmarks. By design, NSGA-II, and NSGA-III outperform a single optimisation method baseline in terms of dominated hypervolume, but remarkably our results show they do so without relying on a greater internal chemical diversity.
dc.language	EN
dc.rights	Attribution 3.0 Unported
dc.rights.uri	https://creativecommons.org/licenses/by/3.0/
dc.title	Graph-based molecular Pareto optimisation
dc.title.alternative	ENEngelskEnglishGraph-based molecular Pareto optimisation
dc.type	Journal article
dc.creator.author	Verhellen, Jonas
cristin.unitcode	185,15,29,0
cristin.unitname	Institutt for biovitenskap
cristin.ispublished	true
cristin.fulltext	original
cristin.qualitycode	2
dc.identifier.cristin	2030838
dc.identifier.bibliographiccitation	info:ofi/fmt:kev:mtx:ctx&ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=Chemical Science&rft.volume=13&rft.spage=7526&rft.date=2022
dc.identifier.jtitle	Chemical Science
dc.identifier.volume	13
dc.identifier.issue	25
dc.identifier.startpage	7526
dc.identifier.endpage	7535
dc.identifier.doi	https://doi.org/10.1039/d2sc00821a
dc.type.document	Tidsskriftartikkel
dc.type.peerreviewed	Peer reviewed
dc.source.issn	2041-6520
dc.type.version	PublishedVersion