The design of new molecules has countless applications in various industrial sectors, including pharmaceuticals and materials. However, identifying molecules with the desired properties is a complex task, as it involves identifying specific elements within the vast and structurally complex chemical space.
Mathematically, this task can be likened to a combinatorial optimization problem, often stochastic and multi-objective, with black-box objective functions and constraints. To find approximate solutions to this problem, two interrelated steps are necessary:
Developing predictive models that can forecast the properties of interest from the chemical structure of the molecules.
Creating algorithms for the automatic generation of molecules (de-novo generation) that meet specific structural constraints and optimize the predicted properties from the first stage.
In this course, we will explore various machine learning strategies that can be utilized to effectively navigate the chemical space.
During the first session, we will delve into the process of fitting predictive models that can forecast the properties of molecules based on their structure. We will give special consideration to the challenge posed by the small data regime, which is a crucial obstacle in this field.
The session will be structured as follows:
In this second session, using the concepts presented previously, we will introduce new models that perform de-novo molecular design. We will employ pre-existing data to design novel molecules that are different from those present in our database. In most cases, this design phase aims at designing molecules which optimize a desired target properties in the attempt to produce new compounds. We will use the different molecular representations throughout the session wherever necessary.
The session will be structured as follows: