-
Hello, |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
Hi, welcome to the REINVENT community and many thanks for your interest in the software. MatchingSubstructure is implemented as a penalty (currently the only one) which means that the final score is multipled by 1.0 if the generated compound matches the SMARTS pattern (a single string) or by 0.5 if it does not. So its seems that this is what you are looking for. Many thanks, |
Beta Was this translation helpful? Give feedback.
-
Yes, penalties are applied last in the current implementation. What you are really asking for is having this as a filter which would run first and would ensure that further scoring only happens for molecules that pass the filter. Now, there are several things here to consider. It takes a while until the agent learns to generate molecules with the preferred pattern, Compounds not matching the SMARTS pattern will still be scored with 0.5. So there will be some sampling inefficiency in the RL run, you can easily check that by only running the penalty (probably together with some othe scoring function to create sensible molecules). Also, a matching substructure alone will not guarantee a sensible docked structure. A major reason why we have staged learning aka curriculum learning is to start out with cheaply to compute scoring components and then phase in computationally more demanding components in a later stage. The idea here is obviously to invoke e.g. docking only at a time when the agent is expected to generate molecules with the preferred properties with high probability. What you really seem to be asking for is a fixed sub-pattern in the molecule. It may therefore be possible to use our constraint (conditioned) priors Linkinvent and/or Libinvent. The more "natural" one would seem to be Linkinvent as here the idea here is to link two fragments with a common scaffold/linker (think PROTACS). But in its current implementation the requirement is exactly two fragments and we will support, I believe, 1-4 fragments only with a new prior to be published later. The idea of Libinvent is to decorate a scaffold with R-groups. Here we already support 1-4 attachments. I am a bit doubtful if this can work for you because the training data for Libinvent may not support the type of molecules you are after. But it may be worth a try to see if you can push an RL agent into the chemical space you are looking for. |
Beta Was this translation helpful? Give feedback.
Hi,
welcome to the REINVENT community and many thanks for your interest in the software.
MatchingSubstructure is implemented as a penalty (currently the only one) which means that the final score is multipled by 1.0 if the generated compound matches the SMARTS pattern (a single string) or by 0.5 if it does not. So its seems that this is what you are looking for.
Many thanks,
Hannes.