Skip to content

Handling duplicated SMILES with Libinvent #166

Answered by halx
marco-chimfarm asked this question in Q&A
Discussion options

You must be logged in to vote

So, the idea to zero-score duplicate SMILES is to promote a level of diversity. We have done internal tests to switch this off and we found that it did not seem to have any benefit regarding learning rate but lowered diversity. You also need to keep on mind that the final aggregated total score is a single float value calculated as the average from the individual SMILES scores. So the effect may be rather minimal in practice. If you start sampling excessive number of duplicates, you are running out of chemical space anyway and should probably stop RL.

Replies: 4 comments 6 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
4 replies
@halx
Comment options

halx Feb 24, 2025
Maintainer

@marco-chimfarm
Comment options

@halx
Comment options

halx Feb 24, 2025
Maintainer

@marco-chimfarm
Comment options

Comment options

You must be logged in to vote
0 replies
Answer selected by halx
Comment options

You must be logged in to vote
2 replies
@halx
Comment options

halx Feb 27, 2025
Maintainer

@halx
Comment options

halx Feb 27, 2025
Maintainer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants