Best practices for constructing, preparing, and evaluating protein-ligand binding affinity benchmarks

Abstract

Free energy calculations are rapidly becoming indispensable in structure-enabled drug discovery programs. As new methods, force fields, and implementations are developed, assessing their expected accuracy on real-world systems (benchmarking) becomes critical to provide users with an assessment of the accuracy expected when these methods are applied within their domain of applicability, and developers with a way to assess the expected impact of new methodologies. These assessments require construction of a benchmark—a set of well-prepared, high quality systems with corresponding experimental measurements designed to ensure the resulting calculations provide a realistic assessment of expected performance when these methods are deployed within their domains of applicability. To date, the community has not yet adopted a common standardized benchmark, and existing benchmark reports suffer from a myriad of issues, including poor data quality, limited statistical power, and statistically deficient analyses, all of which can conspire to produce benchmarks that are poorly predictive of real-world performance. Here, we address these issues by presenting guidelines for

curating experimental data to develop meaningful benchmark sets,
preparing benchmark inputs according to best practices to facilitate widespread adoption, and
analysis of the resulting predictions to enable statistically meaningful comparisons among methods and force fields.

We highlight challenges and open questions that remain to be solved in these areas, as well as recommendations for the collection of new datasets that might optimally serve to measure progress as methods become systematically more reliable. Finally, we provide a curated, versioned, open, standardized benchmark set adherent to these standards protein-ligand-benchmark and an open source toolkit for implementing standardized best practices assessments arsenic for the community to use as a standardized assessment tool. While our main focus is free energy methods based on molecular simulations, these guidelines should prove useful for assessment of the rapidly growing field of machine learning methods for affinity prediction as well.

List of Authors

DH: David Hahn
CIB: Christopher I. Bayly
HBM: Hannah Bruce Macdonald
JDC: John D. Chodera
VG: Vytautas Gapsys
ASJSM: Antonia S. J. S. Mey
DLM: David L. Mobley
LPB: Laura Perez Benito
CS Christina E.M. Schindler
GT: Gary Tresadern
GW: Gregory L. Warren

List of Contributors

(non-author list of people who contributed to document)

Paper Writing as Code Development

This paper is being developed as a living document, open to changes from the community. You can read more about the concept of writing a paper in the same way one would write software code in the essay "Paper writing as code development". If you have comments or suggestions, we welcome them! Please submit them as issues to this GitHub repository so they can be recorded and given credit for the contribution. Specific changes can be proposed via pull requests.

Online Resources

Original brainstorming document: (https://docs.google.com/document/d/1lCGcol6jYLQmcfqrUv9h_FsWygTZzqYxqgjOLCyMoL4/edit)

Design Notes:

Figure labels are in captital bold face
Standard font for all figures should be Helvetica. Font size for lables should be at least the size of the document text.

List of Released Versions

revision for version 1.0

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
examples		examples
figures		figures
releases		releases
CHANGELOG		CHANGELOG
README.md		README.md
livecoms-sample.bib		livecoms-sample.bib
livecoms.cls		livecoms.cls
main.tex		main.tex
struct_table.tex		struct_table.tex
vancouver-livecoms.bst		vancouver-livecoms.bst

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Best practices for constructing, preparing, and evaluating protein-ligand binding affinity benchmarks

Abstract

List of Authors

List of Contributors

Paper Writing as Code Development

Online Resources

Design Notes:

List of Released Versions

About

Releases

Packages

Contributors 3

Languages

openforcefield/protein-ligand-benchmark-livecoms

Folders and files

Latest commit

History

Repository files navigation

Best practices for constructing, preparing, and evaluating protein-ligand binding affinity benchmarks

Abstract

List of Authors

List of Contributors

Paper Writing as Code Development

Online Resources

Design Notes:

List of Released Versions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages