Skip to content
This repository has been archived by the owner on Jan 26, 2024. It is now read-only.

how to handle missing values #16

Open
cbaakman opened this issue Nov 29, 2021 · 1 comment
Open

how to handle missing values #16

cbaakman opened this issue Nov 29, 2021 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@cbaakman
Copy link
Collaborator

Currently reported types of missing values:

  • atom types that are not in the forcefield, in which case there's no charge values/vanderwaals parameters
  • conservation scores that are NaN, because the protein wasn't aligned at that particular position

Possibilities about how to fix this:
solution 1: skip PDB entries with missing values
solution 2: remove the residues/atoms with missing values from the PDB file, after it's loaded in.

@cbaakman cbaakman added the enhancement New feature or request label Nov 29, 2021
@cbaakman cbaakman self-assigned this Nov 29, 2021
@cbaakman
Copy link
Collaborator Author

cbaakman commented Dec 13, 2021

If variant is mapped to multiple PDBs, we can choose the one with no NaNs. However, this information isn't known until the script starts trying to preprocess the PDBs. So we need a more flexible script that can cancel preprocessings during the run.

@rgayatri rgayatri reopened this Apr 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants