Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A bug on using protein family #10

Open
chao1224 opened this issue Aug 1, 2022 · 1 comment
Open

A bug on using protein family #10

chao1224 opened this issue Aug 1, 2022 · 1 comment

Comments

@chao1224
Copy link

chao1224 commented Aug 1, 2022

Hi,

I'm testing the dataset curated from protein family, e.g., configs/curators/sbap_core_ki_protein_family.py. And I get the following exception:

Traceback (most recent call last):
  File "xxx/drugood/apis/curate.py", line 14, in curate_data
    data = curator.data_splitting(data)
  File "xxx/drugood/curators/curator.py", line 206, in data_splitting
    domain_value = domain_func(value_for_generating_domain)
  File "xxx/drugood/curators/get_domain_info.py", line 75, in protein_family
    class_id = self.protein_family_getter(protein_seq)
  File "xxx/drugood/curators/chembl/protein_family.py", line 48, in __call__
    target_level_class_id = self.get_target_level_class_id(class_id)
  File "xxx/drugood/curators/chembl/protein_family.py", line 37, in get_target_level_class_id
    class_id_cur_level = self.dict_id_to_parent_level[class_id_cur_level][0]
KeyError: None

It turns out that protein_family_level is None.


A quick update: this line fails to pass in the protein_family_level.

@chao1224 chao1224 changed the title Exception on using protein family A bug on using protein family Aug 1, 2022
@Tigerrr07
Copy link

I also have the same question, so I debug for that, the problem comes from the following code lines:

domain_cfg = self.cfg.domain
domain_info_funcs_set = DomainInfo(self.cfg, self.sql)

It should be modified to this:

domain_cfg = self.cfg.domain

domain_info_funcs_set = DomainInfo(domain_cfg, self.sql)

This is because DomainInfo needs to accept domain_cfg instead of the original cfg. By doing that, I can generate the dataset curated from protein family successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants