You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I encountered an issue where the program throws an UnboundLocalError for the variable rank_ids when none of the distances between query samples and reference samples meet the defined threshold criteria.
Error Message
Traceback (most recent call last): File "/usr/local/bin/gas", line 10, in <module> sys.exit(main()) ^^^^^^ File "/usr/local/lib/python3.11/site-packages/genomic_address_service/main.py", line 43, in main exec('genomic_address_service.' + task + '.run()') File "<string>", line 1, in <module> File "/usr/local/lib/python3.11/site-packages/genomic_address_service/call.py", line 109, in run obj = assign(dist_file,membership_file,threshold_map,linkage_method) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/genomic_address_service/classes/assign.py", line 86, in __init__ self.assign() File "/usr/local/lib/python3.11/site-packages/genomic_address_service/classes/assign.py", line 254, in assign a[i] = self.nomenclature_cluster_tracker[rank_ids[i]] ^^^^^^^^ UnboundLocalError: cannot access local variable 'rank_ids' where it is not associated with a value
Steps to Reproduce:
Run the gas call command with a --dists file where the distances between the query and reference samples are significantly larger than the defined thresholds (e.g., distances in the thousands, thresholds at 10, 5, and 0).
The error occurs when all distance values exceed the set thresholds, causing rank_ids to not be assigned a value.
Analysis
It appears that when none of the distances meet the threshold criteria, the variable rank_ids is not properly initialized or assigned, causing the error.
This issue is avoided when at least one pair of samples has a distance that falls within the threshold, resulting in cluster addresses being assigned to all query samples.
Suggested Fix
It would be helpful to add error handling or checks to prevent the unbound error by ensuring that rank_ids is properly initialized, even when no sample comparisons fall within the thresholds.
Additional Context
This error was encountered while processing Salmonella enterica samples (sourced from NCBI) through the mikrokondo, followed by running a subset through gasclustering to assign cluster addresses using gas mcluster. Two remaining samples (SH01, SH02) lacked assigned cluster addresses, leading to the execution of the gasnomenclature , where the gas call command was used.
Notably, increasing the thresholds (--gm_thresholds "3500,1000,500") allowed the query sample to be successfully assigned. Alternatively, rerunning the samples through gasclustering and gasnomenclature with adjusted parameters (--pd_distm scaled and --gm_threshold "50,20,0") also resulted in a successful assignment.
The text was updated successfully, but these errors were encountered:
Description
I encountered an issue where the program throws an
UnboundLocalError
for the variablerank_ids
when none of the distances between query samples and reference samples meet the defined threshold criteria.Error Message
Traceback (most recent call last): File "/usr/local/bin/gas", line 10, in <module> sys.exit(main()) ^^^^^^ File "/usr/local/lib/python3.11/site-packages/genomic_address_service/main.py", line 43, in main exec('genomic_address_service.' + task + '.run()') File "<string>", line 1, in <module> File "/usr/local/lib/python3.11/site-packages/genomic_address_service/call.py", line 109, in run obj = assign(dist_file,membership_file,threshold_map,linkage_method) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/genomic_address_service/classes/assign.py", line 86, in __init__ self.assign() File "/usr/local/lib/python3.11/site-packages/genomic_address_service/classes/assign.py", line 254, in assign a[i] = self.nomenclature_cluster_tracker[rank_ids[i]] ^^^^^^^^ UnboundLocalError: cannot access local variable 'rank_ids' where it is not associated with a value
Steps to Reproduce:
gas call
command with a--dists
file where the distances between the query and reference samples are significantly larger than the defined thresholds (e.g., distances in the thousands, thresholds at 10, 5, and 0).reference_clusters.txt
id address level_1 level_2 level_3
SE01 3.3.5 3 3 5
SE02 3.3.4 3 3 4
SE03 2.2.3 2 2 3
SE04 1.1.1 1 1 1
SE04 1.1.1 1 1 2
results.txt
(fromprofile_dists
)query_id ref_id dist
SH01 SH01 0
SH01 SE04 3346
SH01 SE05 3346
SH01 SE03 3350
SH01 SE02 3359
SH01 SE01 3360
SH01 SH02 3369
SH02 SH02 0
SH02 SE02 22
SH02 SE01 23
SH02 SE03 43
SH02 SE04 45
SH02 SE05 45
SH02 SH01 3369
Analysis
It appears that when none of the distances meet the threshold criteria, the variable
rank_ids
is not properly initialized or assigned, causing the error.Suggested Fix
It would be helpful to add error handling or checks to prevent the unbound error by ensuring that
rank_ids
is properly initialized, even when no sample comparisons fall within the thresholds.Additional Context
This error was encountered while processing Salmonella enterica samples (sourced from NCBI) through the
mikrokondo
, followed by running a subset throughgasclustering
to assign cluster addresses usinggas mcluster
. Two remaining samples (SH01, SH02) lacked assigned cluster addresses, leading to the execution of thegasnomenclature
, where thegas call
command was used.Notably, increasing the thresholds (
--gm_thresholds "3500,1000,500"
) allowed the query sample to be successfully assigned. Alternatively, rerunning the samples throughgasclustering
andgasnomenclature
with adjusted parameters (--pd_distm scaled
and--gm_threshold "50,20,0"
) also resulted in a successful assignment.The text was updated successfully, but these errors were encountered: