[APPAI-1776] Fixed sentry error due to invalid version #737

dgpatelgit · 2021-03-08T09:23:17Z

Fixes issue with version parsing during CA post call. There was version string clean up required for maven ecosystem, but this generic code was applied to all ecosystem. Also reduced log level from error to warning as CA response is served with return status 202 including other packages with valid version. Package with invalid version will be returned as unknown_packages in the response.

Fixes jira issue :: https://issues.redhat.com/browse/APPAI-1776

Reduced vulnerabilities from 22 to 1 and resolved 1 exploits by upgrading dependent package versions in manifest file.

Address below sentry errors:
Stage :: https://sentry.stage.devshift.net/sentry/fabric8-analytics-stage/issues/202486/
Prod :: https://sentry.devshift.net/sentry/fabric8-analytics-production/issues/250686/

yzainee-zz · 2021-03-08T10:35:31Z

bayesian/utils.py

+        if ecosystem == "maven":
+            """Needed for maven version like 1.5.2.RELEASE to be converted to
+            1.5.2 - RELEASE for semantic version to work."""
+            version = version.replace('.', '-', 3)


what value will be sent for ingestion in case the pkg is unknown? Also, is the version in graph going to be stored with . or - ? While fetching details from graph, will it expect in . or - form?

This functions is used to format the version we report in recommended version field of CA POST call. This is not used for any query graph / ingestion purpose. Also I don't know why maven version are converted from '.' to '-' forms, this generic conversion was creating issue for golang for sure, so moved maven specific code under ecosystem check.

ok. can you ask the original coder to check why do we need this in the first place?

@deepak1725 Can you add oldie who can provide us reasoning behind this strange version transformation for Maven ecosystem?

This code was written 3 years ago, You will hardly found the author of that.

dgpatelgit

@yzainee You suggest to check on version_comparator to remove this customize version transformation in API server. I did integrated version_comparator, but we cannot use this library directly because of following reasons.

Version comparator is used for version comparison only we cannot transform them to form a tuple with major, minor, patch and build. This tuple is refereed further in complete flow of CA response builder.
Invalid version string validation is missing in version comparator. This functionality is done by existing function in API server, which is a valid use-case.
With golang, we have to support semver and pseudo version format, which is taken care by existing function.

yzainee-zz · 2021-03-11T09:27:45Z

@dgpatelgit
1- why do you need major, minor patch etc? all you should care is about the version. how does it matter where it fits?
2- Please specify what type of validations are you referring to
3- FYI, im asking you to use version comparator just to find which is the latest etc. where else do you see a use of tuple and all?

dgpatelgit · 2021-03-11T09:42:53Z

@dgpatelgit
1- why do you need major, minor patch etc? all you should care is about the version. how does it matter where it fits?
2- Please specify what type of validations are you referring to
3- FYI, im asking you to use version comparator just to find which is the latest etc. where else do you see a use of tuple and all?

We need to validate version string for invalid version like 'upstream/1.0.0' or 'kubernetes.1.43.0' that we get in the request. Such validation are not provided by version comparator. I verified the current test cases using version comparator, not every thing can be handled. One such example is if we have 'upstream/1.0.1' (invalid version) and empty version (which is default HIGH version), current code return empty version as other version is invalid, but same code in version comparator will return 'upstream/1.0.1' as high version. This is not correct.

yzainee-zz · 2021-03-11T10:59:06Z

version comparator, as the name suggests, is used to compare versions. Its not a version validator. If there is any validation which is needed, it can be done in the code outside its purview. The comment is just to stop using tuple so that comparison can be done easily. how is tuple anyways helping you solve and check version;s validation?

dgpatelgit · 2021-03-11T11:59:34Z

@yzainee The issue is not in version comparison, this bug or sentry error was due to version validation only. So, as part of the investigation, this PR fixes only version parsing / validation part. No modifying any part of version comparison logic. If required we can have separate task for this.

dgpatelgit · 2021-03-17T09:01:00Z

@yzainee Finally your wish came true, I have removed customer function and use version comparator to get highest non-cve version for recommendation.

…alytics-server into APPAI-1776

This reverts commit 7fb71d1.

yzainee-zz · 2021-03-17T11:08:00Z

bayesian/utility/v2/ca_response_builder.py

-                    highest_version = version
+        try:
+            input_version_comparable = ComparableVersion(self.version)
+            for version in latest_non_cve_versions:


latest non cve version will never return a list/set. Its always a single value. Why are we using a loop here?

In the existing flow this is function variable and comes as a list. Below was the code that populates this parameter:
latest_non_cve_versions = result_data[0].get('package', {}).get('latest_non_cve_version', [])

latest_non_cve_version = mgmt.makePropertyKey('latest_non_cve_version').dataType(String.class).make();
schema for this value. it will not allow to store a list

We use gremlin to fetch the data, below is batch gremlin query we are using in CA POST flow.

ca_batch_query = """
epv = [];packages.each {g.V().has('pecosystem', ecosystem).has('pname', it.name)
.has('version', it.version).as('version', 'cve').select('version').in('has_version')
.dedup().as('package').select('package', 'version', 'cve')
.by(valueMap()).by(valueMap()).by(out('has_snyk_cve')
.valueMap().fold()).fill(epv);};epv;
"""

Out of this query provides packages with non-cve version as an list, though it has always one value, but technically its a list.

In fact most of the properties are list as shown in snypet of response below:

"package": {
"ecosystem": [
"npm"
],
"gh_subscribers_count": [
5
],
"gh_contributors_count": [
-1
],
"latest_version_last_updated": [
"20191117"
],
"vertex_label": [
"Package"
],
"libio_dependents_repos": [
"634788"
],
"latest_non_cve_version": [
"7.0.3"
],
"gh_issues_last_year_opened": [
-1
],
"gh_issues_last_month_closed": [
-1
],
"gh_open_issues_count": [
5
],
"libio_dependents_projects": [
"290"
],
"latest_version": [
"7.0.3"
],
"tokens": [
"cliui"
],
"package_relative_used": [
"not used"
],
"gh_stargazers": [
210
],
"gh_forks": [
16
],
"package_dependents_count": [
-1
],
"gh_prs_last_month_opened": [
-1
],
"gh_issues_last_year_closed": [
-1
],
"last_updated": [
1.6046478431252089E9
],

in that case you can pick [0]

@yzainee I would have not touched the existing logical flow, however as per above discussion I have removed latest non-cve into simple comparison check. PTAL

yzainee-zz

LGTM

bayesian/utility/v2/ca_response_builder.py

arajkumar · 2021-03-18T07:48:00Z

bayesian/utility/v2/ca_response_builder.py

+            if latest_non_cve_version_comparable > input_version_comparable:
+                highest_version = latest_non_cve_versions[0]
+
+        except Exception as e:


Don't cache generic exception, Catch the specific one which will be thrown by ComparableVersion.

arajkumar · 2021-03-18T07:50:05Z

bayesian/utility/v2/ca_response_builder.py

+                highest_version = latest_non_cve_versions[0]
+
+        except Exception as e:
+            logger.error(f"Package {self.package} @ {self.version} raised exception {e}")


It is not recommended to use preformatted string in logging. it would expanded regardless of log level which is not good.

Suggested change

logger.error(f"Package {self.package} @ {self.version} raised exception {e}")

logger.exception("Unable to parse version %s@%s", self.package, self.version)

arajkumar · 2021-03-18T07:50:12Z

bayesian/utility/v2/ca_response_builder.py

+        except Exception as e:
+            logger.error(f"Package {self.package} @ {self.version} raised exception {e}")
+
+        logger.info("Highest non-cve version for "


bayesian/utility/v2/ca_response_builder.py

codecov-io · 2021-03-18T09:14:25Z

Codecov Report

Merging #737 (8bb605a) into master (f4edec1) will increase coverage by 0.50%.
The diff coverage is 77.77%.

@@            Coverage Diff             @@
##           master     #737      +/-   ##
==========================================
+ Coverage   84.09%   84.59%   +0.50%     
==========================================
  Files          21       21              
  Lines        1603     1584      -19     
==========================================
- Hits         1348     1340       -8     
+ Misses        255      244      -11

Impacted Files	Coverage Δ
bayesian/utils.py	`71.68% <ø> (+6.29%)`	⬆️
bayesian/utility/v2/ca_response_builder.py	`97.43% <77.77%> (-0.87%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f4edec1...8bb605a. Read the comment docs.

dgpatelgit · 2021-03-19T05:03:38Z

Edit: ✌️ Image Build Successfull @dgpatelgit, Avaliable at: ghcr.io/fabric8-analytics/fabric8-analytics-server/bayesian-bayesian-api:SNAPSHOT-PR-737

arajkumar · 2021-03-19T05:26:43Z

bayesian/utility/v2/ca_response_builder.py

+        try:
+            input_version_comparable = ComparableVersion(self.version)
+
+            # latest non-cve is list with only one entry.
+            latest_non_cve_version_comparable = ComparableVersion(latest_non_cve_versions[0])
+        except TypeError:
+            logger.error("Package %s@%s raised a TypeError", self.package, self.version)
+
+        if latest_non_cve_version_comparable > input_version_comparable:
+            highest_version = latest_non_cve_versions[0]
+
+        logger.info("Highest non-cve version for %s@%s is %s", self.package, self.version,
+                    highest_version)


I've stated that the comparison must go into exception else block, this is incorrect.

Suggested change

try:

input_version_comparable = ComparableVersion(self.version)

# latest non-cve is list with only one entry.

latest_non_cve_version_comparable = ComparableVersion(latest_non_cve_versions[0])

except TypeError:

logger.error("Package %s@%s raised a TypeError", self.package, self.version)

if latest_non_cve_version_comparable > input_version_comparable:

highest_version = latest_non_cve_versions[0]

logger.info("Highest non-cve version for %s@%s is %s", self.package, self.version,

highest_version)

try:

input_version_comparable = ComparableVersion(self.version)

# latest non-cve is list with only one entry.

latest_non_cve_version_comparable = ComparableVersion(latest_non_cve_versions[0])

except TypeError:

logger.error("Package %s@%s raised a TypeError", self.package, self.version)

else:

if latest_non_cve_version_comparable > input_version_comparable:

highest_version = latest_non_cve_versions[0]

logger.info("Highest non-cve version for %s@%s is %s", self.package, self.version,

highest_version)

arajkumar · 2021-03-19T05:37:48Z

tests/test_utils.py

+    @patch('bayesian.utils.init_celery', return_value=None)
+    @patch('bayesian.utils.run_flow', return_value=1234)
+    def test_server_run_flow(self, _rf_mock, _cf_mock):
+        """Test basic run flow function."""
+        assert server_run_flow('RUN_FLOW_NAME', {}) == 1234
+
+    @patch('bayesian.utils.init_celery', return_value=None)
+    @patch('bayesian.utils.run_flow', return_value=1234)
+    def test_create_component_bookkeeping(self, _rf_mock, _cf_mock):
+        """Verify create componenet book keeping utility function."""
+        assert create_component_bookkeeping('pypi', ['pkg1', 'pkg2'], {}, {}) == 1234
+
+    @patch('bayesian.utils.init_celery', return_value=None)
+    @patch('bayesian.utils.run_flow', return_value=1234)
+    def test_server_create_analysis(self, _rf_mock, _cf_mock):
+        """Verify various combinations of create analysis function."""
+        assert server_create_analysis('pypi', 'pkg1', '1.5.4', None, False, False, False) == 1234
+
+        assert server_create_analysis('pypi', 'pkg1', '1.5.4', None, True, False, False) == 1234
+
+        pkg = 'debug:artifact:0.4.3'
+        assert server_create_analysis('maven', pkg, '0.4.3', None, False, False, False) == 1234


What is the purpose of these tests? why does it only asserts mocked func return values which will be always same?

Reason was Codecov, after removing existing util functions codecov was failing due to reduction in coverage percentage (it was as low as 18% for utility file). You can see few commit before error on Codecov. To improve unit test coverage I added some of missing test cases for run-flow utility functions.

That is fine as the number of LOCs are reduced. But these tests doesn't make any sense to me.

@arajkumar what do you suggest? shall we reduce Codecov level from B to C. Currently we fail Codecov if any file has level less than 'B' which is 19% coverage.

arajkumar

lgtm

dgpatelgit added 2 commits March 8, 2021 14:49

[APPAI-1776] Fixed sentry error due to invalid version

52eddb0

Merge branch 'master' into APPAI-1776

94b4e6a

yzainee-zz reviewed Mar 8, 2021

View reviewed changes

dgpatelgit commented Mar 11, 2021

View reviewed changes

dgpatelgit added 2 commits March 17, 2021 14:28

Used version comparator utility instead of custom code

3ecf15d

Merge branch 'master' into APPAI-1776

fd1b484

dgpatelgit requested review from prashbnair and arajkumar March 17, 2021 09:05

dgpatelgit added 3 commits March 17, 2021 16:14

Moved version comparator to latest code

7fb71d1

Merge branch 'APPAI-1776' of https://github.com/dgpatelgit/fabric8-an…

a90bed7

…alytics-server into APPAI-1776

Revert "Moved version comparator to latest code"

fd31384

This reverts commit 7fb71d1.

yzainee-zz requested changes Mar 17, 2021

View reviewed changes

dgpatelgit added 2 commits March 18, 2021 09:49

Removed latest non-cve version loop

7704a58

Merge branch 'master' into APPAI-1776

a5b4e51

yzainee-zz approved these changes Mar 18, 2021

View reviewed changes

arajkumar requested changes Mar 18, 2021

View reviewed changes

dgpatelgit added 2 commits March 18, 2021 14:23

Merge branch 'master' into APPAI-1776

083ea9f

Reduced security issues and resolve 1 exploits, add required deps

9b646b3

dgpatelgit added 3 commits March 18, 2021 16:03

Added more test cases to improve codecov

38411cb

Merge branch 'master' into APPAI-1776

40fcea4

Moved conditions outside exception as per review comments

edf9e0d

arajkumar requested changes Mar 19, 2021

View reviewed changes

Moved comparison to else block as per review comments

8bb605a

Removed test case as per comments

d50efdb

arajkumar approved these changes Mar 22, 2021

View reviewed changes

dgpatelgit merged commit c07d525 into fabric8-analytics:master Mar 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[APPAI-1776] Fixed sentry error due to invalid version #737

[APPAI-1776] Fixed sentry error due to invalid version #737

dgpatelgit commented Mar 8, 2021 •

edited

Loading

yzainee-zz Mar 8, 2021

dgpatelgit Mar 8, 2021

yzainee-zz Mar 8, 2021

dgpatelgit Mar 9, 2021

deepak1725 Mar 11, 2021

dgpatelgit left a comment

yzainee-zz commented Mar 11, 2021

dgpatelgit commented Mar 11, 2021

yzainee-zz commented Mar 11, 2021

dgpatelgit commented Mar 11, 2021

dgpatelgit commented Mar 17, 2021 •

edited

Loading

yzainee-zz Mar 17, 2021

dgpatelgit Mar 17, 2021

yzainee-zz Mar 17, 2021

dgpatelgit Mar 17, 2021

dgpatelgit Mar 17, 2021

yzainee-zz Mar 17, 2021

dgpatelgit Mar 18, 2021

yzainee-zz left a comment

arajkumar Mar 18, 2021

arajkumar Mar 18, 2021

arajkumar Mar 18, 2021

codecov-io commented Mar 18, 2021 •

edited

Loading

dgpatelgit commented Mar 19, 2021 •

edited by github-actions bot

Loading

arajkumar Mar 19, 2021

arajkumar Mar 19, 2021

dgpatelgit Mar 19, 2021

arajkumar Mar 19, 2021

dgpatelgit Mar 19, 2021

arajkumar left a comment

	logger.error(f"Package {self.package} @ {self.version} raised exception {e}")
	logger.exception("Unable to parse version %s@%s", self.package, self.version)

[APPAI-1776] Fixed sentry error due to invalid version #737

[APPAI-1776] Fixed sentry error due to invalid version #737

Conversation

dgpatelgit commented Mar 8, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dgpatelgit left a comment

Choose a reason for hiding this comment

yzainee-zz commented Mar 11, 2021

dgpatelgit commented Mar 11, 2021

yzainee-zz commented Mar 11, 2021

dgpatelgit commented Mar 11, 2021

dgpatelgit commented Mar 17, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yzainee-zz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Mar 18, 2021 • edited Loading

Codecov Report

dgpatelgit commented Mar 19, 2021 • edited by github-actions bot Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arajkumar left a comment

Choose a reason for hiding this comment

dgpatelgit commented Mar 8, 2021 •

edited

Loading

dgpatelgit commented Mar 17, 2021 •

edited

Loading

codecov-io commented Mar 18, 2021 •

edited

Loading

dgpatelgit commented Mar 19, 2021 •

edited by github-actions bot

Loading