-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Validation Framework #2725
Add Validation Framework #2725
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2725 +/- ##
==========================================
- Coverage 94.10% 93.04% -1.07%
==========================================
Files 156 165 +9
Lines 4208 4514 +306
==========================================
+ Hits 3960 4200 +240
- Misses 248 314 +66
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Hi @Divyaasm , Please check the approach posted here: #2524 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please have proper PR title like"Add validation framework".
Also link the Github issue in the description to track.
Need to add tests too for this.
def is_url_valid(url: str) -> int: | ||
response = requests.head(url) | ||
if response.status_code == 200: | ||
return 'content-length' in response.headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we returning the 'content-length' here?
src/validation_workflow/yum.py
Outdated
@classmethod | ||
def download_urls(cls, projects: list, version: str, platform: str) -> None: | ||
for project in projects: | ||
url = f"https://artifacts.opensearch.org/releases/bundle/{project}/2.x/{project}-2.x.repo" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is won't be true everytime. For example, for 1.x version this static link will point to 1.x
. Calculate it using the user entered version
value. For example, for 3.0.0 version the url will be 3.x
cc: @peterzhuamazon
src/validation_workflow/yum.py
Outdated
for project in projects: | ||
url = f"https://artifacts.opensearch.org/releases/bundle/{project}/2.x/{project}-2.x.repo" | ||
if Yum.is_url_valid(url): | ||
print(f"Valid URL - {url}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we should just log the statements and return true instead of printing them out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will work on the changes requested Sayali.Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@staticmethod | ||
def download(url: str) -> None: | ||
response = requests.get(url, stream=True) | ||
open(url.split("/")[-1], "wb").write(response.content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this part doing here?
Could you write some comments addressing your implementation?
Thanks.
src/validation_workflow/rpm.py
Outdated
def download_urls(cls, projects: list, version: str, platform: str) -> None: | ||
for project in projects: | ||
for package_type in cls.get_architectures(): | ||
url = f"https://artifacts.opensearch.org/releases/bundle/{project}/{version}/{project}-{version}-{platform}-{package_type}.rpm" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of directly hardcording this part in rpm.py.
I suggest bringing this part into a lib type of file which sourcing all these information.
So you dont need to hardcode this line every single file.
src/validation_workflow/rpm.py
Outdated
for package_type in cls.get_architectures(): | ||
url = f"https://artifacts.opensearch.org/releases/bundle/{project}/{version}/{project}-{version}-{platform}-{package_type}.rpm" | ||
if Rpm.is_url_valid(url): | ||
print(f"Valid URL - {url}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using printf you should look at logging
lib and see how other files utilizing it.
src/validation_workflow/tar.py
Outdated
def get_architectures(cls) -> list: | ||
return ["x64", "arm64"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this has been returned every .py which does not seems right.
If this is the same for every child class this should exist in the parent class.
Also this does not seems to belongs to a classmethod but more like a property.
SUPPORTED_DISTRIBUTIONS = ["tar", "yum", "rpm"] | ||
SUPPORTED_PLATFORMS = ["linux"] | ||
DISTRIBUTION_MAP = {"tar": Tar, "yum": Yum, "rpm": Rpm} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better practice is to write them into multiple lines instead of one line, for better visual or better maintenance.
from .rpm import Rpm | ||
from .tar import Tar | ||
from .yum import Yum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of writing relative path it is better to write the full path of these libs.
Also it is better to give a name with full contaxt of these classes.
Example: ValidationRpm
, ValidationTar
.
A good start @Divyaasm but I think we can improve it even better 😄 Thanks. |
src/validation_workflow/tar.py
Outdated
for package_type in cls.get_architectures(): | ||
url = f"https://artifacts.opensearch.org/releases/bundle/{project}/{version}/{project}-{version}-{platform}-{package_type}.tar.gz" | ||
if Tar.is_url_valid(url): | ||
print(f" Valid URL - {url}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about making the return type as boolean? True if successful, else false along with throwing an exception? We want the workflow to fail if the validation fails instead of throwing false positive and just logging.
Hey @Divyaasm some info on tests with existing workflows: Tests to assert the Tests per class which is used my |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work overall. Need some minor improvements.
src/run_validation.py
Outdated
|
||
args = ValidationArgs() | ||
|
||
console.configure(level=logging.DEBUG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the logging level set to debug
?
|
||
@classmethod | ||
@abstractmethod | ||
def download_urls(self, projects: list, version: str, platform: str, architectures: list) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommended to change the method name to just download
or download_artifacts
as this can be used by other distributions like docker to pull the images.
"--distribution", | ||
type=str, | ||
choices=self.SUPPORTED_DISTRIBUTIONS, | ||
help="Distribution to build.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: Distribution to validate
parser.add_argument( | ||
"version", | ||
type=str, | ||
help="Enter the Version" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to Product version to validate
for architecture in architectures: | ||
url = f"{self.url}{project}/{version}/{project}-{version}-{platform}-{architecture}.rpm" | ||
if ValidationRpm.is_url_valid(url) and ValidationRpm.download(url): | ||
logging.debug(f" Valid URL - {url} and Download Successful !") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the logging as info
as default is info
for project in projects: | ||
url = f"{self.url}{project}/{version[0:1]}.x/{project}-{version[0:1]}.x.repo" | ||
if ValidationYum.is_url_valid(url) and ValidationYum.download(url): | ||
logging.debug(f" Valid URL - {url} and Download Successful !") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. Change to info
tests/tests_run_validation.py
Outdated
self.assertEqual(ValidationArgs().distribution, "rpm") | ||
self.assertNotEqual(ValidationArgs().distribution, "yum") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rpm and yum are same thing. I believe you want to assert that its not equal to tar
which is default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestions Sayali, I will work on the changes.
|
||
def __init__(self) -> None: | ||
parser = argparse.ArgumentParser(description="Download Artifacts.") | ||
parser.add_argument("version", help="Enter Version.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we have another argument parser in this class? Should all be handled in the ValidationArgs()
and passing that args
assigned in run_validation
within all classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't called the base class in my workflow by creating an object,So I have left it to default version. I will pass all the arguments from ValidationArgs() to init if the base class object is utilized in further development of the code or can set it to default without passing and arguments and without declaring any new variables.
Hey @Divyaasm thanks for PR/changes, I would suggest to include TemporaryDirectory, in your logic to download the files, else they would be on the user working directory and would never be deleted, example https://github.com/opensearch-project/opensearch-build/blob/main/src/release_notes_workflow/release_notes.py#L40, with this it would download all the artifacts in a temporary directly and will be removed once the execution is done. Can you also add a small README file? |
Sure Prudhvi, Thanks for the info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some nits.
return 0 | ||
|
||
@staticmethod | ||
def download(url: str, tmp_dir: TemporaryDirectory) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: rename to download_artifacts
Just download is bit confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a different function in downloadutils.py which is called by the download_artifacts method in tar, rpm and yum.
response = requests.get(url, stream=True) # get() method sends a GET request to the url | ||
path = tmp_dir.name + "/" + url.split("/")[-1] | ||
val = bool(open(path, "wb").write(response.content)) # writes the contents from the response object into temporary directory file name fetched from the end of the url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these comments required? If yes please move it above the usage or remove it entirely. Also recommending to use better variables names instead of val
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, Sayali.
|
||
|
||
class Validation(ABC): | ||
url = "https://artifacts.opensearch.org/releases/bundle/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: base_url sounds more appropriate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the change.
version: str | ||
|
||
def __init__(self) -> None: | ||
parser = argparse.ArgumentParser(description="Download Artifacts.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make the description more generic as the same framework is going to be used for multiple other platforms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried changing the description to Validation Framework for Tar,Rpm and Yum artifacts
as it deals with the three distributions.
logging.info(f" Valid URL - {url} and Download Successful !") | ||
else: | ||
logging.info(f"Invalid URL - {url}") | ||
raise Exception("Invalid url - check version") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not everytime the issue might be in version, sometimes the artifact won't exist at all.
Just invalid url
with url provided works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the changes
logging.info(f" Valid URL - {url} and Download Successful !") | ||
else: | ||
logging.info(f"Invalid URL - {url}") | ||
raise Exception("Invalid url - check version") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, change all for all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, Thanks.
Signed-off-by: Divya Madala <[email protected]> DownloadUrl Verification Signed-off-by: Divya Madala <[email protected]> Validate DownloadUrl's Signed-off-by: Divya Madala <[email protected]> Validate DownloadUrl's Signed-off-by: Divya Madala <[email protected]> Validate DownloadUrl's Signed-off-by: Divya Madala <[email protected]> Validate DownloadUrl's Signed-off-by: Divya Madala <[email protected]> Validate DownloadUrl's Signed-off-by: Divya Madala <[email protected]> Validate DownloadUrl's Signed-off-by: Divya Madala <[email protected]> Validate DownloadUrl's Signed-off-by: Divya Madala <[email protected]> Validate DownloadUrl's Signed-off-by: Divya Madala <[email protected]> Validate DownloadUrl's Signed-off-by: Divya Madala <[email protected]> Validate DownloadUrl's Signed-off-by: Divya Madala <[email protected]> Verify Download Url's Signed-off-by: Divya Madala <[email protected]> Verify Download Url's Signed-off-by: Divya Madala <[email protected]> Test Signed-off-by: Divya Madala <[email protected]> Test Signed-off-by: Divya Madala <[email protected]> Test Signed-off-by: Divya Madala <[email protected]> Test Signed-off-by: Divya Madala <[email protected]> Test Signed-off-by: Divya Madala <[email protected]> Test Signed-off-by: Divya Madala <[email protected]> Test Signed-off-by: Divya Madala <[email protected]> Test Signed-off-by: Divya Madala <[email protected]> Test Signed-off-by: Divya Madala <[email protected]> Test Signed-off-by: Divya Madala <[email protected]> Test Signed-off-by: Divya Madala <[email protected]>
Signed-off-by: Divya Madala <[email protected]> Add Validation WorkFlow Signed-off-by: Divya Madala <[email protected]> Add Validation Framework Signed-off-by: Divya Madala <[email protected]> Changes to Validation Workflow Signed-off-by: Divya Madala <[email protected]> Changes to Validation Workflow Signed-off-by: Divya Madala <[email protected]>
Signed-off-by: Divya Madala <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Divyaasm good start but there are some parts that I question the implementation.
Could you please take a look at these comments I raised.
Thanks.
return 1 | ||
else: | ||
return 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a benefit using 1/0 instead of True/False?
This is also opposite compares to normal programming pattern where 1 is false and 0 is true normally.
Plus, the else
block can be completely removed in this case.
def download(url: str, tmp_dir: TemporaryDirectory) -> bool: | ||
# This method writes the contents from the response object into temporary directory file name fetched from the end of the url. | ||
response = requests.get(url, stream=True) | ||
path = tmp_dir.name + "/" + url.split("/")[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a good implementation and I would rather use os.join because /
is only working on nix* and would cause trouble on windows (even tho windows now can work with /
it is still not ideal here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if you want to get the file name, use something like os.basename or similar, do not try to split the url and get the last item.
self.version = version | ||
self.distribution = distribution | ||
self.platform = platform | ||
self. projects = projects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo, which would cause the code not running at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also means you are missing test cases because the test pass but this code is invalid.
Or it means there are variables you did not use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't used the init method in base class for implementation.So tried removing the init as I don't need any class properties in particular.
self.platform = platform | ||
self. projects = projects | ||
|
||
@classmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a classmethod after all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing class method threw me an error Missing positional argument "self" in call to "download_artifacts" of "Validation"
so I let it stay Peter.
SUPPORTED_DISTRIBUTIONS = [ | ||
"tar", | ||
"yum", | ||
"rpm" | ||
] | ||
SUPPORTED_PLATFORMS = ["linux"] | ||
SUPPORTED_ARCHITECTURES = [ | ||
"x64", | ||
"arm64", | ||
] | ||
DISTRIBUTION_MAP = { | ||
"tar": ValidationTar, | ||
"yum": ValidationYum, | ||
"rpm": ValidationRpm | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not need one list and one map just to rewrite all the dist again.
The design can be using 1 map max:
Example:
DISTRIBUTION_MAP = {
"tar": ValidationTar,
"yum": ValidationYum,
"rpm": ValidationRpm
}
Then the platform/arch information can be distributed directly in the validation<> class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure Peter, I will use a single dictionary for storing distributions and mapping the distributions to respective classes. Could you please explain what platform/arch information can be distributed directly in the validation<> class.
in your comment means.Do you recommend that I declare these lists where they are actually implemented without declaring them in validationargs.py so I need not pass them as arguments to the function.
for architecture in architectures: | ||
url = f"{self.base_url}{project}/{version}/{project}-{version}-{platform}-{architecture}.rpm" | ||
if ValidationRpm.is_url_valid(url) and ValidationRpm.download(url, self.tmp_dir): | ||
logging.info(f" Valid URL - {url} and Download Successful !") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there a leading space?
for architecture in architectures: | ||
url = f"{self.base_url}{project}/{version}/{project}-{version}-{platform}-{architecture}.tar.gz" | ||
if ValidationTar.is_url_valid(url) and ValidationTar.download(url, self.tmp_dir): | ||
logging.info(f" Valid URL - {url} and Download Successful !") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
for project in projects: | ||
url = f"{self.base_url}{project}/{version[0:1]}.x/{project}-{version[0:1]}.x.repo" | ||
if ValidationYum.is_url_valid(url) and ValidationYum.download(url, self.tmp_dir): | ||
logging.info(f" Valid URL - {url} and Download Successful !") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
|
||
|
||
class ValidationRpm(Validation, DownloadUtils): | ||
tmp_dir = TemporaryDirectory() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a temp dir that all the test requires, why not just put it back to the parent class?
for project in projects: | ||
for architecture in architectures: | ||
url = f"{self.base_url}{project}/{version}/{project}-{version}-{platform}-{architecture}.tar.gz" | ||
if ValidationTar.is_url_valid(url) and ValidationTar.download(url, self.tmp_dir): | ||
logging.info(f" Valid URL - {url} and Download Successful !") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these implementation are extremely similar, means the core code should exist in parent, and child should just pass the different information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially suggested to have only abstract methods in the base class so I have implemented it this way. I can actually change the core code to base class and pass the required information from the child class. Can I know what you would recommend!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm expecting to leave the method abstract in base class as Docker also extends the same method which has a different implementation all together.So, had to implement it independently for each distribution even though the code is reused.
|
||
@classmethod | ||
@abstractmethod | ||
def download_artifacts(self, projects: list, version: str, platform: str, architectures: list) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greetings, @Divyaasm , I am working on a dependant issue/PR to validation docker artifact and want to update you that I will need only the 'version' argument to Pull and Spin-up containers by docker-compose; hence, if you can make other arguments in your script as optional, that will help! Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 I believe we can make platform, architectures optional
Signed-off-by: Divya Madala <[email protected]>
Signed-off-by: Divya Madala <[email protected]>
@classmethod | ||
def download_artifacts(self, projects: list, version: str) -> bool: | ||
for project in projects: | ||
for architecture in ["x64", "arm64"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit: define []
as an actual map instead of calling it on the fly.
architecture_map=[]
for architecture in architecture_map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Divyaasm, one nitpik from above.
Signed-off-by: Divya Madala <[email protected]>
Description
Validation of Download Url's for OpenSearch.
GitHub issue :#2712
Issues Resolved
Automated the downloads of OpenSearch artifacts related to Linux platform.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.