Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic Release Pipeline Proposal #92

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jmkeil
Copy link
Contributor

@jmkeil jmkeil commented Sep 29, 2023

This PR is an proposal of a basic release pipeline to solve #90 and is based on #91. It removes dc:date and owl:versionInfo from om-2.0.ttl om-2.0.rdf and adds a pipeline to automatically add dc:date and owl:versionInfo on releases and to generate a RDF/XML serialization. In the pipeline, ROBOT (based on the OWL API) is used to perform these actions.

To trigger the pipeline and add release information, one needs to create a release with a new tag of the style v*.*.*. The version number will get picked from the tag name by the pipeline. The OWL/XML and TTL with data and version number will automatically be added to the release.

In addition, the pipeline will be triggered by pushes and pull requests, to automatically generate the serialization variants (and potentially run generation scripts and checks later one), but without adding release information.

Example from the fork repository:

@HajoRijgersberg
Copy link
Owner

Hey Jan Martin, thanx this looks great! The only thing is, we need to keep om-2.0.rdf as the source file, in its present form. Could you adapt your pipeline such that it runs every time a new version of om-2.0.rdf is published, and use that file as input for your pipeline? Please see also my other comments in related issues and PRs.

@HajoRijgersberg
Copy link
Owner

Also, at this stage, I think we should not change the format of the version numbers. So, could you keep the format ..*? I have to dive into what kind of change (major, minor, patch) the change of a version number format in itself is. And I would not want different version formats for the different versions of OM files that could be generated using the pipelines.
Hope this (and the above and earlier comments in several issues and PRs) are no problem to you. Really appreciate all your effort!

@jmkeil
Copy link
Contributor Author

jmkeil commented Sep 29, 2023

Could you adapt your pipeline such that it runs every time a new version of om-2.0.rdf is published, and use that file as input for your pipeline?

It is easy to adapt the pipeline to use om-2.0.rdf. But it is not trivial to automatically generate a release each time om-2.0.rdf is changed, as one would need to automatically determine the version number.

I think we should not change the format of the version numbers.

The version number format in the RDF files was not changed. It it only the git tag, that has the v at the beginning, as this format is common practice on GitHub. However, I removed the language tag from the version info literal and I changed the formatting of the date, as I switched the datatype from xsd:string to xsd:date.

@HajoRijgersberg
Copy link
Owner

Thanx again so much for your response, Jan Martin. You know how much I appreciate all your effort.

It is easy to adapt the pipeline to use om-2.0.rdf.

That is great! :)

But it is not trivial to automatically generate a release each time om-2.0.rdf is changed, as one would need to automatically determine the version number.

I understand, but I'll manage the dates and version numbers. It's not ideal I know, but it is less important than the transparency of the quality of the contents of OM. To put it simply.

The version number format in the RDF files was not changed. It it only the git tag, that has the v at the beginning, as this format is common practice on GitHub.

Clear, thanx!

However, I removed the language tag from the version info literal

Ah, shall I do so accordingly in om-2.0.rdf? For my understanding: why should it be removed?

and I changed the formatting of the date, as I switched the datatype from xsd:string to xsd:date.

Sounds good, but can you perhaps explain why in dc:date2023/09/28</dc:date> the date is a string? Maybe a stupid question but I don't know.

@jmkeil jmkeil force-pushed the releaseCI branch 4 times, most recently from 413f19b to c68867d Compare October 17, 2023 15:07
@jmkeil
Copy link
Contributor Author

jmkeil commented Oct 17, 2023

I updated the pull request to use om-2.0.rdf.

@HajoRijgersberg
Copy link
Owner

I updated the pull request to use om-2.0.rdf.

That is so fantastic, Jan Martin, many thanx! Please allow me to ask some questions, just for my understanding:

  1. So you use om-2.0.rdf as the basis for generating other files like om-2.0.ttl? You do not change om-2.0.rdf? I ask this because in the yml file I see: --output om-2.0.rdf.
  2. And I see you have removed the owl:versionInfo and dc:date from om-2.0.rdf, or am I wrong?
  3. Or are you working in a copy of om-2.0.rdf? The original file should keep its owl:versionInfo and dc:date of course.

Looking forward to your response. Maybe my questions are silly. Hope you can help me and answer these questions. Many thanx in advance! :)

@jmkeil
Copy link
Contributor Author

jmkeil commented Nov 2, 2023

  1. So you use om-2.0.rdf as the basis for generating other files like om-2.0.ttl? You do not change om-2.0.rdf? I ask this because in the yml file I see: --output om-2.0.rdf.
  • om-2.0.rdf is used as starting point
  • if the pipeline is running for a release tag, version number and date are added
  • in each case, om-2.0.rdf is re-serialized with the OwlApi serializer, to have comparable pipeline output for release and non-release runs
  1. And I see you have removed the owl:versionInfo and dc:date from om-2.0.rdf, or am I wrong?

You are right. But it gets added by the pipeline if running for a release tage. This has the advantage that intermediate (non-release versions) do not have a version number. This is an advantage because:

  • one can be sure to have a copy of an official released version of OM, if the file contains a version number (i.e. the relation version number -> commit becomes functional)
  • no need to deal with version numbers during merges
  • no need to manually add version number and date for a release

The automatic adding of the version number would become even more important, as soon as some statements get automatically added/removed by an extended pipeline. Then the file in the repository is only the (incomplete) "source" (which should not be used in production), but the pipeline output is the (complete) "build" (which is intended for use in production).

  1. Or are you working in a copy of om-2.0.rdf? The original file should keep its owl:versionInfo and dc:date of course.

The file in the repository does not get changed by the pipeline. But there will be a modified om-2.0.rdf file in the pipeline output (called pipeline artifacts).

@HajoRijgersberg
Copy link
Owner

Hey Jan Martin,
Just saw your message come in, and coincidentally also had the opportunity to respond.
Clear answers, thanx. However (unfortunately there is a 'but' here), the original om-2.0.rdf must not be altered... It should really remain as it is, with date and version number, in its present order/structure, for reasons given earlier.
Would you perhaps see a chance to adapt the pipeline one more time such that om-2.0.rdf can remain as it is, with its date and version number? The great benefit would really be in the generation of derived versions of OM, in DL, etc.
Hope you don't mind my words! If so, my sincere apologies (you're doing such great jobs for OM!). And many thanx of course in advance for your attention and - hopefully - the adaptation of the pipeline.
All the best and good luck, Hajo

@jmkeil
Copy link
Contributor Author

jmkeil commented Nov 28, 2023

I updated the PR to not change om-2.0.rdf anymore. The pipeline now generates RDF/XML and TTL serializations (using OWL API) and, in case of a release tag, adds them to the release.

@HajoRijgersberg
Copy link
Owner

Hey Jan Martin, thanx but I meant that also the pipeline should not alter om-2.0.rdf. Could you perhaps adapt that accordingly, i.e., that the pipeline will not affect om-2.0.rdf in any way? Hope it's no problem for you. But many thanx in advance! My apologies for any inconvenience.

@jmkeil
Copy link
Contributor Author

jmkeil commented Jan 12, 2024

Hi Hajo. Just to exclude a misunderstanding: The pipeline does not make any change in the repository. It only takes files from it, uses and maybe changes them, and finales stores the resulting files as artifacts (see e.g. https://github.com/jmkeil/OM/actions/runs/6549193162) of the pipeline execution (job).

@HajoRijgersberg
Copy link
Owner

Hi Jan Martin, and again my apologies for my late response and thanx again for yours! :)
Indeed I thought files from this repository were changed.
I'll study it soon again, intendedly within a few weeks - maybe longer since I expect that I need some time. I'll study it with this new knowledge that will probably help a lot.
Hope to get back to you soon! :)

@HajoRijgersberg
Copy link
Owner

HajoRijgersberg commented May 12, 2024

Hi Jan Martin,

Sorry that it took so long. There are so many things for me to dive into (I mean other than only this issue of course). I'm sure/convinced you understand.
I took a look at the yml file, and I always want to understand everything. How can I see that the files are stored elsewhere, not in this repository? Is that perhaps indicated here:

runs-on: ubuntu-latest
container: obolibrary/robot:v1.9.5

Could you explain to me, help me?

A more general question that I have, since I update the version number and the date of OM manually, e.g. by changing someone's committed file (as you have helped me by pointing out that that is possible), of course in your repository you are fully free what to do, but why would you then want to generate version number and date automatically? And would these then not deviate from the version number and date that I update?

Last question I have: many if not all things that we discussed above was from the perspective that I thought it was about this Github. So probably I have given you (many?) wrong advices. How do you see that?

Apologies for all my questions, and thanx so much for your answers in advance!

Best, Hajo

@jmkeil
Copy link
Contributor Author

jmkeil commented May 15, 2024

Sorry that it took so long. There are so many things for me to dive into (I mean other than only this issue of course). I'm sure/convinced you understand. I took a look at the yml file, and I always want to understand everything. How can I see that the files are stored elsewhere, not in this repository?

It is not directly visible in the yml file itself, but obvious from the general way how GitHub handles files. In our case these four types of files are relevant:

  • repository files: the files pushed and pulled via git
  • temporary workflow files: the files in the file systems of the docker containers executing the workflow, which vanish with the containers after the workflow completion
  • workflow artifacts: the archived result files of a workflow/pipeline
  • release assets: the files attached to a release

A workflow typically workflow/pipeline pulls the repository files (e.g. the source code of a program, processes them (e.g. compiling, testing), maybe stores some artifacts (e.g. unit test results), and under some conditions (e.g. release branch, no failed tests) generates a release and attaches release assets (e.g. executable binaries of a program) to them.

These types of files exist in parallel without affecting each other, if not explicitly specified in the workflow. Of course, a workflow could also push to the repository, but that would need to be scripted explicitly in the workflow.

since I update the version number and the date of OM manually, e.g. by changing someone's committed file (as you have helped me by pointing out that that is possible), of course in your repository you are fully free what to do, but why would you then want to generate version number and date automatically? And would these then not deviate from the version number and date that I update?

There is a bunch of reasons to automatize this:

  • it is a repetitive task, automatizing it would
    • save time for interesting tasks
    • avoid mistakes like typos in the release date
  • it would ease the parallel development of multiple features, each spanning over multiple commit, as
    • it would avoid the existence of several version of the ontology with the same version number, e.g. commit 58799a2 and commit 5cdb4d4 both have version 2.0.50
    • it avoids handing of version numbers during branching and merging, including pull requests
  • it would allow to do further automated post processing of the ontology before releases in the workflow, like automated generation of additional statements, without having different files with the version number in the repository files and the release asset

So, the idea is to not have a version number in the git repository at all, but only in the artifacts and assets.

many if not all things that we discussed above was from the perspective that I thought it was about this Github. So probably I have given you (many?) wrong advices. How do you see that?

Yes, I think the above comments were based on the misconception of the separation between repository files, artifacts and assets.

@HajoRijgersberg
Copy link
Owner

It is not directly visible in the yml file itself, but obvious from the general way how GitHub handles files. (...)
Of course, a workflow could also push to the repository, but that would need to be scripted explicitly in the workflow.

But how should that look like then? It looks already like presently the repository file is affected.

There is a bunch of reasons to automatize this: (...)
So, the idea is to not have a version number in the git repository at all, but only in the artifacts and assets.

Certainly, I know all that, but for other reasons - as discussed before - we don't do that (yet?) in this OM Github, at least not for the original om-2.0.rdf.
But my question was more like: would a version number and date that you create automatically in your OM Github not deviate from the version number and date that I create here (on this OM Github)? Of course, you can do that, but I thought: is that then - under the described circumstances (that I create those manually) - a sensible thing to do? This question just for my understanding.

Hope you can answer my questions again, Jan Martin. Very much appreciated, all your help, patience, answers, etc.! :)

@jmkeil
Copy link
Contributor Author

jmkeil commented Jul 24, 2024

It is not directly visible in the yml file itself, but obvious from the general way how GitHub handles files. (...)
Of course, a workflow could also push to the repository, but that would need to be scripted explicitly in the workflow.

But how should that look like then? It looks already like presently the repository file is affected.

You would see some git commit and git push command in the workflow code. During the workflow the repository is cloned into the docker container, just as if you clone the repository to your local machine: Changes at the files in the cloned repository will not affect the remote repository as long as you do not push them to the remote.

There is a bunch of reasons to automatize this: (...)
So, the idea is to not have a version number in the git repository at all, but only in the artifacts and assets.

Certainly, I know all that, but for other reasons - as discussed before - we don't do that (yet?) in this OM Github, at least not for the original om-2.0.rdf.
But my question was more like: would a version number and date that you create automatically in your OM Github not deviate from the version number and date that I create here (on this OM Github)? Of course, you can do that, but I thought: is that then - under the described circumstances (that I create those manually) - a sensible thing to do? This question just for my understanding.

Of course, they could conflict. That is the reason why I supposed to remove them from the repository and only (automatically) add them during the release process. That way, it is for sure they will not conflict.

But the version number would not get out of your control: The script would determines it based on a version tag at the commit, which must be added manually. The difference is that a tag does not change the content of the commit nor does it require an additional commit. The commits stay as they are, but from time to time a version tag is added causing a release. See for example this demo release triggered by the version tag v0.0.0: The file in the tagged commit does not contain version information, but the files attached to the release do.

@jmkeil jmkeil force-pushed the releaseCI branch 4 times, most recently from 9f77118 to 8b4f243 Compare August 12, 2024 15:23
@jmkeil
Copy link
Contributor Author

jmkeil commented Aug 12, 2024

I just restored the initial workflow (but with a few updates of pipeline dependencies and a rebase on the latest commit in om/master to avoid merge conflicts):

Now the PR (again):

  • removes version information from om-2.0.rdf in repository
  • adds a pipeline that automatically:
    1. if commit has a release tag: add version information to om-2.0.rdf (not affecting the file in the repository)
    2. generates a OWL API formatted om-2.0.rdf, om-2.0.ttl (not affecting the file in the repository)
    3. store the generated om-2.0.rdf, om-2.0.ttl as job artefacts (not affecting the files in the repository) example how an job artifact would look like
    4. if commit has a release tag: create a release in GitHub containing the files om-2.0.rdf, om-2.0.ttl (not affecting the files in the repository) example how a release on GitHub would look like

With that, doing a release would only require you to create a tag on a commit and cause/require zero changes to the commit and the files in the repository as well as the commit history by you and by the pipeline.

Merging this would enable to work on further PRs like: automatic OWL profile checking, OWL profile variants generation, numeric datatype variants generation, automatic deployment on the OM website, ...

Would you consider to merge this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants