Documentation PR #151

zain-sohail · 2023-09-14T10:06:08Z

Let's all commit to this branch to create a comprehensive documentation for SED

rettigl · 2023-10-06T10:28:58Z

This solution looks great, however in its current form it's difficult for me to test its output, because it is not actually published anywhere, and I cannot easilly locally generate it. Also, how do you picture this to be run in the future?
I would suggest to move the actual code into a separate .py file, and run the code from the docs-pipeline that you already defined for the main-branch (the one that currently updates the dependencies).

zain-sohail · 2023-10-06T10:50:57Z

This solution looks great, however in its current form it's difficult for me to test its output, because it is not actually published anywhere, and I cannot easilly locally generate it. Also, how do you picture this to be run in the future?

Indeed. My plan to add this branch to the readthedocs options so we can check the output. Likely my script is not working right now, since first the files that are produced need to be committed to the branch and then the documentation produced. So still have to figure it out.

I would suggest to move the actual code into a separate .py file,

Yes think moving to another file is the best. I am not sure if using a bash script is just a better option, as that wouldn't require any specific environment.

and run the code from the docs-pipeline that you already defined for the main-branch (the one that currently updates the dependencies).

And also good idea, I will combine the two (also that one is not yet committing to branch).

coveralls · 2023-10-10T21:58:11Z

Pull Request Test Coverage Report for Build 6510731696

0 of 0 changed or added relevant lines in 0 files are covered.
8 unchanged lines in 2 files lost coverage.
Overall coverage increased (+9.1%) to 99.666%

Files with Coverage Reduction	New Missed Lines	%
sed/loader/flash/loader.py	1	97.97%
sed/loader/base/loader.py	7	88.52%

Totals
Change from base Build 6483984354:	9.1%
Covered Lines:	4771
Relevant Lines:	4787

💛 - Coveralls

zain-sohail · 2023-10-11T20:28:11Z

Updating the workflows to use cache. Much faster access. I need to set coveralls with the testing still.

Pylint has yet to be done.
Regarding combining the two documentation workflows, it's not possible since both rely on different criteria: one on the toml file changing and other on ipynb files changing.
The requirements generating workflow works perfectly now. The bot even commits the output directly to the repo.
Still figuring out the tutorials.. since it's not as trivial to run the notebooks and also nbconvert.

But anyways, all branch outputs can be checked on https://sed.readthedocs.io/en/documentation/ (so far no changes would be visible)

rettigl · 2023-10-12T07:01:27Z

Updating the workflows to use cache. Much faster access. I need to set coveralls with the testing still.

As far as I understand this uses poetry and the lock file then, no? I would find it important to always use the most recent package versions during testing, that our package limits allow, to see potential problems with updated packages when they come. That's why I don't like having the lock file in the repo, because it will essentially be always outdated. Right now, its not being used by the workflows, so I don't care. If you intend to use it, we at least have to include a poetry update and poetry lock, and then I doubt it's faster...

zain-sohail · 2023-10-12T07:17:51Z

cache doesn't have to use it and can use any package management tool.
But I think it's good time to start using poetry, since the package is entirely based on it.
The entire point of the lock file is that users who clone the repo can have the exact environment that we have in 3 commands. I don't see any point to keep updating the lock file for every minor release.
But even if you want to have an updated set of packages at all times, then it's even more critical that the lock file is the one that's updated. Otherwise we're testing for a system that the user or us developers might not even see.
The speedup will still be there because at least the python environment can be cached, and for when there's nothing to update in lock file.

rettigl · 2023-10-12T07:29:03Z

Let's test it. I would not say the package is built on poetry. pyproject.yaml != Poetry. I dont' use poetry by default at all, onyl if I have to regenerate a poetry.lock file for whatever reason. This takes for me typically more than a minute to resolve dependencies, the installation of the environment takes <2 minutes. So I am not so sure about the speedup.
The goal is to have this as a package anyways, and then there will be no poetry and no lock file, but just the pyroject and the dependencies. So we need to make sure that at any point in time starting from there we get a working installation. That's my point of view and that's what we should test for.

zain-sohail · 2023-10-12T08:56:58Z

Let's test it. I would not say the package is built on poetry. pyproject.yaml != Poetry. I dont' use poetry by default at all, onyl if I have to regenerate a poetry.lock file for whatever reason. This takes for me typically more than a minute to resolve dependencies, the installation of the environment takes <2 minutes. So I am not so sure about the speedup. The goal is to have this as a package anyways, and then there will be no poetry and no lock file, but just the pyroject and the dependencies. So we need to make sure that at any point in time starting from there we get a working installation. That's my point of view and that's what we should test for.

I initialized sed using poetry. Of course, poetry and setuptools are compatible with each other but the purpose of poetry is to manage dependencies and also aid in packaging. If you have any experience with building with setuptools, you'd know how much of a hassle it can be but poetry makes it easier. I'd encourage you to also adopt it.
Since we build using poetry, it only makes sense to test using the same environment too.

Regarding speedup, you can check here:
https://github.com/OpenCOMPES/sed/actions/runs/6477102441
of course, here I am not updating the lock file, which we can but I don't think necessary unless there is a change in pyproject.toml

rettigl · 2023-10-12T09:07:36Z

Since we build using poetry, it only makes sense to test using the same environment too.

That does not matter. What we want is a package that we can pip install sed-processor
And that will only have a list of dependencies, and no poetry.lock file. And it will resolve these dependencies, and install the most recent versions of packages according to these dependencies, at the point in time a user installs the package.
With poetry.lock, on the other hand, you install an environment locked to package versions defined at the point in time you created the poetry.lock file, which will emmediately after be outdated. That's the problem I want to point out.
In order to make sure that the package works for the user who does pip install sed-processor, we need to test with the most recent packages our depedencies allow. And the package versions change independent of our pyproject.toml file!

zain-sohail · 2023-10-12T09:22:59Z

That does not matter. What we want is a package that we can pip install sed-processor And that will only have a list of dependencies, and no poetry.lock file. And it will resolve these dependencies, and install the most recent versions of packages according to these dependencies, at the point in time a user installs the package. With poetry.lock, on the other hand, you install an environment locked to package versions defined at the point in time you created the poetry.lock file, which will emmediately after be outdated. That's the problem I want to point out. In order to make sure that the package works for the user who does pip install sed-processor, we need to test with the most recent packages our depedencies allow. And the package versions change independent of our pyproject.toml file!

Then wouldn't it make sense to update poetry and run those tests only when we publish a new version to pypi?
But even if you don't want to use poetry for testing, the cache can still be used on these commands

        git lfs pull
        python -m pip install --upgrade pip
        python -m pip install pytest coverage coveralls
        python -m pip install .

In this case, we can use the python setup action cache funtionality they added: https://github.com/actions/setup-python#caching-packages-dependencies

zain-sohail · 2023-10-12T09:24:34Z

see examples here: https://github.com/actions/setup-python/blob/main/docs/advanced-usage.md#caching-packages

rettigl · 2023-10-12T09:28:04Z

This probabaly makes a lot of sense. I am not at all against using caches, and even less against speedup, the only thing I wanted to point out is this potential caveat in using a static environment for testing. And no, it does not depend on us pushing a new version of the package, but others publishing new versions of packages we depend on. So even an old version of our package might break, if dependencies update. That we need to regularly test for, and react if we detect such a problem. That's the burdon of maintaining a complex python project...

zain-sohail · 2023-10-12T14:11:25Z

FileNotFoundError: [Errno 2] No such file or directory: 'sed_config.yaml'
https://github.com/OpenCOMPES/sed/actions/runs/6496531345/job/17643665308#step:4:343

Getting this error with parallel testing. Only sometimes because 3.9 passed but not 3.8. Any ideas?

rettigl · 2023-10-12T15:27:25Z

Classical race condition. These tests actually are written to be run sequenctually, because they all write to the same folder config sed_config.yaml file, and delete it afterwards. if one test does this before the other, the deletion does not work, and the tests will also for other reasons randomly fail.
We can work around this by specifying specific names for the local file for each of these tests, as also done for at least one of the other tests.

rettigl · 2023-10-12T15:39:28Z

I'll provide a fix for it

…r.py

rettigl · 2023-10-12T16:02:44Z

That should fix the issue you had, however while doing local testing I encountered a similar problem with the flash loader, where the test files are also deleted. Here, I cannot change the file names, so I don't know how to resolve it.

rettigl · 2023-10-12T16:04:23Z

The io-tests are also predestined for race conditions, as here also multiple tests access the same files.

zain-sohail · 2023-10-12T16:17:39Z

That should fix the issue you had, however while doing local testing I encountered a similar problem with the flash loader, where the test files are also deleted. Here, I cannot change the file names, so I don't know how to resolve it.

In essence, they don't need to be deleted, at least in a workflow, because the runner deletes everything after it finishes running. For local case, I can understand your point.

The io-tests are also predestined for race conditions, as here also multiple tests access the same files.

That's true. Do we have many other such tests?

Also question, I noticed that our linting workflow also has testing. Is it there for a purpose?

zain-sohail · 2023-10-12T16:31:07Z

https://pytest-xdist.readthedocs.io/en/latest/how-to.html#making-session-scoped-fixtures-execute-only-once maybe locking can help with the data race

rettigl · 2023-10-12T17:02:42Z

In essence, they don't need to be deleted, at least in a workflow, because the runner deletes everything after it finishes running. For local case, I can understand your point.

They are deleted, because otherwise the conversion from h5 to parquet is not tested for the different cases, but the parquets are read afer the first test has run. But this again depends on the order of tests, which is undefined in a parallel setup, so again -> race condition. It comes from the somewhat different behavior of the flash loader compared to the others.

That's true. Do we have many other such tests?

I don't think so. The i/o tests can also be made safe by changing the file names depending on test fixture.

Also question, I noticed that our linting workflow also has testing. Is it there for a purpose?

The idea was to have this as the single workflow to run at every push, that verifies the integrity of the pushed code, which includes linting and tests. Maybe the name is not the best. The other test workflow was added to verify compatibility with different python versions.

rettigl · 2023-10-12T18:23:38Z

This should remove all remaining file conflicts/race conditions. I tested with up to 60 workers several times and did not get any failures. ~10 workers seems to be an optimum with ~50 sec. testing time. Off course, they won't be available at the free nodes, I suppose.

zain-sohail · 2023-10-12T18:57:24Z

This should remove all remaining file conflicts/race conditions. I tested with up to 60 workers several times and did not get any failures. ~10 workers seems to be an optimum with ~50 sec. testing time. Off course, they won't be available at the free nodes, I suppose.

The usual is 2 cores for linux systems, it seems. Though github has a feature to host your own runners:
https://github.com/organizations/OpenCOMPES/settings/actions/runners/new
Setup seems quite simple. If there's a server available, we could use it. But maybe too much hassle

rettigl · 2023-10-12T19:01:02Z

This should remove all remaining file conflicts/race conditions. I tested with up to 60 workers several times and did not get any failures. ~10 workers seems to be an optimum with ~50 sec. testing time. Off course, they won't be available at the free nodes, I suppose.

The usual is 2 cores for linux systems, it seems. Though github has a feature to host your own runners: https://github.com/organizations/OpenCOMPES/settings/actions/runners/new Setup seems quite simple. If there's a server available, we could use it. But maybe too much hassle

Honestly its of not so much importance how long the online runs take if you do the tests locally before, so I would not bother. Speaking of which, looks like I forgot to check linting...

…nerated buffer files

… from github

github-advanced-security · 2023-10-13T15:16:45Z

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

github-advanced-security

pylint found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

zain-sohail · 2023-10-13T16:36:35Z

PR deviated too much from intention. Hence I will close it and open up two new branches: one for CI/workflows and another for documentation

zain-sohail assigned zain-sohail, steinnymir, kutnyakhov and rettigl Sep 14, 2023

zain-sohail unassigned zain-sohail, steinnymir, kutnyakhov and rettigl Oct 5, 2023

zain-sohail marked this pull request as draft October 5, 2023 13:59

zain-sohail added 6 commits October 10, 2023 23:49

add metadata class for flash in docs

47ac24d

adding example data first try

30c248b

tutorial files and index.rst edited for docs

62ce18a

workflow for auto generation of tutorial docs

3ae3cfd

tutorial file with outputs for testing

47dc164

update to workflow

0a13521

zain-sohail force-pushed the documentation branch from 278c60e to 0a13521 Compare October 10, 2023 21:49

Use github cache feature

88caa16

Merge branch 'main' into documentation

ccc69bf

go back to the combined action

8b25f63

remove race condition for sed_config.yaml detetion from test_processo…

103c9a6

…r.py

remove race condition in test_io, and remove generated test files

2644b41

remove race condition for flash loader tests and remove additional ge…

ef76eff

…nerated buffer files

rettigl force-pushed the documentation branch from 8196329 to ef76eff Compare October 12, 2023 19:04

zain-sohail mentioned this pull request Oct 13, 2023

add depedabot for dependency tracking/updating #178

Merged

zain-sohail added 2 commits October 13, 2023 15:35

update testing workflow for coveralls and change name

f4ea9b1

linting workflow restricted to only lint, also allowing code analysis…

48c6fe3

… from github

github-advanced-security bot found potential problems Oct 13, 2023

View reviewed changes

zain-sohail added 5 commits October 13, 2023 17:40

try ignoring docs, installing package

132f7cc

remove code analysis section for now and use old way of running linter

4e99f9d

remove matrix testing

e7d4780

fix linter

097a566

revert back linting configs

627ecb6

zain-sohail closed this Oct 13, 2023

zain-sohail mentioned this pull request Oct 16, 2023

Improve workflows to use github caches #182

Merged

zain-sohail deleted the documentation branch November 23, 2023 13:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation PR #151

Documentation PR #151

zain-sohail commented Sep 14, 2023

rettigl commented Oct 6, 2023

zain-sohail commented Oct 6, 2023

coveralls commented Oct 10, 2023 •

edited

Loading

zain-sohail commented Oct 11, 2023

rettigl commented Oct 12, 2023

zain-sohail commented Oct 12, 2023

rettigl commented Oct 12, 2023

zain-sohail commented Oct 12, 2023

rettigl commented Oct 12, 2023 •

edited

Loading

zain-sohail commented Oct 12, 2023

zain-sohail commented Oct 12, 2023

rettigl commented Oct 12, 2023 •

edited

Loading

zain-sohail commented Oct 12, 2023

rettigl commented Oct 12, 2023

rettigl commented Oct 12, 2023

rettigl commented Oct 12, 2023

rettigl commented Oct 12, 2023

zain-sohail commented Oct 12, 2023

zain-sohail commented Oct 12, 2023

rettigl commented Oct 12, 2023

rettigl commented Oct 12, 2023

zain-sohail commented Oct 12, 2023

rettigl commented Oct 12, 2023

github-advanced-security bot commented Oct 13, 2023

github-advanced-security bot left a comment

zain-sohail commented Oct 13, 2023

Documentation PR #151

Documentation PR #151

Conversation

zain-sohail commented Sep 14, 2023

rettigl commented Oct 6, 2023

zain-sohail commented Oct 6, 2023

coveralls commented Oct 10, 2023 • edited Loading

Pull Request Test Coverage Report for Build 6510731696

💛 - Coveralls

zain-sohail commented Oct 11, 2023

rettigl commented Oct 12, 2023

zain-sohail commented Oct 12, 2023

rettigl commented Oct 12, 2023

zain-sohail commented Oct 12, 2023

rettigl commented Oct 12, 2023 • edited Loading

zain-sohail commented Oct 12, 2023

zain-sohail commented Oct 12, 2023

rettigl commented Oct 12, 2023 • edited Loading

zain-sohail commented Oct 12, 2023

rettigl commented Oct 12, 2023

rettigl commented Oct 12, 2023

rettigl commented Oct 12, 2023

rettigl commented Oct 12, 2023

zain-sohail commented Oct 12, 2023

zain-sohail commented Oct 12, 2023

rettigl commented Oct 12, 2023

rettigl commented Oct 12, 2023

zain-sohail commented Oct 12, 2023

rettigl commented Oct 12, 2023

github-advanced-security bot commented Oct 13, 2023

github-advanced-security bot left a comment

Choose a reason for hiding this comment

zain-sohail commented Oct 13, 2023

coveralls commented Oct 10, 2023 •

edited

Loading

rettigl commented Oct 12, 2023 •

edited

Loading

rettigl commented Oct 12, 2023 •

edited

Loading