-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate out documentation building and publishing per provider #11423
Comments
I worked on this ticket yesterday / today and managed to build documentation for providers package.. There are two more serious issues that need to be discussed.
CC: @ryw @potiuk @iadi7ya @francescomucio @jward-bw @jhtimmins @kaxil @paolaperaza @pcandoalmeida @xinbinhuang Related issue: apache/airflow-site#301 |
Will take a look shortly :) |
Hi @mik-laj - I reviewed this and chatted w/ @kaxil today, looks good structurally for v1. In the short term, is the idea to build this into the airflow website next to the other docs? Trying to think what is simplest to get v1 out there. Do we want to provide versioned docs for each provider, I don't think so - just "latest" We could have sublinks across the top "Airflow" and "Airflow Providers" as a way to navigate to this providers docs? Happy to jump on a call to brainstorm. |
I think eventually we might need a doc per provider version. We can fully automate it - once we automate it for "latest" it will be almost no effort to automate it for "per-provider-version". And it would be rather confusing for people looking at the provider's doc from latest version while they will be using another. Just the fact that we agreed to Semver and agreed that we might have breaking changes pretty much implies that we need to have "per-version" documentation. Imagine we have 1.0.0 versions of Google provider and then we introduce 2.0.0 which will introduce breaking changes (for example after we migrate to Google 2.0 Python APIS). We need to provide docs for both versions for quite some time. And It's even likely that we will release a 1.0.1 Google provider with bugfixes for 1.0.0 (though this still waits for #11425 to be completed). I think we have no choice but to implement all of it, including the possibility of choosing version per provider - this IMHO is pretty much sealed when we agreed to allow for breaking changes for each provider. And it's not even difficult - we can (and should) fully automate it. It does not have to be there for "Day 0" - like when we release 2.0.0 and set of 1.0.0 providers, it can be "no version" but very soon after we have to support versions. And our tooling has to be prepared for that (and have it automated), because keeping it manually updated will be impossible. |
Agree - we should ship v1 as "universal docs" since it's the first release for all the providers, but we'll have to address the problem pretty soon as providers start to independently update + release. |
We update vendors very often, so I think it's worth breaking down these dossiers as soon as possible. If we are going to publish these documents, we must also give the opportunity to look at the archival version of the documentation. Mainly, so that the user can check whether a given operator is available in a given or needs to update to the latest version.
I would like us to have an index (at the address: https://airflow.apache.org/docs/ ) that will describe all the products we release. For now, my focus is only on Airflow-core, a provider packages, but in the future we may add documentation for the rest of the products we release. When the user selects a product, they gets a view similar to:
I think we should be prepared with the documentation for "Day 0". Otherwise we will have mixed content for different products and versions in one documentation. However, this documentation will not be easily updated, e.g. links will still point to out-of-date documents. If we do not split the documentation, they will have problems with publishing some documents at "Day 0", e.g. changelog for provider packages. |
During the split, I would also like to introduce one additional change - migrate the development version of the documentation to the official template. Now all contributors are using documentation that has a different template and sometimes the final documentation is buggy as a result. If everyone used one template, the bugs would be fixed faster. |
This is cool! And yeah! if we can make it split from day 0. I'd really love that! |
I already have the first successful build of full documentation on S3: |
Hello. Today I would like to discuss the next step - Sphinx theme for our documentation. This theme is currently being developed in the The production and development documentation looks completely different. This means that if there is an error in the theme, we find out about it after publishing the documentation and any change is then much more difficult. This usually means that we have to edit each HTML file individually. I would like to improve it now and install theme in Breeze and also provide a way to install this theme if you want to build documentation locally. I would not like to publish this package on Pypi so as not to clutter the public repository with packages that will not be used by other projects. I think the easiest way is to build a theme on Github Action for airfllow-site and then publish theme to S3. Then we will be able to install the theme with the command:
This looks like a simple task if we use https://github.com/novemberfiveco/s3pypi.
Unfortunately, this won't work as this theme has a complex build process. We must first build a website to generate the necessary artifacts to build a theme package. |
Sounds great! Another option might be to publish it as a Release on Github, and then we could install it as
(To test this I uploaded the artifacts to 2.0.0b3 on Airflow: https://github.com/apache/airflow/releases/tag/2.0.0b3) The advantage of using Github is Actions already has credentials to create releases (I think?) and we then dont need to manage keys for S3. Disadvantage is that we could only point at specific releases, and couldn't do |
Maybe - we can do better than that. Why don't we create a separate repository "apache/airflow-doc-theme" and put all the theme there ? then we can develop it separately and point to the tags/versions of the code (without even releasing it) same way as we do with airflow now:
This will run setup.py locally, to build the theme. But maybe this is not as complex and can be done? The benefit is that if we decide to move it to PyPI, we can publish pre-built binary themes there similarly to NumPy prebuilt packages (PyPI accepts different variants of releases). |
And we could combine both - keeping theme in separate repo and making them available as release as well). Also - releasing it to PyPI is super easy. I think we should also consider simply releasing it via PyPI. Once we have the right set of artifacts, it is as easy as running "twine upload". I am not sure why we excluded that so easily? Is there any problem with that @mik-laj since this is standard way of distributing packages? I do not think there is a "clutter" or any kind there, to be honest if it makes our life easier. |
@potiuk The website and theme will share some files, more specifically you must have the site build output files to be able to build the theme. For this reason, moving this theme to a separate repository could be problematic. We can think about using Pypi, but if this is actually going to be for internal use only and we don't expect users to install this theme, I don't think we should make it easy to find this theme. If publication on a private repository of packages will not be a big problem for us. Now I even think that publishing on Github Releases might be easier for us as we won't have to provide credentials. |
I am also wondering if publishing on Pypi will result in us having to meet some releasing requirements. Any user will be able to easily find this theme and install it. If we use a private repository, the files will be available only to the developers of this project. Ideally, we would also be able to configure the full CI / CD so that the package is available without any of our intervention. Even a manual little extra step of publishing the artifact would be a pain for us. |
I see. Fine for me.
Yep - that's much better. Just remember this will only work from master merge or workfow_run (the {PR token is read only) |
In my head I had is only publishing a new release on tags - but we could probably have a "latest" release too that we overwrite the release blob for. And if you want to test out the theme built from a pr we can use the upload-artifact action |
I have prepared a PR that publishes the theme package on Github Action. |
LGTM |
I guess this is done @mik-laj ? can we close it ? |
Last PR: #12892 It would be nice to have this merged before the RC1 release, but that doesn't affect end users, so we can also merge it later and do one release more manually. |
No description provided.
The text was updated successfully, but these errors were encountered: