Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation and other guidelines proposal #1

Open
automainint opened this issue Sep 26, 2024 · 3 comments
Open

Documentation and other guidelines proposal #1

automainint opened this issue Sep 26, 2024 · 3 comments
Labels
documentation Improvements or additions to documentation

Comments

@automainint
Copy link

automainint commented Sep 26, 2024

This is my thoughts and suggestions about general meta documentation and documentation in specific projects.

Docs writing guidelines

I see the main purpose of a documentation in resolving any confusion that a user or a developer can have when working with code. So, to write a documentation, one should understand what confusions may arise. Which requires deep understanding of the code and how it works in practice. Which is why, for writing a documentation, I propose to have some draft beforehand, written by people how worked with the code directly. Then this draft can be completed by someone who didn't work with the code.

Specifically, I can write doc drafts for repos I worked on:

  • meritrank-psql-connector
  • meritrank-service-rust
  • meritrank-rust (but about this @ichorid probably knows more)
  • riblt-rust-py
  • riblt-rust

This proposal itself may also be completed into a guideline for writing docs.

CI

Currently suggested workflow "OneFlow with Forks" is unfit for CI and doesn't reflect the way repositories are organized. The reason for this is that different repositories are dependent on each other (meritrank-psql-connector -> meritrank-service-rust -> meritrank-rust; and riblt-rust-py -> riblt-rust). To run integration tests I need to know the relevant branch name, which would be different for each feature in "OneFlow". Instead, I just use the same branch name dev, so I don't have to change CI scripts each time I make changes in the code (and then change it back to main). This is called "Trunk-based workflow", where all the changes are committed into one development branch, and then if everything works, merged into main. Different developers may have their own forks or branches, but they have to merge into dev instead of main.

The other way to not have the problem with branch names would be to have a single monorepo, and therefore no "integration tests" at all. Different developers can work on different projects in the same repo, just in different subfolders, so there will be no merge-conflicts (or, they will be quite easy to resolve). I would prefer this instead of multiple repos, but so far I just follow the way projects were already organized.

Also, "OneFlow" may be better in a situation where several developers work on the same set of source files at the same time. I consider this a bad practice in general (and we don't have so many developers anyway). It's better to organize tasks in a way so each developer works on separate source files.

Workflow guidelines

I suggest to also add some basic rules to general guidelines:

  • There's no need for PR if only one developer works on a project, just merge directly;
  • Have descriptive names for PRs;
  • Have some motivation for changes in PR descriptions.
@ichorid
Copy link
Contributor

ichorid commented Sep 26, 2024

@automainint - thanks for an insightful opinion. Let me address the main points of it:

About the docs

There are two types of documentation: external and internal. The external documentation is targeted at the user of a library and aims to facilitate adoption, while the internal focuses on making it easier for the external contributor to make meaningful improvements in the project's code. Both types are crucial for a project, but I'd say that the external one is more important at earlier stages: people rarely come directly to contribute to something, but rather start contributing after becoming users.

So, I do not see the internal documentation as a priority. I see more of a priority in the following...

Monorepo

I agree that tightly coupled projects are much easier to handle from a single repo. So, I'm pro-merging the
meritrank-psql-connector, meritrank-service-rust, meritrank-rust repos into a single one. My only concern is the ability to handle multiple projects/crates from a single repo properly. I'm not a specialist in Rust Cargo standards, so if you (@automainint) volunteer to experiment with handling multiple crates/apps from a single repo - go for it, there is value in it.
Merging the repos would also solve the problem with "OneFlow vs Trunk"...

OneFlow vs Trunk

I've read the link about the "Trunk-based workflow". In my understanding, it describes the OneFlow development style exactly, so "Trunk" is just another name for "OneFlow". The only difference is that in OneFlow you create specific release branches/tags out of main to stabilize it. So, I don't see any value in first merging to dev and then -> main: if people actively put their half-baked code to dev it will never stabilize. That's why in OneFlow they just merge everything to main and create stable releases from it. And again, the rule of "your PR is not going to be reviewed if it breaks CI/CD` makes OneFlow with Trunk effectively the same.
So, for our case, I don't care how we call it: OneFlow or Trunk, the idea is very simple:

  1. Merge your PRs into main
  2. Your PRs should never break CI/CD, all the checks must be green 🟢 . If you have half-baked code, use feature flags to circumvent it

Everything else (branches, tags, releases, etc.) - is a gimmick.

Of guidelines and private dirs

Fortunately, we typically work on different parts of the project, but we still sometimes have to make slight adjustments to each other's code. So, all the proposed rules ("merge directly," etc.) make sense as long as we don't doctor the CI/CD to push our stuff. That kind of change must be communicated.

@automainint
Copy link
Author

automainint commented Sep 26, 2024

@ichorid

My only concern is the ability to handle multiple projects/crates from a single repo properly.

I don't think there's any difference in managing crates. However, my preference is to not use package managers at all.

I don't see any value in first merging to dev and then -> main: if people actively put their half-baked code to dev it will never stabilize. That's why in OneFlow they just merge everything to main and create stable releases from it.

Currently, there is a CI script that publishes a docker image for the main branch automatically. I can change this to publish for tags or manually.

As I said, projects are dependent on each other. If a feature is integral, I can't just have all green CI in one repo. I will have to merge untested changes in one repo in order to test the other repo (or change CI scripts constantly back and forth, which is worse); and this testing may fail, so I will end up with red changes merged into the branch. (Also, I don't think feature flags can help with this, unless we introduce a lot of extra complications in the development process.)

I can eliminate dev branches, and then main inevitably will have broken code with red CI sometimes.

However, all this will not be a problem for monorepo.

if people actively put their half-baked code to dev it will never stabilize

I don't think this is relevant for the workflow assessment. In any workflow programmers can put bad code in main branch, or write inconsistent code in feature branches and then have merge hell, etc. But I think this is actually important to specify in guidelines. Red CI is an indicator that should be used consciously. Fixing inconsistencies in the code should have higher priority than implementing new features.

In conclusion

I see two possible solutions

  • Make a monorepo;
  • Use one branch and multiple repos; accept that main branch will have red CI sometimes.

@ichorid
Copy link
Contributor

ichorid commented Sep 27, 2024

@automainint - while I don't share your PoV regarding avoiding package managers, I agree switching to monorepo pattern will save us a lot of effort in this particular case. So, I propose taking the following actions:

  1. Combine meritrank-psql-connector, meritrank-service-rust, meritrank-rust into a monorepo (with packages).
  2. Adopt the practice of merging into main directly by default, without requiring PR reviews, if all the checks are green.
  3. Change CI/CD to produce Docker images when a tag with a certain format (e.g. release-vX.Y.Z) is pushed into any branch in the repo. This way, we could branch off the main to have "temporary stabilization branches" as necessary, without affecting main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants