Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Development Workflow #1

Closed
germonprez opened this issue Jan 20, 2017 · 18 comments
Closed

Development Workflow #1

germonprez opened this issue Jan 20, 2017 · 18 comments
Assignees
Labels
feature-request Request for a new feature in Augur

Comments

@germonprez
Copy link
Contributor

Hi all,

I strongly encourage us to create a formalized workflow around the development of this system. This should include things like:

  1. Contributor agreements
  2. Defined processes for accepting pull requests
  3. Defined processes for merging branches (we should have master and development)

matt

@howderek howderek self-assigned this Jan 20, 2017
@howderek howderek added the feature-request Request for a new feature in Augur label Jan 25, 2017
@sgoggins
Copy link
Member

sgoggins commented Feb 2, 2017

We are a small team. For now lets use the slack channel to coordinate.

@GeorgLink
Copy link
Member

During our interviews, we have seen a lot of interest in our work and we might see people drop in to check on the status occasionally. Slack is a closed system that others cannot follow our progress.

I believe that coordinating the development work through GitHub gives us the advantage of gaining a sense for what the metrics that we calculate originate from. We might even be able to determine the health of our micro-community :-)

@howderek
Copy link
Contributor

howderek commented Feb 2, 2017

@GeorgLink communication in Slack would be limited to dividing development work between us. This is a common practice, IRC and development have gone hand in hand for ages.

Examples:

It makes sense to use Slack/Branches for core development and things like issues and shared documents for big picture public discussion. Outsiders are able to follow development through the commit log. When we have a working version of ghdata we can focus on creating a more sustainable, open project. It's hard to attract contributors without fundamental functionality anyway.

@howderek howderek added this to the 1.0.0 milestone Feb 2, 2017
@GeorgLink
Copy link
Member

Indeed, Slack, IRC, Skype etc. are great for quick communication and keeping us on the same page.
I am merely making a case for using issues and pull requests for building up a public knowledge base and history. It is not my place to tell anyone how to work on this project. I believe that defined processes help us collaborate with others and building habits early are advantageous for later. :-)

@germonprez
Copy link
Contributor Author

I'm fine with Slack. Not sure what the more general OSS world's use of Slack is. IRC is generally the way to go on this one. Can we make Slack a public discussion where anyone can join? I would prefer this.

Also, I chatted with Derek and Sean yesterday. We agreed that no one can accept their own pull request -- I would further like to note that I think all changes should go through pull requests. Just work on your local clone and issue the request.

@howderek
Copy link
Contributor

howderek commented Feb 3, 2017

@germonprez We should use branches and then merge the branches after discussing them with our colleagues. Pull requests are generally for contributors outside the core team, and branches are the most common way to internally collaborate in git. If you are concerned that our current workflow is not descriptive enough, we can communicate through issues and use the feature-branch workflow.

@GeorgLink
Copy link
Member

@howderek just to clarify. Are you suggesting that all of us have an individual branch in the main repository and then issues pull requests to the main development branch so that we can review each other's changes?

@howderek
Copy link
Contributor

howderek commented Feb 3, 2017

@GeorgLink I'm suggesting that we create a new branch for every issue and when that issue is resolved, we merge it into dev, and then when it's tested, we merge dev into master

@GeorgLink
Copy link
Member

GeorgLink commented Feb 3, 2017 via email

@germonprez
Copy link
Contributor Author

@howderek how is a merge not seen as a pull request in this case?

I was thinking that

  1. Each of us has a clone of ghdata (all branches) locally. This is were we work
  2. We commit to our local clone
  3. We push to our fork of ghdata (whatever branch is appropriate -- dev or exploratory)
  4. We issue a pull request on ghdata (main) to merge our updated/changed branch from our fork

Discussion can then happen on that pull request (if needed). If it's not needed then have someone else on the team merge. Perhaps there is a smoother way to tackle this.

@howderek
Copy link
Contributor

howderek commented Feb 3, 2017

@germonprez we have to be able to make changes without review or development will be very slow. The object of forks -> pull requests is so the maintainers can make sure someone without commit rights changes are safe before merging them in.

We should each make a branch for the individual issues. If they are non-breaking changes, we should commit them to dev. Dev is okay to break, but we should do our due diligence to test before pushing anything. We should always talk and make sure nothing is broken before merging dev -> master.

If there are significant changes that need peer review, we can make pull requests to merge a branch into another branch. (So yes @GeorgLink we can! :) ). We should not have our own forks. We are the core team and have commit rights, creating our own forks unnecessarily complicates things.

Also @GeorgLink regarding branches "Would we clutter up the repository with numerous open and closed branches", that's why you don't publish your branches. You rebase and then publish to dev.

Here is how the workflow would usually look, assuming you're in the dev branch and your local branch is in sync with master:

Doing work locally (small change, you're sure it won't interfere with other people's work):

git checkout -b your_feature
vim new_feature.py
git add new_feature.py
git commit -m "blahblahblah for my own reference"
vim new_feature.py
git commit -m "more things I need to know"
git checkout dev
git merge --squash your_feature
git commit -m "Message for everyone else"
git push

Large work (Probable conflicts):

git checkout -b your_feature
vim shared_file.py
git commit -m "Message for everyone about this"
vim another_shared_file.py
git commit -m "More things you need to know"

# It's been a few days
git stash
git checkout dev
git pull
git checkout your_feature

# Time to make your changes
git rebase --interactive dev
# Edit the file that appears so that your commits will make sense to others
# Manually fix merge conflicts
git rebase --continue
git push

Here is a great guide: https://sandofsky.com/blog/git-workflow.html

@abuhman
Copy link
Contributor

abuhman commented Feb 3, 2017

That was an interesting link, Derek. I didn't know there was a way to 'hide' transitional commits that cause problems (generally I've labeled them with something like "this commit will break things, don't use it" or something)

Here's what teams I have been on have done in the past as far as git workflow:

There is a dev branch and a master branch.

From the dev branch, feature branches/ bug fix branches are created.

Ideally, there is only one new thing/one fix per branch, or at least only one thing changed per commit. This means that if a bug emerges later that didn't reveal itself during testing, there is the option of simply removing the offending commit(s) without breaking everything else.

Feature branches are merged into the dev branch by each developer once they have been tested and are working locally. The code on the dev branch is intended to be working (though less vetted than master), but feature branches may in intermediate stages contain code that breaks things.

The dev branch has pull requests into master (which are approved after further testing and review of the code)

One problem I have seen with the above is that if dev is not merged into master regularly, then new changes build up and the testing and review process becomes more and more difficult. If we use this option, I would say dev --> master pull requests would need to be somewhat frequent.

Another issue (which has been mentioned above) is that sometimes the pull request into master process slows things down considerably, especially if another developer is needed to approve it who is not available right now. However, since our project isn't live anywhere and to my understanding the deadline is Dec 2017, changes to master are not urgent and likely will not cause problems if delayed briefly.

For non-git workflow:

For code files that cannot be easily obtained through git such as with 'git pull' (for example, if we were to have an environment file whose format had changed) there needs to be a way to notify developers of how to obtain, change, and/or utilize it. If developers don't know how to get set up, or if newly pulled-down develop branch code breaks because there was also an environment file change, this causes delays as developers try to figure out what went wrong. This would need to be something we did any time such a file was updated.

We should also have instructions on how the code is intended to be run/utilized, as well as any initial setup that needs to happen before it can be run.

@abuhman
Copy link
Contributor

abuhman commented Feb 7, 2017

Ideas that were brought up over the weekend regarding this:

If someone steps on your toes or changes something you needed, talk to them about it.

Possibility of intent statement for files or repositories if they aren't meant to be part of the general workflow (example: "This is Bobby Bobberson's experimental file. Please don't edit")

@germonprez
Copy link
Contributor Author

Thanks for the great comments. It sounds like you all are settling into a development pattern for the project. I do defer to @howderek and @abuhman for getting this flow down. Thanks all.

@howderek
Copy link
Contributor

howderek commented Feb 13, 2017

I was talking with @ChristianCme, and he wants to use a kanban board to divide/plan the work. I agree that an asynchronous and straightforward method such as a kanban board is useful for visualizing who's doing what and what needs to be done.

If we do so, and I think we should, coordination will look like this:

Large Features/Roadmap - Wiki
 |
 |____ Tasks - Kanban Board, also Issues for more complicated tasks
        |
        |___Division of those tasks -  Slack/Chat

What do you all think?

@GeorgLink
Copy link
Member

I think this might work for us.
Would we use the Projects in GitHub for the Kanban board?

@howderek
Copy link
Contributor

@GeorgLink Yes! I already made one for Metrics that we will put the metrics you wrote in CCCIndicators

@GeorgLink
Copy link
Member

I like it!
In light of the discussion for maintaining one repository or two, how easy is it to move the project to another repository?

@germonprez germonprez mentioned this issue Apr 8, 2017
@sgoggins sgoggins modified the milestones: 0.5.0: Festivus Down, 0.4.0: Red Pumpkin Dec 17, 2017
sgoggins added a commit that referenced this issue Jun 6, 2019
Fix indentation bug in jobs.py
sgoggins pushed a commit that referenced this issue Dec 28, 2020
Updating my dev branch from the main dev branch at chaoss/augur
sgoggins pushed a commit that referenced this issue Dec 28, 2020
sgoggins pushed a commit that referenced this issue Jan 4, 2021
sgoggins pushed a commit that referenced this issue Jul 13, 2021
sgoggins pushed a commit that referenced this issue Dec 13, 2021
Added notes and requirements docs for Sprint 1
sgoggins pushed a commit that referenced this issue Apr 14, 2022
Updated a typo error on line 196.
sgoggins added a commit that referenced this issue Apr 26, 2022
```
2022-04-14 17:10:28,581,581ms [PID: 984587] workers.pull_request_worker.50471 [DEBUG] Extracting data from source in pr model. ERROR: invalid literal for int() with base 10: '909941810.0'
Traceback (most recent call last):
  File "/bigdisk/augur-vmware/workers/pull_request_worker/pull_request_worker.py", line 554, in pk_source_increment_insert
    prs_insert = [
  File "/bigdisk/augur-vmware/workers/pull_request_worker/pull_request_worker.py", line 558, in <listcomp>
    'pr_src_id': int(str(pr['id']).encode(encoding='UTF-8').decode(encoding='UTF-8')),#1-22-2022 inconsistent casting; sometimes int, sometimes float in bulk_insert
ValueError: invalid literal for int() with base 10: '909941810.0'
2022-04-14 17:10:28,581,581ms [PID: 984587] workers.pull_request_worker.50471 [INFO] 0 insertions are needed and 0 updates are needed for pull_requests
2022-04-14 17:10:28,581,581ms [PID: 984587] workers.pull_request_worker.50471 [INFO] Preparing to enrich data.

2022-04-14 19:07:07,129,129ms [PID: 984587] workers.pull_request_worker.50471 [DEBUG] Extracting data from source in pr model. ERROR: 'cntrb_id'
Traceback (most recent call last):
  File "/bigdisk/augur-vmware/workers/pull_request_worker/pull_request_worker.py", line 554, in pk_source_increment_insert
    prs_insert = [
  File "/bigdisk/augur-vmware/workers/pull_request_worker/pull_request_worker.py", line 572, in <listcomp>
    int(pr['cntrb_id']) ## cast as an int because of an otherwise inexplicable error.
KeyError: 'cntrb_id'
2022-04-14 19:07:07,130,130ms [PID: 984587] workers.pull_request_worker.50471 [INFO] 0 insertions are needed and 0 updates are needed for pull_requests
2022-04-14 19:07:07,130,130ms [PID: 984587] workers.pull_request_worker.50471 [INFO] Preparing to enrich data.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Request for a new feature in Augur
Projects
None yet
Development

No branches or pull requests

5 participants