-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve code authorship attribution #944
Comments
Some additional implementation details:
|
So in this algorithm, given the flow:
The full credit in this case will go to A or B? |
For this case, A will be given partial credit. After this change, the author who last modified the line will still get credit as before, but this algorithm will determine if he/she gets partial or full credit. That means for the line described above, B will still not receive any credit. When tracing the "ancestry" of a line, the algorithm terminates and gives partial credit once a different author is found. So for the line described, once we find out that B authored the previous version, we will give A partial credit and analysis will stop there. This is less rigorous, but saves the time of having to trace the ancestry of the line all the way back to the start. Hope this clarifies, do let me know if you think there are areas which could be improved. Thanks! |
Just to ask and clarify:
|
Thanks for clarifying! To answer your questions:
Yup correct.
Yup correct, strictly speaking >= 0.8.
Yup correct. |
One more question, how far do we go in terms of tracing the ancestry of a line? |
Currently there is no limit to how far back we trace the ancestry. The current way it works is that the tracing will stop if:
We continue to trace if the ancestor line has similarity value >= 80% and is written by the author (author who last modified the line). In the worst case, the tracing will stop when there are no ancestor lines. Ie. the commit where the file was first added. |
Would it be faster if we just find the original author and the last modifier of the line? |
If we just find the original author, we would still need to trace the ancestry of the line but we save the time needed to check the author of each of the ancestor lines during tracing. On the other hand, the current method checks the author of each of the ancestor lines but is able to terminate early once a different author is found. The author of a line is obtained from the results of the |
@SkyBlaise99 it's been a while since someone looked at this PR. Perhaps you can send a cleaned up PR and we can get the current active developers to take a fresh look? |
Sure prof @damithc, I'm planning to make a new pr since the old one has too much merge conflict to resolve. |
) AnnotatorAnalyzer only overwrites the author but not the credit information, when an author tag is found. If the analyze authorship flag is enabled, credit information based on the blame author will be wrongly inherited by the annotated author. Lets assign partial credit if the annotated author is not the same as the blame author, and keep the analyzed credit information if the 2 are the same.
* [#2027] Fix date range bug (#2034) Currently, users are unable to select a zoom range that includes the until date. This results in misleading data being presented to users. * [#2039] Update cypress minimum requirement to 12.15.0 (#2041) Chrome bug is causing cypress to fail to open a browser on Github Actions, causing frontend tests and CI to fail. Upgrading cypress to greater than 12.15.0 will fix this issue. Let's upgrade cypress to fix the failing CI. * [#1936] Migrate c-segment.vue to typescript (#2035) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate the rest of the JavaScript code to TypeScript code to facilitate future changes to the code. * [#1936] Migrate load-font-awesome-icons.js to typescript (#2040) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate the rest of the JavaScript code to TypeScript code to facilitate future changes to the code. * [#2045] Fix cypress zoom feature test (#2047) Currently, Cypress zoom feature tests are failing due to a recent change in behavior caused by a bug fix. With the tests failing, we are unable to detect any future regressions. Let's update the Cypress tests to test for the new intended behavior. * [#1936] Migrate random-color-gen.js to typescript (#2043) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate random-color-generator.js JavaScript code to TypeScript code to facilitate future changes to the code. * [#1936] Migrate c-segment-collection.vue to typescript (#2036) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate the rest of the JavaScript code to TypeScript code to facilitate future changes to the code. * [#1936] Migrate c-resizer.vue to typescript (#2038) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate the rest of the JavaScript code to TypeScript code to facilitate future changes to the code. * Bump zod from 3.20.6 to 3.22.3 in /frontend (#2048) Bumps [zod](https://github.com/colinhacks/zod) from 3.20.6 to 3.22.3. - [Release notes](https://github.com/colinhacks/zod/releases) - [Changelog](https://github.com/colinhacks/zod/blob/master/CHANGELOG.md) - [Commits](colinhacks/zod@v3.20.6...v3.22.3) --- updated-dependencies: - dependency-name: zod dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump @cypress/request and cypress in /frontend/cypress (#2042) Bumps [@cypress/request](https://github.com/cypress-io/request) to 3.0.1 and updates ancestor dependency [cypress](https://github.com/cypress-io/cypress). These dependencies need to be updated together. Updates `@cypress/request` from 2.88.12 to 3.0.1 - [Release notes](https://github.com/cypress-io/request/releases) - [Changelog](https://github.com/cypress-io/request/blob/master/CHANGELOG.md) - [Commits](cypress-io/request@v2.88.12...v3.0.1) Updates `cypress` from 12.17.4 to 13.3.0 - [Release notes](https://github.com/cypress-io/cypress/releases) - [Changelog](https://github.com/cypress-io/cypress/blob/develop/CHANGELOG.md) - [Commits](cypress-io/cypress@v12.17.4...v13.3.0) --- updated-dependencies: - dependency-name: "@cypress/request" dependency-type: indirect - dependency-name: cypress dependency-type: direct:development ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [#1936] Migrate c-ramp.vue to typescript (#2037) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate the rest of the JavaScript code to TypeScript code to facilitate future changes to the code. * Give partial credit if annotated author is not the same as the blame author * [#2054] Fix zoom view bug (#2055) Currently, when granularity is set to day or week, clicking on a ramp will open up a zoom view where commit messages are not being displayed and sorting by insertions does not result in any sorting. Let's fix the unintended behaviour of the zoom view. * [#1936] Migrate repo-sorter.js to typescript (#2052) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate repo-sorter.js to TypeScript code to facilitate future changes to the code. * [#1936] Migrate safari_date.js to typescript (#2053) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate safari_date.js to TypeScript code to facilitate future changes to the code. * Remove frontend JS lint (#2063) Currently, frontend linter is failing due to lint scripts checking javascript files, the last of which has been removed in PR #2053. Lets update the lint command to exclude javascript files front the check. * use full and partial credit color * [#1929] Add dynamic positioning support for tooltips (#2056) Currently, most tooltips are shown above buttons and text. When these tooltips appear at the top of the viewport, part of the tooltips will not be rendered. Let's implement changes such that these tooltips appear below the text or button, when appearing at the top of the viewport. * Add test cases for annotated author overriding last author's credit * revert merge from master * revert merge from master 58b7002 * [#1928] Fix tooltip zIndex such that it doesn't occlude next file title (#2057) Currently, if one hovers over a tooltip of the pinned title of a file whose content is scrolled almost completely, such that the title of the next file is just below the pinned title, the tooltip is not displayed appropriately, as the title of the next file obstructs it. Let's fix this issue. * [#1726] Update GitHub-specific references in codebase and docs (#2050) There are still leftover references specific to GitHub on parts of the codebase and docs that have been generalized to accept other remote git hosts. Let's update these GitHub references to use more general language. * Trigger workflow * Revert "Merge branch 'master' into 944-analyze-authorship" This reverts commit 950c912, reversing changes made to 4bd05a7. * fix frontend test failing --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: jq1836 <[email protected]> Co-authored-by: Chan Jun Da <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Pratham Jain <[email protected]>
* [#2027] Fix date range bug (#2034) Currently, users are unable to select a zoom range that includes the until date. This results in misleading data being presented to users. * [#2039] Update cypress minimum requirement to 12.15.0 (#2041) Chrome bug is causing cypress to fail to open a browser on Github Actions, causing frontend tests and CI to fail. Upgrading cypress to greater than 12.15.0 will fix this issue. Let's upgrade cypress to fix the failing CI. * [#1936] Migrate c-segment.vue to typescript (#2035) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate the rest of the JavaScript code to TypeScript code to facilitate future changes to the code. * [#1936] Migrate load-font-awesome-icons.js to typescript (#2040) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate the rest of the JavaScript code to TypeScript code to facilitate future changes to the code. * [#2045] Fix cypress zoom feature test (#2047) Currently, Cypress zoom feature tests are failing due to a recent change in behavior caused by a bug fix. With the tests failing, we are unable to detect any future regressions. Let's update the Cypress tests to test for the new intended behavior. * [#1936] Migrate random-color-gen.js to typescript (#2043) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate random-color-generator.js JavaScript code to TypeScript code to facilitate future changes to the code. * [#1936] Migrate c-segment-collection.vue to typescript (#2036) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate the rest of the JavaScript code to TypeScript code to facilitate future changes to the code. * [#1936] Migrate c-resizer.vue to typescript (#2038) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate the rest of the JavaScript code to TypeScript code to facilitate future changes to the code. * Bump zod from 3.20.6 to 3.22.3 in /frontend (#2048) Bumps [zod](https://github.com/colinhacks/zod) from 3.20.6 to 3.22.3. - [Release notes](https://github.com/colinhacks/zod/releases) - [Changelog](https://github.com/colinhacks/zod/blob/master/CHANGELOG.md) - [Commits](colinhacks/zod@v3.20.6...v3.22.3) --- updated-dependencies: - dependency-name: zod dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump @cypress/request and cypress in /frontend/cypress (#2042) Bumps [@cypress/request](https://github.com/cypress-io/request) to 3.0.1 and updates ancestor dependency [cypress](https://github.com/cypress-io/cypress). These dependencies need to be updated together. Updates `@cypress/request` from 2.88.12 to 3.0.1 - [Release notes](https://github.com/cypress-io/request/releases) - [Changelog](https://github.com/cypress-io/request/blob/master/CHANGELOG.md) - [Commits](cypress-io/request@v2.88.12...v3.0.1) Updates `cypress` from 12.17.4 to 13.3.0 - [Release notes](https://github.com/cypress-io/cypress/releases) - [Changelog](https://github.com/cypress-io/cypress/blob/develop/CHANGELOG.md) - [Commits](cypress-io/cypress@v12.17.4...v13.3.0) --- updated-dependencies: - dependency-name: "@cypress/request" dependency-type: indirect - dependency-name: cypress dependency-type: direct:development ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [#1936] Migrate c-ramp.vue to typescript (#2037) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate the rest of the JavaScript code to TypeScript code to facilitate future changes to the code. * Give partial credit if annotated author is not the same as the blame author * [#2054] Fix zoom view bug (#2055) Currently, when granularity is set to day or week, clicking on a ramp will open up a zoom view where commit messages are not being displayed and sorting by insertions does not result in any sorting. Let's fix the unintended behaviour of the zoom view. * [#1936] Migrate repo-sorter.js to typescript (#2052) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate repo-sorter.js to TypeScript code to facilitate future changes to the code. * [#1936] Migrate safari_date.js to typescript (#2053) Currently, there is still some JavaScript code which remains unmigrated. This allows for type unsafe code to be written, potentially resulting in unintended behavior. Let's migrate safari_date.js to TypeScript code to facilitate future changes to the code. * Remove frontend JS lint (#2063) Currently, frontend linter is failing due to lint scripts checking javascript files, the last of which has been removed in PR #2053. Lets update the lint command to exclude javascript files front the check. * use full and partial credit color * [#1929] Add dynamic positioning support for tooltips (#2056) Currently, most tooltips are shown above buttons and text. When these tooltips appear at the top of the viewport, part of the tooltips will not be rendered. Let's implement changes such that these tooltips appear below the text or button, when appearing at the top of the viewport. * Add test cases for annotated author overriding last author's credit * revert merge from master * revert merge from master 58b7002 * [#1928] Fix tooltip zIndex such that it doesn't occlude next file title (#2057) Currently, if one hovers over a tooltip of the pinned title of a file whose content is scrolled almost completely, such that the title of the next file is just below the pinned title, the tooltip is not displayed appropriately, as the title of the next file obstructs it. Let's fix this issue. * [#1726] Update GitHub-specific references in codebase and docs (#2050) There are still leftover references specific to GitHub on parts of the codebase and docs that have been generalized to accept other remote git hosts. Let's update these GitHub references to use more general language. * Trigger workflow * Revert "Merge branch 'master' into 944-analyze-authorship" This reverts commit 950c912, reversing changes made to 4bd05a7. * fix frontend test failing * switch to originality score and threshold * update originality threshold * revert frontend code changes --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: jq1836 <[email protected]> Co-authored-by: Chan Jun Da <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Pratham Jain <[email protected]>
) Currently, when viewing individual contributions, full and partial credit are differentiated with dark and light green colors. However, this distinction is not applied when the group is merged, potentially causing confusion for users. Let's introduce a clear differentiation between full and partial credit when viewing a merged group.
The existing setup employs a static originality threshold of 0.51. However, this threshold is tailored for codes, such as Java or Markdown, and might not be suitable for other programming languages. Additionally, it doesn't offer flexibility for users who may want a stricter threshold but are willing to endure longer processing times, or those who prefer a more lenient threshold but prioritize faster analysis speeds. Let's enable users to input their preferred originality threshold.
Currently, the authorship credit analysis process exceeds an hour in duration, which is significantly longer than the mere 5 minutes required when the feature is deactivated. Let's speed up the performance by implementing a caching mechanism and refining the dynamic programming algorithm utilized for computing the Levenshtein distance.
A line is credited to the author who last modified it. Another author might have written the line initially and the current author only modified it slightly. In such a case, the current author gets credited for work that is not entirely done by him/her. Let's analyze how similar a line is as compared to its ancestor lines (previous versions of the line) and give full or partial credit to the last author based on the analysis.
Currently in the code panel, an author is credited for lines of code that was last modified them. However, another author might have written the line initially and the current author (author who last modified the line) might have only made a slight modification to the line. In such a case, the current author gets credited for work that is not entirely done by them.
Proposed solution:
Assign partial or full credit of a line to the current author based on whether they introduced the line or someone else introduced the line and they only made a slight modification to it. This is done by checking the ancestor line (previous version of the line) to see how similar it is to the line. Users add a command line flag to indicate if they want this feature to be turned on - it is turned off by default.
For a given line l and author a who last modified the line, we can assign credit to author a using the following steps:
After we assign partial or full credit to the author, we can display the results in the code panel by using a darker shade of green to highlight code with full credit.
The text was updated successfully, but these errors were encountered: