Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored data clumps with the help of LLMs (research project) #3867

Merged
merged 2 commits into from
Jul 2, 2024

Conversation

compf
Copy link
Contributor

@compf compf commented Jun 21, 2024

Overview

Hello maintainers,

I am conducting a master thesis project focused on enhancing code quality through automated refactoring of data clumps, assisted by Large Language Models (LLMs).

Data clump definition

A data clump exists if

  1. two methods (in the same or in different classes) have at least 3 common parameters and one of those methods does not override the other, or
  2. At least three fields in a class are common with the parameters of a method (in the same or in a different class), or
  3. Two different classes have at least three common fields

See also the following UML diagram as an example
Example data clump

I believe these refactoring can contribute to the project by reducing complexity and enhancing readability of your source code.

Pursuant to the EU AI Act, I fully disclose the use of LLMs in generating these refactorings, emphasizing that all changes have undergone human review for quality assurance.

Even if you decide not to integrate my changes to your codebase (which is perfectly fine), I ask you to fill out a feedback survey, which will be scientifically evaluated to determine the acceptance of AI-supported refactorings. You can find the feedback survey under https://campus.lamapoll.de/Data-clump-refactoring/en

Thank you for considering my contribution. I look forward to your feedback. If you have any other questions or comments, feel free to write a comment, or email me under [email protected] .

Best regards,
Timo Schoemaker
Department of Computer Science
University of Osnabrück

I hereby agree to the terms of the JUnit Contributor License Agreement.


Definition of Done

Copy link
Contributor

@mpkorstanje mpkorstanje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not part of the JUnit Team. But I thought this PR looked interesting. And I left some suggestions, hope you find them useful.

Overall I think this refactoring has the right idea. Extracting the counter was the right action and does improve things. But this refactoring also unlocked some benefits that are not yet utilized.

}

@Test
void classLevelExceptionHandlersRethrowException() {
LauncherDiscoveryRequest request = request().selectors(selectClass(RethrowingTestCase.class)).build();
EngineExecutionResults executionResults = executeTests(request);

assertEquals(1, RethrowExceptionHandler.beforeAllCalls, "Exception should handled in @BeforeAll");
assertEquals(1, RethrowExceptionHandler.afterAllCalls, "Exception should handled in @AfterAll");
assertEquals(1, RethrowExceptionHandler.callCounter.getBeforeAllCalls(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the introduction of the HandlerCallCounter there is no need to store the counts as a field in the handler. You could now declare a HandlerCallCounter rethrowCounter as a static field on LifecycleMethodExecutionExceptionHandlerTests and reference that everywhere. I.e.

assertEquals(1, rethrowCounter.getBeforeAllCalls(), "Exception should handled in @BeforeAll");

With some thoughtfully chosen method and field names this would reduce the verbosity of the code a bit and keep the whole assertion on a single line.

Comment on lines 30 to 60
public void incrementBeforeAllCalls() {
beforeAllCalls++;
}

public void incrementBeforeEachCalls() {
beforeEachCalls++;
}

public void incrementAfterEachCalls() {
afterEachCalls++;
}

public void incrementAfterAllCalls() {
afterAllCalls++;
}

public int getBeforeAllCalls() {
return beforeAllCalls;
}

public int getBeforeEachCalls() {
return beforeEachCalls;
}

public int getAfterEachCalls() {
return afterEachCalls;
}

public int getAfterAllCalls() {
return afterAllCalls;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this class is only a local test helper, I don't think we need to encapsulate the fields. Therefore, I think these methods should be inlined.

@compf
Copy link
Contributor Author

compf commented Jul 1, 2024

Thank you very much for the feedback. @marcphilipp I have just applied the suggested changes.

@mpkorstanje While I like your reasoning, I think this should be part of a larger redesign of the test classes. I haven't implement your proposal for now but I would be happy to do so if others concur

@compf compf force-pushed the refactor_data_clumps branch from eb0cf99 to 3f75c2f Compare July 1, 2024 18:19
@marcphilipp marcphilipp added this to the 5.11 M3 milestone Jul 2, 2024
@marcphilipp marcphilipp merged commit 50b2856 into junit-team:main Jul 2, 2024
16 checks passed
@marcphilipp
Copy link
Member

@compf Thanks! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants