Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Test Failure] Test failure in 32bit defragmentation #1344

Closed
madolson opened this issue Nov 24, 2024 · 9 comments · Fixed by #1370
Closed

[Test Failure] Test failure in 32bit defragmentation #1344

madolson opened this issue Nov 24, 2024 · 9 comments · Fixed by #1370
Assignees
Labels
test-failure An issue indicating a test failure

Comments

@madolson
Copy link
Member

Ever since we merged this commit devendoring jemalloc, we are seeing a consistent test failure. It is not failing on any other the other configurations.

See 32 bit test failures:

  1. https://github.com/valkey-io/valkey/actions/runs/11963892902/job/33355221669
  2. https://github.com/valkey-io/valkey/actions/runs/11982115030/job/33409589244
  3. https://github.com/valkey-io/valkey/actions/runs/11991491528/job/33430151931
@madolson madolson added the test-failure An issue indicating a test failure label Nov 24, 2024
@madolson
Copy link
Member Author

@zvi-code Can you take a look?

@zvi-code
Copy link
Contributor

yes, looking

@zvi-code
Copy link
Contributor

zvi-code commented Nov 24, 2024

@madolson building on ubuntu x86 with 32bit flag, as instructed here, does not reproduce the issue and tests pass successfully. Any advise on how to reproduce?

@zvi-code
Copy link
Contributor

Looking at the failures

*** [err]: Active defrag big list: standalone in tests/unit/memefficiency.tcl
Expected 1.43 >= 1.7 (context: type eval line 40 cmd {assert {$frag >= $expected_frag}} proc ::test)
Cleanup: may take some time... OK
*** [err]: Active defrag big list: standalone in tests/unit/memefficiency.tcl
Expected 1.62 >= 1.7 (context: type eval line 40 cmd {assert {$frag >= $expected_frag}} proc ::test)
Cleanup: may take some time... OK
*** [err]: Active defrag big list: standalone in tests/unit/memefficiency.tcl
Expected 1.30 >= 1.7 (context: type eval line 40 cmd {assert {$frag >= $expected_frag}} proc ::test)
Cleanup: may take some time... OK

This failures mean we didn't reach the desired fragmentation, it's not strictly related to defrag mechanism, it's the preconditioning phase of the defrag that is not met. My PR showed slight improvement in defrag time, so maybe it also affected timing somehow

@zuiderkwast
Copy link
Contributor

I think it's a bit random, so run 10 or 100 times to reproduce it.

The test case should work harder and create more fragmentation to be sure to meet the pre-condition?

@zvi-code
Copy link
Contributor

This is my assumption, maybe wait a little longer, because we see the fragmentation achieved is not identical every time it runs (we do not validate value is in some range, only check it's at least some value), this indicates the test is not deterministic

@zvi-code
Copy link
Contributor

zvi-code commented Nov 27, 2024

I was able to reproduce it (only using the daily.yml). After several attempts, I figured the default config for lazyfree-lazy-user-del is yes, so when we delete an object it depends on the bio lazy free thread to when it will be actually freed.
When disabling lazyfree for the specific test that fails the 32bit passes successfully.

I will run more tests, but assuming it's validated, I would suggest running defrag without lazyfree at least for the pre-conditioning stage, WDYT? @madolson , @zuiderkwast, @ranshid

@zuiderkwast
Copy link
Contributor

Good findings! @enjoy-binbin changed to lazy del by default in #913

@zuiderkwast
Copy link
Contributor

Let's disable lazy del for this test case, yes. That was done in other test cases when we changed the default but we must have missed this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test-failure An issue indicating a test failure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants