-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HitResults
are not calculated properly at the hit-window edges compared to osu!stable.
#11311
Comments
Given the following, I'm not surprised that lazer is giving a hit window of 60ms: osu/osu.Game.Rulesets.Osu/Scoring/OsuHitWindows.cs Lines 10 to 16 in b448fd1
osu/osu.Game/Beatmaps/BeatmapDifficulty.cs Lines 45 to 53 in b448fd1
Not sure where the 59.5ms figure cited above is coming from. |
Took the value from https://osu.ppy.sh/wiki/en/Beatmap_Editor/Song_Setup#overall-difficulty . When hovering over the beatmap difficulty stats in stable you can also see the hit window for |
this may be an intentional fix, for what it's worth. will need further discussion. |
The Of note: In osu-stable, osu!mania uniformly uses This could be fixed by subtracting 1 from these values, however I propose we keep them as-is and touch upon this in the future. I've labelled this issue as such. |
The compatibility is one thing, but the replay playback errors are another. osu/osu.Game/Scoring/Legacy/LegacyScoreEncoder.cs Lines 126 to 127 in 46be6bc
This means that, for instance, with a timing window of 60ms, if you hit a note at +59.6ms, it will be a 300 in the original play but then will change to +60ms in the replay and thus become a 100. In that case maybe frame times should be truncated/floored when being written out to replay? |
Probably not floored, because it works both ways - |
Probably not floored, because it works both ways - Edit: I guess it doesn't matter since we're not dealing with relative time values... But that means I don't think just flooring is enough to avoid replay playback errors. You'd want to |
relevant wiki page https://github.com/ppy/osu/wiki/Gameplay-differences-from-osu!stable#hit-window-edge-calculations-dont-match-stable (based on a quick glance of the discussion here, it might be missing some details for specific hit windows?) |
In the above wiki pr I've done some investigation about how stable does hit window comparisons. In summary:
A few examples
Some theoreticals
Neither of these two give correct effective hit windows (max error) for any case. The comparisons simply aren't the same. Of course, true hit timing aside, if we only compare replays between stable and lazer where we only see the ints, "ceiled&floored" would match stable in the case of osu!mania, but not for osu! (as in, you can be off by a whole millisecond in lazer but not in stable). Same for "floored&ceiled". In short:
I haven't been able to verify this in-game though as I don't have the tooling or time to do so. Maybe there's a relevant test scene in lazer, or if not maybe one of those could be cooked up (comparing against stable rounding/truncation)?
Rounding the hit time for replays should be correct, but I'm not entirely sure what that implies. Are these rounded hit times used for replay playback of plays set on lazer/stable or only for .osr exports?
I haven't been able to find anything of the sort in the stable reference code. In stable, taiko misses are given here, the comparison for which (<=) can be seen here and here (variable name for the "miss window" is a bit misleading). Greats and Ok's are compared with <. There's no difference between regular and large taiko notes from what I can tell.
Both stable and lazer use absolute hit error, so it's always comparing two positive values. There's no special case to consider here, especially since hit times are rounded in stable anyway. osu/osu.Game/Rulesets/Scoring/HitWindows.cs Lines 127 to 138 in d38316b
Where |
There are two issues presented here: one is the hit error as you're playing the game, and the other is the timing error of replays. Legacy replays only store integer time deltas whereas lazer deals with floating-point time, so lazer rounds the time values to write into the replay. Suppose you have a hitobject at 500ms. The user hits a circle at -59.6ms delta lazer-time (i.e. 440.4ms). Should this value be rounded up or down for storage in the replay. Conversely, suppose the user hits at +59.6ms, should this value be rounded up or down?
I believe your assessment here, and for the other hit results, is correct. |
Rounding the value emulates the rounded-integer temporal granularity of stable. With that, the only difference for replays in lazer (with integer time values) is that they're still compared against doubles, non-floored hit windows (as well as the This affects the lazer part of the examples I presented I suppose. I've updated them with lazer replay timings and also made it all into a table.
Well I thought through the implications of this... (see updated examples section) but I can't convince myself that this would solve anything.
|
Reading through this again, I see two options:
This one is probably too technical for community discussion so we'll probably have to come to a conclusion internally. @ppy/team-client anyone wants to sound off on this, or is this going to be a matter of "try both and see which works best"? |
is there a test scene for this? it would be invaluable for both outlined options |
I don't think this is something that is a blocker for the path-to-ranked project since it mostly affects (very few) replays and is otherwise unnoticeable in gameplay. We can fix this at any point later on, with likewise minimal effect to gameplay. |
Adding another sample, just in case (same settings as #25973) Beatmap
As high-level osu!mania gameplay is inherently reliant on tight timing windows, I think this is probably bound to happen more often than not for osu!mania replays. I don't think it's a blocker for ranked play though, as long as the gameplay score (not the replay one) makes sense. |
This does affect both gameplay and replays FWIW |
Bumping this to p0 until i or someone else investigates. |
As noted, currently lazer hit windows are too lenient for standard and taiko, and too strict for mania, compared to stable hit windows. For OD 10 in standard, the OD hit window formula gives 20 ms for the 300 hit window. In reality, true hit error in stable, as measured by the game's clock, can be in the interval (-19.5 ms, 19.5 ms) to get a 300, as described by the stable comparison formula. In lazer, a new, more intuitive formula is used, which results in the interval of [-20 ms, 20 ms], a 0.5 ms difference, a non-trivial reduction in difficulty. The effect is even more pronounced for fractional ODs, as stable floors the hit window, while lazer doesn't, resulting in up to 1.3-1.5 ms of difference. The disparity disadvantages existing stable scores in standard and taiko, and advantages existing stable scores in mania. This means that if an obsoleting of stable scores isn't desired, then these stable's hit windows must be matched in lazer, and thus the intuitive formula sadly cannot be followed as it is. This also means that 20 ms that is obtained from the OD hit window formula is no longer the "true" hit window - it's actually 19.5 ms. If the player asks for the actual hit window (eg. by hovering over the OD value in song select), it would therefore be best to return 19.5 ms, and not 20 ms. As a side note, it is the "true" hit window value that gets divided by speed rate, not the result of the OD hit window formula (except for mania, of course). There is a disadvantage to rounding player inputs to integers - it introduces a small amount of additional variance not due to the player's input variance. This play, set on stable, has the UR of 28.84. Had it been played on lazer, it would be expected to have about 28.55 UR. This effect is less pronounced the higher the UR. Thus, it is preferable to use This also means that downgrading An idea to resolve the issue would be to find a formula to use as the judgement formula which is equivalent to stable's, and works the same regardless of whether input is rounded (or transformed in another way, as we can modify ProofFor x not a half-integer, and N integer, Proofround() matches Math.round() everywhere except possibly half-integers. thus here they're equivalent.useful Math.Round() properties for x not half-integer:
Hence, for x not half-integer, the stable judgement formula for standard and taiko is equivalent to What about the hit_error half-integer case? Stable uses Math.Round(), which uses a to-even tiebreaker method. Of course, the half-integers that aren't close to the "true" hit window edge work as expected for both stable's and the proposed formula. However, in stable, half-integers at the edge can be either judged as in or out, depending on the OD. The proposed formula simply considers them all out, regardless of OD. It's not impossible to support to-even half-integer rounding in the judgement check method, but it's pointlessly complex just to not make hitting accurately harder by less than a nanosecond, and the new formula is more consistent. I would say half-integer hit errors not resulting in the same hit result in gameplay as in stable is a non-issue. It's only meaningful in the context of legacy score encoding (and even then, it's a very rare issue). Not sure if I'm understanding legacy score encoding correctly, but from looking at the linked code it seems to me that it's currently not working properly. For example, on osu!mania with OD 10, a hit that is 34.4 ms late would be a Good on live lazer, but rounded, it would be a hit error of 34 ms, which, when replayed, would be a Great. This would be fixed with the new proposed formula (by rounding, and with away-from-zero (in terms of hit error) tiebreaker to handle the half-integer case). ProofNew judgement formula (for standard and taiko): LHS: abs(hit error) < floor(hit window) - 0.5Thesis: new lazer replay formula that gives the same results as the above formula is:
The mania and taiko miss hit window proof is the same, with an extra +1 on the RHS. Unfortunately, from what I can see, After this issue is resolved, previously set lazer scores might have higher accuracy than they should, if there were hits at the edge (for example, hits in the [19.5 ms, 20 ms] range on OD 10 in standard). Unfortunately, it's either them or all the historical scores from stable that will be inaccurate, so the choice seems clear. This is why I would say this is actually an important issue to get right. Correcting ruleset rules after scores had been set on the leaderboards and pp awarded could be difficult. I've submitted a PR that aims to resolve the issue using this approach here. |
Out of curiosity and concern, how are you checking against stable when the code is not published? Usually we'd have people ask for the relevant extract of code before making statements like this. |
I claimed that based on Walavouchey's investigation with stable reference code, the fact that |
Your comment made me think more about the rounding, and its relation to legacy score encoding. The part of the formula in Walavouchey's table responsible for the hit error reads simply:
Reading Walavouchey's table, I assumed (I) was the used formula, and my remarks about legacy score encoding in the previous comment were based on that assumption. But in fact, the used formula could also be (II), and if that's the case, then the legacy score encoding can be rounded in the correct direction with just the replay frame value, as from that formula it can be seen that rounding the time value is correct to downgrade to int in legacy score encoding, as it basically repeats the same operation. If However, this is the correct way to downgrade only if the judgement formula is equivalent to stable's, and my '34.4 ms late hit' point still stands. For example, in mania, a 16.4 ms off (either way) hit would be Perfect in stable, and in lazer with my PR, but would be a Great in live lazer (if Perfect hit window was 16 ms), while being rounded to 16 ms during legacy score encoding, which would be a Perfect. Moreover, in lazer, the object time is a double. The two formulas are not equivalent for fractional object times. For |
I'm a bit confused how it's possible that a replayed osu!mania lazer play doesn't have the exact same result counts? #25973 was closed with a comment pointing to |
Reading through If anyone was wondering, here's how the results are counted differently in Paturages's mania replays: Maboroshi: the map is OD 8.3. live lazer's Perfect hit window is 15.77 ms, and Great hit window is 39.1 ms. In stable, they're 16.5 ms and 39.5 ms. In gameplay, hits that are off by x in [15.5, 15.77] ms are Perfect, but in the replay they get rounded to 16 ms and are thus Great. This is why the replay's Perfect count is lower. On stable, the Perfect count would be even higher than in lazer gameplay, and so would the sum of Perfect and Great counts. Neither lazer gameplay's nor replay's result counts are correct from stable's point of view. Counts in lazer with my PR are 2117/418/23/1/0/2, which looks legit. The Weekend: the map is OD 8; live lazer's Perfect hit window is 16.1 ms, stable's is 16.5 ms. The rest of the hit windows are all 0.5 ms wider in stable because of rounding. In gameplay, hits that are off by x in (16.1, 16.5] are Great, but in the replay they get rounded to 16 ms and are thus Perfect. Similarly, all the other results have 0.5 ms more lenient hit windows in replay and stable, and this is why the replay has higher accuracy than the gameplay across the board. The hit result counts in the replay are actually the correct result counts from stable's point of view, as the judgement formulas are equivalent, and legacy score encoding rounds the same way as stable. I get the same hit results counts in lazer with my PR. |
I'm deprioritising this from p0. #26452 exists but it hasn't been reviewed by anyone but me despite being open for months and judging from references to it from elsewhere we might not be interested in doing anything about it at all. Unless I'm wrong @ppy/team-client? |
Describe the bug:
For the
OsuRuleset
(did not research/test for other rulesets) I have found a bug as described in the title.Example:
Given OD=10, which means the hit-window for
Meh
(50) is supposed to be +-59.5ms:In Stable:
+59ms
=>Ok
,+60ms
=>Meh
In Lazer:
+59ms
=>Ok
,+60ms
=>Ok
which is incorrect.This is demonstrated in one score of mine:
Side by side comparison: https://www.youtube.com/watch?v=IQGsfzOEyJY
calculated correctly in Stable but not in Lazer.
I used gosumemory to calculate the hit offsets of this replay and at the
suspicious note (=6th last note of the map) I hit the note with exactly +60ms
offset. (https://gist.github.com/abstrakt-osu/46997fa9d203564ed1f1b1eee09a18ba#file-gosu-json-L894).
Screenshots or videos showing encountered issue:
Side by side comparison: https://www.youtube.com/watch?v=IQGsfzOEyJY
Beatmap: https://osu.ppy.sh/beatmapsets/1010865#osu/2117132
Score (+Replay): https://osu.ppy.sh/scores/osu/3321687146
Stable replay: https://youtu.be/FDhGuZh3P_U
Lazer replay: https://youtu.be/Ositc3vu_bg
osu!lazer version: 2020.1225.0
Logs:
The text was updated successfully, but these errors were encountered: