New update is much faster (everything fits easier into VRAM), but causes artifacts and incorrect depth when going too high #141

gituser123456789000 · 2024-06-20T14:24:09Z

gituser123456789000
Jun 20, 2024

I was lucky to find my card's maximum just before the update. I found mine maxes out around 3.7 million total pixels for the depth resolution (exact height and width depends on aspect ratio, but 3.7mil total pixel count)...

This maximum was very slow, running at a snail's pace before the update, but now it all stays in VRAM and runs fast.. like in 1 second instead of 30 seconds to 1 minute+ previously

so my max was something like 1904x1904 for a square image using AnyL v2. Now if I use anything over that.. I get the result, but it has problems. Artifacts are introduced and depth becomes incorrect

We need to somehow know what our limits were prior to this recent update. I had luckily just tested mine.. others have no idea. Others will push very high resolution, because it's easy now, but it will give worse results than some lower resolution.

maybe what I found is universal for all cards and that 3.6-3.7 million limit is the limitation of these new models. If that's the case, then that needs to be verified and then we (aka nagadomi) can try to code in a pixel limit for the Any v2 models. People may be limited otherwise due to hardware

gituser123456789000 · 2024-06-20T14:26:35Z

gituser123456789000
Jun 20, 2024
Author

Here's an example showing the map produced at my previous limit (right at the edge of getting a CUDA ran out of memory error).. 1904 resolution in my case for a square image... compared to 3840 with this new update. We can go higher in numbers, but it goes downhill in quality.

0 replies

gituser123456789000 · 2024-06-20T14:33:48Z

gituser123456789000
Jun 20, 2024
Author

some things, like you might look at the face and think it's more detailed, but pay attention to all details.. like her right earring in now darker, background instead of closer to the viewer as it should be.. there are background holes in her arms.. parts of her shoes are turning into background.. the grating or whatever it is at her knee level is much worse looking.. an artifact where the light is shining to the right of her.. the ground is no longer a smooth gradient, etc.

I know my setup's pixel limit (or maybe it's the limit of these models).. We need to figure out if this is a limitation of the models or not and code in a total pixel limitation if it is

It sounds like it may be universal, as nagadomi.. you mentioned things appearing broken over 2048 or something along those lines.. it was likely becoming broken slightly before that too

0 replies

gituser123456789000 · 2024-06-20T14:45:58Z

gituser123456789000
Jun 20, 2024
Author

it's a total pixel limit.. something between 3,631,096 and 3,701,852 pixels

for example 1904x1904 is 3,625,216 pixels and under this limit, so it produces an amazing result, as detailed and artifact free as possible

different aspect ratios will be a lower number.. like 16:9 aspect ratio limit is likely 1428 resolution... which would produce a map of around 1428x2539 = 3,625,692 pixels

another one of my tests in a different aspect ration maxed out at 1652x2198... 3,631,096 total pixels

0 replies

gituser123456789000 · 2024-06-20T15:01:09Z

gituser123456789000
Jun 20, 2024
Author

@nagadomi ... here's how I calculate how to find the limit (at least this is how it works for my setup, but I'm thinking it may be universal / a limit of these Any v2 models...

"original image res is 1920x804...
so 804/1920 = 0.41875....
0.41875 * the pixel limit of around 3,701,852 = 1,550,150
take the square route of that... which is 1245
now find the highest divisibility by 14... 1245 / 14 = 88.9285etc
so max res for this image is 88x14 or maybe 89x14...
1232 or maybe 1246"

final result 2940x1232 = 3,622,080 pixels

that's a calculation I did for the attached example

0 replies

nagadomi · 2024-06-20T15:26:04Z

nagadomi
Jun 20, 2024
Maintainer

This is a known problem that quality goes down with too high resolution input.
This problem is related to the model's capacity and the image resolution when training.
DepthAnything is trained with 518 resolution, so I can guess that 518 is the best safe resolution.

BoostingMonocularDepth discusses this problem.
https://github.com/compphoto/BoostingMonocularDepth?tab=readme-ov-file#observations

7 replies

nagadomi Jun 20, 2024
Maintainer

DA-2K seems not what I expected.
I will try to make a benchmark code for Hypersim dataset.

gituser123456789000 Jun 20, 2024
Author

More detailed thresholds should be found by building a evalution benchmark env and measuring the scores for different resolutions. Depth-Anything has released evaluation benchmark data called DA-2K, but seems to have no code to run it. I may try it for a test.

good you'll try it, because I don't understand half of what you said and wouldn't know where to start lol

but yeah, if you get some results that show at 16:9 you should be using 1442 resolution.. then that would be helpful.. like Stable Diffusion XL has recommended resolutions based around 1024x1024.. like I believe 896x1152 for example

nagadomi Jun 20, 2024
Maintainer

Hypersim was 1024x768 so it did not work for this purpose.
There may be no dataset available for evaluation.

For now, we have to trust our own eyeballs.

Edit:
I have a few ideas on how to do this and will try them out.

kaelsonofkrypto Jun 20, 2024

i just want to say thanks bro, each of these updates you do are getting better, i really appreciate it.

gituser123456789000 Jun 22, 2024
Author

I'm closing this discussion now. It's been tested and verified that 518 depth map resolution should give the most accurate results, because that's what the model is trained on. Sometimes you may get lucky going higher, but in general, the accuracy will start to degrade as you go higher. Even if the higher resolution depth maps may look sharper or more detailed, this can likely be incorrect depth and after converting and checking back and forth comparisons, issues can be noticed.

loawizard · 2024-06-20T19:56:41Z

loawizard
Jun 20, 2024

my speed went up! It's awesome!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New update is much faster (everything fits easier into VRAM), but causes artifacts and incorrect depth when going too high #141

{{title}}

Replies: 6 comments 7 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

New update is much faster (everything fits easier into VRAM), but causes artifacts and incorrect depth when going too high #141

gituser123456789000 Jun 20, 2024

Replies: 6 comments · 7 replies

gituser123456789000 Jun 20, 2024 Author

gituser123456789000 Jun 20, 2024 Author

gituser123456789000 Jun 20, 2024 Author

gituser123456789000 Jun 20, 2024 Author

nagadomi Jun 20, 2024 Maintainer

nagadomi Jun 20, 2024 Maintainer

gituser123456789000 Jun 20, 2024 Author

nagadomi Jun 20, 2024 Maintainer

kaelsonofkrypto Jun 20, 2024

gituser123456789000 Jun 22, 2024 Author

loawizard Jun 20, 2024

gituser123456789000
Jun 20, 2024

Replies: 6 comments 7 replies

gituser123456789000
Jun 20, 2024
Author

gituser123456789000
Jun 20, 2024
Author

gituser123456789000
Jun 20, 2024
Author

gituser123456789000
Jun 20, 2024
Author

nagadomi
Jun 20, 2024
Maintainer

nagadomi Jun 20, 2024
Maintainer

gituser123456789000 Jun 20, 2024
Author

nagadomi Jun 20, 2024
Maintainer

gituser123456789000 Jun 22, 2024
Author

loawizard
Jun 20, 2024