Improved Image Handling #719

NullSenseStudio · 2023-10-31T23:21:24Z

Summary

Image handling on the backend is currently quite a mess: images have to be flipped before returning, converted between PIL and numpy often (despite diffusers being able to input and output ndarrays), and depth maps need flipped before use and are received in float32 rather than uint8 unlike any other image. This PR aims to simplify all of this with the new image_utils module.

Details

Images received by the backend will be in float32 RGBA format and won't require any flipping upon receiving or outputting an image (unless if there is some library that would require it flipped like Blender does). Depth maps for use in depth to image (not depth control net) will be in float32 grayscale without a channel dimension. This allows all images to be close enough to what diffusers can handle as images with minimal preprocessing on dream textures' side. Usually this just involves removing the alpha channel, extracting alpha as an inpaint mask, or resizing to certain dimensions. For custom backends that may require PIL images there's an extra image_utils.np_to_pil() function that'll handle conversion without all the code clutter. The diffusers backend is now primarily using image_utils.image_to_np() for most of its needs. It acts as an all-in-one function that supports inputting various image types or file paths and calls upon other image_utils functions determined by its kwargs.

Returned images won't have to follow as rigid of a requirement. DType can be any floating point or integer type, as long as it's using its proper type range (int(0) = float(0), int.max = float(1)). Channels don't matter: can be grayscale, RGB, with or without alpha.

The frontend's code is simplified with image_utils.bpy_to_np() and image_utils.np_to_bpy(). Both functions will flip the image and include handling color spaces. image_utils.np_to_bpy(..., float_buffer=True) while currently unused, would allow for saving higher color precision and support potential future HDRI models.

Drawbacks

Using numpy directly instead of converting to PIL can cause issues without very obvious causes.

I've noticed that having values barely below 0 or above 1 has a very bad effect on image to image saturation. Certain resizing methods and color transforms can cause values to shift slightly outside of this range, which is not a problem for PIL due to limited precision. Also not removing the alpha channel before giving the image to diffusers will lead to it being used directly as latents instead of going through encoding first. Normally causes an out of memory error, though I'm sure if someone had enough memory it would lead to strange results.

NullSenseStudio · 2023-10-31T23:55:27Z

Here's an example of some of the clutter this is dealing with from depth_to_image.py

# before
depth_image = PIL.ImageOps.flip(PIL.Image.fromarray(np.uint8(depth * 255)).convert('L')).resize(rounded_size) if depth is not None else None
init_image = None if image is None else (PIL.Image.open(image) if isinstance(image, str) else PIL.Image.fromarray(image.astype(np.uint8))).convert('RGB').resize(rounded_size)
# after
depth = image_to_np(depth, mode="L", size=rounded_size)
image = image_to_np(image, mode="RGB", size=rounded_size)

So much easier to read through.

NullSenseStudio · 2023-11-01T02:50:54Z

I was hoping that keeping depth images in 32-bit would help with some finer details for depth to image or depth control nets, but it seems that any I've tried unfortunately don't.

I did at least find that diffusers/controlnet-depth-sdxl-1.0 can run into a sort of artifact with 8-bit depth images, but is just fine in 32-bit.
Left: 8-bit, Right: 32-bit

There are these lines along beam that are persistent no matter the seed or prompt with 8-bit depth. This artifact can affect much more of the image if you're unlucky enough.

NullSenseStudio · 2023-11-13T16:24:51Z

@carson-katri I'd like to standardize how images are shared between the render engine nodes. I propose keeping the images flipped like Blender naturally has it and switching the color space to linear so that image operations would match how they occur in other node editors. Also do you think the amount of channels should be standardized to 4 or allow anything between 1-4?

carson-katri · 2023-12-22T22:50:31Z

@NullSenseStudio Sorry for the delayed response. Other node editors use Color sockets for 4 channel images, and Float sockets for 1 channel, and I don't think there are any other options for 2/3 channels. So I'd say our nodes should always have 4 channels for images, and 1 channel for other 2d arrays (like depth operations).

NullSenseStudio · 2024-03-18T17:42:55Z

With the render engine now outputting images in linear color space you won't have to change the color management display device to none for accurate viewing, which has since been removed in Blender 4.0.

Resize and image file nodes can be used in earlier versions, but not with as good of resize sampling or file compatibility.

Dynamic sockets are fixed for Blender 4.0. Getting sockets by their string name is no longer supported when they are disabled: https://projects.blender.org/blender/blender/commit/e4ad58114b9d56fe838396a97fe09aff32c79c6a

carson-katri

LGTM 👍

NullSenseStudio added 2 commits October 25, 2023 18:38

image utils

307a2f2

flexible image arguments

8592126

NullSenseStudio added 2 commits November 9, 2023 13:22

subprocess dispatch

e493bfc

render pass

851c9bb

remove ImageGenerationResult

9ef4aa0

NullSenseStudio added 6 commits February 16, 2024 16:58

Merge branch 'main' into image-handling

3cf2a61

render engine

11f5056

default color space

6e913e5

separate out path

6cae0a1

fix dynamic sockets

ccc741e

cleanup

e51c54a

This was linked to issues Mar 14, 2024

fails when I set step preview to accurate: 'Generator' object has no attribute 'numpy_to_pil' #760

Closed

Image to image generation don't work with the node-based Dream Texture render engine #772

Closed

NullSenseStudio marked this pull request as ready for review March 14, 2024 17:29

NullSenseStudio requested a review from carson-katri March 18, 2024 17:43

carson-katri approved these changes Mar 22, 2024

View reviewed changes

carson-katri merged commit 99f057d into main Mar 31, 2024

carson-katri deleted the image-handling branch March 31, 2024 13:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved Image Handling #719

Improved Image Handling #719

NullSenseStudio commented Oct 31, 2023

NullSenseStudio commented Oct 31, 2023

NullSenseStudio commented Nov 1, 2023

NullSenseStudio commented Nov 13, 2023

carson-katri commented Dec 22, 2023 •

edited

Loading

NullSenseStudio commented Mar 18, 2024

carson-katri left a comment

Improved Image Handling #719

Improved Image Handling #719

Conversation

NullSenseStudio commented Oct 31, 2023

Summary

Details

Drawbacks

NullSenseStudio commented Oct 31, 2023

NullSenseStudio commented Nov 1, 2023

NullSenseStudio commented Nov 13, 2023

carson-katri commented Dec 22, 2023 • edited Loading

NullSenseStudio commented Mar 18, 2024

carson-katri left a comment

Choose a reason for hiding this comment

carson-katri commented Dec 22, 2023 •

edited

Loading