Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function to apply mask to RawImage. #1020

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

BritishWerewolf
Copy link
Contributor

I'm working on an app to remove the background from an image.
This PR adds the ability to very easily apply the result of the model as a mask to a RawImage.
The image alpha channel will be replaced with the values of the mask.

Here is an example of the sort of thing I am doing.

const image = await RawImage.fromURL('https://picsum.photos/600/400');
const padded = image.clone().pad([0, 0, 100, 100]);
const model = await AutoModel.from_pretrained(modelName);
const processed = await processor(padded);
const output = model({ 'input': processed.pixel_values });

// Resize the mask to match the image since many models have pretrained sizes.
// `output` is a square so resize and centre crop to remove padding.
const imageSize = Math.max(image.width, image.height);
const mask = await RawImage.fromTensor(output)
.resize(imageSize, imageSize)
.then(async image => image.center_crop(image.width, image.height));

// Finally, apply the mask and save it.
image.applyMask(mask)
.then(image => image.save('masked_image.png'));

src/utils/image.js Outdated Show resolved Hide resolved
@xenova
Copy link
Collaborator

xenova commented Nov 16, 2024

Useful PR! 🔥 Could you reference the similar function / usage / inspiration? e.g., how PIL does mask application?

@BritishWerewolf
Copy link
Contributor Author

BritishWerewolf commented Nov 17, 2024

Useful PR! 🔥 Could you reference the similar function / usage / inspiration? e.g., how PIL does mask application?

@xenova, I am not completely familiar with the Python Imaging Library, however it looks like a similar thing is achieved with the following snippet - assuming I understand the docs correctly 😅

from PIL import Image

background = Image.open('background.png')
foreground = Image.open('foreground.png')
mask = Image.open('mask.png').convert('L')

result = Image.composite(background, foreground, mask)
result.save('masked_image.png')

The mask is converted to greyscale (docs here).

The foreground image would map to what RawImage would represent, and the background image is used to fill in the gaps that are created in accordance with mask (docs here).
My implementation instead uses a single image, and will instead use transparent pixels instead of filling them with another image.

This was a great resource for learning how that library and function works:
https://note.nkmk.me/en/python-pillow-composite/

@xenova
Copy link
Collaborator

xenova commented Nov 19, 2024

Thanks! I followed that resource you provided and found https://note.nkmk.me/en/python-pillow-putalpha/ - perhaps it's more applicable? If so, I say we rename applyMask to putalpha :) WDYT?

Comment on lines +374 to +376
if (this.channels !== 4) {
throw new Error('Image must have 4 channels');
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be useful to support both 3- and 4-channel images as the input image.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you consider something like this?
Automatically convert to 4 channels if 3 are provided, however anything else will throw an error.

Suggested change
if (this.channels !== 4) {
throw new Error('Image must have 4 channels');
}
if (this.channels === 3) {
this.convert(4);
} else if (![3, 4].includes(this.channels)) {
throw new Error('Image must have 3 or 4 channels');
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent unnecessary conversions, I think we can set the step size accordingly (3 for RGB, 4 for RGBA) and write the image data accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, sorry I’ll get to this soon.
I’m currently poorly and bedridden at the minute, but will work on a fix ASAP.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No rush! Feel better soon! 🤗

src/utils/image.js Outdated Show resolved Hide resolved
@BritishWerewolf
Copy link
Contributor Author

Thanks! I followed that resource you provided and found https://note.nkmk.me/en/python-pillow-putalpha/ - perhaps it's more applicable? If so, I say we rename applyMask to putalpha :) WDYT?

Yes - that's a much more apt name!

@xenova xenova changed the base branch from v3 to main November 25, 2024 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants