Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to load local model for b2c.stardist function #28

Open
dfgao opened this issue Nov 25, 2024 · 13 comments
Open

how to load local model for b2c.stardist function #28

dfgao opened this issue Nov 25, 2024 · 13 comments

Comments

@dfgao
Copy link

dfgao commented Nov 25, 2024

hi bin2cell team,

My server can't access github to download python_2D_versatile_he.zip. I downloaded the file from another computer and loaded it in python, but b2c.stardist still had to download it from github. Here's my code.

import os \ model_dir = "/root/.keras/models/StarDist2D/2D_versatile_he" \ files = os.listdir(model_dir) \ print(files)
['thresholds.json', 'weights_best.h5', 'config.json']

from stardist.models import StarDist2D \ from pathlib import Path \ model_dir = Path("/root/.keras/models/StarDist2D/2D_versatile_he") \ if not model_dir.exists(): \ raise FileNotFoundError(f"Model path does not exist: {model_dir}") \ model = StarDist2D(None, name=str(model_dir))

Loading network weights from 'weights_best.h5'.
Loading thresholds from 'thresholds.json'.
Using default values: prob_thresh=0.692478, nms_thresh=0.3.

b2c.stardist(image_path="stardist/he.jpg", labels_npz_path="stardist/he.npz", stardist_model="2D_versatile_he", prob_thresh=0.01 )

Exception: URL fetch failure on https://github.com/stardist/stardist-models/releases/download/v0.1/python_2D_versatile_he.zip: None -- [Errno 101] Network is unreachable

Do you have any better suggestions?

@ktpolanski
Copy link
Contributor

ktpolanski commented Nov 25, 2024

Interesting, you're not the first person to run into this. The other one came to me via email so there's no track record of this on GitHub.

Here's a workaround you could try using, which accepts the model as a function argument. As a result, since we no longer have the name of the model, you need to pass model_axes to go with it - for H&E that's "YXC", for IF/GEX that's "YX".

import scipy.sparse
import bin2cell as b2c

def stardist_custom(image_path, labels_npz_path, model, model_axes, block_size=4096, min_overlap=128, context=128, **kwargs):
    #using stardist models requires tensorflow, avoid global import
    from stardist.models import StarDist2D
    #load and percentile normalize image, following stardist demo protocol
    #turn it to np.float16 pre normalisation to keep RAM footprint minimal
    #determine whether to greyscale or not based on the length of model_axes
    #YX is length 2, YXC is length 3
    img = b2c.load_image(image_path, gray=(len(model_axes) == 2), dtype=np.float16)
    img = b2c.normalize(img)
    #we passed a custom model as model already
    #we also passed model_axes to go with it so we're good
    model = StarDist2D.from_pretrained(stardist_model)
    #run predict_instances_big() to perform automated tiling of the input
    #this is less parameterised than predict_instances, needed to pass axes too
    #pass any other **kwargs to the thing, passing them on internally
    #in practice this is going to be prob_thresh
    labels, _ = model.predict_instances_big(img, axes=model_axes, 
                                            block_size=block_size, 
                                            min_overlap=min_overlap, 
                                            context=context, 
                                            **kwargs
                                           )
    #store resulting labels as sparse matrix NPZ - super efficient space wise
    labels_sparse = scipy.sparse.csr_matrix(labels)
    scipy.sparse.save_npz(labels_npz_path, labels_sparse)
    print("Found "+str(len(np.unique(labels_sparse.data)))+" objects")

Let me know how you get on with it.

@dfgao
Copy link
Author

dfgao commented Nov 25, 2024

Wow, its work. here's my code:

Add the following code to the bin2cell.py file and comment out the def stardist function:

def stardist_custom(image_path, labels_npz_path, model, model_axes, block_size=4096, min_overlap=128, context=128, **kwargs):
    #using stardist models requires tensorflow, avoid global import
    from stardist.models import StarDist2D
    #load and percentile normalize image, following stardist demo protocol
    #turn it to np.float16 pre normalisation to keep RAM footprint minimal
    #determine whether to greyscale or not based on the length of model_axes
    #YX is length 2, YXC is length 3
    img = load_image(image_path, gray=(len(model_axes) == 2), dtype=np.float16)
    img = normalize(img)
    #we passed a custom model as model already
    #we also passed model_axes to go with it so we're good
#    model = StarDist2D.from_pretrained(stardist_model)
    #run predict_instances_big() to perform automated tiling of the input
    #this is less parameterised than predict_instances, needed to pass axes too
    #pass any other **kwargs to the thing, passing them on internally
    #in practice this is going to be prob_thresh
    labels, _ = model.predict_instances_big(img, axes=model_axes, 
                                            block_size=block_size, 
                                            min_overlap=min_overlap, 
                                            context=context, 
                                            **kwargs
                                           )
    #store resulting labels as sparse matrix NPZ - super efficient space wise
    labels_sparse = scipy.sparse.csr_matrix(labels)
    scipy.sparse.save_npz(labels_npz_path, labels_sparse)
    print("Found "+str(len(np.unique(labels_sparse.data)))+" objects")

Then, run the following commands in the Jupyter terminal:

from stardist.models import StarDist2D
from pathlib import Path

model_dir = Path("/root/.keras/models/StarDist2D/2D_versatile_he") # downloaded the **python_2D_versatile_he** file and unzip in this folder.

if not model_dir.exists():
    raise FileNotFoundError(f"Model path does not exist: {model_dir}")

model = StarDist2D(None, name=str(model_dir))
print("Model loaded successfully!")

b2c.stardist_custom(
    image_path="stardist/he.jpg", 
    labels_npz_path="stardist/he.npz",
    model=model,  
    model_axes="YXC",
    prob_thresh=0.01
)

output:

effective: block_size=(4096, 4096, 3), min_overlap=(128, 128, 0), context=(128, 128, 0)
functional.py (225): The structure of `inputs` doesn't match the expected structure: ['input']. Received: the structure of inputs=*
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [02:28<00:00,  9.30s/it]

Thank you for your help 😀

@ktpolanski
Copy link
Contributor

Awesome. I'll keep this issue open to remind myself to figure out an elegant way to support custom StarDist models, as this sorts out corner cases like you and technically grants users extra possibilities as well.

@dfgao
Copy link
Author

dfgao commented Nov 25, 2024

Thank you so much! I have another question: My area contains two non-overlapping tissues, one on the left and one on the right. Do you have any suggestions on how to separately generate cdata for each tissue?

@ktpolanski
Copy link
Contributor

This shouldn't be a problem in bin2cell itself, as non-overlapping implies that the segmented objects should be quite far apart.

Do a simple scatterplot of cdata.obsm["spatial"] and see if you can define a simple thresholding to split them up into two. You'd think that's the most likely.

@dfgao
Copy link
Author

dfgao commented Nov 25, 2024

Thank you again for your suggestion. I will try it out soon. 👍

@ktpolanski ktpolanski mentioned this issue Nov 26, 2024
@dfgao
Copy link
Author

dfgao commented Dec 16, 2024

Hi ktpolanski,

I have a question and I was hoping you could give me some advice, I obtained a set of irregular region coordinates and barcodes using LoupeR (handle selected them). What are the corresponding barcodes of these coordinates and barcodes in the single-cell matrix generated by B2C? How can I extract the cells in the B2C objects?

Thank you very much for any advice you may have.

@ktpolanski
Copy link
Contributor

ktpolanski commented Dec 16, 2024

Assuming you worked on the 2um bins with that thing, your output should hopefully use the same nomenclature as what the 2um bins are called in the loaded/b2c-processed Visium HD object. Then it becomes a simple case of extracting the .obs and doing some simple data frame manipulation to add a new column and then do grouping on the labels.

I think Nadav did something similar at some point, maybe he can unearth pertinent code and share it.

@dfgao
Copy link
Author

dfgao commented Dec 17, 2024

In fact, Spaceranger does not generate 2um cloupe files. I took a rather roundabout approach: I imported the h5ad file generated by b2c into Seurat and changed the barcode name, then used loupeR to convert it to a cloupe file, which allowed me to manually select regions again. This way, I obtained the b2c regions corresponding to the HD regions, especially those that could not be distinguished by clustering or spatial distance.

Thank you again for your advice!

Assuming you worked on the 2um bins with that thing, your output should hopefully use the same nomenclature as what the 2um bins are called in the loaded/b2c-processed Visium HD object. Then it becomes a simple case of extracting the .obs and doing some simple data frame manipulation to add a new column and then do grouping on the labels.

I think Nadav did something similar at some point, maybe he can unearth pertinent code and share it.

@nadavyayon
Copy link

Ah that's cool solution! Thanks for sharing.

I'm working on a tutorial for migration of 8um annotations from loupe to bin2cell objects directly hopefully will share this week.

@ktpolanski
Copy link
Contributor

Ah yeah, the normal Loupe is 8um bins only. It's possible to turn a 2um bin to its corresponding 8um bin based on its array coordinates (Nadav's notebook will have this), and then you do a couple mapping operations in pandas and you're good. Sorry, forgot about this detail, there's quite a bit going on over here.

@nadavyayon
Copy link

nadavyayon commented Dec 19, 2024

Hey @dfgao,

I just formulated a notebook which you are free to try out in order to migrate Loupe annotations to b2c objects.
I did also try to perfrom the other way around which you described - h5ad -> Seurat -> LoupeR but as far as i can spot LoupeR doesn't support spatial objects yet.

If you found a way around this will be cool to know!

@dfgao
Copy link
Author

dfgao commented Dec 20, 2024

Hey @dfgao,

I just formulated a notebook which you are free to try out in order to migrate Loupe annotations to b2c objects. I did also try to perfrom the other way around which you described - h5ad -> Seurat -> LoupeR but as far as i can spot LoupeR doesn't support spatial objects yet.

If you found a way around this will be cool to know!

Hi nadavyayon,

Thank you very much for sharing your code.

LoupeR does not support exporting spatial objects. In fact, when you import an b2c h5ad file into Seurat, the images will not be retained, only the spatial coordinates inferred by b2c, which are stored in the reductions-spatial assay. My approach is to convert it into a cloupe file, align the b2c-inferred cells based on the coordinates in the spatial assay, and then manually segment the regions overlapping with the 8um cloupe file based on marker genes or experience. This gives me the corresponding b2c cell barcodes. However, this is quite an unwise method, and I believe the method you developed is much more advanced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants