inconsistent input preprocessing in PyTorch demo #18

function2-llx · 2023-12-21T14:35:45Z

Dear author,

Thank you for the contribution of this work. I'm trying to use the pre-trained network as a feature extractor. To best utilize the pre-trained weights, I must figure out how input images are pre-processed during pre-training and follow it exactly. However, I find two different pre-processing approaches in this repository.

The first one is found in the Tensorflow training code. The data will firstly be rescaled to [0, 1]. Then in preprocess_input function, since the default mode is caffe, it will reorder the channels from "RGB" to "BGR", and subtract the mean value of ImageNet (doc).

RadImageNet/breast/breast_train.py

Lines 144 to 153 in 0ce16f7

    
           train_data_generator = ImageDataGenerator( 
        
                                           rescale=1./255, 
        
                                           preprocessing_function=preprocess_input, 
        
                                           rotation_range=10, 
        
                                           width_shift_range=0.1, 
        
                                           height_shift_range=0.1, 
        
                                           shear_range=0.1, 
        
                                           zoom_range=0.1, 
        
                                           horizontal_flip=True, 
        
                                           fill_mode='nearest')

The second one (in the PyTorch Demo) rescales the input to [-1, 1], which is different from the first one and will result in different input distribution and output of the pre-trained network.

class createDataset(Dataset):
    def __init__(self, dataframe, transform=None):
        self.dataframe = dataframe
        self.transform = transforms.Compose([transforms.ToTensor()])

    def __len__(self):
        return self.dataframe.shape[0]
        
    def __getitem__(self, index):
        image = self.dataframe.iloc[index]["img_dir"]
        image = cv2.imread(image)
        image = (image-127.5)*2 / 255
        image = cv2.resize(image,(224,224))
        #image = np.transpose(image,(2,0,1))   
        if self.transform is not None:
            image = self.transform(image)
        label = self.dataframe.iloc[index]["label"]
        return {"image": image , "label": torch.tensor(label, dtype=torch.long)}

It will be much appreciated if there can be clarification on this issue. Thanks!

The text was updated successfully, but these errors were encountered:

Can-Zhao · 2024-03-01T18:49:32Z

I'm also confused by it. I checked ImageDataGenerator code, https://github.com/keras-team/keras/blob/601488fd4c1468ae7872e132e0f1c9843df54182/keras/preprocessing/image.py#L1849-L1852. What it did inside is to first use preprocess_input to zero-center the images, then use rescale=1./255 to rescale. So during training, the images will be normalized by img = (img-mean)/255. In the case when original img has range [0,255] with mean of 127.5, it will be normalized to [-0.5,0.5]? If so, it seems demo code needs to be changed

jooho7lee · 2024-10-28T18:30:32Z

Are there any updates regarding this matter?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inconsistent input preprocessing in PyTorch demo #18

inconsistent input preprocessing in PyTorch demo #18

function2-llx commented Dec 21, 2023

Can-Zhao commented Mar 1, 2024 •

edited

Loading

jooho7lee commented Oct 28, 2024

inconsistent input preprocessing in PyTorch demo #18

inconsistent input preprocessing in PyTorch demo #18

Comments

function2-llx commented Dec 21, 2023

Can-Zhao commented Mar 1, 2024 • edited Loading

jooho7lee commented Oct 28, 2024

Can-Zhao commented Mar 1, 2024 •

edited

Loading