You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have the following code. The data set can be downloaded here or here. The data set contains images categorized as cat or dog.
The task of this code is for training cats and dogs image data.
So that given a picture, it can tell whether it's cat's or dog.
It is motivated by this page. Below is the fully running code:
library(keras)
library(tidyverse)
# Organize dataset --------------------------------------------------------
options(warn = -1)
# Ths input
original_dataset_dir <- "data/kaggle_cats_dogs/original/"
# Create new organized dataset directory ----------------------------------
base_dir <- "data/kaggle_cats_dogs_small/"
dir.create(base_dir)
model_dir <- paste0(base_dir, "model/")
dir.create(model_dir)
train_dir <- file.path(base_dir, "train")
dir.create(train_dir)
validation_dir <- file.path(base_dir, "validation")
dir.create(validation_dir)
test_dir <- file.path(base_dir, "test")
dir.create(test_dir)
train_cats_dir <- file.path(train_dir, "cats")
dir.create(train_cats_dir)
train_dogs_dir <- file.path(train_dir, "dogs")
dir.create(train_dogs_dir)
validation_cats_dir <- file.path(validation_dir, "cats")
dir.create(validation_cats_dir)
validation_dogs_dir <- file.path(validation_dir, "dogs")
dir.create(validation_dogs_dir)
test_cats_dir <- file.path(test_dir, "cats")
dir.create(test_cats_dir)
test_dogs_dir <- file.path(test_dir, "dogs")
dir.create(test_dogs_dir)
# Copying files from original dataset to newly created directory
fnames <- paste0("cat.", 1:1000, ".jpg")
dum <- file.copy(
file.path(original_dataset_dir, fnames),
file.path(train_cats_dir)
)
fnames <- paste0("cat.", 1001:1500, ".jpg")
dum <- file.copy(
file.path(original_dataset_dir, fnames),
file.path(validation_cats_dir)
)
fnames <- paste0("cat.", 1501:2000, ".jpg")
dum <- file.copy(
file.path(original_dataset_dir, fnames),
file.path(test_cats_dir)
)
fnames <- paste0("dog.", 1:1000, ".jpg")
dum <- file.copy(
file.path(original_dataset_dir, fnames),
file.path(train_dogs_dir)
)
fnames <- paste0("dog.", 1001:1500, ".jpg")
dum <- file.copy(
file.path(original_dataset_dir, fnames),
file.path(validation_dogs_dir)
)
fnames <- paste0("dog.", 1501:2000, ".jpg")
dum <- file.copy(
file.path(original_dataset_dir, fnames),
file.path(test_dogs_dir)
)
options(warn = 0)
# Making model ------------------------------------------------------------
conv_base <- application_vgg16(
weights = "imagenet",
include_top = FALSE,
input_shape = c(150, 150, 3)
)
model <- keras_model_sequential() %>%
conv_base() %>%
layer_flatten() %>%
layer_dense(units = 256, activation = "relu") %>%
layer_dense(units = 1, activation = "sigmoid")
summary(model)
length(model$trainable_weights)
freeze_weights(conv_base)
length(model$trainable_weights)
# Train model -------------------------------------------------------------
train_datagen <- image_data_generator(
rescale = 1 / 255,
rotation_range = 40,
width_shift_range = 0.2,
height_shift_range = 0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = TRUE,
fill_mode = "nearest"
)
# Note that the validation data shouldn't be augmented!
test_datagen <- image_data_generator(rescale = 1 / 255)
train_generator <- flow_images_from_directory(
train_dir, # Target directory
train_datagen, # Data generator
target_size = c(150, 150), # Resizes all images to 150 × 150
shuffle = FALSE,
batch_size = 20,
class_mode = "binary" # binary_crossentropy loss for binary labels
)
test_generator <- flow_images_from_directory(
test_dir, # Target directory
train_datagen, # Data generator
target_size = c(150, 150), # Resizes all images to 150 × 150
shuffle = FALSE,
batch_size = 20,
class_mode = "binary" # binary_crossentropy loss for binary labels
)
validation_generator <- flow_images_from_directory(
validation_dir,
test_datagen,
target_size = c(150, 150),
shuffle = FALSE,
batch_size = 20,
class_mode = "binary"
)
# Fine tuning -------------------------------------------------------------
unfreeze_weights(conv_base, from = "block3_conv1")
# Compile model -----------------------------------------------------------
model %>% compile(
loss = "binary_crossentropy",
optimizer = optimizer_rmsprop(lr = 2e-5),
metrics = c("accuracy")
)
# Evaluate by epochs ---------------------------------------------------------------
# # This create plots accuracy of various epochs (slow)
history <- model %>% fit_generator(
train_generator,
steps_per_epoch = 100,
epochs = 50, # was 50
validation_data = validation_generator,
validation_steps = 50
)
There appear to be problems preserving the order of the inputs using predict_generator() (this is an issue in the core Keras library I think). See #149. In the meantime I would just use standard predict or predict_on_batch.
I have the following code. The data set can be downloaded here or here. The data set contains images categorized as
cat
ordog
.The task of this code is for training cats and dogs image data.
So that given a picture, it can tell whether it's cat's or dog.
It is motivated by this page. Below is the fully running code:
Evaluation gives the following great result:
But then, I tried to 'manually' check the prediction accuracy the following way
Proportion of predicted label
Number of prediction correctly predicted as dog or cat:
Which says that out of 803 prediction only 432 is correctly predicted as dogs (that's around 54% accuracy). Why is that? Where did I go wrong?
Note that the
evaluate_generator()
gives around 92% accuracy. What's the correct interpretation?How can I resolve the difference?
The text was updated successfully, but these errors were encountered: