Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficientdet #235

Merged
merged 6 commits into from
Nov 16, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
100 changes: 58 additions & 42 deletions examples/efficientdet/anchors.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,25 +25,27 @@ def compute_feature_sizes(image_size, max_level):

def generate_configurations(feature_sizes, min_level, max_level,
num_scales, aspect_ratios, anchor_scale):
"""Generates configurations or in other words different combinations of
strides, octave scales, aspect ratios and anchor scales for each level
of the EfficientNet layers that feeds BiFPN network.
"""Generates configurations or in other words different combinations
of strides, octave scales, aspect ratios and anchor scales for each
level of the EfficientNet layers that feeds BiFPN network.

# Arguments:
feature_sizes: Numpy array representing the feature sizes of inputs to
EfficientNet layers.
feature_sizes: Numpy array representing the feature sizes of
inputs to EfficientNet layers.
min_level: Int, Number representing the index of the earliest
EfficientNet layer that feeds the BiFPN layers.
max_level: Int, Number representing the index of the last EfficientNet
layer that feeds the BiFPN layers.
max_level: Int, Number representing the index of the last
EfficientNet layer that feeds the BiFPN layers.
num_scales: Int, specifying the number of scales in the
anchor boxes.
aspect_ratios: List, specifying the aspect rations of the anchor boxes.
anchor_scale: Numpy array representing the scales of the anchor box.
aspect_ratios: List, specifying the aspect rations of the
anchor boxes.
anchor_scale: Numpy array representing the scales of the
anchor box.

# Returns:
Tuple: Containing configuarations of strides, octave scales, aspect
ratios and anchor scales.
Tuple: Containing configuarations of strides, octave scales,
aspect ratios and anchor scales.
"""
num_levels = max_level + 1 - min_level
scale_aspect_ratio_combinations = (
Expand All @@ -61,13 +63,16 @@ def generate_configurations(feature_sizes, min_level, max_level,


def build_strides(feature_sizes, features_H, features_W, num_levels):
"""Generates strides for each EfficientNet layer that feeds BiFPN network.
"""Generates strides for each EfficientNet layer that feeds BiFPN
network.

# Arguments:
feature_sizes: Numpy array representing the feature sizes of inputs to
EfficientNet layers.
features_H: Numpy array representing the height of the input features.
features_W: Numpy array representing the width of the input features.
feature_sizes: Numpy array representing the feature sizes of
inputs to EfficientNet layers.
features_H: Numpy array representing the height of the input
features.
features_W: Numpy array representing the width of the input
features.
num_levels: Int, representing the number of feature levels.

# Returns:
Expand All @@ -83,16 +88,16 @@ def build_strides(feature_sizes, features_H, features_W, num_levels):

def build_features(feature_sizes, min_level, max_level,
scale_aspect_ratio_combinations):
"""Calculates feature height and width for each EfficientNet layer that
feeds BiFPN network.
"""Calculates feature height and width for each EfficientNet layer
that feeds BiFPN network.

# Arguments:
feature_sizes: Numpy array representing the feature sizes of inputs to
EfficientNet layers.
feature_sizes: Numpy array representing the feature sizes of
inputs to EfficientNet layers.
min_level: Int, Number representing the index of the earliest
EfficientNet layer that feeds the BiFPN layers.
max_level: Int, Number representing the index of the last EfficientNet
layer that feeds the BiFPN layers.
max_level: Int, Number representing the index of the last
EfficientNet layer that feeds the BiFPN layers.
scale_aspect_ratio_combinations: Int, representing the number of
possible combinations of scales and aspect ratios.

Expand All @@ -113,7 +118,8 @@ def build_octaves(num_scales, aspect_ratios, num_levels):
# Arguments:
num_scales: Int, specifying the number of scales in the
anchor boxes.
aspect_ratios: List, specifying the aspect rations of the anchor boxes.
aspect_ratios: List, specifying the aspect rations of the anchor
boxes.
num_levels: Int, representing the number of feature levels.

# Returns:
Expand All @@ -130,8 +136,10 @@ def build_aspects(aspect_ratios, num_scales, num_levels):
BiFPN network.

# Arguments:
aspect_ratios: List, specifying the aspect rations of the anchor boxes.
num_scales: Int, specifying the number of scales in the anchor boxes.
aspect_ratios: List, specifying the aspect rations of the anchor
boxes.
num_scales: Int, specifying the number of scales in the anchor
boxes.
num_levels: Int, representing the number of feature levels.

# Returns:
Expand All @@ -147,7 +155,8 @@ def build_scales(anchor_scale, scale_aspect_ratio_combinations, num_levels):
BiFPN network.

# Arguments:
anchor_scale: Numpy array representing the scales of the anchor box.
anchor_scale: Numpy array representing the scales of the anchor
box.
scale_aspect_ratio_combinations: Int, representing the number of
possible combinations of scales and aspect ratios.
num_levels: Int, representing the number of feature levels.
Expand Down Expand Up @@ -180,8 +189,10 @@ def compute_box_coordinates(stride_y, stride_x, octave_scale, aspect,
"""Calculates the coordinates of the anchor box in centre form.

# Arguments:
stride_y: Numpy array representing the stride value in y direction.
stride_x: Numpy array representing the stride value in x direction.
stride_y: Numpy array representing the stride value in y
direction.
stride_x: Numpy array representing the stride value in x
direction.
octave_scale: Numpy array representing the octave scale of the
anchor box.
aspect: Numpy array representing the aspect value.
Expand Down Expand Up @@ -212,19 +223,22 @@ def generate_level_boxes(strides_y, strides_x, octave_scales, aspects,
"""Generates anchor box in centre form for every feature level.

# Arguments:
strides_y: Numpy array representing the stride value in y direction.
strides_x: Numpy array representing the stride value in x direction.
strides_y: Numpy array representing the stride value in y
direction.
strides_x: Numpy array representing the stride value in x
direction.
octave_scales: Numpy array representing the octave scale of the
anchor box.
aspects: Numpy array representing the aspect value.
anchor_scaless: Numpy array representing the scale of anchor box.
anchor_scaless: Numpy array representing the scale of anchor
box.
image_size: Tuple, representing the size of input image.
scale_aspect_ratio_combinations: Int, representing the number of
combinations of scale and aspect ratio.

# Returns:
boxes_level: List containing ancho9r boxes in centre form for every
feature level.
boxes_level: List containing ancho9r boxes in centre form for
every feature level.
"""
boxes_level = []
for combination in range(scale_aspect_ratio_combinations):
Expand All @@ -248,11 +262,12 @@ def generate_anchors(feature_sizes, min_level, max_level, num_scales,
feature_sizes: Numpy array of shape ``(8, 2)``.
min_level: Int, Number representing the index of the earliest
EfficientNet layer that feeds the BiFPN layers.
max_level: Int, Number representing the index of the last EfficientNet
layer that feeds the BiFPN layers.
max_level: Int, Number representing the index of the last
EfficientNet layer that feeds the BiFPN layers.
num_scales: Int, specifying the number of scales in the
anchor boxes.
aspect_ratios: List, specifying the aspect rations of the anchor boxes.
aspect_ratios: List, specifying the aspect rations of the anchor
boxes.
image_size: Tuple, representing the size of input image.
anchor_scales: Numpy array representing the scale of anchor box.

Expand Down Expand Up @@ -283,14 +298,15 @@ def build_prior_boxes(min_level, max_level, num_scales,
# Arguments
min_level: Int, Number representing the index of the earliest
EfficientNet layer that feeds the BiFPN layers.
max_level: Int, Number representing the index of the last EfficientNet
layer that feeds the BiFPN layers.
max_level: Int, Number representing the index of the last
EfficientNet layer that feeds the BiFPN layers.
num_scales: Int, specifying the number of scales in the
anchor boxes.
aspect_ratios: List, specifying the aspect rations of the anchor boxes.
anchor_scale: float number representing the scale of size of the base
anchor to the feature stride 2^level. Or a list, one value per
layer.
aspect_ratios: List, specifying the aspect rations of the anchor
boxes.
anchor_scale: float number representing the scale of size of the
base anchor to the feature stride 2^level. Or a list, one
value per layer.
image_size: Tuple, representing the size of input image.

# Returns
Expand Down
14 changes: 8 additions & 6 deletions examples/efficientdet/detection.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@

class PreprocessImage(SequentialProcessor):
"""Preprocess RGB image by resizing it to the given ``shape``. If a
``mean`` is given it is substracted from image and it not the image gets
normalized.
``mean`` is given it is substracted from image and it not the image
gets normalized.

# Argumeqnts
shape: List of two Ints.
Expand All @@ -36,8 +36,8 @@ class AugmentDetection(SequentialProcessor):
prior_boxes: Numpy array of shape ``[num_boxes, 4]`` containing
prior/default bounding boxes.
split: Flag from `paz.processors.TRAIN`, ``paz.processors.VAL``
or ``paz.processors.TEST``. Certain transformations would take
place depending on the flag.
or ``paz.processors.TEST``. Certain transformations would
take place depending on the flag.
num_classes: Int.
size: Int. Image size.
mean: List of three elements indicating the per channel mean.
Expand Down Expand Up @@ -81,7 +81,8 @@ class DetectSingleShot_EfficientDet(Processor):
score_thresh: Float between [0, 1]
nms_thresh: Float between [0, 1].
mean: List of three elements indicating the per channel mean.
draw: Boolean. If ``True`` prediction are drawn in the returned image.
draw: Boolean. If ``True`` prediction are drawn in the returned
image.
"""
def __init__(self, model, class_names, score_thresh, nms_thresh):
self.model = model
Expand Down Expand Up @@ -117,7 +118,8 @@ class DetectSingleShot(DetectSingleShot):
nms_thresh: Float between [0, 1].
mean: List of three elements indicating the per channel mean.
variances: List containing the variances of the encoded boxes.
draw: Boolean. If ``True`` prediction are drawn in the returned image.
draw: Boolean. If ``True`` prediction are drawn in the returned
image.
"""
def __init__(
self, model, class_names, score_thresh, nms_thresh,
Expand Down
31 changes: 20 additions & 11 deletions examples/efficientdet/draw.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,12 @@ def put_text(image, text, point, scale, color, thickness):
# Arguments
image: Numpy array.
text: String. Text to be drawn.
point: Tuple of coordinates indicating the top corner of the text.
point: Tuple of coordinates indicating the top corner of the
text.
scale: Float. Scale of text.
color: Tuple of integers. RGB color coordinates.
thickness: Integer. Thickness of the lines used for drawing text.
thickness: Integer. Thickness of the lines used for drawing
text.

# Returns
Numpy array with shape ``[H, W, 3]``. Image with text.
Expand All @@ -29,11 +31,13 @@ def get_text_size(text, scale, FONT_THICKNESS, FONT=FONT):
# Arguments
text: String. Text whose width and height is to be calculated.
scale: Float. Scale of text.
FONT_THICKNESS: Integer. Thickness of the lines used for drawing text.
FONT_THICKNESS: Integer. Thickness of the lines used for drawing
text.
FONT: Integer. Style of the text font.

# Returns
Tuple with shape ((text_W, text_H), baseline)``. The width and height
of the text
Tuple with shape ((text_W, text_H), baseline)``. The width and
height of the text
"""
text_size = cv2.getTextSize(text, FONT, scale, FONT_THICKNESS)
return text_size
Expand All @@ -44,8 +48,10 @@ def add_box_border(image, corner_A, corner_B, color, thickness):

# Arguments
image: Numpy array of shape ``[H, W, 3]``.
corner_A: List of length two indicating ``(y, x)`` openCV coordinates.
corner_B: List of length two indicating ``(y, x)`` openCV coordinates.
corner_A: List of length two indicating ``(y, x)`` openCV
coordinates.
corner_B: List of length two indicating ``(y, x)`` openCV
coordinates.
color: List of length three indicating RGB color of point.
thickness: Integer/openCV Flag. Thickness of rectangle line.
or for filled use cv2.FILLED flag.
Expand All @@ -63,8 +69,10 @@ def draw_opaque_box(image, corner_A, corner_B, color, thickness=-1):

# Arguments
image: Numpy array of shape ``[H, W, 3]``.
corner_A: List of length two indicating ``(y, x)`` openCV coordinates.
corner_B: List of length two indicating ``(y, x)`` openCV coordinates.
corner_A: List of length two indicating ``(y, x)`` openCV
coordinates.
corner_B: List of length two indicating ``(y, x)`` openCV
coordinates.
color: List of length three indicating RGB color of point.
thickness: Integer/openCV Flag. Thickness of rectangle line.
or for filled use cv2.FILLED flag.
Expand All @@ -77,8 +85,9 @@ def draw_opaque_box(image, corner_A, corner_B, color, thickness=-1):
return image


def make_box_transparent(raw_image, image, alpha=0.30):
""" Blends the raw image with bounding box image to add transparency.
def make_box_transparent(raw_image, image, alpha=0.25):
""" Blends the raw image with bounding box image to add
transparency.

# Arguments
raw_image: Numpy array of shape ``[H, W, 3]``.
Expand Down
Loading