flowvision.layers¶
Plug and Play Modules or Functions that are specific for Computer Vision Tasks
flowvision.layers.blocks module¶
-
class
flowvision.layers.blocks.
FeaturePyramidNetwork
(in_channels_list: List[int], out_channels: int, extra_blocks: Optional[flowvision.layers.blocks.feature_pyramid_network.ExtraFPNBlock] = None)[source]¶ Module that adds a FPN from on top of a set of feature maps. This is based on “Feature Pyramid Network for Object Detection”.
The feature maps are currently supposed to be increasing depth order.
The input to the model is expected to be an OrderedDict[Tensor], containing the feature maps on top of which the FPN will be added.
- Parameters
in_channels_list (list[int]) – number of channels for each feature map that is passed to the module
out_channels (int) – number of channels of the FPN representation
extra_blocks (ExtraFPNBlock or None) – if provided, extra operations will be performed. It is expected to take the fpn features, the original features and the names of the original features as input, and returns a new list of feature maps and their corresponding names
-
forward
(x: Dict[str, oneflow.Tensor]) → Dict[str, oneflow.Tensor][source]¶ Computes the FPN for a set of feature maps.
- Parameters
x (OrderedDict[Tensor]) – feature maps for each feature level.
- Returns
- feature maps after FPN layers.
They are ordered from highest resolution first.
- Return type
results (OrderedDict[Tensor])
-
class
flowvision.layers.blocks.
MultiScaleRoIAlign
(featmap_names: List[str], output_size: Union[int, Tuple[int], List[int]], sampling_ratio: int, *, canonical_scale: int = 224, canonical_level: int = 4)[source]¶ Multi-scale RoIAlign pooling, which is useful for detection with or without FPN.
It infers the scale of the pooling via the heuristics specified in eq. 1 of the Feature Pyramid Network paper. They keyword-only parameters
canonical_scale
andcanonical_level
correspond respectively to224
andk0=4
in eq. 1, and have the following meaning:canonical_level
is the target level of the pyramid from which to pool a region of interest withw x h = canonical_scale x canonical_scale
.- Parameters
featmap_names (List[str]) – the names of the feature maps that will be used for the pooling.
output_size (List[Tuple[int, int]] or List[int]) – output size for the pooled region
sampling_ratio (int) – sampling ratio for ROIAlign
canonical_scale (int, optional) – canonical_scale for LevelMapper
canonical_level (int, optional) – canonical_level for LevelMapper
-
forward
(x: Dict[str, oneflow.Tensor], boxes: List[oneflow.Tensor], image_shapes: List[Tuple[int, int]]) → oneflow.Tensor[source]¶ - Parameters
x (OrderedDict[Tensor]) – feature maps for each level. They are assumed to have all the same number of channels, but they can have different sizes.
boxes (List[Tensor[N, 4]]) – boxes to be used to perform the pooling operation, in (x1, y1, x2, y2) format and in the image reference size, not the feature map reference. The coordinate must satisfy
0 <= x1 < x2
and0 <= y1 < y2
.image_shapes (List[Tuple[height, width]]) – the sizes of each image before they have been fed to a CNN to obtain feature maps. This allows us to infer the scale factor for each one of the levels to be pooled.
- Returns
result (Tensor)
-
flowvision.layers.blocks.
batched_nms
(boxes: oneflow.Tensor, scores: oneflow.Tensor, idxs: oneflow.Tensor, iou_threshold: float) → oneflow.Tensor[source]¶ Performs non-maximum suppression in a batched fashion.
Each index value correspond to a category, and NMS will not be applied between elements of different categories.
- Parameters
boxes (Tensor[N, 4]) – boxes where NMS will be performed. They are expected to be in
(x1, y1, x2, y2)
format with0 <= x1 < x2
and0 <= y1 < y2
.scores (Tensor[N]) – scores for each one of the boxes
idxs (Tensor[N]) – indices of the categories for each one of the boxes.
iou_threshold (float) – discards all overlapping boxes with IoU > iou_threshold
- Returns
int64 tensor with the indices of the elements that have been kept by NMS, sorted in decreasing order of scores
- Return type
Tensor
-
flowvision.layers.blocks.
box_iou
(boxes1: oneflow.Tensor, boxes2: oneflow.Tensor) → oneflow.Tensor[source]¶ Return intersection-over-union (Jaccard index) between two sets of boxes.
Both sets of boxes are expected to be in
(x1, y1, x2, y2)
format with0 <= x1 < x2
and0 <= y1 < y2
.- Parameters
boxes1 (Tensor[N, 4]) – first set of boxes
boxes2 (Tensor[N, 4]) – second set of boxes
- Returns
the NxM matrix containing the pairwise IoU values for every element in boxes 1 and boxes2
- Return type
Tensor[N, M]
-
flowvision.layers.blocks.
nms
(boxes: oneflow.Tensor, scores: oneflow.Tensor, iou_threshold: float) → oneflow.Tensor[source]¶ Performs non-maximum suppression (NMS) on the boxes according to their intersection-over-union (IoU).
NMS iteratively removes lower scoring boxes which have an IoU greater than iou_threshold with another (higher scoring) box.
- Parameters
boxes (Tensor[N, 4]) – boxes to perform NMS on. They are expected to be in
(x1, y1, x2, y2)
format with0 <= x1 < x2
and0 <= y1 < y2
.scores (Tensor[N]) – scores for each one of the boxes
iou_threshold (float) – discards all overlapping boxes with IoU > iou_threshold
- Returns
int64 tensor with the indices of the elements that have been kept by NMS, sorted in decreasing order of scores
- Return type
Tensor
flowvision.layers.attention module¶
-
class
flowvision.layers.attention.
SEModule
(channels: int, reduction: int = 16, rd_channels: Optional[int] = None, act_layer: Optional[oneflow.nn.modules.activation.ReLU] = <class 'oneflow.nn.modules.activation.ReLU'>, gate_layer: Optional[oneflow.nn.modules.activation.Sigmoid] = <class 'oneflow.nn.modules.activation.Sigmoid'>, mlp_bias=True)[source]¶ “Squeeze-and-Excitation” block adaptively recalibrates channel-wise feature responses. This is based on “Squeeze-and-Excitation Networks”. This unit is designed to improve the representational capacity of a network by enabling it to perform dynamic channel-wise feature recalibration.
- Parameters
channels (int) – The input channel size
reduction (int) – Ratio that allows us to vary the capacity and computational cost of the SE Module. Default: 16
rd_channels (int or None) – Number of reduced channels. If none, uses reduction to calculate
act_layer (Optional[ReLU]) – An activation layer used after the first FC layer. Default: flow.nn.ReLU
gate_layer (Optional[Sigmoid]) – An activation layer used after the second FC layer. Default: flow.nn.Sigmoid
mlp_bias (bool) – If True, add learnable bias to the linear layers. Default: True