Heads¶

Linear¶

class eztorch.models.heads.LinearHead(affine=True, bias=True, dropout=0.0, dropout_inplace=False, input_dim=2048, norm_layer=None, output_dim=1000, init_normal=True, init_mean=0.0, init_std=0.01, zero_bias=True)[source]¶

Build a Linear head with optional dropout and normalization.

Parameters:

affine (bool, optional) – Use affine in normalization layer.
Default: True
bias (bool, optional) – Use bias in linear layer. If norm_layer, set to False.
Default: True
dropout (float, optional) – Dropout probability, if \(0\), no dropout layer.
Default: 0.0
dropout_inplace (bool, optional) – Use inplace operation in dropout.
Default: False
input_dim (int, optional) – Input dimension for the linear head.
Default: 2048
norm_layer (Union[str, Module, None], optional) – Normalization layer after the linear layer, if str lookup for the module in _BN_LAYERS dictionary.
Default: None
output_dim (int, optional) – Output dimension for the linear head.
Default: 1000
init_normal (bool, optional) – If True, make normal initialization for linear layer.
Default: True
init_mean (float, optional) – Mean for the initialization.
Default: 0.0
init_std (float, optional) – STD for the initialization.
Default: 0.01
zero_bias (bool, optional) – If True, put zeros to bias for the initialization.
Default: True

Raises:

NotImplementedError – If norm_layer is not supported.

Linear 3D¶

class eztorch.models.heads.Linear3DHead(input_dim, pool=None, dropout=None, bn=None, proj=None, norm=False, init_std=0.01, view=True)[source]¶

Linear 3D head. This layer performs an optional pooling operation followed by an optional dropout, a fully-connected projection, an optional activation layer and a global spatiotemporal averaging.

   Pool3d
      ↓
Normalization
      ↓
   Dropout
      ↓
  Projection

Parameters:

in_features – Input channel size of the resnet head.
pool (Module, optional) – Pooling module.
Default: None
dropout (Module, optional) – Dropout module.
Default: None
bn (Module, optional) – Batch normalization module.
Default: None
proj (Module, optional) – Project module.
Default: None
norm (bool, optional) – If True, normalize features along first dimension.
Default: False
init_std (float, optional) – Init std for weights from pytorchvideo.
Default: 0.01
view (bool, optional) – If True, apply reshape view to \((-1, num features)\).
Default: True

eztorch.models.heads.create_linear3d_head(*, in_features, num_classes=400, bn=None, norm=False, pool=<class 'torch.nn.modules.pooling.AvgPool3d'>, pool_kernel_size=(1, 7, 7), pool_stride=(1, 1, 1), pool_padding=(0, 0, 0), output_size=(1, 1, 1), dropout_rate=0.5, view=True)[source]¶

Creates ResNet basic head. This layer performs an optional pooling operation followed by an optional dropout, a fully-connected projection, an activation layer and a global spatiotemporal averaging.

   Pool3d
      ↓
Normalization
      ↓
   Dropout
      ↓
  Projection

Pool3d examples include: AvgPool3d, MaxPool3d, AdaptiveAvgPool3d, and None.

Parameters:

in_features (int) – Input channel size of the resnet head.
num_classes (int, optional) – Output channel size of the resnet head.
Default: 400
bn (Union[str, Callable], optional) – A callable that constructs a batch norm layer.
Default: None
norm (bool, optional) – If True, normalize features along first dimension.
Default: False
pool (Union[str, Callable], optional) – A callable that constructs resnet head pooling layer, examples include: nn.AvgPool3d, nn.MaxPool3d, nn.AdaptiveAvgPool3d, and None (not applying pooling).
Default: <class 'torch.nn.modules.pooling.AvgPool3d'>
pool_kernel_size (Tuple[int], optional) – Pooling kernel size(s) when not using adaptive pooling.
Default: (1, 7, 7)
pool_stride (Tuple[int], optional) – Pooling stride size(s) when not using adaptive pooling.
Default: (1, 1, 1)
pool_padding (Tuple[int], optional) – Pooling padding size(s) when not using adaptive pooling.
Default: (0, 0, 0)
output_size (Tuple[int], optional) – Spatial temporal output size when using adaptive pooling.
Default: (1, 1, 1)
dropout_rate (float, optional) – Dropout rate.
Default: 0.5
view (bool, optional) – Whether to apply reshape view to \((-1, num\ features)\).
Default: True

Return type:

Module

ResNet 3D head¶

class eztorch.models.heads.VideoResNetHead(pool=None, dropout=None, proj=None, activation=None, output_pool=None, init_std=0.01)[source]¶

ResNet basic head. This layer performs an optional pooling operation followed by an optional dropout, a fully-connected projection, an optional activation layer and a global spatiotemporal averaging.

 Pool3d
    ↓
 Dropout
    ↓
Projection
    ↓
Activation
    ↓
Averaging

The builder can be found in create_video_resnet_head().

Parameters:

pool (Module, optional) – Pooling module.
Default: None
dropout (Module, optional) – dropout module.
Default: None
proj (Module, optional) – project module.
Default: None
activation (Module, optional) – activation module.
Default: None
output_pool (Module, optional) – pooling module for output.
Default: None
init_std (float, optional) – init std for weights from pytorchvideo.
Default: 0.01

eztorch.models.heads.create_video_resnet_head(*, in_features, num_classes=400, pool=<class 'torch.nn.modules.pooling.AvgPool3d'>, output_size=(1, 1, 1), pool_kernel_size=(1, 7, 7), pool_stride=(1, 1, 1), pool_padding=(0, 0, 0), dropout_rate=0.5, activation=None, output_with_global_average=True)[source]¶

Creates ResNet basic head. This layer performs an optional pooling operation followed by an optional dropout, a fully-connected projection, an activation layer and a global spatiotemporal averaging.

 Pooling
    ↓
 Dropout
    ↓
Projection
    ↓
Activation
    ↓
Averaging

Activation examples include: ReLU, Softmax, Sigmoid, and None. Pool3d examples include: AvgPool3d, MaxPool3d, AdaptiveAvgPool3d, and None.

Parameters:

in_features (int) – Input channel size of the resnet head.
num_classes (int, optional) – Output channel size of the resnet head.
Default: 400
pool (Union[str, Callable], optional) – A callable that constructs resnet head pooling layer, examples include: AvgPool3d, MaxPool3d, AdaptiveAvgPool3d, and None (not applying pooling).
Default: <class 'torch.nn.modules.pooling.AvgPool3d'>
pool_kernel_size (Tuple[int], optional) – Pooling kernel size(s) when not using adaptive pooling.
Default: (1, 7, 7)
pool_stride (Tuple[int], optional) – Pooling stride size(s) when not using adaptive pooling.
Default: (1, 1, 1)
pool_padding (Tuple[int], optional) – Pooling padding size(s) when not using adaptive pooling.
Default: (0, 0, 0)
output_size (Tuple[int], optional) – Spatial temporal output size when using adaptive pooling.
Default: (1, 1, 1)
activation (Union[str, Callable, None], optional) – A callable that constructs resnet head activation layer, examples include: ReLU, Softmax, Sigmoid, and None (not applying activation).
Default: None
dropout_rate (float, optional) – Dropout rate.
Default: 0.5
output_with_global_average (bool, optional) – If True, perform global averaging on temporal and spatial dimensions and reshape output to \(batch\_size imes out\_features\).
Default: True

Return type:

Module

MLP¶

class eztorch.models.heads.MLPHead(activation_inplace=True, activation_layer=<class 'torch.nn.modules.activation.ReLU'>, affine=True, bias=True, dropout=0.0, dropout_inplace=False, hidden_dims=2048, input_dim=2048, norm_layer=None, num_layers=2, last_bias=True, last_norm=False, last_affine=False, output_dim=128, last_init_normal=False, init_mean=0.0, init_std=0.01, zero_bias=True)[source]¶

Build a MLP head with optional dropout and normalization.

Parameters:

activation_inplace (bool, optional) – Inplace operation for activation layers.
Default: True
activation_layer (Union[str, Module], optional) – Activation layer, if str lookup for the module in _ACTIVATION_LAYERS dictionary.
Default: <class 'torch.nn.modules.activation.ReLU'>
affine (bool, optional) – If True, use affine in normalization layer.
Default: True
bias (bool, optional) – If True, use bias in linear layer. If norm_layer, set to False.
Default: True
dropout (Union[float, Iterable[float]], optional) – Dropout probability, if \(0\), no dropout layer.
Default: 0.0
dropout_inplace (bool, optional) – If True, use inplace operation in dropout.
Default: False
hidden_dims (Union[int, Iterable[int]], optional) – dimension of the hidden layers \((num\_layers - 1)\). If int, used for all hidden layers.
Default: 2048
input_dim (int, optional) – Input dimension for the MLP head.
Default: 2048
norm_layer (Union[str, Module, None], optional) – Normalization layer after the linear layer, if str lookup for the module in _BN_LAYERS dictionary.
Default: None
num_layers (int, optional) – Number of layers \((number\ of\ hidden\ layers + 1)\).
Default: 2
last_bias (bool, optional) – If True, use bias in output layer. If last_norm and norm_layer set to False.
Default: True
last_norm (bool, optional) – If True, Apply normalization to the last layer if norm_layer.
Default: False
last_affine (bool, optional) – If True, use affine in output normalization layer.
Default: False
output_dim (int, optional) – Output dimension for the MLP head.
Default: 128
last_init_normal (bool, optional) – If True, make normal initialization for last layer.
Default: False
init_mean (float, optional) – Mean for the last initialization.
Default: 0.0
init_std (float, optional) – STD for the last initialization.
Default: 0.01
zero_bias (bool, optional) – If True, put zeros to bias for the last initialization.
Default: True

Raises:

NotImplementedError – If norm_layer is not supported.