Heads

Linear

class eztorch.models.heads.LinearHead(affine=True, bias=True, dropout=0.0, dropout_inplace=False, input_dim=2048, norm_layer=None, output_dim=1000, init_normal=True, init_mean=0.0, init_std=0.01, zero_bias=True)[source]

Build a Linear head with optional dropout and normalization.

Parameters:
  • affine (bool, optional) – Use affine in normalization layer.

    Default: True

  • bias (bool, optional) – Use bias in linear layer. If norm_layer, set to False.

    Default: True

  • dropout (float, optional) – Dropout probability, if \(0\), no dropout layer.

    Default: 0.0

  • dropout_inplace (bool, optional) – Use inplace operation in dropout.

    Default: False

  • input_dim (int, optional) – Input dimension for the linear head.

    Default: 2048

  • norm_layer (Union[str, Module, None], optional) – Normalization layer after the linear layer, if str lookup for the module in _BN_LAYERS dictionary.

    Default: None

  • output_dim (int, optional) – Output dimension for the linear head.

    Default: 1000

  • init_normal (bool, optional) – If True, make normal initialization for linear layer.

    Default: True

  • init_mean (float, optional) – Mean for the initialization.

    Default: 0.0

  • init_std (float, optional) – STD for the initialization.

    Default: 0.01

  • zero_bias (bool, optional) – If True, put zeros to bias for the initialization.

    Default: True

Raises:

NotImplementedError – If norm_layer is not supported.

Linear 3D

class eztorch.models.heads.Linear3DHead(input_dim, pool=None, dropout=None, bn=None, proj=None, norm=False, init_std=0.01, view=True)[source]

Linear 3D head. This layer performs an optional pooling operation followed by an optional dropout, a fully-connected projection, an optional activation layer and a global spatiotemporal averaging.

   Pool3d
      ↓
Normalization
      ↓
   Dropout
      ↓
  Projection
Parameters:
  • in_features – Input channel size of the resnet head.

  • pool (Module, optional) – Pooling module.

    Default: None

  • dropout (Module, optional) – Dropout module.

    Default: None

  • bn (Module, optional) – Batch normalization module.

    Default: None

  • proj (Module, optional) – Project module.

    Default: None

  • norm (bool, optional) – If True, normalize features along first dimension.

    Default: False

  • init_std (float, optional) – Init std for weights from pytorchvideo.

    Default: 0.01

  • view (bool, optional) – If True, apply reshape view to \((-1, num features)\).

    Default: True

eztorch.models.heads.create_linear3d_head(*, in_features, num_classes=400, bn=None, norm=False, pool=<class 'torch.nn.modules.pooling.AvgPool3d'>, pool_kernel_size=(1, 7, 7), pool_stride=(1, 1, 1), pool_padding=(0, 0, 0), output_size=(1, 1, 1), dropout_rate=0.5, view=True)[source]

Creates ResNet basic head. This layer performs an optional pooling operation followed by an optional dropout, a fully-connected projection, an activation layer and a global spatiotemporal averaging.

   Pool3d
      ↓
Normalization
      ↓
   Dropout
      ↓
  Projection

Pool3d examples include: AvgPool3d, MaxPool3d, AdaptiveAvgPool3d, and None.

Parameters:
  • in_features (int) – Input channel size of the resnet head.

  • num_classes (int, optional) – Output channel size of the resnet head.

    Default: 400

  • bn (Union[str, Callable], optional) – A callable that constructs a batch norm layer.

    Default: None

  • norm (bool, optional) – If True, normalize features along first dimension.

    Default: False

  • pool (Union[str, Callable], optional) – A callable that constructs resnet head pooling layer, examples include: nn.AvgPool3d, nn.MaxPool3d, nn.AdaptiveAvgPool3d, and None (not applying pooling).

    Default: <class 'torch.nn.modules.pooling.AvgPool3d'>

  • pool_kernel_size (Tuple[int], optional) – Pooling kernel size(s) when not using adaptive pooling.

    Default: (1, 7, 7)

  • pool_stride (Tuple[int], optional) – Pooling stride size(s) when not using adaptive pooling.

    Default: (1, 1, 1)

  • pool_padding (Tuple[int], optional) – Pooling padding size(s) when not using adaptive pooling.

    Default: (0, 0, 0)

  • output_size (Tuple[int], optional) – Spatial temporal output size when using adaptive pooling.

    Default: (1, 1, 1)

  • dropout_rate (float, optional) – Dropout rate.

    Default: 0.5

  • view (bool, optional) – Whether to apply reshape view to \((-1, num\ features)\).

    Default: True

Return type:

Module

ResNet 3D head

class eztorch.models.heads.VideoResNetHead(pool=None, dropout=None, proj=None, activation=None, output_pool=None, init_std=0.01)[source]

ResNet basic head. This layer performs an optional pooling operation followed by an optional dropout, a fully-connected projection, an optional activation layer and a global spatiotemporal averaging.

 Pool3d
    ↓
 Dropout
    ↓
Projection
    ↓
Activation
    ↓
Averaging

The builder can be found in create_video_resnet_head().

Parameters:
  • pool (Module, optional) – Pooling module.

    Default: None

  • dropout (Module, optional) – dropout module.

    Default: None

  • proj (Module, optional) – project module.

    Default: None

  • activation (Module, optional) – activation module.

    Default: None

  • output_pool (Module, optional) – pooling module for output.

    Default: None

  • init_std (float, optional) – init std for weights from pytorchvideo.

    Default: 0.01

eztorch.models.heads.create_video_resnet_head(*, in_features, num_classes=400, pool=<class 'torch.nn.modules.pooling.AvgPool3d'>, output_size=(1, 1, 1), pool_kernel_size=(1, 7, 7), pool_stride=(1, 1, 1), pool_padding=(0, 0, 0), dropout_rate=0.5, activation=None, output_with_global_average=True)[source]

Creates ResNet basic head. This layer performs an optional pooling operation followed by an optional dropout, a fully-connected projection, an activation layer and a global spatiotemporal averaging.

 Pooling
    ↓
 Dropout
    ↓
Projection
    ↓
Activation
    ↓
Averaging

Activation examples include: ReLU, Softmax, Sigmoid, and None. Pool3d examples include: AvgPool3d, MaxPool3d, AdaptiveAvgPool3d, and None.

Parameters:
  • in_features (int) – Input channel size of the resnet head.

  • num_classes (int, optional) – Output channel size of the resnet head.

    Default: 400

  • pool (Union[str, Callable], optional) – A callable that constructs resnet head pooling layer, examples include: AvgPool3d, MaxPool3d, AdaptiveAvgPool3d, and None (not applying pooling).

    Default: <class 'torch.nn.modules.pooling.AvgPool3d'>

  • pool_kernel_size (Tuple[int], optional) – Pooling kernel size(s) when not using adaptive pooling.

    Default: (1, 7, 7)

  • pool_stride (Tuple[int], optional) – Pooling stride size(s) when not using adaptive pooling.

    Default: (1, 1, 1)

  • pool_padding (Tuple[int], optional) – Pooling padding size(s) when not using adaptive pooling.

    Default: (0, 0, 0)

  • output_size (Tuple[int], optional) – Spatial temporal output size when using adaptive pooling.

    Default: (1, 1, 1)

  • activation (Union[str, Callable, None], optional) – A callable that constructs resnet head activation layer, examples include: ReLU, Softmax, Sigmoid, and None (not applying activation).

    Default: None

  • dropout_rate (float, optional) – Dropout rate.

    Default: 0.5

  • output_with_global_average (bool, optional) – If True, perform global averaging on temporal and spatial dimensions and reshape output to \(batch\_size imes out\_features\).

    Default: True

Return type:

Module

MLP

class eztorch.models.heads.MLPHead(activation_inplace=True, activation_layer=<class 'torch.nn.modules.activation.ReLU'>, affine=True, bias=True, dropout=0.0, dropout_inplace=False, hidden_dims=2048, input_dim=2048, norm_layer=None, num_layers=2, last_bias=True, last_norm=False, last_affine=False, output_dim=128, last_init_normal=False, init_mean=0.0, init_std=0.01, zero_bias=True)[source]

Build a MLP head with optional dropout and normalization.

Parameters:
  • activation_inplace (bool, optional) – Inplace operation for activation layers.

    Default: True

  • activation_layer (Union[str, Module], optional) – Activation layer, if str lookup for the module in _ACTIVATION_LAYERS dictionary.

    Default: <class 'torch.nn.modules.activation.ReLU'>

  • affine (bool, optional) – If True, use affine in normalization layer.

    Default: True

  • bias (bool, optional) – If True, use bias in linear layer. If norm_layer, set to False.

    Default: True

  • dropout (Union[float, Iterable[float]], optional) – Dropout probability, if \(0\), no dropout layer.

    Default: 0.0

  • dropout_inplace (bool, optional) – If True, use inplace operation in dropout.

    Default: False

  • hidden_dims (Union[int, Iterable[int]], optional) – dimension of the hidden layers \((num\_layers - 1)\). If int, used for all hidden layers.

    Default: 2048

  • input_dim (int, optional) – Input dimension for the MLP head.

    Default: 2048

  • norm_layer (Union[str, Module, None], optional) – Normalization layer after the linear layer, if str lookup for the module in _BN_LAYERS dictionary.

    Default: None

  • num_layers (int, optional) – Number of layers \((number\ of\ hidden\ layers + 1)\).

    Default: 2

  • last_bias (bool, optional) – If True, use bias in output layer. If last_norm and norm_layer set to False.

    Default: True

  • last_norm (bool, optional) – If True, Apply normalization to the last layer if norm_layer.

    Default: False

  • last_affine (bool, optional) – If True, use affine in output normalization layer.

    Default: False

  • output_dim (int, optional) – Output dimension for the MLP head.

    Default: 128

  • last_init_normal (bool, optional) – If True, make normal initialization for last layer.

    Default: False

  • init_mean (float, optional) – Mean for the last initialization.

    Default: 0.0

  • init_std (float, optional) – STD for the last initialization.

    Default: 0.01

  • zero_bias (bool, optional) – If True, put zeros to bias for the last initialization.

    Default: True

Raises:

NotImplementedError – If norm_layer is not supported.