Heads#

Linear#

class eztorch.models.heads.LinearHead(affine=True, bias=True, dropout=0.0, dropout_inplace=False, input_dim=2048, norm_layer=None, output_dim=1000, init_normal=True, init_mean=0.0, init_std=0.01, zero_bias=True)[source]#

Build a Linear head with optional dropout and normalization.

Parameters:
  • affine (

    Default:, optional) – Use affine in normalization layer.
    Default: True

  • bias (

    Default:, optional) – Use bias in linear layer. If norm_layer, set to False.
    Default: True

  • dropout (

    Default:, optional) – Dropout probability, if \(0\), no dropout layer.
    Default: 0.0

  • dropout_inplace (

    Default:, optional) – Use inplace operation in dropout.
    Default: False

  • input_dim (

    Default:, optional) – Input dimension for the linear head.
    Default: 2048

  • norm_layer (

    Default:, optional) – Normalization layer after the linear layer, if str lookup for the module in _BN_LAYERS dictionary.
    Default: None

  • output_dim (

    Default:, optional) – Output dimension for the linear head.
    Default: 1000

  • init_normal (

    Default:, optional) – If True, make normal initialization for linear layer.
    Default: True

  • init_mean (

    Default:, optional) – Mean for the initialization.
    Default: 0.0

  • init_std (

    Default:, optional) – STD for the initialization.
    Default: 0.01

  • zero_bias (

    Default:, optional) – If True, put zeros to bias for the initialization.
    Default: True

Raises:

NotImplementedError – If norm_layer is not supported.

Linear 3D#

class eztorch.models.heads.Linear3DHead(input_dim, pool=None, dropout=None, bn=None, proj=None, norm=False, init_std=0.01, view=True)[source]#

Linear 3D head. This layer performs an optional pooling operation followed by an optional dropout, a fully-connected projection, an optional activation layer and a global spatiotemporal averaging.

   Pool3d
      ↓
Normalization
      ↓
   Dropout
      ↓
  Projection
Parameters:
  • in_features – Input channel size of the resnet head.

  • pool (

    Default:, optional) – Pooling module.
    Default: None

  • dropout (

    Default:, optional) – Dropout module.
    Default: None

  • bn (

    Default:, optional) – Batch normalization module.
    Default: None

  • proj (

    Default:, optional) – Project module.
    Default: None

  • norm (

    Default:, optional) – If True, normalize features along first dimension.
    Default: False

  • init_std (

    Default:, optional) – Init std for weights from pytorchvideo.
    Default: 0.01

  • view (

    Default:, optional) – If True, apply reshape view to \((-1, num features)\).
    Default: True

eztorch.models.heads.create_linear3d_head(*, in_features, num_classes=400, bn=None, norm=False, pool=<class 'torch.nn.modules.pooling.AvgPool3d'>, pool_kernel_size=(1, 7, 7), pool_stride=(1, 1, 1), pool_padding=(0, 0, 0), output_size=(1, 1, 1), dropout_rate=0.5, view=True)[source]#

Creates ResNet basic head. This layer performs an optional pooling operation followed by an optional dropout, a fully-connected projection, an activation layer and a global spatiotemporal averaging.

   Pool3d
      ↓
Normalization
      ↓
   Dropout
      ↓
  Projection

Pool3d examples include: AvgPool3d, MaxPool3d, AdaptiveAvgPool3d, and None.

Parameters:
  • in_features (

    Default:) – Input channel size of the resnet head.

  • num_classes (

    Default:, optional) – Output channel size of the resnet head.
    Default: 400

  • bn (

    Default:, optional) – A callable that constructs a batch norm layer.
    Default: None

  • norm (

    Default:, optional) – If True, normalize features along first dimension.
    Default: False

  • pool (

    Default:, optional) – A callable that constructs resnet head pooling layer, examples include: nn.AvgPool3d, nn.MaxPool3d, nn.AdaptiveAvgPool3d, and None (not applying pooling).
    Default: <class 'torch.nn.modules.pooling.AvgPool3d'>

  • pool_kernel_size (

    Default:, optional) – Pooling kernel size(s) when not using adaptive pooling.
    Default: (1, 7, 7)

  • pool_stride (

    Default:, optional) – Pooling stride size(s) when not using adaptive pooling.
    Default: (1, 1, 1)

  • pool_padding (

    Default:, optional) – Pooling padding size(s) when not using adaptive pooling.
    Default: (0, 0, 0)

  • output_size (

    Default:, optional) – Spatial temporal output size when using adaptive pooling.
    Default: (1, 1, 1)

  • dropout_rate (

    Default:, optional) – Dropout rate.
    Default: 0.5

  • view (

    Default:, optional) – Whether to apply reshape view to \((-1, num\ features)\).
    Default: True

Return type:

Default:

ResNet 3D head#

class eztorch.models.heads.VideoResNetHead(pool=None, dropout=None, proj=None, activation=None, output_pool=None, init_std=0.01)[source]#

ResNet basic head. This layer performs an optional pooling operation followed by an optional dropout, a fully-connected projection, an optional activation layer and a global spatiotemporal averaging.

 Pool3d
    ↓
 Dropout
    ↓
Projection
    ↓
Activation
    ↓
Averaging

The builder can be found in create_video_resnet_head().

Parameters:
  • pool (

    Default:, optional) – Pooling module.
    Default: None

  • dropout (

    Default:, optional) – dropout module.
    Default: None

  • proj (

    Default:, optional) – project module.
    Default: None

  • activation (

    Default:, optional) – activation module.
    Default: None

  • output_pool (

    Default:, optional) – pooling module for output.
    Default: None

  • init_std (

    Default:, optional) – init std for weights from pytorchvideo.
    Default: 0.01

eztorch.models.heads.create_video_resnet_head(*, in_features, num_classes=400, pool=<class 'torch.nn.modules.pooling.AvgPool3d'>, output_size=(1, 1, 1), pool_kernel_size=(1, 7, 7), pool_stride=(1, 1, 1), pool_padding=(0, 0, 0), dropout_rate=0.5, activation=None, output_with_global_average=True)[source]#

Creates ResNet basic head. This layer performs an optional pooling operation followed by an optional dropout, a fully-connected projection, an activation layer and a global spatiotemporal averaging.

 Pooling
    ↓
 Dropout
    ↓
Projection
    ↓
Activation
    ↓
Averaging

Activation examples include: ReLU, Softmax, Sigmoid, and None. Pool3d examples include: AvgPool3d, MaxPool3d, AdaptiveAvgPool3d, and None.

Parameters:
  • in_features (

    Default:) – Input channel size of the resnet head.

  • num_classes (

    Default:, optional) – Output channel size of the resnet head.
    Default: 400

  • pool (

    Default:, optional) – A callable that constructs resnet head pooling layer, examples include: AvgPool3d, MaxPool3d, AdaptiveAvgPool3d, and None (not applying pooling).
    Default: <class 'torch.nn.modules.pooling.AvgPool3d'>

  • pool_kernel_size (

    Default:, optional) – Pooling kernel size(s) when not using adaptive pooling.
    Default: (1, 7, 7)

  • pool_stride (

    Default:, optional) – Pooling stride size(s) when not using adaptive pooling.
    Default: (1, 1, 1)

  • pool_padding (

    Default:, optional) – Pooling padding size(s) when not using adaptive pooling.
    Default: (0, 0, 0)

  • output_size (

    Default:, optional) – Spatial temporal output size when using adaptive pooling.
    Default: (1, 1, 1)

  • activation (

    Default:, optional) – A callable that constructs resnet head activation layer, examples include: ReLU, Softmax, Sigmoid, and None (not applying activation).
    Default: None

  • dropout_rate (

    Default:, optional) – Dropout rate.
    Default: 0.5

  • output_with_global_average (

    Default:, optional) – If True, perform global averaging on temporal and spatial dimensions and reshape output to \(batch\_size imes out\_features\).
    Default: True

Return type:

Default:

MLP#

class eztorch.models.heads.MLPHead(activation_inplace=True, activation_layer=<class 'torch.nn.modules.activation.ReLU'>, affine=True, bias=True, dropout=0.0, dropout_inplace=False, hidden_dims=2048, input_dim=2048, norm_layer=None, num_layers=2, last_bias=True, last_norm=False, last_affine=False, output_dim=128, last_init_normal=False, init_mean=0.0, init_std=0.01, zero_bias=True)[source]#

Build a MLP head with optional dropout and normalization.

Parameters:
  • activation_inplace (

    Default:, optional) – Inplace operation for activation layers.
    Default: True

  • activation_layer (

    Default:, optional) – Activation layer, if str lookup for the module in _ACTIVATION_LAYERS dictionary.
    Default: <class 'torch.nn.modules.activation.ReLU'>

  • affine (

    Default:, optional) – If True, use affine in normalization layer.
    Default: True

  • bias (

    Default:, optional) – If True, use bias in linear layer. If norm_layer, set to False.
    Default: True

  • dropout (

    Default:, optional) – Dropout probability, if \(0\), no dropout layer.
    Default: 0.0

  • dropout_inplace (

    Default:, optional) – If True, use inplace operation in dropout.
    Default: False

  • hidden_dims (

    Default:, optional) – dimension of the hidden layers \((num\_layers - 1)\). If int, used for all hidden layers.
    Default: 2048

  • input_dim (

    Default:, optional) – Input dimension for the MLP head.
    Default: 2048

  • norm_layer (

    Default:, optional) – Normalization layer after the linear layer, if str lookup for the module in _BN_LAYERS dictionary.
    Default: None

  • num_layers (

    Default:, optional) – Number of layers \((number\ of\ hidden\ layers + 1)\).
    Default: 2

  • last_bias (

    Default:, optional) – If True, use bias in output layer. If last_norm and norm_layer set to False.
    Default: True

  • last_norm (

    Default:, optional) – If True, Apply normalization to the last layer if norm_layer.
    Default: False

  • last_affine (

    Default:, optional) – If True, use affine in output normalization layer.
    Default: False

  • output_dim (

    Default:, optional) – Output dimension for the MLP head.
    Default: 128

  • last_init_normal (

    Default:, optional) – If True, make normal initialization for last layer.
    Default: False

  • init_mean (

    Default:, optional) – Mean for the last initialization.
    Default: 0.0

  • init_std (

    Default:, optional) – STD for the last initialization.
    Default: 0.01

  • zero_bias (

    Default:, optional) – If True, put zeros to bias for the last initialization.
    Default: True

Raises:

NotImplementedError – If norm_layer is not supported.