Siamese¶

Base classes¶

class eztorch.models.siamese.SiameseBaseModel(trunk, optimizer, projector=None, predictor=None, train_transform=None, val_transform=None, test_transform=None, normalize_outputs=True, num_global_crops=2, num_local_crops=0, num_splits=0, num_splits_per_combination=2, mutual_pass=False)[source]¶

Abstract class to represent siamese models.

Subclasses should implement training_step method.

Parameters:

trunk (DictConfig) – Config to build a trunk.
optimizer (DictConfig) – Config to build optimizers and schedulers.
projector (Optional[DictConfig], optional) – Config to build a project.
Default: None
predictor (Optional[DictConfig], optional) – Config to build a predictor.
Default: None
train_transform (Optional[DictConfig], optional) – Config to perform transformation on train input.
Default: None
val_transform (Optional[DictConfig], optional) – Config to perform transformation on val input.
Default: None
test_transform (Optional[DictConfig], optional) – Config to perform transformation on test input.
Default: None
normalize_outputs (bool, optional) – If True, normalize outputs.
Default: True
num_global_crops (int, optional) – Number of global crops which are the first elements
Default: 2
batch. (of each)
num_local_crops (int, optional) – Number of local crops which are the last elements
Default: 0
batch.
num_splits (int, optional) – Number of splits to apply to each crops.
Default: 0
num_splits_per_combination (int, optional) – Number of splits used for combinations of features of each split.
Default: 2
mutual_pass (bool, optional) – If True, perform one pass per crop resolution.
Default: False

class eztorch.models.siamese.MomentumSiameseBaseModel(trunk, optimizer, projector=None, predictor=None, train_transform=None, val_transform=None, test_transform=None, normalize_outputs=True, num_global_crops=2, num_local_crops=0, num_splits=0, num_splits_per_combination=2, mutual_pass=False, initial_momentum=0.996, scheduler_momentum='cosine')[source]¶

Abstract class to represent siamese models with a momentum branch.

Subclasses should implement training_step method.

Parameters:

trunk (DictConfig) – Config to build a trunk.
optimizer (DictConfig) – Config to build optimizers and schedulers.
projector (Optional[DictConfig], optional) – Config to build a project.
Default: None
predictor (Optional[DictConfig], optional) – Config to build a predictor.
Default: None
train_transform (Optional[DictConfig], optional) – Config to perform transformation on train input.
Default: None
val_transform (Optional[DictConfig], optional) – Config to perform transformation on val input.
Default: None
test_transform (Optional[DictConfig], optional) – Config to perform transformation on test input.
Default: None
normalize_outputs (bool, optional) – If True, normalize outputs.
Default: True
num_global_crops (int, optional) – Number of global crops which are the first elements of each batch.
Default: 2
num_local_crops (int, optional) – Number of local crops which are the last elements of each batch.
Default: 0
num_splits (int, optional) – Number of splits to apply to each crops.
Default: 0
num_splits_per_combination (int, optional) – Number of splits used for combinations of features of each split.
Default: 2
mutual_pass (bool, optional) – If True, perform one pass per branch per crop resolution.
Default: False
initial_momentum (int, optional) – Initial value for the momentum update.
Default: 0.996
scheduler_momentum (str, optional) – Rule to update the momentum value.
Default: 'cosine'

class eztorch.models.siamese.ShuffleMomentumSiameseBaseModel(trunk, optimizer, projector=None, predictor=None, train_transform=None, val_transform=None, test_transform=None, normalize_outputs=True, num_global_crops=2, num_local_crops=0, num_splits=0, num_splits_per_combination=2, mutual_pass=False, initial_momentum=0.996, scheduler_momentum='cosine', shuffle_bn=True, num_devices=1, simulate_n_devices=8)[source]¶

Abstract class to represent siamese models with a momentum branch and possibility to shuffle input elements in momentum branch to apply normalization trick from MoCo.

Subclasses should implement training_step method.

Parameters:

trunk (DictConfig) – Config to build a trunk.
optimizer (DictConfig) – Config to build optimizers and schedulers.
projector (Optional[DictConfig], optional) – Config to build a project.
Default: None
predictor (Optional[DictConfig], optional) – Config to build a predictor.
Default: None
train_transform (Optional[DictConfig], optional) – Config to perform transformation on train input.
Default: None
val_transform (Optional[DictConfig], optional) – Config to perform transformation on val input.
Default: None
test_transform (Optional[DictConfig], optional) – Config to perform transformation on test input.
Default: None
normalize_outputs (bool, optional) – If True, normalize outputs.
Default: True
num_global_crops (int, optional) – Number of global crops which are the first elements of each batch.
Default: 2
num_local_crops (int, optional) – Number of local crops which are the last elements of each batch.
Default: 0
num_splits (int, optional) – Number of splits to apply to each crops.
Default: 0
num_splits_per_combination (int, optional) – Number of splits used for combinations of features of each split.
Default: 2
mutual_pass (bool, optional) – If True, perform one pass per branch per crop resolution.
Default: False
initial_momentum (int, optional) – Initial value for the momentum update.
Default: 0.996
scheduler_momentum (str, optional) – Rule to update the momentum value.
Default: 'cosine'
shuffle_bn (bool, optional) – If True, apply shuffle normalization trick from MoCo.
Default: True
num_devices (int, optional) – Number of devices used to train the model in each node.
Default: 1
simulate_n_devices (int, optional) – Number of devices to simulate to apply shuffle trick. Requires shuffle_bn to be True and num_devices to be \(1\).
Default: 8

class eztorch.models.siamese.ShuffleMomentumQueueBaseModel(trunk, optimizer, projector=None, predictor=None, train_transform=None, val_transform=None, test_transform=None, normalize_outputs=True, num_global_crops=2, num_local_crops=0, num_splits=0, num_splits_per_combination=2, mutual_pass=False, initial_momentum=0.999, scheduler_momentum='constant', shuffle_bn=False, num_devices=1, simulate_n_devices=8, queue=None, sym=False, use_keys=False)[source]¶

SCE model.

Parameters:

trunk (DictConfig) – Config tu build a trunk.
optimizer (DictConfig) – Config tu build optimizers and schedulers.
projector (Optional[DictConfig], optional) – Config to build a project.
Default: None
predictor (Optional[DictConfig], optional) – Config to build a predictor.
Default: None
train_transform (Optional[DictConfig], optional) – Config to perform transformation on train input.
Default: None
val_transform (Optional[DictConfig], optional) – Config to perform transformation on val input.
Default: None
test_transform (Optional[DictConfig], optional) – Config to perform transformation on test input.
Default: None
normalize_outputs (bool, optional) – If True, normalize outputs.
Default: True
num_global_crops (int, optional) – Number of global crops which are the first elements of each batch.
Default: 2
num_local_crops (int, optional) – Number of local crops which are the last elements of each batch.
Default: 0
num_splits (int, optional) – Number of splits to apply to each crops.
Default: 0
num_splits_per_combination (int, optional) – Number of splits used for combinations of features of each split.
Default: 2
mutual_pass (bool, optional) – If True, perform one pass per branch per crop resolution.
Default: False
initial_momentum (int, optional) – Initial value for the momentum update.
Default: 0.999
scheduler_momentum (str, optional) – Rule to update the momentum value.
Default: 'constant'
shuffle_bn (bool, optional) – If True, apply shuffle normalization trick from MoCo.
Default: False
num_devices (int, optional) – Number of devices used to train the model in each node.
Default: 1
simulate_n_devices (int, optional) – Number of devices to simulate to apply shuffle trick. Requires shuffle_bn to be True and num_devices to be \(1\).
Default: 8
queue (Optional[DictConfig], optional) – Config to build a queue.
Default: None
sym (bool, optional) – If True, symmetrised the loss.
Default: False
use_keys (bool, optional) – If True, add keys to negatives.
Default: False

Self-Supervised Models¶

SimCLR¶

class eztorch.models.siamese.SimCLRModel(trunk, optimizer, projector=None, train_transform=None, val_transform=None, test_transform=None, normalize_outputs=True, num_global_crops=2, num_local_crops=0, num_splits=0, num_splits_per_combination=2, mutual_pass=False, temp=0.1)[source]¶

SimCLR model with version 1, 2 that can be configured.

References

SimCLR: https://arxiv.org/abs/2002.05709
SimCLRv2: https://arxiv.org/abs/2006.10029

Parameters:

trunk (DictConfig) – Config tu build a trunk.
optimizer (DictConfig) – Config tu build optimizers and schedulers.
projector (Optional[DictConfig], optional) – Config to build a project.
Default: None
train_transform (Optional[DictConfig], optional) – Config to perform transformation on train input.
Default: None
val_transform (Optional[DictConfig], optional) – Config to perform transformation on val input.
Default: None
test_transform (Optional[DictConfig], optional) – Config to perform transformation on test input.
Default: None
normalize_outputs (bool, optional) – If True, normalize outputs.
Default: True
num_global_crops (int, optional) – Number of global crops which are the first elements of each batch.
Default: 2
num_local_crops (int, optional) – Number of local crops which are the last elements of each batch.
Default: 0
num_splits (int, optional) – Number of splits to apply to each crops.
Default: 0
num_splits_per_combination (int, optional) – Number of splits used for combinations of features of each split.
Default: 2
mutual_pass (bool, optional) – If True, perform one pass per crop resolution.
Default: False
temp (float, optional) – Temperature parameter to scale the online similarities.
Default: 0.1

MoCo¶

class eztorch.models.siamese.MoCoModel(trunk, optimizer, projector=None, predictor=None, train_transform=None, val_transform=None, test_transform=None, normalize_outputs=True, num_global_crops=2, num_local_crops=0, num_splits=0, num_splits_per_combination=2, mutual_pass=False, initial_momentum=0.999, scheduler_momentum='constant', shuffle_bn=True, num_devices=1, simulate_n_devices=8, queue=None, sym=False, use_keys=False, temp=0.2)[source]¶

MoCo model with version 1, 2, 2+, 3 that can be configured.

References

MoCo: https://arxiv.org/abs/1911.05722
MoCov2: https://arxiv.org/abs/2003.04297
MoCov2+: https://arxiv.org/abs/2011.10566
MoCov3: https://arxiv.org/abs/2104.02057

Parameters:

trunk (DictConfig) – Config to build a trunk.
optimizer (DictConfig) – Config to build optimizers and schedulers.
projector (Optional[DictConfig], optional) – Config to build a project.
Default: None
predictor (Optional[DictConfig], optional) – Config to build a predictor.
Default: None
train_transform (Optional[DictConfig], optional) – Config to perform transformation on train input.
Default: None
val_transform (Optional[DictConfig], optional) – Config to perform transformation on val input.
Default: None
test_transform (Optional[DictConfig], optional) – Config to perform transformation on test input.
Default: None
normalize_outputs (bool, optional) – If True, normalize outputs.
Default: True
num_global_crops (int, optional) – Number of global crops which are the first elements of each batch.
Default: 2
num_local_crops (int, optional) – Number of local crops which are the last elements of each batch.
Default: 0
num_splits (int, optional) – Number of splits to apply to each crops.
Default: 0
num_splits_per_combination (int, optional) – Number of splits used for combinations of features of each split.
Default: 2
mutual_pass (bool, optional) – If True, perform one pass per branch per crop resolution.
Default: False
initial_momentum (int, optional) – initial value for the momentum update.
Default: 0.999
scheduler_momentum (str, optional) – rule to update the momentum value.
Default: 'constant'
shuffle_bn (bool, optional) – If True, apply shuffle normalization trick from MoCo.
Default: True
num_devices (int, optional) – Number of devices used to train the model in each node.
Default: 1
simulate_n_devices (int, optional) – Number of devices to simulate to apply shuffle trick. Requires shuffle_bn to be True and num_devices to be \(1\).
Default: 8
queue (Optional[DictConfig], optional) – Config to build a queue.
Default: None
sym (bool, optional) – If True, symmetrised the loss.
Default: False
use_keys (bool, optional) – If True, add keys to negatives.
Default: False
temp (float, optional) – Temperature parameter to scale the online similarities.
Default: 0.2

MoCo v3¶

class eztorch.models.siamese.MoCov3Model(trunk, optimizer, projector=None, predictor=None, train_transform=None, val_transform=None, test_transform=None, normalize_outputs=True, num_global_crops=2, num_local_crops=0, num_splits=0, num_splits_per_combination=2, mutual_pass=False, initial_momentum=0.99, scheduler_momentum='cosine', temp=1.0)[source]¶

MoCov3 that can be configured as in the paper.

References

MoCov3: https://arxiv.org/abs/2104.02057

Parameters:

trunk (DictConfig) – Config to build a trunk.
optimizer (DictConfig) – Config to build optimizers and schedulers.
projector (Optional[DictConfig], optional) – Config to build a project.
Default: None
predictor (Optional[DictConfig], optional) – Config to build a predictor.
Default: None
train_transform (Optional[DictConfig], optional) – Config to perform transformation on train input.
Default: None
val_transform (Optional[DictConfig], optional) – Config to perform transformation on val input.
Default: None
test_transform (Optional[DictConfig], optional) – Config to perform transformation on test input.
Default: None
normalize_outputs (bool, optional) – If True, normalize outputs.
Default: True
num_global_crops (int, optional) – Number of global crops which are the first elements of each batch.
Default: 2
num_local_crops (int, optional) – Number of local crops which are the last elements of each batch.
Default: 0
num_splits (int, optional) – Number of splits to apply to each crops.
Default: 0
num_splits_per_combination (int, optional) – Number of splits used for combinations of features of each split.
Default: 2
mutual_pass (bool, optional) – If True, perform one pass per branch per crop resolution.
Default: False
initial_momentum (int, optional) – Initial value for the momentum update.
Default: 0.99
scheduler_momentum (str, optional) – Rule to update the momentum value.
Default: 'cosine'
temp (float, optional) – Temperature parameter to scale the online similarities.
Default: 1.0

ReSSL¶

class eztorch.models.siamese.ReSSLModel(trunk, optimizer, projector=None, predictor=None, train_transform=None, val_transform=None, test_transform=None, normalize_outputs=True, num_global_crops=2, num_local_crops=0, num_splits=0, num_splits_per_combination=2, mutual_pass=False, initial_momentum=0.999, scheduler_momentum='constant', shuffle_bn=True, num_devices=1, simulate_n_devices=8, queue=None, sym=False, use_keys=False, temp=0.1, temp_m=0.04, initial_temp_m=0.04)[source]¶

ReSSL model.

References

ReSSL: https://proceedings.neurips.cc/paper/2021/file/14c4f36143b4b09cbc320d7c95a50ee7-Paper.pdf

Parameters:

trunk (DictConfig) – Config tu build a trunk.
optimizer (DictConfig) – Config tu build optimizers and schedulers.
projector (Optional[DictConfig], optional) – Config to build a project.
Default: None
predictor (Optional[DictConfig], optional) – Config to build a predictor.
Default: None
train_transform (Optional[DictConfig], optional) – Config to perform transformation on train input.
Default: None
val_transform (Optional[DictConfig], optional) – Config to perform transformation on val input.
Default: None
test_transform (Optional[DictConfig], optional) – Config to perform transformation on test input.
Default: None
normalize_outputs (bool, optional) – If True, normalize outputs.
Default: True
num_global_crops (int, optional) – Number of global crops which are the first elements of each batch.
Default: 2
num_local_crops (int, optional) – Number of local crops which are the last elements of each batch.
Default: 0
num_splits (int, optional) – Number of splits to apply to each crops.
Default: 0
num_splits_per_combination (int, optional) – Number of splits used for combinations of features of each split.
Default: 2
mutual_pass (bool, optional) – If True, perform one pass per branch per crop resolution.
Default: False
initial_momentum (int, optional) – Initial value for the momentum update.
Default: 0.999
scheduler_momentum (str, optional) – Rule to update the momentum value.
Default: 'constant'
shuffle_bn (bool, optional) – If True, apply shuffle normalization trick from MoCo.
Default: True
num_devices (int, optional) – Number of devices used to train the model in each node.
Default: 1
simulate_n_devices (int, optional) – Number of devices to simulate to apply shuffle trick. Requires shuffle_bn to be True and num_devices to be \(1\).
Default: 8
queue (Optional[DictConfig], optional) – Config to build a queue.
Default: None
sym (bool, optional) – If True, symmetrised the loss.
Default: False
use_keys (bool, optional) – If True, add keys to negatives.
Default: False
temp (float, optional) – Temperature parameter to scale the online similarities.
Default: 0.1
temp_m (float, optional) – Temperature parameter to scale the target similarities. Initial value if warmup applied.
Default: 0.04

SCE¶

class eztorch.models.siamese.SCEModel(trunk, optimizer, projector=None, predictor=None, train_transform=None, val_transform=None, test_transform=None, normalize_outputs=True, num_global_crops=2, num_local_crops=0, num_splits=0, num_splits_per_combination=2, mutual_pass=False, initial_momentum=0.999, scheduler_momentum='constant', shuffle_bn=True, num_devices=1, simulate_n_devices=8, queue=None, sym=False, use_keys=False, temp=0.1, temp_m=0.05, start_warmup_temp_m=0.05, warmup_epoch_temp_m=0, warmup_scheduler_temp_m='cosine', coeff=0.5, warmup_scheduler_coeff='linear', warmup_epoch_coeff=0, start_warmup_coeff=1.0, scheduler_coeff=None, final_scheduler_coeff=0.0)[source]¶

SCE model.

References

SCE: https://arxiv.org/pdf/2111.14585.pdf

Parameters:

trunk (DictConfig) – Config tu build a trunk.
optimizer (DictConfig) – Config tu build optimizers and schedulers.
projector (Optional[DictConfig], optional) – Config to build a project.
Default: None
predictor (Optional[DictConfig], optional) – Config to build a predictor.
Default: None
train_transform (Optional[DictConfig], optional) – Config to perform transformation on train input.
Default: None
val_transform (Optional[DictConfig], optional) – Config to perform transformation on val input.
Default: None
test_transform (Optional[DictConfig], optional) – Config to perform transformation on test input.
Default: None
normalize_outputs (bool, optional) – If True, normalize outputs.
Default: True
num_global_crops (int, optional) – Number of global crops which are the first elements of each batch.
Default: 2
num_local_crops (int, optional) – Number of local crops which are the last elements of each batch.
Default: 0
num_splits (int, optional) – Number of splits to apply to each crops.
Default: 0
num_splits_per_combination (int, optional) – Number of splits used for combinations of features of each split.
Default: 2
mutual_pass (bool, optional) – If True, perform one pass per branch per crop resolution.
Default: False
initial_momentum (int, optional) – initial value for the momentum update.
Default: 0.999
scheduler_momentum (str, optional) – rule to update the momentum value.
Default: 'constant'
shuffle_bn (bool, optional) – If True, apply shuffle normalization trick from MoCo.
Default: True
num_devices (int, optional) – Number of devices used to train the model in each node.
Default: 1
simulate_n_devices (int, optional) – Number of devices to simulate to apply shuffle trick. Requires shuffle_bn to be True and num_devices to be \(1\).
Default: 8
queue (Optional[DictConfig], optional) – Config to build a queue.
Default: None
sym (bool, optional) – If True, symmetrised the loss.
Default: False
use_keys (bool, optional) – If True, add keys to negatives.
Default: False
temp (float, optional) – Temperature parameter to scale the online similarities.
Default: 0.1
temp_m (float, optional) – Temperature parameter to scale the target similarities. Initial value if warmup applied.
Default: 0.05
start_warmup_temp_m (float, optional) – Initial temperature parameter to scale the target similarities in case of warmup.
Default: 0.05
warmup_epoch_temp_m (int, optional) – Number of warmup epochs for the target temperature.
Default: 0
warmup_scheduler_temp_m (Optional[int], optional) – Type of scheduler for warming up the target temperature. Options are: 'linear', 'cosine'.
Default: 'cosine'
coeff (float, optional) – Coeff parameter between InfoNCE and relational aspects.
Default: 0.5
warmup_scheduler_coeff (Optional[int], optional) – Type of scheduler for warming up the coefficient. Options are: 'linear', 'cosine'.
Default: 'linear'
warmup_epoch_coeff (int, optional) – Number of warmup epochs for coefficient.
Default: 0
start_warmup_coeff (float, optional) – Starting value of coefficient for warmup.
Default: 1.0
scheduler_coeff (Optional[str], optional) – Type of scheduler for coefficient after warmup. Options are: 'linear', 'cosine'.
Default: None
final_scheduler_coeff (float, optional) – Final value of scheduler coefficient.
Default: 0.0

SCE Tokens¶

class eztorch.models.siamese.SCETokensModel(trunk, optimizer, projector=None, predictor=None, train_transform=None, val_transform=None, test_transform=None, normalize_outputs=True, num_global_crops=2, num_local_crops=0, mutual_pass=False, initial_momentum=0.999, scheduler_momentum='constant', queue={}, sym=False, use_keys=True, use_all_keys=False, num_prefix_tokens=0, num_out_tokens=32, positive_radius=0, keep_aligned_positive=True, temp=0.1, temp_m=0.05, start_warmup_temp_m=0.05, warmup_epoch_temp_m=0, warmup_scheduler_temp_m='cosine', coeff=0.5, normalize_positive_coeff=False, warmup_scheduler_coeff='linear', warmup_epoch_coeff=0, start_warmup_coeff=1, scheduler_coeff=None, final_scheduler_coeff=0)[source]¶

SCE model for tokens output.

References

SCE: https://arxiv.org/pdf/2111.14585.pdf

Parameters:

trunk (DictConfig) – Config tu build a trunk.
optimizer (DictConfig) – Config tu build optimizers and schedulers.
projector (Optional[DictConfig], optional) – Config to build a project.
Default: None
predictor (Optional[DictConfig], optional) – Config to build a predictor.
Default: None
train_transform (Optional[DictConfig], optional) – Config to perform transformation on train input.
Default: None
val_transform (Optional[DictConfig], optional) – Config to perform transformation on val input.
Default: None
test_transform (Optional[DictConfig], optional) – Config to perform transformation on test input.
Default: None
normalize_outputs (bool, optional) – If True, normalize outputs.
Default: True
num_global_crops (int, optional) – Number of global crops which are the first elements of each batch.
Default: 2
num_local_crops (int, optional) – Number of local crops which are the last elements of each batch.
Default: 0
mutual_pass (bool, optional) – If True, perform one pass per branch per crop resolution.
Default: False
initial_momentum (int, optional) – initial value for the momentum update.
Default: 0.999
scheduler_momentum (str, optional) – rule to update the momentum value.
Default: 'constant'
queue (DictConfig, optional) – Config to build a queue.
Default: {}
sym (bool, optional) – If True, symmetrised the loss.
Default: False
use_keys (bool, optional) – If True, add aligned keys to negatives.
Default: True
use_all_keys (bool, optional) – If True, add all keys to negatives.
Default: False
num_out_tokens (int, optional) – Number of expected output tokens.
Default: 32
positive_radius (int, optional) – Number of adjacent tokens to consider as positives.
Default: 0
keep_aligned_positive (bool, optional) – Whether to keep the aligned token as positive.
Default: True
temp (float, optional) – Temperature parameter to scale the online similarities.
Default: 0.1
temp_m (float, optional) – Temperature parameter to scale the target similarities. Initial value if warmup applied.
Default: 0.05
start_warmup_temp_m (float, optional) – Initial temperature parameter to scale the target similarities in case of warmup.
Default: 0.05
warmup_epoch_temp_m (int, optional) – Number of warmup epochs for the target temperature.
Default: 0
warmup_scheduler_temp_m (Optional[int], optional) – Type of scheduler for warming up the target temperature. Options are: 'linear', 'cosine'.
Default: 'cosine'
coeff (float, optional) – Coeff parameter between InfoNCE and relational aspects.
Default: 0.5
normalize_positive_coeff (bool, optional) – Whether to use the coeff argument or multiply it by normalized mask over number of positives.
Default: False
warmup_scheduler_coeff (Optional[int], optional) – Type of scheduler for warming up the coefficient. Options are: 'linear', 'cosine'.
Default: 'linear'
warmup_epoch_coeff (int, optional) – Number of warmup epochs for coefficient.
Default: 0
start_warmup_coeff (float, optional) – Starting value of coefficient for warmup.
Default: 1
scheduler_coeff (Optional[str], optional) – Type of scheduler for coefficient after warmup. Options are: 'linear', 'cosine'.
Default: None
final_scheduler_coeff (float, optional) – Final value of scheduler coefficient.
Default: 0