Datasets¶

Datasets permits to iterate over data from a source of data. In Eztorch, to properly use the datasets for training, validating or evaluating, one should pass through datamodules.

Eztorch defines datasets and various tools to properly use datasets from Torchvision and custom ones.

Dataset Wrapper¶

General¶

class eztorch.datasets.DictDataset(dataset)[source]¶

Wrapper around a Dataset to have a dictionary as input for models.

Parameters:: dataset (Dataset) – dataset to wrap around.

class eztorch.datasets.DatasetFolder(root, loader, extensions=None, transform=None, target_transform=None, is_valid_file=None, class_ratio=1.0, sample_ratio=1.0, class_list=None, sample_list_path=None, seed=None)[source]¶

A generic data loader.

This default directory structure can be customized by overriding the find_classes() method.

Parameters:

root (str) – Root directory path.
loader (Callable[[str], Any]) – A function to load a sample given its path.
extensions (Optional[Tuple[str, ...]], optional) – A list of allowed extensions. both extensions and is_valid_file should not be passed.
Default: None
transform (Optional[Callable], optional) – A function/transform that takes in a sample and returns a transformed version. E.g, transforms.RandomCrop for images.
Default: None
target_transform (Optional[Callable], optional) – A function/transform that takes in the target and transforms it.
Default: None
is_valid_file (Optional[Callable[[str], bool]], optional) – A function that takes path of a file and check if the file is a valid file (used to check of corrupt files) both extensions and is_valid_file should not be passed.
Default: None
class_ratio (float, optional) – Ratio of classes to use if class_list is None.
Default: 1.0
sample_ratio (float, optional) – Ratio of samples to use.
Default: 1.0
class_list (Optional[Iterable[str]], optional) – If not None, list of classes to use.
Default: None
sample_list_path (Optional[str], optional) – If not None, list of samples to use.
Default: None
seed (Optional[int], optional) – If not None, seed used to randomly choose class and samples.
Default: None

class eztorch.datasets.DumbDataset(shape, len_dataset)[source]¶

Dumb dataset that always provide random data. Useful for testing models or pipelines.

Parameters:

shape (List[int]) – shape of data to generate.
len_dataset (int) – length of the dataset. Used by dataloaders.

Image¶

class eztorch.datasets.DictCIFAR10(root, train=True, transform=None, target_transform=None, download=False)[source]¶

CIFAR10 dict dataset.

Parameters:

root (str) – Root directory of dataset where directory cifar-10-batches-py exists or will be saved to if download is set to True.
train (bool, optional) – If True, creates dataset from training set, otherwise creates from test set.
Default: True
transform (Optional[Callable], optional) – A function/transform that takes in an PIL image and returns a transformed version. E.g, transforms.RandomCrop
Default: None
target_transform (Optional[Callable], optional) – A function/transform that takes in the target and transforms it.
Default: None
download (bool, optional) – If True, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
Default: False

class eztorch.datasets.DictCIFAR100(root, train=True, transform=None, target_transform=None, download=False)[source]¶

CIFAR100 dict dataset.

Parameters:

root (str) – Root directory of dataset where directory cifar-100-batches-py exists or will be saved to if download is set to True.
train (bool, optional) – If True, creates dataset from training set, otherwise creates from test set.
Default: True
transform (Optional[Callable], optional) – A function/transform that takes in an PIL image and returns a transformed version. E.g, transforms.RandomCrop
Default: None
target_transform (Optional[Callable], optional) – A function/transform that takes in the target and transforms it.
Default: None
download (bool, optional) – If True, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
Default: False

Video¶

class eztorch.datasets.LabeledVideoDataset(labeled_video_paths, clip_sampler, transform=None, decode_audio=True, decoder='pyav', decoder_args={})[source]¶

LabeledVideoDataset handles the storage, loading, decoding and clip sampling for a video dataset. It assumes each video is stored as either an encoded video (e.g. mp4, avi) or a frame video (e.g. a folder of jpg, or png)

Parameters:

labeled_video_paths (list[tuple[str, dict | None]]) – List containing video file paths and associated labels. If video paths are a folder it’s interpreted as a frame video, otherwise it must be an encoded video.
clip_sampler (ClipSampler) – Defines how clips should be sampled from each video.
transform (Optional[Callable[[dict], Any]], optional) – This callable is evaluated on the clip output before the clip is returned. It can be used for user defined preprocessing and augmentations on the clips.
Default: None
decode_audio (bool, optional) – If True, also decode audio from video.
Default: True
decoder (str, optional) – Defines what type of decoder used to decode a video.
Default: 'pyav'
decoder_args (DictConfig, optional) – Arguments to configure the decoder.
Default: {}

class eztorch.datasets.Hmdb51(data_path, clip_sampler, transform=None, video_path_prefix='', split_id=1, split_type='train', decode_audio=True, decoder='pyav', decoder_args={})[source]¶

A helper function to create LabeledVideoDataset object for HMDB51 dataset.

Parameters:

data_path (pathlib.Path) –
Path to the data. The path type defines how the data should be read:
- For a file path, the file is read and each line is parsed into a video path and label.
- For a directory, the directory structure defines the classes (i.e. each subdirectory is a class).
clip_sampler (ClipSampler) – Defines how clips should be sampled from each video. See the clip sampling documentation for more information.
video_sampler – Sampler for the internal video container. This defines the order videos are decoded and, if necessary, the distributed split.
transform (Optional[Callable[[dict], Any]], optional) – This callable is evaluated on the clip output before the clip is returned. It can be used for user defined preprocessing and augmentations to the clips. See the LabeledVideoDataset class for clip output format.
Default: None
video_path_prefix (str, optional) – Path to root directory with the videos that are loaded in LabeledVideoDataset. All the video paths before loading are prefixed with this path.
Default: ''
split_id (int, optional) – Fold id to be loaded. Options are 1, 2 or 3
Default: 1
split_type (str, optional) – Split/Fold type to be loaded. Options are (“train”, “test” or “unused”)
Default: 'train'
decoder (str, optional) – Defines which backend should be used to decode videos.
Default: 'pyav'
decoder_args (DictConfig, optional) – Arguments to configure the decoder.
Default: {}

Return type:

LabeledVideoDataset

Returns:

The dataset instantiated.

class eztorch.datasets.Kinetics(data_path, clip_sampler, transform=None, video_path_prefix='', decode_audio=True, decoder='pyav', decoder_args={})[source]¶

A helper function to create LabeledVideoDataset object for the Kinetics dataset.

Parameters:

data_path (str) –
Path to the data. The path type defines how the data should be read:
- For a file path, the file is read and each line is parsed into a video path and label.
- For a directory, the directory structure defines the classes (i.e. each subdirectory is a class).
clip_sampler (ClipSampler) – Defines how clips should be sampled from each video. See the clip sampling documentation for more information.
transform (Optional[Callable[[Dict[str, Any]], Dict[str, Any]]], optional) – This callable is evaluated on the clip output before the clip is returned. It can be used for user defined preprocessing and augmentations to the clips. See the LabeledVideoDataset class for clip output format.
Default: None
video_path_prefix (str, optional) – Path to root directory with the videos that are loaded in LabeledVideoDataset. All the video paths before loading are prefixed with this path.
Default: ''
decode_audio (bool, optional) – If True, also decode audio from video.
Default: True
decoder (str, optional) – Defines what type of decoder used to decode a video.
Default: 'pyav'
decoder_args (DictConfig, optional) – Arguments to configure the decoder.
Default: {}

Return type:

LabeledVideoDataset

Returns:

The dataset instantiated.

eztorch.datasets.soccernet_dataset(data_path, transform=None, video_path_prefix='', decoder='frame', decoder_args={}, label_args=None, features_args=None, task=SoccerNetTask.ACTION)[source]¶

A helper function to create SoccerNet object.

Parameters:

data_path (str) – Path to the data.
transform (Optional[Callable[[dict[str, Any]], dict[str, Any]]], optional) – This callable is evaluated on the clip output before the clip is returned.
Default: None
video_path_prefix (str, optional) – Path to root directory with the videos that are loaded in SoccerNet. All the video paths before loading are prefixed with this path.
Default: ''
decoder (str, optional) – Defines what type of decoder used to decode a video.
Default: 'frame'
decoder_args (DictConfig, optional) – Arguments to configure the decoder.
Default: {}
label_args (DictConfig | None, optional) – Arguments to configure the labels.
Default: None
features_args (DictConfig | None, optional) – Arguments to configure the extracted features.
Default: None
task (SoccerNetTask, optional) – The task of action spotting, action and ball supported.
Default: SoccerNetTask.ACTION

Return type:

SoccerNet

Returns:

The dataset instantiated.

class eztorch.datasets.SoccerNet(annotated_videos, transform=None, decoder='frame', decoder_args={}, label_args=None, features_args=None, task=SoccerNetTask.ACTION)[source]¶

SoccerNet handles the storage, loading, decoding and clip sampling for a soccernet dataset. It assumes each video is stored as either an encoded video (e.g. mp4, avi) or a frame video (e.g. a folder of jpg, or png)

Parameters:

annotated_videos (SoccerNetPaths) – List containing video annotations.
transform (Optional[Callable[[dict], Any]], optional) – This callable is evaluated on the clip output before the clip is returned. It can be used for user defined preprocessing and augmentations on the clips.
Default: None
decoder (str, optional) – Defines what type of decoder used to decode a video.
Default: 'frame'
decoder_args (DictConfig, optional) – Arguments to configure the decoder.
Default: {}

eztorch.datasets.spot_dataset(data_path, transform=None, video_path_prefix='', decoder='frame', decoder_args={}, label_args=None, features_args=None, dataset=SpotDatasets.TENNIS)[source]¶

A helper function to create Spot object.

Parameters:

data_path (str) – Path to the data.
transform (Optional[Callable[[dict[str, Any]], dict[str, Any]]], optional) – This callable is evaluated on the clip output before the clip is returned.
Default: None
video_path_prefix (str, optional) – Path to root directory with the videos that are loaded. All the video paths before loading are prefixed with this path.
Default: ''
decoder (str, optional) – Defines what type of decoder used to decode a video.
Default: 'frame'
decoder_args (DictConfig, optional) – Arguments to configure the decoder.
Default: {}
label_args (DictConfig | None, optional) – Arguments to configure the labels.
Default: None
features_args (DictConfig | None, optional) – Arguments to configure the extracted features.
Default: None
dataset (SpotDatasets, optional) – The spotting dataset.
Default: SpotDatasets.TENNIS

Return type:

Spot

Returns:

The dataset instantiated.

class eztorch.datasets.Spot(annotated_videos, transform=None, decoder='frame', decoder_args={}, label_args=None, features_args=None, dataset=SpotDatasets.TENNIS)[source]¶

Spot handles the storage, loading, decoding and clip sampling for a spot dataset. It assumes each video is stored as a frame video (e.g. a folder of jpg, or png)

Parameters:

annotated_videos (SpotPaths) – List containing video annotations.
transform (Optional[Callable[[dict], Any]], optional) – This callable is evaluated on the clip output before the clip is returned. It can be used for user defined preprocessing and augmentations on the clips.
Default: None
decoder (str, optional) – Defines what type of decoder used to decode a video.
Default: 'frame'
decoder_args (DictConfig, optional) – Arguments to configure the decoder.
Default: {}

class eztorch.datasets.Ucf101(data_path, clip_sampler, transform=None, video_path_prefix='', split_id=1, split_type='train', decode_audio=True, decoder='pyav', decoder_args={})[source]¶

A helper function to create LabeledVideoDataset object for the Ucf101 dataset.

Parameters:

data_path (str) –
Path to the data. The path type defines how the data should be read:
- For a file path, the file is read and each line is parsed into a video path and label.
- For a directory, the directory structure defines the classes (i.e. each subdirectory is a class).
clip_sampler (ClipSampler) – Defines how clips should be sampled from each video. See the clip sampling documentation for more information.
video_sampler – Sampler for the internal video container. This defines the order videos are decoded and, if necessary, the distributed split.
transform (Optional[Callable[[dict[str, Any]], dict[str, Any]]], optional) – This callable is evaluated on the clip output before the clip is returned. It can be used for user defined preprocessing and augmentations to the clips. See the LabeledVideoDataset class for clip output format.
Default: None
video_path_prefix (str, optional) – Path to root directory with the videos that are loaded in LabeledVideoDataset. All the video paths before loading are prefixed with this path.
Default: ''
split_id (int, optional) – Fold id to be loaded. Options are: \(1\), \(2\) or \(3\).
Default: 1
split_type (str, optional) – Split/Fold type to be loaded. Options are: 'train' or 'test'.
Default: 'train'
decode_audio (bool, optional) – If True, also decode audio from video.
Default: True
decoder (str, optional) – Defines what type of decoder used to decode a video.
Default: 'pyav'
decoder_args (DictConfig, optional) – Arguments to configure the decoder.
Default: {}

Return type:

LabeledVideoDataset

Returns:

The dataset instantiated.

Clip samplers¶

Clip samplers are used in video datasets to correctly sample clips when iteration over videos following a rule.

Decoders¶

Decoders are used in video datasets to decode the clips.

Video Decoders

Collate functions¶

Used to collate samples for the dataloader.

Collate functions
- Get collate function
  - get_collate_fn()
- Collate functions
  - default_collate()
  - multiple_samples_collate()