src.data package¶

Submodules¶

src.data.balanced_image_data_reader module¶

This file implements an image data reader which balances data.

class src.data.balanced_image_data_reader.BalancedImageDataReader(folder: Optional[str] = None)¶

Bases: ImageDataReader

Class that reads images from folders in a balanced way. This means that of all classes, there should be an approximately equal amount of images from that class. This means that some images from underrepresented classes might appear twice and some images from overrepresented classes might not appear at all. Note: Has higher memory requirements than other Data Readers.

get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

Get the labels for the text dataset that is specified in an array

Parameters:

which_set – Train, val or test set
parameters – Parameter dictionary

Returns:

The labels in an array of shape (num_samples,)

get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the CSV file into a dataset and also converts the emotion labels to the three emotion space.

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional arguments

Returns:

The tensorflow Dataset instance

get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the CSV file into a dataset and also converts the emotion labels to the three emotion space.

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional arguments

Returns:

The tensorflow Dataset instance

src.data.balanced_plant_exp_reader module¶

This data reader reads the PlantSpikerBox data from the experiments.

class src.data.balanced_plant_exp_reader.BalancedPlantExperimentDataReader(folder: str = 'data/plant', default_label_mode: str = 'expected')¶

Bases: ExperimentDataReader

This data reader reads the plant spiker box files from the experiments and balances the classes exactly.

cleanup(parameters: Optional[Dict] = None) → None¶

Function that cleans up the big data arrays for memory optimization.

Parameters:: parameters – Parameter Dictionary

get_input_shape(parameters: Dict) → Tuple[int]¶

Returns the shape of a preprocessed sample.

Parameters:: parameters – Parameter dictionary
Returns:: Tuple that is the shape of the sample.

get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

This function returns labels for the dataset

Parameters:

which_set – Which set to get the labels for.
parameters – Additional parameters.

Returns:

Label numpy array

get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the CSV file into a dataset and also converts the emotion labels to the three emotion space.

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional arguments

Returns:

The tensorflow Dataset instance

get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the plant data into a dataset and also converts the emotion labels to the three emotion space.

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional arguments

Returns:

The tensorflow Dataset instance

src.data.balanced_watch_exp_reader module¶

This data reader reads the watch data from the experiments.

class src.data.balanced_watch_exp_reader.BalancedWatchExperimentDataReader(folder: str = 'data/watch', default_label_mode: str = 'expected')¶

Bases: ExperimentDataReader

This data reader reads the watch data files from the experiments and balances the classes exactly.

get_input_shape(parameters: Dict) → tuple¶

Returns the shape of a preprocessed sample.

Parameters:: parameters – Parameter dictionary
Returns:: Tuple that is the shape of the sample.

get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

This function returns labels for the dataset

Parameters:

which_set – Which set to get the labels for.
parameters – Additional parameters.

Returns:

Label numpy array

get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the CSV file into a dataset and also converts the emotion labels to the three emotion space.

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional arguments

Returns:

The tensorflow Dataset instance

get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the watch data into a dataset and also converts the emotion labels to the three emotion space.

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional arguments

Returns:

The tensorflow Dataset instance

src.data.classwise_speech_data_reader module¶

This file implements classwise data reading for speech data.

class src.data.classwise_speech_data_reader.ClasswiseSpeechDataReader(name: str = 'classwise_speech', folder: Optional[str] = None)¶

Bases: DataReader

Class that reads the speech datasets per class. This means that the data extraction methods return one array per class. This is required for HMM and GMM classifiers which need all data for one class at the same time and do not support batching like NNs.

get_crema_samples(crema_d: DatasetV2, class_name: str) → ndarray¶

Gets the samples from a specified class from the crema dataset

Parameters:

crema_d – The entire crema dataset instance
class_name – The class to extract from crema_d

Returns:

A numpy array with the extracted data

get_file_samples(emotion_class: str, data_dir: str) → ndarray¶

Extract the data from a specific class from disk

Parameters:

emotion_class – The class to load from disk
data_dir – The directory on disk that contains the data

Returns:

Numpy array with the data

get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

Get the labels for the text dataset that is specified in an array

Parameters:

which_set – Train, val or test set
parameters – Parameter dictionary

Returns:

The labels in an array of shape (num_samples,)

get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → Generator[Tuple[ndarray, str], None, None]¶

Main data reading function which reads the audio files and then returns them one class at a time.

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional parameters

Returns:

Generator that yields (array, class name)

get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → Generator[Tuple[ndarray, str], None, None]¶

Main data reading function which reads the audio data from disk.

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional arguments

Returns:

The tensorflow Dataset instance

static get_waveform_and_label(file_path: bytes) → Tuple[Tensor, Tensor]¶

Preprocessing function for the audio files that are read from the data folder. Files are read, decoded and padded or truncated.

Parameters:: file_path – The path of one audio file to read.
Returns:: Audio tensor and label tensor in a tuple

static map_emotions(data: ndarray, labels: ndarray)¶

Conversion function that is applied when three emotion labels are required.

Parameters:

data – The emotions data.
labels – The labels that are to be converted to three emotions.

static process_crema(x: ndarray, y: int) → Tuple[Tensor, Tensor]¶

Preprocessing function for the crema dataset read from tensorflow_datasets package.

Parameters:

x – The audio data
y – The label data

Returns:

Processed audio and label data

src.data.comparison_image_data_reader module¶

This file implements the data reading functionality for the image data from the comparison dataset.

class src.data.comparison_image_data_reader.ComparisonImageDataReader(name: str = 'comparison_image', folder: Optional[str] = None)¶

Bases: DataReader

Class that reads the comparison dataset image data

get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

Get the labels for the image dataset in an array

Parameters:

which_set – Train, val or test set - only test allowed here
parameters – Parameter dictionary

Returns:

The labels in an array of shape (num_samples,)

get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the images into a dataset

Parameters:

which_set – Which dataset to use - only test is allowed here
batch_size – The batch size for the resulting dataset
parameters – Additional parameters

Returns:

The tensorflow Dataset instance

get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the image folders into a dataset and also converts the emotion labels to the three emotion space.

Parameters:

which_set – Which dataset to use - test only
batch_size – The batch size for the resulting dataset
parameters – Additional arguments

Returns:

The tensorflow Dataset instance

src.data.comparison_speech_data_reader module¶

This file implements the data reading functionality for the speech data from the comparison dataset.

class src.data.comparison_speech_data_reader.ComparisonSpeechDataReader(name: str = 'comparison_speech', folder: Optional[str] = None)¶

Bases: DataReader

Class that reads the comparison speech dataset

get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

Get the labels for the text dataset that is specified in an array

Parameters:

which_set – Train, val or test set
parameters – Parameter dictionary

Returns:

The labels in an array of shape (num_samples,)

get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the audio files into a dataset

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional parameters

Returns:

The tensorflow Dataset instance

get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the audio data from disk.

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional arguments

Returns:

The tensorflow Dataset instance

static get_waveform_and_label(file_path: bytes) → Tuple[Tensor, Tensor]¶

Preprocessing function for the audio files that are read from the data folder. Files are read, decoded and padded or truncated.

Parameters:: file_path – The path of one audio file to read.
Returns:: Audio tensor and label tensor in a tuple

static map_emotions(data: ndarray, labels: ndarray)¶

Conversion function that is applied when three emotion labels are required.

Parameters:

data – The emotional data.
labels – The labels that need to be converted to three emotions.

static set_tensor_shapes(x: Tensor, y: Tensor) → Tuple[Tensor, Tensor]¶

Function that sets the tensor shapes in the dataset manually. This fixes an issue where using Dataset.map and numpy_function causes the tensor shape to be unknown. See the issue here: https://github.com/tensorflow/tensorflow/issues/47032

Parameters:

x – The speech tensor
y – The labels tensor

Returns:

Tuple with speech and labels tensor

src.data.comparison_text_data_reader module¶

This file implements the data reading functionality for text data from the comparison dataset.

class src.data.comparison_text_data_reader.ComparisonTextDataReader(folder: Optional[str] = None)¶

Bases: DataReader

Class that reads the CSV datasets from the data/train/text folder

get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

Get the labels for the text dataset that is specified in an array

Parameters:

which_set – Train, val or test set
parameters – Parameter dict (unused)

Returns:

The labels in an array of shape (num_samples,)

get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the CSV file into a dataset

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional parameters

Returns:

The tensorflow Dataset instance

get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the CSV file into a dataset and also converts the emotion labels to the three emotion space.

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional arguments

Returns:

The tensorflow Dataset instance

src.data.data_factory module¶

This class implements a factory for easy access to data readers and data

class src.data.data_factory.DataFactory¶

Bases: object

The Data Factory returning data readers or data sets

static get_data_reader(data_type: str, data_folder=None) → DataReader¶

This factory method returns a data reader instance

Parameters:

data_type – The type of data to return the reader for
data_folder – Override data folder for the data reader

Raises:

ValueError – If the data_type does not exist

Returns:

A DataReader for the specified data type

static get_dataset(data_type: str, which_set: Set, emotions: str = 'neutral_ekman', batch_size: int = 64, data_folder: Optional[str] = None, parameters: Optional[Dict] = None) → DatasetV2¶

Get a specific dataset from a data reader

Parameters:

data_type – The data type to consider
which_set – Which dataset to return: train, val or test
emotions – Which emotion set to use: neutral_ekman or three
batch_size – The batch size for the returned dataset
data_folder – The folder where data is stored
parameters – Additional parameters for creating data

Raises:

ValueError – If the emotion type is not available

Returns:

Dataset instance that was requested

src.data.data_reader module¶

This file implements that basic functions for data reading

class src.data.data_reader.DataReader(name: str, folder: str)¶

Bases: ABC

The DataReader class is responsible for creating a tensorflow DataSet which is used for training and evaluating the emotion detection models.

cleanup(parameters: Optional[Dict] = None) → None¶

Optional cleanup method that deletes unneccessary memory elements.

Parameters:: parameters – Parameters that might be required

static convert_to_numpy(dataset: DatasetV2) → Tuple[ndarray, ndarray]¶

Converts a given tensorflow dataset into a single numpy array

Parameters:: dataset – The dataset to convert to numpy
Returns:: Tuple containing two array: - numpy array containing data from all batches - numpy array containing labels from all batches

static convert_to_three_emotions(labels: ndarray) → ndarray¶

Convert the NeutralEkmanEmotion labels to the ThreeEmotionSet

Parameters:: labels – The integer labels from 0-6 in NeutralEkman format
Returns:: The integer labels from 0-2 in ThreeEmotion format

static convert_to_three_emotions_onehot(labels: ndarray) → ndarray¶

Convert the NeutralEkmanEmotion labels to the ThreeEmotionSet

Parameters:: labels – The integer labels from 0-6 in a one-hot encoding -> shape (n, 7)
Returns:: The integer labels from 0-2 in ThreeEmotion format in one-hot encoding: shape (n,3)

get_emotion_data(emotions: str = 'neutral_ekman', which_set: Set = Set.TRAIN, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Method that returns a dataset depending on the emotion set.

Parameters:

emotions – The emotion set to use: neutral_ekman or three
which_set – train, test or val set
batch_size – The batch size for the dataset
parameters – Additional arguments

Returns:

The obtained dataset

abstract get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

Method that gets only the labels for the dataset that is specified

Parameters:

which_set – Which set to use, train, val or test
parameters – Parameter dictionary

Returns:

An array of labels in shape (num_samples,)

abstract get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main method which loads the data from disk into a Dataset instance

Parameters:

which_set – Which set to use, can be either train, val or test
batch_size – The batch size for the requested dataset
parameters – Additional parameters

Returns:

The Dataset instance to use in the emotion classifiers

abstract get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Method that loads the dataset from disk and stores the labels in the ThreeEmotionSet instead of the NeutralEkmanEmotionSet

Parameters:

which_set – train, val or test set distinguisher
batch_size – the batch size for the dataset
parameters – Additional arguments

Returns:

The Dataset that contains data and labels

static map_emotions(data, labels)¶

Conversion function that is applied when three emotion labels are required.

Parameters:

data – The emotional data.
labels – The labels that need to be converted to three emotions.

class src.data.data_reader.Set(value)¶

Bases: IntEnum

Define the different set types that are available

ALL = 3¶

TEST = 2¶

TRAIN = 0¶

VAL = 1¶

src.data.experiment_data_reader module¶

This file contains a base class for data readers that read experiment related data and implements common functionality.

class src.data.experiment_data_reader.ExperimentDataReader(name: str, folder: str)¶

Bases: DataReader

This is the base class for all experiment related data readers.

static get_complete_data_indices() → List[int]¶

Static method that returns all experiment indices that have complete data and are supposed to be used in the evaluation.

Returns:: List of experiment indices.

get_emotion_times() → Dict[str, Dict[str, float]]¶

This function returns start and end times for every emotion in the experiments.

Returns:: The start and end time for every emotion.

abstract get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

Return the labels for the unsorted data in the dataset.

Parameters:

which_set – Which set to get labels for
parameters – Additional parameters

Returns:

Numpy array of labels.

abstract get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

The abstract method for getting the dataset to train on.

Parameters:

which_set – Training, Validation or Test Set
batch_size – Batch Size for the dataset
parameters – Additional parameters.

Returns:

A tensorflow Dataset instance.

abstract get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

The abstract method for getting the dataset to train on. This method should return only three emotions.

Parameters:

which_set – Training, Validation or Test Set
batch_size – Batch Size for the dataset
parameters – Additional parameters.

Returns:

A tensorflow Dataset instance.

src.data.fusion_data_reader module¶

This data reader reads the fusion data from the experiments.

class src.data.fusion_data_reader.FusionProbDataReader(folder: Optional[str] = None)¶

Bases: ExperimentDataReader

This data reader reads fusion data from the experiments

get_data_generator(which_set: Set, parameters: Dict) → Generator[Tuple[ndarray, ndarray], None, None]¶

Generator that generates the data

Parameters:

which_set – Train, val or test set
parameters – Additional parameters including: - window: The length of the window to use in seconds

Returns:

Generator that yields data and label.

get_input_shape(parameters: Dict) → Tuple[int]¶

Returns the shape of a concatenated input sample.

Parameters:: parameters – Parameter dictionary
Returns:: Tuple that is the shape of the sample.

get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

This function returns labels for the dataset

Parameters:

which_set – Which set to get the labels for.
parameters – Additional parameters.

Returns:

Label numpy array

get_raw_data(parameters: Dict) → tuple[numpy.ndarray, numpy.ndarray]¶

Function that reads all experiment emotion probabilities from the data/continuous folder.

Parameters:: parameters – Parameters for the data reading process
Returns:: Tuple with samples, labels

get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Method that returns a dataset of fusion probabilities.

Parameters:

which_set – Which set to use.
batch_size – Batch size for the dataset.
parameters – Additional parameters.

Returns:

Dataset instance.

get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Create a dataset that uses only three emotions.

Parameters:

which_set – Which set: Train, val or test
batch_size – Batch size
parameters – Additional parameters

Returns:

Dataset with three emotion labels.

split_set(all_data: ndarray, all_labels: ndarray, which_set: Set) → tuple[numpy.ndarray, numpy.ndarray]¶

Split all labels into train, val and test sets.

Parameters:

all_data – All data array shape (n_exp * 613, n_modalities * 7)
all_labels – All corresponding labels (n_exp * 613,)
which_set – Train, Val or Test set

Returns:

Training, validation or test set as specified

src.data.image_data_reader module¶

This file implements the data reading functionality for image data.

class src.data.image_data_reader.ImageDataReader(name: str = 'image', folder: Optional[str] = None)¶

Bases: DataReader

Class that reads the image dataset from the data/train/image folder

add_augmentations(dataset: DatasetV2, use_augmentations: bool = True)¶

Function that adds augmentation to the dataset. This helps reduce overfitting of the model.

Parameters:

dataset – The dataset containing images
use_augmentations – Boolean flag to enable augmentation

Returns:

The dataset with augmented images

get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

Get the labels for the image dataset that is specified in an array

Parameters:

which_set – Train, val or test set
parameters – Parameter dictionary

Returns:

The labels in an array of shape (num_samples,)

get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the images into a dataset

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional parameters

Returns:

The tensorflow Dataset instance

get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the image folders into a dataset and also converts the emotion labels to the three emotion space.

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional arguments

Returns:

The tensorflow Dataset instance

src.data.plant_exp_reader module¶

This data reader reads the PlantSpikerBox data from the experiments.

class src.data.plant_exp_reader.PlantExperimentDataReader(folder: str = 'data/plant', default_label_mode: str = 'expected')¶

Bases: ExperimentDataReader

This data reader reads the plant spiker box files from the experiments

cleanup(parameters: Optional[Dict] = None) → None¶

Cleanup method to free RAM which due to a bug in garbage collection is not cleared up automatically.

Parameters:: parameters – Parameters.

get_cross_validation_indices(which_set: Set, parameters: Dict) → List[int]¶

Generate a list of indices according to CrossValidation.

Parameters:

which_set – Which set to use.
parameters – Additional parameters including: - cv_portions: Number of cv splits to do. - cv_index: Which split to use.

Returns:

List of indexes in a cv form.

get_data_generator(which_set: Set, parameters: Dict) → Generator[Tuple[ndarray, ndarray], None, None]¶

Generator that generates the data

Parameters:

which_set – Train, val or test set
parameters – Additional parameters including: - window: The length of the window to use in seconds

Returns:

Generator that yields data and label.

get_input_shape(parameters: Dict) → Tuple[int]¶

Returns the shape of a preprocessed sample.

Parameters:: parameters – Parameter dictionary
Returns:: Tuple that is the shape of the sample.

get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

This function returns labels for the dataset

Parameters:

which_set – Which set to get the labels for.
parameters – Additional parameters.

Returns:

Label numpy array

get_raw_data(parameters: Dict) → None¶

Load the raw plant data from the wave files and split it into windows according to the parameters.

Parameters:: parameters – Additional parameters

get_raw_expected_labels() → ndarray¶

Load the raw emotions from the expected emotions during the video. The expected emotion means that while the participant is watching a happy video, we expect them to be happy, thus the label is happy.

Returns:: Labels that are expected from the user.

get_raw_faceapi_labels() → ndarray¶

Load the raw labels from the faceapi output files.

Returns:: Labels that are collected from the user’s face expression.

get_raw_labels(label_mode: str) → ndarray¶

Get the raw labels per experiment and time. Populates the raw_labels member of this class. The two axis are [experiment_index, time_in_seconds]

Parameters:: label_mode – Whether to use expected or faceapi labels
Returns:: Array of all labels in shape (file, second)

get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Method that returns a dataset of plant data.

Parameters:

which_set – Which set to use.
batch_size – Batch size for the dataset.
parameters – Additional parameters.

Returns:

Dataset instance.

get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Create a dataset that uses only three emotions.

Parameters:

which_set – Which set: Train, val or test
batch_size – Batch size
parameters – Additional parameters

Returns:

Dataset with three emotion labels.

static prepare_faceapi_labels() → None¶: This function prepares the faceapi labels if they are not computed yet.

static preprocess_sample(sample: ndarray, parameters: Optional[Dict] = None) → ndarray¶

Gets a sample with shape (window_size * 10000,) and then preprocesses it before using it in the classifier.

Parameters:

sample – The data sample to preprocess.
parameters – Additional parameters for preprocessing.

Returns:

The preprocessed sample.

src.data.speech_data_reader module¶

This file implements the data reading functionality for speech data.

class src.data.speech_data_reader.SpeechDataReader(name: str = 'speech', folder: Optional[str] = None)¶

Bases: DataReader

Class that reads the speech datasets

get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

Get the labels for the text dataset that is specified in an array

Parameters:

which_set – Train, val or test set
parameters – Parameter dictionary

Returns:

The labels in an array of shape (num_samples,)

get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the audio files into a dataset

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional parameters

Returns:

The tensorflow Dataset instance

get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the audio data from disk.

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional arguments

Returns:

The tensorflow Dataset instance

static get_waveform_and_label(file_path: bytes) → Tuple[Tensor, Tensor]¶

Preprocessing function for the audio files that are read from the data folder. Files are read, decoded and padded or truncated.

Parameters:: file_path – The path of one audio file to read.
Returns:: Audio tensor and label tensor in a tuple

static map_emotions(data: ndarray, labels: ndarray)¶

Conversion function that is applied when three emotion labels are required.

Parameters:

data – The emotional data.
labels – The labels that need to be converted to three emotions.

static process_crema(x: ndarray, y: int) → Tuple[Tensor, Tensor]¶

Preprocessing function for the crema dataset read from tensorflow_datasets package.

Parameters:

x – The audio data
y – The label data

Returns:

Processed audio and label data

static set_tensor_shapes(x: Tensor, y: Tensor) → Tuple[Tensor, Tensor]¶

Function that sets the tensor shapes in the dataset manually. This fixes an issue where using Dataset.map and numpy_function causes the tensor shape to be unknown. See the issue here: https://github.com/tensorflow/tensorflow/issues/47032

Parameters:

x – The speech tensor
y – The labels tensor

Returns:

Tuple with speech and labels tensor

src.data.text_data_reader module¶

This file implements the data reading functionality for text data.

class src.data.text_data_reader.TextDataReader(folder: str = 'data/train/text')¶

Bases: DataReader

Class that reads the CSV datasets from the data/train/text folder

get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

Get the labels for the text dataset that is specified in an array

Parameters:

which_set – Train, val or test set
parameters – Parameter dict (unused)

Returns:

The labels in an array of shape (num_samples,)

get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the CSV file into a dataset

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional parameters

Returns:

The tensorflow Dataset instance

get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Main data reading function which reads the CSV file into a dataset and also converts the emotion labels to the three emotion space.

Parameters:

which_set – Which dataset to use - train, val or test
batch_size – The batch size for the resulting dataset
parameters – Additional arguments

Returns:

The tensorflow Dataset instance

src.data.watch_exp_reader module¶

This data reader reads the Happimeter data from the experiments.

class src.data.watch_exp_reader.WatchExperimentDataReader(folder: str = 'data/watch', default_label_mode: str = 'expected')¶

Bases: ExperimentDataReader

This data reader reads the watch csv files from the experiments

get_cross_validation_indices(which_set: Set, parameters: Dict) → List[int]¶

Generate a list of indices according to CrossValidation.

Parameters:

which_set – Which set to use.
parameters – Additional parameters including: - cv_portions: Number of cv splits to do. - cv_index: Which split to use.

Returns:

List of indexes in a cv form.

get_data_generator(which_set: Set, parameters: Dict) → Generator[Tuple[ndarray, ndarray], None, None]¶

Generator that generates the data

Parameters:

which_set – Train, val or test set
parameters – Additional parameters including: - window: The length of the window to use in seconds

Returns:

Generator that yields data and label.

static get_input_shape(parameters: Dict) → tuple¶

Returns the shape of a preprocessed sample.

Parameters:: parameters – Parameter dictionary
Returns:: Tuple that is the shape of the sample.

get_labels(which_set: Set = Set.TRAIN, parameters: Optional[Dict] = None) → ndarray¶

This function returns labels for the dataset

Parameters:

which_set – Which set to get the labels for.
parameters – Additional parameters.

Returns:

Label numpy array

get_raw_data(parameters: Dict) → None¶

Load the raw watch data from the csv files and split it into windows according to the parameters.

Parameters:: parameters – Additional parameters

get_raw_expected_labels() → ndarray¶

Load the raw emotions from the expected emotions during the video. The expected emotion means that while the participant is watching a happy video, we expect them to be happy, thus the label is happy.

Returns:: Labels that are expected from the user.

get_raw_faceapi_labels() → ndarray¶

Load the raw labels from the faceapi output files.

Returns:: Labels that are collected from the user’s face expression.

get_raw_labels(label_mode: str) → ndarray¶

Get the raw labels per experiment and time. Populates the raw_labels member of this class. The two axis are [experiment_index, time_in_seconds]

Parameters:: label_mode – Whether to use expected or faceapi labels
Returns:: Array of all labels in shape (file, second)

get_seven_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Method that returns a dataset of watch data.

Parameters:

which_set – Which set to use.
batch_size – Batch size for the dataset.
parameters – Additional parameters.

Returns:

Dataset instance.

get_three_emotion_data(which_set: Set, batch_size: int = 64, parameters: Optional[Dict] = None) → DatasetV2¶

Create a dataset that uses only three emotions.

Parameters:

which_set – Which set: Train, val or test
batch_size – Batch size
parameters – Additional parameters

Returns:

Dataset with three emotion labels.

static prepare_faceapi_labels() → None¶: This function prepares the faceapi labels if they are not computed yet.

Module contents¶

Package responsible for data reading and processing

src.data package¶

Submodules¶

src.data.balanced_image_data_reader module¶

src.data.balanced_plant_exp_reader module¶

src.data.balanced_watch_exp_reader module¶

src.data.classwise_speech_data_reader module¶

src.data.comparison_image_data_reader module¶

src.data.comparison_speech_data_reader module¶

src.data.comparison_text_data_reader module¶

src.data.data_factory module¶

src.data.data_reader module¶

src.data.experiment_data_reader module¶

src.data.fusion_data_reader module¶

src.data.image_data_reader module¶

src.data.plant_exp_reader module¶

src.data.speech_data_reader module¶

src.data.text_data_reader module¶

src.data.watch_exp_reader module¶

Module contents¶

Emotion Measurement

Navigation

Related Topics