How to Contribute
This document describes how to contribute new models, feature extraction methods, and datasets to EEGEmoLib.
New Models
Model Input
For most models, the input shape is [B, 1, C, F], where B denotes the batch size, C the number of channels, and F the feature length.
When constructing a model, we provide two arguments: num_classes and input_shape, which represent the number of target classes and the input shape, respectively. If your model requires additional parameters, please assign them reasonable default values.
Ideally, your model should be able to adaptively construct its architecture based on the provided input_shape. If your model requires a fixed feature length, you should still make an effort to support a variable number of channels. If the provided input_shape is incompatible with your model, you should raise a clear error indicating the expected input format.
Model Output
We recommend that your model outputs logits with shape [B, num_classes]. Tuple outputs are also supported, allowing you to return additional information. By default, EEGEmoLib uses the last element of the tuple as the classification output.
If your model outputs a 3D tensor of shape [B, T, num_classes], EEGEmoLib will automatically apply mean(dim=1).
Model Integration
Add a new file under
src/eegemolib/models/, e.g.,MyNet.py.Define the model class (the class name must match
model.namein the configuration).
Example: A Minimal Model MyNet
# src/eegemolib/models/2d_analysis/MyNet.py
import torch
import torch.nn as nn
class MyNet(nn.Module):
def __init__(self, n_classes=3, input_shape=None, feature_dim=1050, hidden=128):
super().__init__()
self.net = nn.Sequential(
nn.Flatten(),
nn.Linear(input_shape[2] * feature_dim, hidden),
nn.ReLU(),
nn.Linear(hidden, n_classes),
)
def forward(self, x): # x: [B, 1, C, F]
return self.net(x)
New Feature Extraction Methods
Input and Output
Typically, the input to a feature extraction method is a single-trial EEG signal with shape [channels, timesteps]. In addition, an args dictionary is provided, which is configured via feature.args in the YAML file.
Your feature extraction method should output a 3D tensor of shape [channels, windows, bands]. If your method is band-independent, you may set the third dimension (bands) to 1.
Compatibility Requirements
If you want your method to support adaptive_band_weight, the output must contain an interpretable bands dimension. Other than this, there are no strict compatibility requirements.
Integration
Add your function to src/eegemolib/preprocess/features.py.
Example: Adding an rms Feature
def rms(data, args):
window_length = args['win_length']
sample_rate = args['down_sample']
n_channels, n_samples = data.shape
points = int(sample_rate * window_length)
n_windows = n_samples // points
out = np.zeros((n_channels, n_windows, 1), dtype=np.float32)
for w in range(n_windows):
seg = data[:, w * points:(w + 1) * points]
out[:, w, 0] = np.sqrt(np.mean(seg ** 2, axis=1))
return out
New Datasets
In principle, we do not recommend manually adding new datasets to EEGEmoLib. If you would like to include a new dataset, please open an issue and provide the dataset link. We will integrate it into EEGEmoLib accordingly.
If you still wish to add a dataset yourself, please follow the instructions below.
Typically, adding a new dataset requires implementing two classes:
Preprocess class (handles conversion from raw data to cached data)
Dataset class (loads cached data and returns batches during training)
Required Methods
Preprocess Class: inherit from BasePreprocess
Suggested path: src/eegemolib/preprocess/yourset_preprocess.py
Must implement:
_load_data(self, path)_load_feature(self, path, feature_list)_preprocess(self, data)
Optional overrides:
_split(...)_feature_compute(...),_feature_selection(...)
Dataset Class: inherit from BaseDataset
Suggested path: src/eegemolib/data/yourset_dataset.py
Must implement:
__init__(self, args, test=False)__getitem__(self, index)(returns{'data': ..., 'label': ...})__len__(self)__str__(self)
Example: Minimal Dataset Skeleton
# src/eegemolib/preprocess/myset_preprocess.py
from .base_preprocess import BasePreprocess
class MysetPreprocess(BasePreprocess):
def _load_data(self, path):
# return data_list, label_list
pass
def _load_feature(self, path, feature_list):
pass
def _preprocess(self, data):
return data
# src/eegemolib/data/myset_dataset.py
from data.base_dataset import BaseDataset
class MysetDataset(BaseDataset):
def __init__(self, args, test=False):
super().__init__(args)
...
def __getitem__(self, index):
return {'data': data_tensor, 'label': label_tensor}
def __len__(self):
return self.length
def __str__(self):
return 'MYSET'