Data Operation¶

The Data Operation module is a convenient and practical tool in TSS that bridges the gap between unstructured tabular data and the standardized signal formats required for TSS projects. Unlike images, time series data comes from a wide range of sources and exists in various forms. You may work with data from ad hoc sources, such as lab equipment and legacy systems that lack consistent formatting, making TSS import challenging for machine learning tasks. This tool empowers users to preprocess, transform, and validate heterogeneous time-series data into compliant input files for TSS workflows.

Dataset¶

The Dataset section enables you to import tabular data files (in TXT or CSV format) for further processing. You can load single or multiple files, with validation rules ensuring data consistency.

To select files from the local system, click the Import Files button. Multiple files can be imported simultaneously.

import

To configure file parsing settings:

Click Ignore the first label line to skip the first line (header) if the table contains column headers
Manually select the appropriate Delimiter to reload files

preview

Operation¶

The Operation section allows users to apply various data transformations to the imported dataset. Most operations require parameter configuration to achieve the desired results.

Remove Lines¶

Remove lines that are unnecessary.

Steps:

Input the Line(s) to remove according to the specified format
Click the Run button

remove lines

Remove Columns¶

Remove columns that are unnecessary.

Steps:

Input the Column(s) to remove according to the specified format
Click the Run button

remove columns

Remove Channels¶

Remove channels that are unnecessary.

Note: This operation is available only for multichannel data. You can get recommendations by applying the data to Data Intelligence for smart analysis. The Channel Correlation and Channel Importance indices can help identify redundant channels.

Steps:

Set the Number of Channels
Select the Channel(s) to remove
Click the Run button

remove channels

Separate Data by Columns¶

Rearrange the data according to the number of columns specified.

Steps:

Set the Number of Columns
Click the Run button

separate columns

Transpose Data¶

Transpose the dataset so that rows become columns and columns become rows.

Simply click the Run button.

transpose data

Add Targets¶

Add target values to classification datasets so that classification datasets can be converted into regression datasets.

Steps:

Set the Number of targets
Input the target values for each file
Click the Run button

add targets

Shuffle Data¶

Shuffle the dataset by lines.

Simply click the Run button.

shuffle data

Wash Data¶

Remove unclean lines from the dataset.

Note: “Unclean” means that the line contains non-numeric elements, or the number of columns in the line is inconsistent with other lines.

Simply click the Run button.

wash data

Generate Samples¶

Create segmented datasets from continuous data for importing into TSS machine learning projects.

Note: You can use Data Intelligence to perform smart analysis on continuous data in advance and obtain optimal segmentation parameters.

Steps:

Set the Number of Channels

Important: Continuous data requires the number of channels to match the number of columns.
Select the Target Columns

Note: This option is available when you wish to use a channel’s output as the prediction target for regression tasks. It is not required for classification tasks.
Set the Window Size
Set the Sampling Frequency (the frequency division factor of the original sampling frequency)
Set the Stride and the Overlap Ratio
Click the Run button

generate samples

Down Sampling¶

Downsample the segmented dataset.

Note: Since the window size of segmented data is fixed, the window size of the data decreases when downsampling.

Steps:

Set the Number of Channels
Set the Sampling Frequency
Click the Run button

down sampling

Split Dataset¶

Split the dataset into training and test sets by lines.

Steps:

Select the Train/Test Ratio
Click the Run button

split dataset

Augment Dataset¶

Augment the dataset by applying transformations to increase data volume and diversity for improving model robustness.

Steps:

Set the Number of Channels
Select the Augmentation Types to choose from available transformations:
- Add Noise: Adds random noise to the data to simulate real-world variations
- Convolve: Applies convolution operations to the data
- Crop: Randomly crops segments of the data
- Drift: Introduces gradual drift in the signal values
- Dropout: Randomly masks the values of some time steps
- Pool: Applies pooling operations to reduce data dimensionality
- Quantize: Reduces the precision of data values through quantization
- Reverse: Reverses the order of time steps in the data
- Time Warp: Applies time warping transformations to the data
Set the Data Ratio to control the augmented data file size
Enable Keep Integer to preserve integer data types (if the original time series data is integer type)
Click the Run button

augment dataset

Concatenate Files¶

Merge multiple files vertically (row-wise) or horizontally (column-wise).

Steps:

Choose concatenation Direction
Click the Run button

concatenate files

Extract Classes by Label¶

Extract specific classes from the dataset based on label values.

Note: Some tabular data might contain a label column that identifies different classes or categories.

Steps:

Set the Index of Label Column to specify which column contains the class labels
Click the Run button

extract classes by label

Result¶

The Result section allows you to save the processed files or perform new operations on them.

For individual files:

Click Run New Operation to import the file to the Dataset section
Click Save As to save the processed file to the local system

For multiple files:

Click Run New Operation to import all files to the Dataset section
Click Save All to package the processed files into a zip file and save it

results

Conclusion¶

The Data Operation module provides a streamlined workflow for preprocessing and transforming raw tabular data into TSS-compatible signal files. The interface is divided into three key sections:

Dataset: Enables flexible file (TXT/CSV) importing with configurable parsing settings (delimiters, headers)
Operation: Provides various operations that can perform different transformations on different types of tabular data, with each operation being simple and easy to understand
Result: Enables you to choose whether to run new operations or save files after processing

The intuitive design of this tool helps both novices and experienced analysts quickly prepare optimal time-series datasets for their projects.