queens.data_processors package#

Data Processor.

Modules for extracting and processing data from simulation output files.

Submodules#

queens.data_processors.csv_file module#

Data processor class for csv data extraction.

class CsvFile(file_name_identifier=None, file_options_dict=None, files_to_be_deleted_regex_lst=None)[source]#

Bases: DataProcessor

Class for extracting data from csv files.

use_cols_lst#

List with column numbers that should be read-in.

Type:: lst

filter_type#

Filter type to use.

Type:: str

header_row#

Integer that determines which csv-row contains labels/headers of the columns. Default is ‘None’, meaning no header used.

Type:: int

skip_rows#

Number of rows that should be skipped to be read-in in csv file.

Type:: int

index_column#

Column to use as the row labels of the DataFrame, either given as string name or column index.

Note: index_column=False can be used to force pandas to not use the first column as the index. index_column is used for filtering the remaining columns.

Type:: int, str

use_rows_lst#

In case this options is used, the list contains the indices of rows in the csv file that should be used as data.

Type:: lst

filter_range#

After data is selected by use_cols_lst and a filter column is specified by index_column, this option selects which data range shall be filtered by providing a minimum and maximum value pair in list format.

Type:: lst

filter_target_values#

Target values to filter.

Type:: list

filter_tol#

Tolerance for the filter range.

Type:: float

returned_filter_format#

Returned data format after filtering.

Type:: str

classmethod check_valid_filter_options(filter_options_dict)[source]#

Check valid filter input options.

Parameters:: filter_options_dict (dict) – dictionary with filter options

expected_filter_by_range = {'range': [1.0, 2.0], 'tolerance': 0.0, 'type': 'by_range'}#

expected_filter_by_row_index = {'rows': [1, 2], 'type': 'by_row_index'}#

expected_filter_by_target_values = {'target_values': [1.0, 2.0, 3.0], 'tolerance': 0.0, 'type': 'by_target_values'}#

expected_filter_entire_file = {'type': 'entire_file'}#

filter_and_manipulate_raw_data(raw_data)[source]#

Filter the pandas data-frame based on filter type.

Parameters:: raw_data (DataFrame) – Raw data from file.
Returns:: processed_data (np.array) – Cleaned, filtered or manipulated data_processor data.

get_raw_data_from_file(file_path)[source]#

Get the raw data from the files of interest.

This method loads the desired parts of the csv file as a pandas dataframe.

Parameters:: file_path (str) – Actual path to the file of interest.
Returns:: raw_data (DataFrame) – Raw data from file.

queens.data_processors.numpy_file module#

Data processor class for numpy data extraction.

class NumpyFile(file_name_identifier=None, file_options_dict=None, files_to_be_deleted_regex_lst=None)[source]#

Bases: DataProcessor

Class for extracting data from numpy binaries.

get_raw_data_from_file(file_path)[source]#

Get the raw data from the files of interest.

This method loads the numpy binary data from the file.

Parameters:: file_path (str) – Actual path to the file of interest.
Returns:: raw_data (np.array) – Raw data from file.

queens.data_processors.pvd_file module#

Data processor class for pvd data extraction.

class PvdFile(field_name, file_name_identifier=None, file_options_dict=None, files_to_be_deleted_regex_lst=None, time_steps=None, block=0, point_data=True)[source]#

Bases: DataProcessor

Class for extracting data from pvd.

field_name#

Name of the field to extract data from

Type:: str

time_steps#

Considered time steps (last time step by default)

Type:: lst

block#

Considered block of MultiBlock data set (first block by default)

Type:: int

data_attribute#

‘point_data’ or ‘cell_data’

Type:: str

filter_and_manipulate_raw_data(raw_data)[source]#

Filter and manipulate the raw data.

Parameters:: raw_data (pv.PVDReader) – PVDReader object.
Returns:: processed_data (np.array) – Cleaned, filtered or manipulated data_processor data.

get_raw_data_from_file(file_path)[source]#

Get the raw data from the files of interest.

Parameters:: file_path (str) – Actual path to the file of interest.
Returns:: raw_data (pv.PVDReader) – PVDReader object.

queens.data_processors.txt_file module#

Data processor class for txt data extraction.

class TxtFile(file_name_identifier=None, file_options_dict=None, files_to_be_deleted_regex_lst=None, remove_logger_prefix_from_raw_data=True, logger_prefix='\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3} - queens\\.drivers\\.driver_\\d* - INFO -', max_file_size_in_mega_byte=200)[source]#

Bases: DataProcessor

Class for extracting data from txt files.

Provides basic functionality for extracting data from txt files, however the final implementation is up to the user.

The implementation of the filter_and_manipulate_raw_data method is up to the user.

Throws:: MemoryError: We throw a conservative MemoryError if the txt file is larger than 200 MB. This is due to the current design, which loads the entire content of the .txt file into memory.
Potential Improvement:: Use a generator for reading the content of the file in chunks. This however requires a more advanced logic with the possibility to nest functions calls in a flexible way.

get_raw_data_from_file(file_path)[source]#

Load the text file into memory.

Parameters:: file_path (str) – Actual path to the file of interest.
Returns:: raw_data (lst) – A list of strings read in from file_path.