gridgene.get_arrays module#

gridgene.get_arrays.transform_df_to_array(df, target_dict, array_shape)[source]#

Transforms a DataFrame into a 3D numpy array based on specified target dictionary and array shape.

Parameters:
  • df (pd.DataFrame) – The input DataFrame containing ‘X’, ‘Y’, and ‘target’ columns.

  • target_dict (dict) – A dictionary mapping target values to unique indices.

  • array_shape (tuple) – The shape of the output array (max(X)+1, max(Y)+1, number of targets).

Returns:

A 3D numpy array with dimensions specified by array_shape, where each position [x, y, target_index] is set to 1 if there is an entry in the DataFrame with coordinates (x, y) and the corresponding target.

Return type:

np.ndarray

gridgene.get_arrays.get_subset_arrays_V1(df_total, target_list, target_col='target', col_x='X', col_y='Y')[source]#

PROBABLY LESS EFFICIENT !

Filters the DataFrame based on target_list, then creates and returns a subset DataFrame, a dictionary of target mappings, a 3D array representing the data, and a 2D summed array along the third axis.

Parameters:
  • df_total (pd.DataFrame) – The input DataFrame containing the data.

  • target_list (list) – List of target values to filter the DataFrame.

  • target_col (str, optional) – Column name in the DataFrame containing target values, by default ‘target’.

  • col_x (str, optional) – Column name in the DataFrame representing the X-coordinate, by default ‘X’.

  • col_y (str, optional) – Column name in the DataFrame representing the Y-coordinate, by default ‘Y’.

Returns:

A tuple containing: - df_subset (pd.DataFrame): The filtered DataFrame. - target_dict_subset (dict): A dictionary mapping each target to a unique index. - array_subset (np.ndarray): A 3D numpy array of shape (max(X)+1, max(Y)+1, len(target_list)), filled based on the filtered DataFrame. - array_subset_2d (np.ndarray): A 2D numpy array obtained by summing array_subset along the third axis.

Return type:

tuple

gridgene.get_arrays.get_subset_arrays(df_total, array_total, target_dict_total, target_list, target_col='target')[source]#

Get a subset of the DataFrame, the corresponding slices from the total array, and the subset target dictionary.

Parameters:
  • df_total (pd.DataFrame) – The input DataFrame containing the data.

  • array_total (np.ndarray) – The 3D array representing the entire dataset.

  • target_dict_total (dict) – A dictionary mapping each target in the total dataset to its index.

  • target_list (list) – List of target values to filter the DataFrame and array.

  • target_col (str, optional) – Column name in the DataFrame containing target values, by default ‘target’.

Returns:

A tuple containing: - df_subset (pd.DataFrame): The filtered DataFrame. - array_subset (np.ndarray): The subset of the array corresponding to the target_list. - target_dict_subset (dict): The subset dictionary mapping the filtered targets to indices.

Return type:

tuple