icat.data.DataManager#

class icat.data.DataManager(data=None, text_col=None, model=None, width=700, height=800, default_sample_size=100, **params)#

Bases: Viewer

A model’s container and viewer component for labelling individual data points.

This manages the current set of sample points and holds the currently explored dataset. It provides the Panel component for interacting with this dataset, in terms of filtering/searching through it, providing multiple different “view” sets, and

Parameters:
  • data (pd.DataFrame) – The initial dataset to use.

  • text_col (str) – The column of the data containing the text to explore.

  • model (Model) – The parent model instance.

  • width (int) – The width of the data manager viewer table.

  • height (int) – The height of the data manager viewer table.

  • default_sample_size (int) – The number of points to randomly sample.

Methods

__init__([data, text_col, model, width, ...])

apply_label(index, label)

Provide the label(s) for the specified index/indices.

fire_on_data_changed()

fire_on_data_labeled(index, label)

fire_on_row_selected(index)

fire_on_sample_changed()

on_data_changed(callback)

Register a callback function for the "data changed" event, when the active_data dataframe is switched out.

on_data_labeled(callback)

Register a callback function for the "data label changed" event.

on_row_selected(callback)

Register a callback function for the "row clicked" event.

on_sample_changed(callback)

Register a callback function for the "sample changed" event.

servable([title, location, area, target])

Serves the object or adds it to the configured pn.state.template if in a panel serve context, writes to the DOM if in a pyodide context and returns the Panel object to allow it to display itself in a notebook context.

set_data(data)

Replace the current active data with the data specified.

set_random_sample()

Randomly choose 100 indices to use for the anchorviz sample.

show([title, port, address, ...])

Starts a Bokeh server and displays the Viewable in a new tab.

Attributes

current_data_tab

name

param

pred_max

pred_min

sample_indices

The row indices from the active_data in the current sample set.

search_value

selected_indices

The row indices from the active_data lasso-selected by the user.

update_trigger

active_data

The current active dataset the user is exploring with the model.

filtered_df

The data currently displayed after the relevant filters are applied.

active_data#

The current active dataset the user is exploring with the model.

apply_label(index, label)#

Provide the label(s) for the specified index/indices.

Parameters:
  • index (int | list[int]) – Either a single index, or a list of indices.

  • label (int | list[int]) – Either the single label to apply or a list of corresponding labels for the provided indices. 1 is “interesting”, 0 is “uninteresting”. If a -1 is provided, this resets or “unlabels”, removing it from the container model’s training set.

current_data_tab = 'Sample'#
filtered_df#

The data currently displayed after the relevant filters are applied.

fire_on_data_changed()#
fire_on_data_labeled(index, label)#
Parameters:
  • index (int | list[int]) –

  • label (int | list[int]) –

fire_on_row_selected(index)#
Parameters:

index (int) –

fire_on_sample_changed()#
name = 'DataManager'#
on_data_changed(callback)#

Register a callback function for the “data changed” event, when the active_data dataframe is switched out.

Callbacks for this event should take no parameters.

Parameters:

callback (Callable) –

on_data_labeled(callback)#

Register a callback function for the “data label changed” event.

Note that depending on how it’s fired, this can either apply to a single datapoint being labeled, or a set of points.

Callbacks for this event should take two parameters: * index (int | list[int]) * label (int | list[int])

If index is a list, that means multiple points are being labeled simultaneously.

Parameters:

callback (Callable) –

on_row_selected(callback)#

Register a callback function for the “row clicked” event.

Callbacks for this event should take the index as a parameter.

Parameters:

callback (Callable) –

on_sample_changed(callback)#

Register a callback function for the “sample changed” event.

Callbacks for this event should take a single parameter which is the new list of sample indices

Parameters:

callback (Callable) –

pred_max = 1.0#
pred_min = 0.0#
sample_indices = []#

The row indices from the active_data in the current sample set.

search_value = ''#
selected_indices = []#

The row indices from the active_data lasso-selected by the user.

set_data(data)#

Replace the current active data with the data specified.

Note that this won’t wipe out the existing training data for the model, model.training_data is a separate data frame that’s built up as labels are applied to various datasets.

Parameters:

data (pd.DataFrame) – The dataset to use as the current active data.

set_random_sample()#

Randomly choose 100 indices to use for the anchorviz sample.

update_trigger = False#