Visualization

Helper functions for constructing visualizations.

Functions:

contrasting_text_color(hex_str)

Get a contrasting foreground text color for specified background hex color

gen_wordcloud(texts)

Creates and returns a wordcloud image that can be rendered with plt.imshow.

get_nice_html_label(text, color[, …])

Get a nice looking colored background label.

plot_big_wordcloud(index, clusters)

Render the word cloud that the currently selected point is in.

plot_clusters(clusters, cluster_values)

Plot highest word values for each cluster.

plot_clusters_stacked(clusters, …)

Plot highest word values for each cluster, colored according to entry classification

plot_confusion_matrix(pred_y, target_y, …)

Get the confusion matrix for given predictions.

plot_metrics(pred_y, target_y, encodings)

Get colored dataframes with macro and micro scores for the given predictions on an aggregate level and per class.

plot_passed_wordcloud(cloud, name)

Render the given word cloud.

plot_wordclouds(dashboard)

Render the grid of all wordclouds.

prepare_wordclouds(clusters, test_texts)

Pre-render the wordcloud for each cluster, this makes switching the main wordcloud figure faster.

render_html_text(text, transformer_wrapper)

Get a text-salience highlighted HTML paragraph.

tx2.visualization.contrasting_text_color(hex_str: str)str

Get a contrasting foreground text color for specified background hex color

Parameters

hext_str – A hex string color (‘#XXXXXX’) for which to determine a black-or-white foreground color.

Returns

‘#FFF’ or ‘#000’.

tx2.visualization.gen_wordcloud(texts: Union[numpy.ndarray, pandas.core.series.Series])

Creates and returns a wordcloud image that can be rendered with plt.imshow.

Parameters

texts – Collection of strings to get text statistics from.

Returns

The generated wordcloud image.

tx2.visualization.get_nice_html_label(text: str, color: str, foreground_color: Optional[str] = None)str

Get a nice looking colored background label.

Parameters
  • text – The text to display in the label.

  • color – The background color as a hex string (‘#XXXXXX’) for the label

  • foreground_color – Leave as None for an automatic black/white foreground color determination from contrasting_text_color().

Returns

An HTML string for the label.

tx2.visualization.plot_big_wordcloud(index: int, clusters: Dict[str, List[int]])

Render the word cloud that the currently selected point is in.

Parameters
  • index – The index of the point to find the cluster of.

  • clusters – The dictionary of clusters where the values are the lists of indices of entries in that cluster.

tx2.visualization.plot_clusters(clusters, cluster_values)

Plot highest word values for each cluster.

tx2.visualization.plot_clusters_stacked(clusters, cluster_words_classified, encodings, colors)

Plot highest word values for each cluster, colored according to entry classification

tx2.visualization.plot_confusion_matrix(pred_y: List[int], target_y: List[int], encodings: Dict[str, int], figsize=(8, 8))

Get the confusion matrix for given predictions.

Parameters
  • pred_y – Predicted labels.

  • target_y – Actual labels.

  • encodings – Dictionary of string label -> numeric label.

  • figsize – the size with which to

tx2.visualization.plot_metrics(pred_y: List[int], target_y: List[int], encodings: Dict[str, int])

Get colored dataframes with macro and micro scores for the given predictions on an aggregate level and per class.

Parameters
  • pred_y – Predicted labels.

  • target_y – Actual labels.

  • encodings – Dictionary of string label -> numeric label.

Returns

The per-class metrics dataframe and the aggregate metrics dataframe.

tx2.visualization.plot_passed_wordcloud(cloud, name)

Render the given word cloud.

Parameters
  • cloud – The word cloud to render.

  • name – The title to render with the word cloud.

tx2.visualization.plot_wordclouds(dashboard)

Render the grid of all wordclouds.

Parameters

dashboard – The current dashboard, needed in order to grab the cluster data.

tx2.visualization.prepare_wordclouds(clusters: Dict[str, List[int]], test_texts: Union[numpy.ndarray, pandas.core.series.Series])

Pre-render the wordcloud for each cluster, this makes switching the main wordcloud figure faster.

Parameters
  • clusters – Dictionary of clusters where the values are the lists of dataframe indices for the entries in each cluster.

  • test_texts – The full test corpus.

tx2.visualization.render_html_text(text, transformer_wrapper: tx2.wrapper.Wrapper)str

Get a text-salience highlighted HTML paragraph.

Parameters
  • text – The text to run salience on and render.

  • transformer_wrapper – The tx2.wrapper.Wrapper instance.

Returns

An HTML string with span-styled-highlights on each relevant word.