Make the print method a bit more informative about the dictionary #20

mattsq · 2020-11-08T02:44:31Z

Currently we print the first 5 words by column position - which obviously results in word, word, word, word, NA when there's less than five words, which isn't great. But also it's not necessarily informative. Something more like the top 5 by class might be better:

"The top 5 words by class are:
Class %class1 : aaa, aba, bba
Class %class2: bba, bbb, aaa
..."

This'll help eyeball whether the representation actually achieves good class separation or not.

The text was updated successfully, but these errors were encountered:

mattsq · 2020-11-21T03:38:05Z

Added a few commits to at least fix the NA problem, and to return the words sorted. Not sure we actually want to do this by-class one (maybe for a plot method?) but i'll leave this here for the moment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the print method a bit more informative about the dictionary #20

Make the print method a bit more informative about the dictionary #20

mattsq commented Nov 8, 2020

mattsq commented Nov 21, 2020

Make the print method a bit more informative about the dictionary #20

Make the print method a bit more informative about the dictionary #20

Comments

mattsq commented Nov 8, 2020

mattsq commented Nov 21, 2020