Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the print method a bit more informative about the dictionary #20

Open
mattsq opened this issue Nov 8, 2020 · 1 comment
Open

Comments

@mattsq
Copy link
Owner

mattsq commented Nov 8, 2020

Currently we print the first 5 words by column position - which obviously results in word, word, word, word, NA when there's less than five words, which isn't great. But also it's not necessarily informative. Something more like the top 5 by class might be better:

"The top 5 words by class are:
Class %class1 : aaa, aba, bba
Class %class2: bba, bbb, aaa
..."

This'll help eyeball whether the representation actually achieves good class separation or not.

@mattsq
Copy link
Owner Author

mattsq commented Nov 21, 2020

Added a few commits to at least fix the NA problem, and to return the words sorted. Not sure we actually want to do this by-class one (maybe for a plot method?) but i'll leave this here for the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant