##########################################################################################################################
#
# PROBLEMS
#
##########################################################################################################################
#
# - Build a corpus from the documents in the "earn" directory
#
# - Load the document classes from "earn-topics.txt"
#
# - Construct the data set and split it into the train (70% of instances) and test (30% of instances) subsets
#
# - Train several models (KNN, SVM, ANN, ...) and evaluate how successfully they classify test documents
#
##########################################################################################################################


#######################################
#
# - Visualize the low-dimensional representation of the corpus w.r.t. target space.
# What do you observe?
#
#######################################

#######################################
#
# Generate the wordcloud by using different pre-processing techniques. Explain the behaviour.
#
#######################################