###
# Problem: using the mushroom.txt data set, compare how ensemble classifiers
# perform when compared to simple decision tree models. How big is the difference
# in performance? What do you observe in terms of learning times? Predict gill.size attribute!
# Visualize the tuples (model, time spent) as a barplot 
# (HINT: you can use Sys.time() to crete checkpoints and measure time.)
# (HINT: some models can have problems with redundant columns. Consider the following command (example on the vehicle dataset)
# vehicle <- vehicle[,  -which(lengths(lapply(vehicle, unique)) == 1)])
###

###
# Problem: create a k-fold cross-validation function that can be used without importing external libraries
# The function should work in the following steps:
#     - Split the dataset (which should be provided as a function parameter) into k wholly distinct subsets
#     - Train and test a model k times. Each time, use 1 subset as the testing set and the combination of all other subsets as a training set.
#	      - Each subset should be used as a testing set once		
#	  - Save the classification accuracy of each model trained in this way and return the average as the result of the function 
#
#
#	 You can also use a similar approach to manually implement bagging. The approach is similar:
#		 Training:	
#        - Instead of simply splitting the dataset, create M datasets by sampling (with replacement) from the original dataset (use the sample function with replace=T)
#		 	- The size and amount of the resampled datasets should be determined by a function parameter	
#		 - Train and save a model on each of these samples and return them as the function result
#		 Prediction:
#		 - Use each of the saved models to obtain a prediction on the test dataset (using the predict function)
#        - Use voting to combine the predictions into a final prediction
###


### Problem
#   Create various ensembles of logistic regression classifiers and evaluate their performance
#   on the vehicles data set. Determine the best ensemble strategy by using F1 measure.
###