Skip to contents

CTVsuggestTrain carries out the model building, and creates the data.frame containing the classification probabilities that is outputted by the CTVsuggest package.
These R packages are based on follow up work of my 4th year university dissertation supervised by Ioannis Kosmidis

The CTVsuggestTrain R package has a single exported function: Train_model(), that constructs features and trains a multinomial logistic regression model with the objective of classifying CRAN packages to available CRAN Task Views. For a more detailed description of the model, view the Model Section of the CTVsuggest Overview Vignette.

Important to note that in order to output suggestions using the CTVsuggest package, you can completely ignore the CTVsuggestTrain package. I use CTVsuggestTrain to train the model weekly in order to update the predictions provided by CTVsuggest. Having the code packaged makes it easier for me to carry out model training, and allows the model building to be transparent for others to inspect.

For further detail on the workflow, view the Packages Workflow Section of the CTVsuggest Overview Vignette.

Installation

You can install the development version of CTVsuggestTrain from GitHub with:

# install.packages("devtools")
devtools::install_github("DylanDijk/CTVsuggestTrain")

Example

The following code saves the model, model accuracy and data.frame containing classification probabilities for packages to an "OUTPUT" directory in your current working directory.

library(CTVsuggestTrain)
Train_model(save_output = TRUE, save_path = "OUTPUT/")

The code example above is the code I run to retrieve an up to date model. The Train_model() function takes a while to run, on my machine (Windows Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz, 2112 Mhz, 4 Cores, 8 Logical Processors) it takes 30 minutes.