CollabDataLoaders_from_df — CollabDataLoaders_from

Create a `DataLoaders` suitable for collaborative filtering from `ratings`.

CollabDataLoaders_from_df(
  ratings,
  valid_pct = 0.2,
  user_name = NULL,
  item_name = NULL,
  rating_name = NULL,
  seed = NULL,
  path = ".",
  bs = 64,
  val_bs = NULL,
  shuffle_train = TRUE,
  device = NULL
)

Arguments

ratings: ratings
valid_pct: The random percentage of the dataset to set aside for validation (with an optional seed)
user_name: The name of the column containing the user (defaults to the first column)
item_name: The name of the column containing the item (defaults to the second column)
rating_name: The name of the column containing the rating (defaults to the third column)
seed: random seed
path: The folder where to work
bs: The batch size
val_bs: The batch size for the validation DataLoader (defaults to bs)
shuffle_train: If we shuffle the training DataLoader or not
device: the device, e.g. cpu, cuda, and etc.

Value

None

Examples


if (FALSE) {

URLs_MOVIE_LENS_ML_100k()
c(user,item,title)  %<-% list('userId','movieId','title')
ratings = fread('ml-100k/u.data', col.names = c(user,item,'rating','timestamp'))
movies = fread('ml-100k/u.item', col.names = c(item, 'title', 'date', 'N', 'url',
                                               paste('g',1:19,sep = '')))
rating_movie = ratings[movies[, .SD, .SDcols=c(item,title)], on = item]
dls = CollabDataLoaders_from_df(rating_movie, seed = 42, valid_pct = 0.1, bs = 64,
item_name=title, path='ml-100k')

}