Create a `DataLoaders` suitable for collaborative filtering from `ratings`.

CollabDataLoaders_from_df(
  ratings,
  valid_pct = 0.2,
  user_name = NULL,
  item_name = NULL,
  rating_name = NULL,
  seed = NULL,
  path = ".",
  bs = 64,
  val_bs = NULL,
  shuffle_train = TRUE,
  device = NULL
)

Arguments

ratings

ratings

valid_pct

The random percentage of the dataset to set aside for validation (with an optional seed)

user_name

The name of the column containing the user (defaults to the first column)

item_name

The name of the column containing the item (defaults to the second column)

rating_name

The name of the column containing the rating (defaults to the third column)

seed

random seed

path

The folder where to work

bs

The batch size

val_bs

The batch size for the validation DataLoader (defaults to bs)

shuffle_train

If we shuffle the training DataLoader or not

device

the device, e.g. cpu, cuda, and etc.

Value

None

Examples


if (FALSE) {

URLs_MOVIE_LENS_ML_100k()
c(user,item,title)  %<-% list('userId','movieId','title')
ratings = fread('ml-100k/u.data', col.names = c(user,item,'rating','timestamp'))
movies = fread('ml-100k/u.item', col.names = c(item, 'title', 'date', 'N', 'url',
                                               paste('g',1:19,sep = '')))
rating_movie = ratings[movies[, .SD, .SDcols=c(item,title)], on = item]
dls = CollabDataLoaders_from_df(rating_movie, seed = 42, valid_pct = 0.1, bs = 64,
item_name=title, path='ml-100k')

}