HF_CausalLMBeforeBatchTransform — HF_CausalLMBeforeBatchTransform • fastai

Handles everything you need to assemble a mini-batch of inputs and targets, as well as decode the dictionary produced

HF_CausalLMBeforeBatchTransform(
  hf_arch,
  hf_tokenizer,
  max_length = NULL,
  padding = TRUE,
  truncation = TRUE,
  is_split_into_words = FALSE,
  n_tok_inps = 1,
  ignore_token_id = -100,
  ...
)

Arguments

hf_arch: architecture
hf_tokenizer: tokenizer
max_length: maximum length
padding: padding or not
truncation: truncation or not
is_split_into_words: to split into words
n_tok_inps: number tok inputs
ignore_token_id: ignore token id
...: additional arguments

Value

None

Details

as a byproduct of the tokenization process in the `encodes` method.