Grab data from TensorFlow Speech Commands (2.3 GB):
commands_path = "SPEECHCOMMANDS"
audio_files = get_audio_files(commands_path)
length(audio_files$items)
# [1] 105835
Prepare dataset and put into data loader:
DBMelSpec = SpectrogramTransformer(mel=TRUE, to_db=TRUE)
a2s = DBMelSpec()
crop_4000ms = ResizeSignal(4000)
tfms = list(crop_4000ms, a2s)
auds = DataBlock(blocks = list(AudioBlock(), CategoryBlock()),
get_items = get_audio_files,
splitter = RandomSplitter(),
item_tfms = tfms,
get_y = parent_label)
audio_dbunch = auds %>% dataloaders(commands_path, item_tfms = tfms, bs = 20)
See batch:
audio_dbunch %>% show_batch(figsize = c(15, 8.5), nrows = 3, ncols = 3, max_n = 9, dpi = 180)
Before fitting, 3 channels to 1 channel:
torch = torch()
nn = nn()
learn = Learner(dls, xresnet18(pretrained = FALSE), nn$CrossEntropyLoss(), metrics=accuracy)
# channel from 3 to 1
learn$model[0][0][['in_channels']] %f% 1L
# reshape
new_weight_shape <- torch$nn$parameter$Parameter(
(learn$model[0][0]$weight %>% narrow('[:,1,:,:]'))$unsqueeze(1L))
# assign with %f%
learn$model[0][0][['weight']] %f% new_weight_shape
Weights and biases could be save and visualized on wandb.ai:
wandb: Currently logged in as: henry090 (use `wandb login --relogin` to force relogin)
wandb: Tracking run with wandb version 0.10.8
wandb: Syncing run macabre-zombie-2
wandb: ⭐️ View project at https://wandb.ai/henry090/speech_recognition_from_R
wandb: 🚀 View run at https://wandb.ai/henry090/speech_recognition_from_R/runs/2sjw3juv
wandb: Run data is saved locally in wandb/run-20201030_224503-2sjw3juv
wandb: Run `wandb off` to turn off syncing.
Now we can train our model:
learn %>% fit_one_cycle(3, lr_max=slice(1e-2), cbs = list(WandbCallback()))
epoch train_loss valid_loss accuracy time
------ ----------- ----------- --------- -----
epoch train_loss valid_loss accuracy time
------ ----------- ----------- --------- -----
WandbCallback requires use of "SaveModelCallback" to log best model
0 0.590236 0.728817 0.787121 04:18
WandbCallback was not able to get prediction samples -> wandb.log must be passed a dictionary
1 0.288492 0.310335 0.908490 04:19
2 0.182899 0.196792 0.941088 04:10
See beautiful dashboard here:
https://wandb.ai/henry090/speech_recognition_from_R/runs/2sjw3juv?workspace=user-henry090