智能配料教程-加快BERT培训！

智能批处理是“动态填充”和“均匀长度批处理”这两种技术的结合。两者都与减少`的数量有关[PAD]`我们必须在文本中添加令牌，这可以加快培训速度，并且似乎不会影响准确性！我最初是从MichaëlBenesty在他的出色博客文章中了解这些技术的：https：//towardsdatascience.com/divide-hugging-face-transformers-training-time-by-2-or-more-21bf7129db9q-21bf7129db9e === =====代码=========这是我从这段视频中获得的Colab笔记本：https://colab.research.google.com/drive/1Er23iD96x_SzmRG8md1kVggbmz0su_Q5我的笔记本的一些关键部分来自代码Michaël在这里共享：https://gist.github.com/pommedeterresautee/1a334b665710bec9bb65965f662c94c8 ========更多BERT ========寻找关于BERT工作原理的深入探讨？看看我的新BERT电子书！ https://www.chrismccormick.ai/offers/nstHFTrM或将其与我所有与BERT相关的内容（当前和计划中的内容）捆绑在一起：https://www.chrismccormick.ai/offers/cEGFXP2Z ====更新====注册以了解我的博客和频道上的新内容：https://www.chrismccormick.ai/subscribe。

3 comments

Dark Mythos说道：

2020年7月11日下午1:22

Thank you for your amazing videos. This idea (uniform batching) is very similar to how RNNs/LSTMs were being trained for language modeling (for example AWD-LSTM and ULMFIT implement a similar thing) to speed things up. Come to think of it, a number of ideas that were explored in LSTMs might work well with Transformers too.
Vishal Goklani说道：

2020年7月11日下午1:22

https://pytorch.org/docs/master/generated/torch.nn.utils.rnn.pad_sequence.html
Teetan Robotics说道：

2020年7月11日下午1:22

This guy is just amazing at this. Just love code explanations

Comments are closed.