智能配料教程-加快BERT培訓！

智能批處理是「動態填充」和「均勻長度批處理」這兩種技術的結合。兩者都與減少`的數量有關[PAD]`我們必須在文本中添加令牌，這可以加快培訓速度，並且似乎不會影響準確性！我最初是從MichaëlBenesty在他的出色博客文章中了解這些技術的：https：//towardsdatascience.com/divide-hugging-face-transformers-training-time-by-2-or-more-21bf7129db9q-21bf7129db9e === =====代碼=========這是我從這段視頻中獲得的Colab筆記本：https://colab.research.google.com/drive/1Er23iD96x_SzmRG8md1kVggbmz0su_Q5我的筆記本的一些關鍵部分來自代碼Michaël在這裡共享：https://gist.github.com/pommedeterresautee/1a334b665710bec9bb65965f662c94c8 ========更多BERT ========尋找關於BERT工作原理的深入探討？看看我的新BERT電子書！ https://www.chrismccormick.ai/offers/nstHFTrM或將其與我所有與BERT相關的內容（當前和計劃中的內容）捆綁在一起：https://www.chrismccormick.ai/offers/cEGFXP2Z ====更新====註冊以了解我的博客和頻道上的新內容：https://www.chrismccormick.ai/subscribe。

3 comments

Dark Mythos說道：

2020年7月11日下午1:22

Thank you for your amazing videos. This idea (uniform batching) is very similar to how RNNs/LSTMs were being trained for language modeling (for example AWD-LSTM and ULMFIT implement a similar thing) to speed things up. Come to think of it, a number of ideas that were explored in LSTMs might work well with Transformers too.
Vishal Goklani說道：

2020年7月11日下午1:22

https://pytorch.org/docs/master/generated/torch.nn.utils.rnn.pad_sequence.html
Teetan Robotics說道：

2020年7月11日下午1:22

This guy is just amazing at this. Just love code explanations

Comments are closed.