智能配料教程-加快BERT培訓!



智能批處理是「動態填充」和「均勻長度批處理」這兩種技術的結合。 兩者都與減少`的數量有關[PAD]`我們必須在文本中添加令牌,這可以加快培訓速度,並且似乎不會影響準確性! 我最初是從MichaëlBenesty在他的出色博客文章中了解這些技術的:https://towardsdatascience.com/divide-hugging-face-transformers-training-time-by-2-or-more-21bf7129db9q-21bf7129db9e === =====代碼=========這是我從這段視頻中獲得的Colab筆記本:https://colab.research.google.com/drive/1Er23iD96x_SzmRG8md1kVggbmz0su_Q5我的筆記本的一些關鍵部分來自代碼Michaël在這裡共享:https://gist.github.com/pommedeterresautee/1a334b665710bec9bb65965f662c94c8 ========更多BERT ========尋找關於BERT工作原理的深入探討? 看看我的新BERT電子書! https://www.chrismccormick.ai/offers/nstHFTrM或將其與我所有與BERT相關的內容(當前和計劃中的內容)捆綁在一起:https://www.chrismccormick.ai/offers/cEGFXP2Z ====更新====註冊以了解我的博客和頻道上的新內容:https://www.chrismccormick.ai/subscribe。

3 comments
  1. Thank you for your amazing videos. This idea (uniform batching) is very similar to how RNNs/LSTMs were being trained for language modeling (for example AWD-LSTM and ULMFIT implement a similar thing) to speed things up. Come to think of it, a number of ideas that were explored in LSTMs might work well with Transformers too.

Comments are closed.