智能配料教程-加快BERT培训!



智能批处理是“动态填充”和“均匀长度批处理”这两种技术的结合。 两者都与减少`的数量有关[PAD]`我们必须在文本中添加令牌,这可以加快培训速度,并且似乎不会影响准确性! 我最初是从MichaëlBenesty在他的出色博客文章中了解这些技术的:https://towardsdatascience.com/divide-hugging-face-transformers-training-time-by-2-or-more-21bf7129db9q-21bf7129db9e === =====代码=========这是我从这段视频中获得的Colab笔记本:https://colab.research.google.com/drive/1Er23iD96x_SzmRG8md1kVggbmz0su_Q5我的笔记本的一些关键部分来自代码Michaël在这里共享:https://gist.github.com/pommedeterresautee/1a334b665710bec9bb65965f662c94c8 ========更多BERT ========寻找关于BERT工作原理的深入探讨? 看看我的新BERT电子书! https://www.chrismccormick.ai/offers/nstHFTrM或将其与我所有与BERT相关的内容(当前和计划中的内容)捆绑在一起:https://www.chrismccormick.ai/offers/cEGFXP2Z ====更新====注册以了解我的博客和频道上的新内容:https://www.chrismccormick.ai/subscribe。

3 comments
  1. Thank you for your amazing videos. This idea (uniform batching) is very similar to how RNNs/LSTMs were being trained for language modeling (for example AWD-LSTM and ULMFIT implement a similar thing) to speed things up. Come to think of it, a number of ideas that were explored in LSTMs might work well with Transformers too.

Comments are closed.