R中的机器学习：通过并行计算加速模型构建

您是否想加快计算机器学习模型所需的时间？在本视频中，我向您展示了如何通过使用并行计算来加快模型构建。 ?给我买杯咖啡：https://www.buymeacoffee.com/dataprofessor⭕时间轴0:30启动RStudio或RStudio.cloud 0:34从“数据教授”下载代码GitHub 1:08打开dhfr-parallel-speed-up .R文件1:20 1.加载到DHFR数据集1:52 2.检查缺失值1:56 3.设置可重现模型2:03的种子4.数据拆分为80/20子集2:28为我们的代码计时4:22让我们使用doParallel进行并行计算5:46并行计算会加速超参数调整吗？ 8:21结束语?数据：https：//raw.githubusercontent.com/dataprofessor/data/master/dhfr.csv?CODE：https：//github.com/dataprofessor/code/blob/master/dhfr/dhfr- ⭕播放列表：在以下播放列表中查看我们的其他视频。 ✅数据科学101：https://bit.ly/dataprofessor-ds101✅数据科学YouTuber播客：https://bit.ly/datascience-youtuber-podcast✅数据科学虚拟实习：https://bit.ly/dataprofessor -internship✅生物信息学：http：//bit.ly/dataprofessor-bioinformatics✅数据科学工具箱：https：//bit.ly/dataprofessor-datasciencetoolbox✅Streamlit（Python中的网络应用）：https：//bit.ly/dataprofessor -streamlit✅闪亮（R中的Web应用程序）：https://bit.ly/dataprofessor-shiny✅Google Colab提示和技巧：https://bit.ly/dataprofessor-google-colab✅熊猫提示和技巧：https： //bit.ly/dataprofessor-pandas✅Python数据科学项目：https：//bit.ly/dataprofessor-python-ds✅R数据科学项目：https：//bit.ly/dataprofessor-r-ds⭕订阅：如果您是这里的新手，那么如果您考虑订阅此频道，对我来说就意味着世界。 ✅订阅：https://www.youtube.com/dataprofessor?sub_confirmation=1⭕推荐工具：Kite是一款免费的AI驱动的编码助手，可帮助您更快更智能地进行编码。 Kite插件与所有顶级编辑器和IDE集成在一起，可在您键入时为您提供智能的补全和文档。我一直在使用风筝，我喜欢它！ ✅查看风筝：https://www.kite.com/get-kite/?utm_medium=referral&utm_source=youtube&utm_campaign=dataprofessor&utm_content=description-only⭕推荐书籍：✅使用Scikit-Learn的动手机器学习：https：// amzn.to/3hTKuTt from Scratch的数据科学：https://amzn.to/3fO0JiZ✅Python数据科学手册：https://amzn.to/37Tvf8n✅R表示数据科学：https://amzn.to/2YCPcgW ✅人工智能：哈佛商业评论需要的见解：https://amzn.to/33jTdcv✅人工智能超级大国：中国，硅谷和新世界秩序：https://amzn.to/3nghGrd⭕库存照片，此频道上使用的图形和视频：✅https://1.envato.market/c/2346717/628379/4662⭕关注我们：✅媒介：http://bit.ly/chanin-medium✅FaceBook：http：/ /facebook.com/dataprofessor/✅网站：http：//dataprofessor.org/（正在建设中）✅Twitter：https：//twitter.com/thedataprof/✅Instagram：https：//www.instagram.com/data。教授/✅LinkedIn：https：//www.linkedin.com/in/chanin-nant asenamat /✅GitHub 1：https://github.com/dataprofessor/✅GitHub 2：https://github.com/chaninlab/⭕免责声明：推荐的书籍和工具是会员链接，无偿提供我一部分销售给您，这将有助于改善此频道的内容。 #dataprofessor＃机器学习＃并行计算#codespeed #fastcode #datascienceproject #learnr #rprogramming #learnrprogramming #datascience #datamining #bigdata #datascienceworkshop #dataminingworkshop #dataminingtutorial #datasciencetutorial #ai #artificialintelligence #r #doparallel。

6 comments

Data Professor说道：

2020年12月10日上午12:12

?QUESTION OF THE DAY: How much faster did this speed up your ML model building? Comments down below! ?
?Help support this YouTube channel by hitting the Subscribe button, Like button and Comment down below! ?
Maksim 093说道：

2020年12月10日上午12:12

is it possible to force rstudio to parallel compute in the beginning of session without writing code just before a model?
Song Youk说道：

2020年12月10日上午12:12

Questions: Is there any specific reason why you use number 5 as the argument of makePSOCKcluster? And is there any kind of maximum limit for this value? If so how can we know this maximum number? Last but not least… What is the definition of a cluster? Is it different from Core or CPU? Tensorflow also uses parallel processing with GPU. What is difference with this DoParallel (R) and GPU Parallel (Cuda, Python)? Sorry for the many questions.
Someone Else说道：

2020年12月10日上午12:12

Does rstudio have a vlookup that works for large datasets? Excels vlookup only works for smaller data sets…
Niraj kandpal说道：

2020年12月10日上午12:12

Note : I could not follow the code properly in Rstudio on cloud thus I followed this code in kaggle notebook. There was some issue with "stopCluster(cl)" and thus running without parallel again was not possible. This was solved using registerDoSEQ().
Shweta Redkar说道：

2020年12月10日上午12:12

Cool. Informative tutorial and helpful. Do you work on protein-ligand interactions?

Comments are closed.