Custom and Distributed Training with TensorFlow
In the Custom and Distributed Training with TensorFlow, you learn about distributed training and how Azure Machine Learning supports it for deep learning models. In distributed training, the workload to train a model is split up and shared among multiple mini processors, called worker nodes. These worker nodes work in parallel to speed up model training. Distributed training can be used for traditional ML models, but is better suited for computing and time-intensive tasks, like deep learning for training deep neural networks.
There are two main types of distributed training: data parallelism and model parallelism. For distributed training on deep learning models, the Azure Machine Learning SDK in Python supports integrations with popular frameworks, PyTorch and TensorFlow. Both frameworks employ data parallelism for distributed training and can leverage horovod for optimizing compute speeds. Most of these courses from Coursera are part of a particular specialization, which is a micro-credential. These specializations also include a capstone project and are geared towards in-demand technology skills. Custom and Distributed Training with TensorFlow includes extensive course content, online videos, quizzes, capstone projects at every level, and virtual classes by the best educators in the industry.
By the end of this course, you will learn:
- Learn about Tensor objects, the fundamental building blocks of TensorFlow, understand the difference between the eager and graph modes in TensorFlow, and learn how to use a TensorFlow tool to calculate gradients.
- Build your own custom training loops using GradientTape and TensorFlow Datasets to gain more flexibility and visibility with your model training.
- Learn about the benefits of generating code that runs in graph mode, take a peek at what graph code looks like, and practice generating this more efficient code automatically with TensorFlow’s tools.
- Harness the power of distributed training to process more data and train larger models, faster, get an overview of various distributed training strategies, and practice working with a strategy that trains on multiple GPU cores, and another that trains on multiple TPU cores.
This course offers:
- Flexible deadlines: Reset deadlines based on your availability.
- Shareable Certificate: Earn a Certificate upon completion
- 100% online
- Course 2 of 4 in the TensorFlow: Advanced Techniques Specialization
- Approximately 29 hours to complete
- Subtitles: English
Enroll here: https://www.coursera.org/learn/custom-distributed-training-with-tensorflow