Nvidia Labs



Nvidia Deep Learning Institute

LAB 1 : OpenACC – 2X in 4 steps (90 minutes)

Learn how to accelerate your C/C++ or Fortran application using OpenACC to harness the massively parallel power of NVIDIA GPUs. OpenACC is a directive based approach to computing where you provide compiler hints to accelerate your code, instead of writing the accelerator code yourself. In 90 minutes, you will experience a four-step process for accelerating applications using OpenACC :
1. Characterize and profile your application
2. Add compute directives
3. Add directives to optimize data movement
4. Optimize your application using kernel scheduling


LAB 2 : Image Classification with NVIDIA DIGITS (120 minutes)

This lab shows you how to leverage deep neural networks (DNN) - specifically convolutional neural networks (CNN) - within the deep learning workflow to solve a real-world image classification problem using NVIDIA DIGITS on top of the Caffe framework and the MNIST hand-written digits dataset.
In this lab, you will learn how to :
1. Architect a Deep Neural Network to run on a GPU
2. Manage the process of data preparation, model definition, model training and troubleshooting
3. Use validation data to test and try different strategies for improving model performance

On completion of this lab, you will be able to use NVIDIA DIGITS to architect, train, evaluate and enhance the accuracy of CNNs on your own image classification application.


LAB 3 : Object Detection with NVIDIA DIGITS (120 minutes)

This lab introduces students to one of four primary computer vision tasks - object detection - by trying three different approaches : sliding window, fully convolutional network (FCN), and DIGITS’ DetectNet network model. In this lab, you will learn how to :
1. Measure object detection approaches in relation to three metrics : model training time, model accuracy and speed of detection during deployment
2. Implement a sliding window approach to object detection
3. Convert fully connected networks to fully convolutional networks (FCN)
4. Use DIGITS’ DetectNet for more efficient object detection

On completion of this lab, you will understand the merits of each approach and learn how to detect objects using neural networks trained using NVIDIA DIGITS on the Caffe framework on real-world datasets.


LAB 4 : Neural Network Deployment with NVIDIA DIGITS and TensorRT (120 minutes)

In this lab, you will learn how to :
1. Understand the role of batch size in inference performance
2. Make various optimizations in the inference process.
3. Explore inference for a variety of different DNN architectures trained in other DLI labs.

On completion of this lab, you will be able to execute a full Deep Learning workflow : from loading data, to training a neural network, to deploying that trained network to production.


LAB 5 : Step by Step Implementation and Optimization of Simulations in Quantitative Finance (105 minutes)

The goal of this lab is to give a guided tour through the essentials of CUDA parallelization in mathematical finance. We consider pricing a bullet option under LV (Local Volatility) model using either MC (Monte Carlo) or an implicit discretization scheme for PDE (Partial Differential Equation). The considered example is close to a real banking application with an LV model derived from the implicit SVI model of Gatheral & Jacquier using the Andersen & Brotherton-Ratcliffe expression based on Dupire equation. Several optimizations are studied like the judicious memory storage in shared and registers for two discretization scales. Thanks to a simple trick proposed in Abbas-Turki & Graillat, we see also the use of PCR (Parallel Cyclic Reduction) to solve tridiagonal systems of any size.