GPU Technology Conference 2021 | OpenACC (2022)

GTC 2021conference offers several opportunities to learn more about the intersection of HPC, AI and Data Science. Browse through a variety of talks, tutorials, posters, and meet-the-experts hangouts across topics such as OpenACC and programming languages, developer tools and industry-specific research and applications.This year's GTC conference was a digital conference with on-demand recordings now available. Registration for NVIDIA's developer program is required to view content.

Connect with the Experts

Directive-Based GPU Programming with OpenACC

OpenACC is a programming model designed to help scientists and developers to start with GPUs faster and be more efficient by maintaining a single code source for multiple platforms. OpenACC experts discuss how to start accelerating your code on GPUs, continue optimizing your GPU code, start teaching OpenACC, host or participate in a hackathon, and more.



Porting VASP to GPU Using OpenACC: Exploiting the Asynchronous Execution Model

Martijn Marsman,Senior Scientist, University of Vienna

NVIDIA GPUs accelerate the most important applications in quantum chemistry (like Gaussian, VASP, Quantum ESPRESSO, GAMESS, NWChem, and CP2K) and molecular dynamics (like GROMACS, NAMD, LAMMPS, and Amber) that are also very popular in materials science, biophysics, drug discovery, and other domains. We'll answer your questions about how to get the best performance for your specific workload or figure out how you can benefit from accelerated computing.

Fluid Dynamic Simulations of Euplectella aspergillumSponge

Giorgio Amati,Senior HPC engineering, CINECA

We present our experience in simulating the flow around a silica-based sponge, the "Euplectella Aspergillum," using a TOP500 machine equipped with NVIDIA GPUs. A Lattice Boltzmann Method (LBM)-based code was used to explore fluid dynamical features of this complex structure. We'll present some physical results, together with details of code implementations and performance figures (up to about 4,000 V100 GPU) for our MPI+OpenACC LBM code.

Aerodynamic Flow Control Simulations with Many GPUs on the Summit Supercomputer

(Video) GTC 2022 Keynote with NVIDIA CEO Jensen Huang

Nicholson Koukpaizan, Postdoctoral Research Associate,Oak Ridge National Laboratory

A GPU-accelerated computational fluid dynamics (CFD) solver was used for aerodynamic flow control simulations on the Summit supercomputer at Oak Ridge Leadership Computing Facility. The GPU implementation of the FORTRAN 90 code relies on OpenACC directives to offload work to the GPU and message-passing interface (MPI) for multi-core/multi-device parallelism, taking advantage of the CUDA-aware MPI capability. We'll address implementation details, as well as performance results and optimization. Finally, we'll present new scientific results obtained by leveraging the GPUs for the control of aerodynamic flow separation using fluidic oscillators (actuators that generate spatially oscillating jets without any moving part). We'll add a few details pertaining to CFD and aerodynamic flow control to make the talk accessible to people who are not necessarily familiar with these domains.

On Scalability, Portability, and Maintainability of Commercial CFD Solver HiFUN on NVIDIA GPU

Munikrishna Nagaram,Chief Technology Officer,S & I Engineering Solutions

OpenACC provides a parallel programming paradigm for porting legacy computational fluid dynamics (CFD) solvers on hybrid HPC platforms without compromising the readability and maintainability of the code. HiFUN, a message-passing interface (MPI)-based super scalable industry standard CFD code, has adopted the OpenACC framework to exploit its super computing advantage on GPU-based hybrid platforms. We'll highlight (1) OpenACC way of porting a production CFD code while retaining the MPI parallelism and source code maintainability; (2) profiling and performance analysis; (3) performance studies on Volta GPU; and (4) performance evaluation on NVIDIA's newer Ampere Architecture.

A Tale of Two Programming-Models: Enhancing Heterogeneity, Productivity, and Performance through OmpSs-2 + OpenACC Inter-Operation

Simon Garcia de Gonzalo,Postdoctoral Researcher , Barcelona Supercomputing Center

Learn about the new possible interoperation between two pragma-based programming models: OmpSs-2 and OpenACC. Two pragma-based programming models made to function completely independent and unaware of each other can be made to effectively collaborate with minimal additional programming. We'll go over the separation of duties between models and describe in-depth the mechanism needed for interoperation. We'll provide concrete code examples using ZPIC, a 2D plasma simulator application written in OmpSs-2, OpenACC, and OmpSs-2 + OpenACC. We'll compare the performance and programmability benefits of OmpSs-2 + OpenACC ZPIC implementation against the other single-model implementations. OmpSs-2 + OpenACC is part of the latest OmpSs-2 release and all ZPIC implementations are open source.

Introducing Developer Tools for Arm and NVIDIA systems

David Lecomber,Senior Director , Arm

NVIDIA GPUs on Arm servers are here. In migrating to, or developing on, Arm servers with NVIDIA GPUs, developers using native code, CUDA, and OpenACC continue to need tools and toolchains to succeed and to get the most out of applications. We'll explore the role of key tools and toolchains on Arm servers, from Arm, NVIDIA and elsewhere — and show how each tool fits in the end-to-end journey to production science and simulation.

(Video) GTC November 2021 Keynote with NVIDIA CEO Jensen Huang

Panel: Present and Future of Accelerated Computing Programming Approaches

Sunita Chandrasekaran, Assistant Professor,University of Delaware |Bryce Lelbach,HPC Programming Models Architect,NVIDIA |Christian Trott,Principal Member of Staff,Sandia National Laboratories |Stephen Jones,CUDA Architect,NVIDIA |Jeff Larkin,Senior Developer Technologies Software Engineer and OpenACC Technical Committee Chair,NVIDIA |Joel Denny,Computer Scientist,Oak Ridge National Laboratory |Jack Deslippe,Application Performance Group Lead,NERSC, Lawrence Berkeley National Lab

With endless choices of programming environments in the parallel computing universe at a time of exascale computers, which programming model should you choose? Are base languages, like C++ and Fortran, a solid choice for today's codes? Will OpenACC and OpenMP directives stay around long enough to warrant investing time and effort in them now for application acceleration on GPUs? Or should a researcher go back to the low-level programming models, like CUDA, to extract maximum performance of the code? Join our panel of experts as they debate the ultimate answer for the future parallel programmer.

Accelerating Machine Learning Applications Using CUDA Graph and OpenACC

Leonel Toledo,Recognized Researcher,Barcelona Supercomputing Center (BSC) |Antonio J. Peña,Senior Researcher,Barcelona Supercomputing Center

We'll showcase the integration of CUDA Graph with OpenACC, which allows developers to write applications that benefit from parallelism from the GPU, as well as increasing coding productivity. Since many scientific applications require high performance computing systems to make their calculations, it's important to provide a mechanism that allows developers to exploit the system's hardware to achieve the expected performance.

We will also explore the most important technical details regarding the integration of CUDA Graph and OpenACC. This allows programmers to define the workflow as a set of GPU tasks, potentially executing more than one at the same time.

Examples will be provided using CUDA, C++ and OpenACC, it will be expected that registrants are familiar with at least the fundamentals of these programming languages.

The ChEESE Effort Toward Building a GPU Ecosystem for Earth Science

Piero Lanucara, CINECA

We'll present the big effort in the ChEESE project toward building a GPU ecosystem. Our presentation can be split into two main pillars:

(Video) GTC Spring 2021 Keynote with NVIDIA CEO Jensen Huang

The ChEESE flagship applications description and their GPU capabilities. We want to cover technical aspects here related to the code developments and technology used (CUDA or OpenACC, and the motives to do that) as well as benchmarking numbers on ChEESE use cases.

Where these applications will run (the “systems”) and what new science is made possible thanks to NVIDIA GPU (in the case of the ChEESE, what are the possible demonstrators that will run on PRACE and EuroHPC systems?).

Devito: High-Performance Imaging and Inversion Codes from Symbolic Computation and Python in Seconds

Gerard Gorman,Associate Professor,Imperial College London

Devito is a domain-specific language (DSL) and code generation framework for designing highly optimized finite difference kernels for use in inversion methods. Devito utilizes SymPy to allow the definition of operators from high-level symbolic equations and generates optimized and automatically tuned code specific to a given target architecture including ARM, GPUs, Power series, x86, and Xeon Phi. Devito is currently used in industry for petascale seismic imaging. Applications in other areas, such as medical imaging and scalable machine learning, are under development. Symbolic computation is a powerful tool that allows users to: build complex solvers from only a few lines of a Python DSL,use advanced code optimization methods to generate parallel high-performance code, and(Re)develop production-ready software in hours rather than months.

Inside NVC++ and NVFORTRAN

BryceLelbach,HPC Programming Models Architect, NVIDIA

Learn about the architecture and latest features in NVC++ and NVFORTRAN. We'll cover the details of an exciting new feature of NVC++ that will be announced earlier at GTC. We'll also discuss the latest developments in Standard Parallelism in C++ and Fortran. With the NVIDIA HPC compilers, programming GPUs has never been easier! Our session involves four programming models: ISO Standard, CUDA, OpenACC, and OpenMP; two languages: C++ and Fortran; and one tool chain: the NVIDIA HPC compiler.

Materials Design Toward the Exascale: Porting Electronic Structure Community Codes to GPUs

Andrea Ferretti - Senior Researcher and Chair of the MaX Executive Committee - CNR - Nanoscience Institute

(Video) GTC November 2021 Keynote Highlights with NVIDIA CEO Jensen Huang

Materials are crucial to science and technology, and connected to major societal challenges ranging from energy and environment to information and communication, and manufacturing. Electronic structure methods have become key to materials simulations, allowing scientists to study and design new materials before running actual experiments. The MaX Centre of Excellence — Materials design at the eXascale — is devoted to enable materials modeling at the frontiers of the current and future HPC architectures. MaX's action focuses on popular open-source community codes in the electronic structure field (Quantum ESPRESSO, Yambo, Siesta, Fleur, CP2K, and BigDFT). We'll discuss the performance and portability of MaX flagship codes, with a special focus on GPU accelerators. Porting on GPUs has been demonstrated (all codes released as GPU-ready) following diverse strategies to address both performance and maintainability, while keeping the community engaged.

Talks Related to GPU Hackathons

Unraveling the Universe with Petascale Graph Networks

Christina Kreisch,Graduate Student Researcher,Princeton University |Miles Cranmer,Graduate Student Researcher,Princeton University

Learn about using graph networks to find interpretable representations of physical laws in the universe with petabytes of data. We have 44,100 n-body simulations, each with over 20,000 nodes, where each node can be connected to ~30 other nodes. Leveraging NVIDIA’s toolkits and improving GPU utilization, we achieved over 8,000x speed-up in pre-processing. We'll demo our optimizations and graph network, which is built to back-propagate gradients through an entire simulation. One of the greatest challenges in astrophysics is understanding the relationship between galaxies and underlying parameters of the universe. Modeling such relationships with petabytes of data has been computationally prohibitive. We construct and train our graph network in an interpretable way, which allows us to contribute to existing theory by interpreting our graph network with symbolic regression in addition to providing new constraints on cosmological parameters. You don't need any particular prior knowledge for our session.

Scaling Graph Generative Models for Fast Detector Simulations in High-Energy Physics

Ali Hariri, Student -Graduate Research Assistant, American University of Beirut

Accurate and fast simulation of particle physics processes is crucial for the high-energy physics community. Simulating the particle showers and interactions in the detector is both time-consuming and computationally expensive. The main goal of a fast simulator in the context of the Large Hadron Collider (LHC) is to map the events from the generation level to the reconstruction level. Traditional detector fast simulation approaches based on non-parametric techniques can significantly improve the speed of the full simulation; however, they also suffer from lower levels of fidelity. For this reason, alternative approaches based on machine-learning techniques can provide faster solutions, while maintaining higher levels of fidelity. We'll introduce a graph neural network-based autoencoder model that provides effective reconstruction of detector simulation for LHC collisions.

Tutorials and Training

Zero to GPU Hero with OpenACC

Jeff Larkin,Senior Developer Technologies Software Engineer and OpenACC Technical Committee Chair, NVIDIA

Porting and optimizing legacy applications for GPUs doesn't have to be difficult when you use the right tools. OpenACC is a directive-based parallel programming model that enables C, C++, and Fortran applications to be ported to GPUs quickly while maintaining a single code base. Learn the basics for parallelizing an application using OpenACC directives and the NVIDIA HPC Compiler. Also learn to identify important parts of your application, parallelize those parts for the GPU, optimize data movement, and improve GPU performance. Become the a GPU Hero by joining this session.

(Video) Watching NVIDIA GTC (GPU Technology Conference) 2021 keynote replay (lots of new AI stuff!)


What is GPU Technology Conference? ›

The GPU Technology Conference and the global GTC event series offer valuable training as well as a showcase for the most vital work in the computing industry today including high performance computing, artificial intelligence and deep learning, healthcare, virtual reality, accelerated analytics, and self-driving cars.

What GTC 2021? ›

Description NVIDIA GTC 2021 is an annual NVIDIA developer conference. Event Type Conference. Industry Artificial Intelligence, GPU, Information Technology, Software. Event Organizers NVIDIA. Start Date Apr 12, 2021.

What is Cuda library? ›

NVIDIA® CUDA-X, built on top of NVIDIA CUDA®, is a collection of libraries, tools, and technologies that deliver dramatically higher performance—compared to CPU-only alternatives— across multiple application domains, from artificial intelligence (AI) to high performance computing (HPC).

What is GTC conference? ›

NVIDIA GTC (GPU Technology Conference) is a global AI conference for developers that brings together developers, engineers, researchers, inventors, and IT professionals. Topics focus on artificial intelligence (AI), computer graphics, data science, machine learning and autonomous machines.

What is Nvidia known for? ›

Nvidia Corporation is a technology company known for designing and manufacturing graphics processing units (GPUs). The company was founded in 1993 by Jen-Hsun "Jensen" Huang, Curtis Priem and Chris Malachowsky and is headquartered in Santa Clara, Calif.

What is Nvidia Maxine? ›

NVIDIA Maxine is a suite of GPU-accelerated AI SDKs and cloud-native microservices for deploying AI features that enhance audio, video, and augmented reality effects in real time. Maxine's state-of-the-art models create high quality effects that can be achieved with standard microphone and camera equipment.

What is Nvidia demand? ›

NVIDIA On-Demand is the home for NVIDIA resources from GPU Technology Conferences (GTCs) and other leading industry events. The content includes NVIDIA keynotes, technical and industry sessions, demos, research posters, and more.

What is the Nvidia inception program? ›

NVIDIA Inception is a free program designed to help your startup evolve faster through access to cutting-edge technology and NVIDIA experts, opportunities to connect with venture capitalists, and co-marketing support to heighten your company's visibility.

Does AMD have CUDA? ›

CUDA is limited to NVIDIA hardware.

Is CUDA only for NVIDIA? ›

Unlike OpenCL, CUDA-enabled GPUs are only available from Nvidia.

Can CUDA run on CPU? ›

A single source tree of CUDA code can support applications that run exclusively on conventional x86 processors, exclusively on GPU hardware, or as hybrid applications that simultaneously use all the CPU and GPU devices in a system to achieve maximal performance.

What is GTC duration? ›

What is Good 'Til Canceled (GTC) Good 'til canceled (GTC) describes a type of order that an investor may place to buy or sell a security that remains active until either the order is filled or the investor cancels it. Brokerages will typically limit the maximum time you can keep a GTC order open (active) to 90 days.

What is GTC in share market? ›

A Good-Til-Cancelled (GTC) order is an order to buy or sell a stock that lasts until the order is completed or canceled. Brokerage firms typically limit the length of time an investor can leave a GTC order open.

Does Apple use Nvidia chips? ›

Apple uses Nvidia chips in some of its products, including the MacBook Pro. Nvidia chips are known for their power efficiency and performance, making them a good choice for Apple's laptops.

Is Nvidia or AMD better? ›

Winner: Nvidia While AMD and Nvidia have superficial parity on most features, Nvidia's implementations are generally superior — and cost more. G-Sync, Reflex, DLSS, and NVENC all end up being at least slightly better than AMD's alternatives.

When did Apple stop using Nvidia? ›

The last Macs featuring an Nvidia GPU was in 2015. The video version differs slightly as it includes more personal ancedotes and asides. Appleinsider isn't my favorite source for Apple news as it's too evangelical, generally portraying Apple as the protagonist in its reporting.

What is Nvidia Jarvis? ›

What Is Jarvis? NVIDIA Jarvis is an application framework for multimodal conversational AI services that delivers real-time performance on GPUs.

How does Nvidia Maxine work? ›

Maxine provides it with real-time, AI-driven face and body tracking along with background removal. Artists can track and mask performers in a live performance setting for a variety of creative use cases — all using a standard camera feed and eliminating the challenges of special hardware-tracking solutions.

What is Nvidia Merlin? ›

NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA GPUs. It enables data scientists, machine learning engineers, and researchers to build high-performing recommenders at scale. Merlin includes tools to address common ETL, training, and inference challenges.

Why is GPU prices dropping? ›

However, the summer of 2022 saw a crypto crash, and a flood of used 30 series GPUs on the second-hand market. This also meant that there was drastically less demand for the current generation of NVIDIA's GPUs, a sudden drop from the high demand of 2021.

What is Prime display? ›

PRIME displays means those are connected to the intel/amd igpu, not the nvidia gpu thus can't be controlled by nvidia-settings. PRIME Synchronisation can be enabled using the kernel parameter. nvidia-drm.modeset=1. (Might not be working on 5.5/5.6 kernels) empbilly April 20, 2020, 5:47pm #3.

What is TensorRT? ›

NVIDIA® TensorRT, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.

How many start up members are in the Nvidia inception program? ›

More than 660 robotics startups are members of Inception, paving the next wave of AI, through digital and physical robots.

What is Nvidia Riva? ›

NVIDIA® Riva is a GPU-accelerated speech AI SDK for building and deploying fully customizable, real-time AI pipelines that deliver world-class accuracy in all clouds, on-premises, at the edge and on embedded devices. Download now Introductory resources.

What is Rapids Nvidia? ›

RAPIDS is a suite of open-source software libraries and APIs for executing data science pipelines entirely on GPUs—and can reduce training times from days to minutes. Built on NVIDIA® CUDA-X AI, RAPIDS unites years of development in graphics, machine learning, deep learning, high-performance computing (HPC), and more.

What is CUDA used for? ›

CUDA is a parallel computing platform and programming model for general computing on graphical processing units (GPUs). With CUDA, you can speed up applications by harnessing the power of GPUs.

What should I use CUDA for? ›

Using the CUDA Toolkit you can accelerate your C or C++ applications by updating the computationally intensive portions of your code to run on GPUs. To accelerate your applications, you can call functions from drop-in libraries as well as develop custom applications using languages including C, C++, Fortran and Python.

What is CUDA in Python? ›

CUDA Python provides uniform APIs and bindings for inclusion into existing toolkits and libraries to simplify GPU-based parallel processing for HPC, data science, and AI. CuPy is a NumPy/SciPy compatible Array library from Preferred Networks, for GPU-accelerated computing with Python.

Do I need CUDA? ›

Cuda needs to be installed in addition to the display driver unless you use conda with cudatoolkit or pip with cudatoolkit. Tensorflow and Pytorch need the CUDA system install if you install them with pip without cudatoolkit or from source.


1. Nvidia's massive GTC 2021 press conference in 17 minutes
2. My 3 Picks for the FREE GPU Technology Conference (GTC21) Presentations
(Jeff Heaton)
3. WATCH: Nvidia GTC with Jensen Huang - Replay Livestream
(CNET Highlights)
4. NVIDIA GTC 2021 | GPU Technology Conference (GTC) November 2021 | NVIDIA DLI Course Giveaway
(Bhavesh Bhatt)
5. Watch Nvidia's Deep Fake of CEO Jensen Huang at GTC (Behind the Scenes)
(CNET Highlights)
6. Nvidia GTC Conference | Nvidia Conference 2021 | Nvidia Artificial Intelligence | Nvidia GTC 21 |🔥
(Jeffrey Marshall)

Top Articles

Latest Posts

Article information

Author: Aron Pacocha

Last Updated: 11/07/2022

Views: 5669

Rating: 4.8 / 5 (68 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Aron Pacocha

Birthday: 1999-08-12

Address: 3808 Moen Corner, Gorczanyport, FL 67364-2074

Phone: +393457723392

Job: Retail Consultant

Hobby: Jewelry making, Cooking, Gaming, Reading, Juggling, Cabaret, Origami

Introduction: My name is Aron Pacocha, I am a happy, tasty, innocent, proud, talented, courageous, magnificent person who loves writing and wants to share my knowledge and understanding with you.