Samples Overview

Characteristics

Name	Main Characteristics / Demonstrated Features
3D Diffusion	Memory Bandwidth bounded stencil code, full time integration on device. Uses Pointers for device memory swap between timesteps.
Particle Push	Computationally bounded, full time integration on device. Uses Pointers for device memory swap between timesteps. Demonstrates high speedup for trigonometric functions on GPU.
Poisson on FEM Solver with Jacobi Approximation	Memory bandwidth bounded Jacobi stencil code in a complete solver setup with multiple kernels. Reduction using GPU compatible BLAS calls. Uses Pointers for device memory swap between iterations.
MIDACO Ant Colony Solver with MINLP Example	Heavily computationally bounded problem function, parallelized on two levels for optimal distribution on both CPU and GPU. Automatic privatization of 1D code to 3D version for GPU parallelization. Data is copied between host and device for every iteration (solver currently only running on CPU).
Simple Stencil Example	Stencil code.
Stencil With Local Array Example	Stencil code with local array. Tests Hybrid Fortran's array reshaping in conjunction with stencil codes.
Stencil With Passed In Scalar From Array Example	Stencil code with a scalar input that's being passed in as a single value from an array in the wrapper.
Parallel Vector and Reduction Example	Separate parallelizations for CPU/GPU with unified codebase, parallel vector calculations without communication. Automatic privatization of 1D code to 3D version for GPU parallelization. Shows a reduction as well.
Simple OpenACC Example	Based on Parallel Vector Example, shows off the OpenACC backend and using multiple parallel regions in one subroutine.
OpenACC Branching Example	Based on the OpenACC example, texts branches around parallel regions implemented using OpenACC.
OpenACC Module Data Example	Tests different ways of using module data with an OpenACC implementation.
OpenACC with Hybrid Code (Device + Host code callable) Example	Hybrid Fortran kernel subroutines are be callable from host-only-code when using the OpenACC implementation. This feature is demonstrated by this example.
Mixed Implementations Example	Tests the @scheme directive which can be used to have different implementations for different parts of your code.
Strides Example	Like parallel vector example, uses blocking of data domain (in case GPU memory is too small).
Tracing Example	Tests different real- and integer data type kernels with the tracing implementation, automatically tracking down errors.
Early Returns Example	Tests different return statements within your kernels.
Array Accessor Functions Example	Tests more complicated array access patterns like 'a(min(n_max,i),j)' with the Hybrid Fortran parser.
5D Parallel Vector Example	Tests parallel (in two dimensions) computation of up to 5D data in different configurations. This is used to emulate the data setup of many physical processes packages.
Simple Weather	A unscientifically simple weather model, accelerated with Hybrid Fortran, used as an academic example to explain the framework.

Link to Sources, Available Versions and Implementation Accuracy

Name	Source	Root Mean Square Error Bounds	Reference C Implementation (OpenACC + OpemMP)	Reference CUDA C Implementation	Reference Fortran Implementation (OpenACC)
3D Diffusion	Link	1E-8 [3]	Yes	Yes	Yes
Particle Push	Link	1E-11	Yes	Yes	Yes
Poisson on FEM Solver with Jacobi Approximation	Link	1E-07 [1]	No	No	No
MIDACO Ant Colony Solver with MINLP Example	Link	1E-3 [3]	No	No	No
Simple Stencil Example	Link	1E-8	No	No	No
Stencil With Local Array Example	Link	1E-8	No	No	No
Stencil With Passed In Scalar From Array Example	Link	1E-8	No	No	No
Parallel Vector and Reduction Example	Link [2]	1E-8	No	No	No
Simple OpenACC Example	Link	1E-8	No	No	No
OpenACC Branching Example	Link	1E-8	No	No	No
OpenACC Module Data Example	Link	1E-8	No	No	No
OpenACC with Hybrid Code (Device + Host code callable) Example	Link	1E-8	No	No	No
Mixed Implementations Example	Link	1E-8	No	No	No
Strides Example	Link	1E-8	No	No	No
Tracing Example	Link	1E-8	No	No	No
Early Returns Example	Link	1E-8	No	No	No
Array Accessor Function Example	Link	1E-8	No	No	No
5D Parallel Vector Example	Link	1E-8	No	No	No
Simple Weather	Link	n/a	No	No	No

[1]: Number of iterations to achieve this error level depends on problem domain sizes. The provided value is an upper bound for the error value after an unspecified long runtime - it 'eventually' converges. Note then that this solver's algorithm is not good enough for production use, it is only included for demonstration purposes here.

[2]: Example obtained when typing 'make example' in the Hybrid Fortran directory.

[3]: Compared to analytic solution

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overview.md

Overview.md

Samples Overview

Characteristics

Link to Sources, Available Versions and Implementation Accuracy

Files

Overview.md

Latest commit

History

Overview.md

File metadata and controls

Samples Overview

Characteristics

Link to Sources, Available Versions and Implementation Accuracy