Achtung:

Sie haben Javascript deaktiviert!
Sie haben versucht eine Funktion zu nutzen, die nur mit Javascript möglich ist. Um sämtliche Funktionalitäten unserer Internetseite zu nutzen, aktivieren Sie bitte Javascript in Ihrem Browser.

High Performance Computing

We co-operate with the group of Prof. Dr. Christian Plessl and the PC2 to develop tools for efficient use of parallel hardware as FPGAs and GP-GPUs for scientific computing .

Next topic: Semiconductor Theory

Related publications by the TET group


Open list in Research Information System

Solving Maxwell's Equations with Modern C++ and SYCL: A Case Study

A. Afzal, C. Schmitt, S. Alhaddad, Y. Grynko, J. Teich, J. Förstner, F. Hannig, in: Proceedings of the 29th Annual IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), 2018, pp. 49-56

In scientific computing, unstructured meshes are a crucial foundation for the simulation of real-world physical phenomena. Compared to regular grids, they allow resembling the computational domain with a much higher accuracy, which in turn leads to more efficient computations.<br />There exists a wealth of supporting libraries and frameworks that aid programmers with the implementation of applications working on such grids, each built on top of existing parallelization technologies. However, many approaches require the programmer to introduce a different programming paradigm into their application or provide different variants of the code. SYCL is a new programming standard providing a remedy to this dilemma by building on standard C ++17 with its so-called single-source approach: Programmers write standard C ++ code and expose parallelism using C++17 keywords. The application is<br />then transformed into a concrete implementation by the SYCL implementation. By encapsulating the OpenCL ecosystem, different SYCL implementations enable not only the programming of CPUs but also of heterogeneous platforms such as GPUs or other devices. For the first time, this paper showcases a SYCL-<br />based solver for the nodal Discontinuous Galerkin method for Maxwell’s equations on unstructured meshes. We compare our solution to a previous C-based implementation with respect to programmability and performance on heterogeneous platforms.<br


    OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes

    T. Kenter, G. Mahale, S. Alhaddad, Y. Grynko, C. Schmitt, A. Afzal, F. Hannig, J. Förstner, C. Plessl, in: Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM), IEEE, 2018

    The exploration of FPGAs as accelerators for scientific simulations has so far mostly been focused on small kernels of methods working on regular data structures, for example in the form of stencil computations for finite difference methods. In computational sciences, often more advanced methods are employed that promise better stability, convergence, locality and scaling. Unstructured meshes are shown to be more effective and more accurate, compared to regular grids, in representing computation domains of various shapes. Using unstructured meshes, the discontinuous Galerkin method preserves the ability to perform explicit local update operations for simulations in the time domain. In this work, we investigate FPGAs as target platform for an implementation of the nodal discontinuous Galerkin method to find time-domain solutions of Maxwell's equations in an unstructured mesh. When maximizing data reuse and fitting constant coefficients into suitably partitioned on-chip memory, high computational intensity allows us to implement and feed wide data paths with hundreds of floating point operators. By decoupling off-chip memory accesses from the computations, high memory bandwidth can be sustained, even for the irregular access pattern required by parts of the application. Using the Intel/Altera OpenCL SDK for FPGAs, we present different implementation variants for different polynomial orders of the method. In different phases of the algorithm, either computational or bandwidth limits of the Arria 10 platform are almost reached, thus outperforming a highly multithreaded CPU implementation by around 2x.


      Flexible FPGA design for FDTD using OpenCL

      T. Kenter, J. Förstner, C. Plessl, in: Proc. Int. Conf. on Field Programmable Logic and Applications (FPL), IEEE, 2017

      Compared to classical HDL designs, generating FPGA with high-level synthesis from an OpenCL specification promises easier exploration of different design alternatives and, through ready-to-use infrastructure and common abstractions for host and memory interfaces, easier portability between different FPGA families. In this work, we evaluate the extent of this promise. To this end, we present a parameterized FDTD implementation for photonic microcavity simulations. Our design can trade-off different forms of parallelism and works for two independent OpenCL-based FPGA design flows. Hence, we can target FPGAs from different vendors and different FPGA families. We describe how we used pre-processor macros to achieve this flexibility and to work around different shortcomings of the current tools. Choosing the right design configurations, we are able to present two extremely competitive solutions for very different FPGA targets, reaching up to 172 GFLOPS sustained performance. With the portability and flexibility demonstrated, code developers not only avoid vendor lock-in, but can even make best use of real trade-offs between different architectures.


        Accelerating Finite Difference Time Domain Simulations with Reconfigurable Dataflow Computers

        H. Giefers, C. Plessl, J. Förstner, ACM SIGARCH Computer Architecture News (2014), 41(5), pp. 65-70


        Convey Vector Personalities – FPGA Acceleration with an OpenMP-like Effort?

        B. Meyer, J. Schumacher, C. Plessl, J. Förstner, in: Proc. Int. Conf. on Field Programmable Logic and Applications (FPL), IEEE, 2012, pp. 189-196

        Although the benefits of FPGAs for accelerating scientific codes are widely acknowledged, the use of FPGA accelerators in scientific computing is not widespread because reaping these benefits requires knowledge of hardware design methods and tools that is typically not available with domain scientists. A promising but hardly investigated approach is to develop tool flows that keep the common languages for scientific code (C,C++, and Fortran) and allow the developer to augment the source code with OpenMPlike directives for instructing the compiler which parts of the application shall be offloaded the FPGA accelerator. In this work we study whether the promise of effective FPGA acceleration with an OpenMP-like programming effort can actually be held. Our target system is the Convey HC-1 reconfigurable computer for which an OpenMP-like programming environment exists. As case study we use an application from computational nanophotonics. Our results show that a developer without previous FPGA experience could create an FPGA-accelerated application that is competitive to an optimized OpenMP-parallelized CPU version running on a two socket quad-core server. Finally, we discuss our experiences with this tool flow and the Convey HC-1 from a productivity and economic point of view.


          Transformation of scientific algorithms to parallel computing code: subdomain support in a MPI-multi-GPU backend

          B. Meyer, C. Plessl, J. Förstner, in: Symp. on Application Accelerators in High Performance Computing (SAAHPC), IEEE Computer Society, 2011, pp. 60-63


          Open list in Research Information System

          Funding, co-operations

          Contact

          Dr. Yevgen Grynko

          Theoretical Electrical Engineering

          Yevgen Grynko
          Phone:
          +49 5251 60-3815
          Fax:
          +49 5251 60-3524
          Office:
          P 1.5.02.2

          Samer Alhaddad

          Theoretical Electrical Engineering

          Samer Alhaddad
          Phone:
          +49 5251 60-3819
          Office:
          P1.5.02.2

          Head of the group

          Prof. Dr. Jens Förstner

          Theoretical Electrical Engineering

          Jens Förstner
          Phone:
          +49 5251 60-3013
          Fax:
          +49 5251 60-3524
          Office:
          P1.5.01.1

          Office hours:

          on request (during lecture break)

          The University for the Information Society