# Automotive High Performance Computing

At 22nd International Forum on Advanced Microsystems for Automotive Applications (AMAA 2018)

11-12 September 2018 Berlin, Germany v13d

Knut Hufeld, Infineon Technologies AG Knut.hufeld@infineon.com +49 89 234 52653

Prof. Jürgen Becker, KITDr. Dominik Reinhardt, BMW GroupDr. Matthias Traub, BMW GroupProf. Mladen Berekovic, Uni zu Lübeck

#### restricted

















4 Example - EPI European Processor Initiative

5

Closing remark





(1) https://www.intel.com/content/www/us/en/automotive/driving-safety-advanced-driver-assistance-systems-self-driving-technology-paper.html

#### What will cars be like in the future? Automotive systems are getting more and more complex



Accelerators for autonomous driving and big data

- Security

 AI (e.g. Fusion), Neuronal Networks

> Enabling next generation of computing platforms 2025 and beyond

Special automotive requirements (e.g. extended temperatur ranges, ...) have to be considered. High-integration of different in-vehicle domains (e.g chassis, body, AD, ..)

Hardware and software solutions

Processes, Methods and Tools

Scalabe Safety

Architecture Fail-safe → Fail-

operational (ASIL B  $\rightarrow$  ASIL D)

#### What will cars be like in the future? Pursuing dependability





#### What will cars be like in the future? Key applications relay on sufficient compute performance





Source: AUTOMOTIVE SOFTWARE TECHNOLOGY - SHAPING TOMORROW'S ECOSYSTEM AUTOMOTIVESOFTWARE TECHNOLOGY - SHAPING TOMORROW'S ECOSYSTEM, C. Grote, 2017.

What will cars be like in the future? Degree of automation will directly depend on eHPC







embedded High Performance Computing



Increasing demand of computing power: **1000x** 

HPC

High Performance Computing



Increasing demand of computing power: **100 000x** 

#### What will cars be like in the future? Multi-Core is set on the Technology-Roadmap ...





#### What will cars be like in the future? Challenge flexibility and dynamic operation



#### **Reconfigurable Architectures:**

- Integration of accelerators / Co-Processors
- Fully customizable
- Dynamically reconfigurable
- Pure HW-Description



### Heterogene MPSoC (e.g. Xilinx Zynq):

- Flexibility & Dynamic Integration of accelerators / Co-Processors
- Integrated COTS Multicore
- Increased customizability
- SW- und HW-Development (HW-SW Co-Design)



#### What will cars be like in the future? Research Challenges of Embedded Multicores



- Common resources shared between different execution units can lead to system dysfunction (malfunctions or loss of functions) caused by:
  - Time interferences (determinism issues)
  - Space interferences (segregation issues)
  - Common Cause Failures (e.g. SEE)
  - Race Condition
- Issues depend on multicore architecture:
  - Mono-Bus / Multi-Bus / Crossbar / NoC / etc.
  - Core local memory or only shared memory
  - Lock-Step-Mode core / end2end ECC / etc.

- Mitigations needed for safe and secure usage (per SW or HW):
  - Failure Detection: Monitoring, Voting
  - Failure Isolation: Partitioning, Time Slicing / Deadlines, Budgeting
  - Failure Correction: Function Recovery, Redundancy, Architectural Patterns



- Embedded FPGA for Safety, Security and Determinism
- Connection to "Monitoring Infrastructure"
- Configurable "Hardware Support" for Determinism
  - Especially for Access to Peripherals
- Fail-Operational Hardware Support
  - By dynamic Reconfiguration
  - By dynamic Redundancy
  - Dynamic Migration
- Hardware Updates in Field









Content

#### Motivation – eHPC in Automotive Innovation driver Autonomous Driving



- Performance challenges in different domains
  - Computer vision
    - Recognition (e.g. Intel i7)
    - Image classification (Xilinx Everest)
    - Semantic segmentation (NVIDIA PX2)
  - Data fusion
    - Cameras
    - Lidar
    - Radar
  - Connectivity / 5G (Intel Go)



#### Impute extract features encode in a vector Feature Extraction Impute Preature Extraction "Hierarchical learning" Feed into classifier (Support Vector Machine or similar) "Hierarchical learning" Feed into classifier (Support Vector Machine or similar) Impute Weights Weights (pixel, edges, textons, parts, objects) Lo Impute La



Taken from (1).





- (1) https://newsroom.intel.de/news/sensors-the-eyes-and-ears-of-autonorhous-formations/
- (2) https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8237683
- (3) Xilinx, Inc. Architectures for Accelerating Deep Neural Networks.
- (4) https://blogs.nvidia.com/blog/2016/01/05/eyes-on-the-road-how-autonomous-cars-understand-what-theyre-seeing/

Taken from (4).

#### Motivation – eHPC in Automotive Innovation driver Economic Importance



- Processors for ADAS will be a major source of revenue for semiconductor manufacturers
- Need to perform an increasing number of safety-critical and computation-intensive tasks
- State-of-the art solutions are unable to meet these requirements

For semiconductor companies, processors and optical semiconductors are expected to account for most hardware revenues for advanced driver-assistance systems in 2025.



<sup>1</sup>Figures may not sum to 100%, because of rounding.

<sup>2</sup>Autonomous emergency braking, adaptive cruise control, and forward-collision warning. <sup>3</sup>Includes, among other categories, back-side monitoring and traffic-signal recognition.

McKinsey&Company

Taken from (1).

(1) https://www.mckinsey.com/industries/semiconductors/our-insights/advanced-driver-assistance-systems-challenges-and-opportunities-ahead







Snapshot - Current solutions for eHPC Intel – CPU combined with FPGA

- Prototypes of autonomous cars usually built on top of conventional hardware
- High-end server CPUs combined with programmable FPGA logic
  - e.g. Intel Xeon 6138P



### Skylake + FPGA on Purley



- Power for FPGA is drawn from socket & requires modified Purley platform specs
- Platform Modifications include Stackup, Clock, Power Delivery, Debug, Power up/down sequence, Misc IO pins (see BOM cost section)

Taken from (2).

- Taken from (1).
- (1) Jennifer Huffstetler. Intel Processors and FPGAs—Better Together. 2018.
- (2) Ian Cutress. Intel Shows Xeon Scalable Gold 6138P with Integrated FPGA, Shipping to Vendors. 2018.





- Lightweight ARM+FPGA solutions
  - General purpose CPU for non-critical software
  - Complex operations mapped to FPGA (includes Machine Learning, Vision)
  - Functional safety assured
  - Low power design feasible

- Available products include
  - Xilinx Zynq, UltraScale MPSoC
  - Intel Cyclone
- Insufficient CPU performance for HPC
   e.g. Zynq-7000: Cortex-A9



Most Responsive and Reconfigurable

https://www.xilinx.com/products/design-tools/embedded-vision-zone.html

# Xilinx Everest

- Heterogonous architecture
  - Compute efficiency
  - Reduced power
  - Software programmable
- SW programmable engine
  - Domain specific architecture
  - Hardened 7nm technology
  - Throughput-oriented, low-latency
- Programmable logic
  - 'Soft' logic
  - Flexibility
  - Custom memory hierarchy
- (1) Juanjo Noguera et al. HW/SW Programmable Engine: Domain Specific Architecture for Project Everest. Xilinx, Inx. 2018.



Taken from (1).





GenomicsStorage

Database

Network IP

Risk modeling

Custom

Memory

Hierarchy

Feature

Volume

Map

Data



- NVIDIA Drive PX series
  - Computer boards for deep learning and autonomous driving
  - Based on Maxwell, Pascal, and Volta **GPU Microarchitecture**

Jan 2016

Drive PX 2

AutoChauffeur

Up to ASIL D



Jan 2015

Drive PX



# NVIDIA Drive PX 2 AutoChauffeur (2016)

- 2x Tegra SoC, featuring each
  - 4x Cortex-A57
  - 2x Denver core
  - Pascal iGPU
  - 8GB LPDDR4 (50+ GB/s from CPU/iGPU)
- 2x Pascal discrete GPUs
  - 4GB GDDR5 (80+ GB/s from GPU)
- AURIX safety microprocessor
- Performance & Power
  - 24 DL TOPS, 8 TFLOPS
  - 250W TDP (board)

#### e.g. used by Tesla for autonomous driving

(1) https://videocardz.com/58800/nvidia-drive-px-2-has-pascal-gpu-with-4gb-gddr5-memory



Taken from (1).





#### NVIDIA Drive PX 2



(1) Pradeep Kumar Gupta. An overview of NVIDIA's autonomous vehicles platform. NVIDIA. 2017.

- NVIDIA Drive PX Pegasus
  - 2x Tegra Xavier SoC
    - 8-core NVIDIA custom Carmel ARM64
    - 1x Volta iGPU (512 CUDA cores)
  - 2x next-gen dGPU
  - AURIX TC3xx Safety Processor
  - 16 GB LPDDR4
  - Designed for ASIL D
  - Performance & Power
    - 320 TOPS
    - 500W TDP (Board)



#### 2018-09-11 restricted

## Snapshot - Current solutions for eHPC zFAS: AUDI Driver Assistance Platform

- Advanced Driver Assistance System Platform
  - Presented at CES 2014
  - 2017: in Audi A8 series
  - Used to enable level 3 autonomous driving
- Contains

Taken from (1).

- NVIDIA Tegra K1 (later to be replaced by Tegra X1)
  - Quad-core Cortex-A15 with additional NVIDIA low-power processing unit
  - Kepler GPU (192 CUDA cores)
- Infineon Aurix
- Altera Cyclone FPGA
- MobilEye EyeQ3







| Device                  | TOPS | ASIL  | DMIPS | TFLOPS | Remarks                                                               |
|-------------------------|------|-------|-------|--------|-----------------------------------------------------------------------|
| NVIDIA Drive PX Xavier  | 30   | С     | -     | 6.3    | A complete SoC with a GPU, CPU and accelerators.                      |
| NVIDIA Drive PX Pegasus | 320  | C / D | _     | N/A    | A computing platform.<br>Contains an AURIX TC3xx<br>safety processor. |
| AURIX TC39x             | -    | D     | 600   | -      | Fully compliant to ISO 26262.                                         |
| Intel Go                | 100  | C / D | _     | 1.5    | Supports dedicated accelerator cards (e.g. FPGA).                     |
| Intel Xeon Platinum     | 5.2  | QM    | 5000  | 3.6    | Not designed for safety applications.                                 |
| KALRAY MPPA             | 8    | В     | _     | 5      | Up to 1024 cores.                                                     |







# Example - EPI European Processor Initiative Positioning the Project





- European Approach for HPC-Technology for Exascale Super Computer
- HPC General Purpose Processor (GPP) by Bull/CEA
- Total: 120 M€ , 23partners, 4years, start: Oct 18
- eHPC GPP for Automotive, total 20M€ → AD
- Automotive core group:





#### Example - EPI European Processor Initiative Enabler next digitalization step in automotive E/E-Architecture



#### Example - EPI European Processor Initiative Challenges for E/E-Architecture and development process





#### Example - EPI European Processor Initiative CE-World CPUs enable highly performant compute platforms



Semiconductors "CE driven" are becoming more and more powerful.

Across all industries the capability becomes apparent especially for performant processors.

Powerful operating systems are increasingly used in automotive electronic control units.

μCs

Copyright © Infineon Technologies AG 2018. All rights reserved.

#### Example - EPI European Processor Initiative Automotive eHPC platform, 1st reference implementation 2020



\* eHPC – embedded HighPerformanceComputing

Copyright © Infineon Technologies AG 2018. All rights reserved.

#### Example - EPI European Processor Initiative EPI Demonstrator based on MODULAR COMPUTING PLATFORM







#### EPI-Demonstrator











Even if we will not drive totally autonomously, completely connected and fully electric by 2030 - does it mean that we have failed?

No! - any step taking us closer to ambitious vision should be recognized as a success!

... Active safety ... ADAS ... Highly automated driving in defined environments ...

.. and it is one thing to draw a big visionary picture, but it is another to accomplish the necessary details.



First of all, technology is not an end in itself!

...we do have to develop mature solutions and use cases that are ready for everyday use.

Research area: Artificial Intelligence; the topic is challenging enough.

Gives opportunity for industry to benefit from academia and institutes that are specialized in the field.



Solutions for established infrastructures in Germany, France or Sweden? The established countries only represent a few percent of mankind.

Already today, European car makers act globally; and their cars run in the streets of Sao Paolo, Cairo or Bombay.

Emerging markets will become more and more important. New technologies must be flexible and meet different standards and infrastructures.

#### Apart from that:

Never forget.. it takes 100% effort to achieve 90% of automation. ..to scale it up to full automation, it takes another 100% on top. The last mile is always the hardest part of the run.

# Thank you very much for your kind attention!

Part of your life. Part of tomorrow.

