DornerWorks

Accelerate Development of High-Performance Products for Aerospace and Defense on Powerful FPGA Platforms

Posted on February 6, 2023 by Matthew Russell

In the aerospace and defense industry, size, weight, and power (SWaP) are critical factors in the design and development of high-performance systems.

This is one of the reasons why our engineers focus on helping customers accelerate development of these systems on powerful FPGA-based platforms, specifically the RSFoC platform.

The RSFoC platform is a powerful and flexible FPGA-based system that enables high-performance radar systems at reduced SWaP. This is achieved through its high-performance parallel processing capabilities, which allow for real-time processing of large amounts of data. This reduces the need for bulky and power-hungry processors, thus enabling smaller and more efficient systems.

DornerWorks engineers are developing a component of one of those systems using the RFSoC as a foundation. As an AMD Xilinx Premier partner, DornerWorks has served as a launch partner in accelerating applications for the RFSoC and Versal platforms.

CASE STUDY: Reduced SWaP with AMD Xilinx RFSoC

Brian Kachala is an Embedded Software Engineer at DornerWorks
Juan Morales is an FPGA engineer at DornerWorks

DornerWorks engineer Brian Kachala implemented the open source algorithm DBSCAN on the RSFoC. Along with DornerWorks Senior FPGA Engineer Juan Morales, the team developed a component that makes up part of a processing chain for an RF system.

DBSCAN is a density-based clustering algorithm that is well-suited for finding clusters of arbitrary shape in large datasets. This makes it an ideal choice for radar processing systems, which often have to deal with large amounts of data and complex signal patterns.

When implemented on the RSFoC, DBSCAN can take advantage of the FPGA’s high-performance parallel processing capabilities to quickly and efficiently analyze radar data. This allows for real-time processing of large amounts of data, reducing the need for bulky and power-hungry processors.

In addition to its performance benefits, DBSCAN also has the added advantage of being open source. This means that it can be easily integrated into existing systems and customized to meet specific requirements. This flexibility and scalability make DBSCAN a valuable tool for any company looking to develop high-performance radar processing systems.

What solution did DornerWorks engineers implement?

DornerWorks engineers looked at the scikit implementation in Python and the mlpack implementation in C++.

To help software developers take full advantage of its powerful FPGA devices, Xilinx provides a tool called High-Level Synthesis (HLS) that allows developers to write C, C++, or SystemC code and then automatically generate FPGA-optimized RTL (register-transfer level) code.

“We started off with those as a base and I used the HLS tool to translate those algorithms into code that’s able to be synthesized,” Kachala says. “You have to do some translation because there are some constructs that aren’t available in C or aren’t available in a format that the tool understands how to how to use.”

Kachala translated the ML algorithms and added optimizations using the HLS tool. Morales completed the FPGA portion of the development in Vivado, while ensuring the system met timing requirements.

Morales worked to improve the timing of the system, which runs at 150 megahertz on the default setting. He brought all the code together and added make file scripts for building the Vivado project and the Vitis bare metal app.

What does it take to port the system to the AMD Xilinx Versal platform?

Both the RFSoC and Versal platforms are designed to handle large amounts of data in real-time, making them well-suited for high-performance applications such as radar and communications systems. Both platforms have built-in support for AI and Machine Learning, which allows for efficient implementation of complex algorithms and neural networks. And, both platforms are designed for power-efficient operation, making them well-suited for resource constrained systems.

On the engineering end, it only takes a few modifications to the Vivado project, targeting a Versal board instead of the RFSoC architecture. Running this system on the Versal board adds the potential for other powerful AI/ML integrations with the Versal AI Core.

How does the HLS tool help developers implement algorithms on an FPGA?

One of the key benefits of HLS is that it allows software developers to work at a higher level of abstraction, without needing to have detailed knowledge of the underlying hardware. This makes it easier to write and debug code, and also makes it possible to quickly prototype and test different algorithms. Additionally, HLS can automatically optimize the generated RTL code for performance, area, and power, making it possible to achieve faster and more efficient implementations than would be possible with manual RTL coding.

To get started with HLS, software developers will first need to have a Xilinx development environment set up, including the Xilinx Vivado Design Suite. Once this is done, developers can write their C, C++, or SystemC code and then use the HLS tool to generate the RTL code for the FPGA. The HLS tool also includes a built-in simulator that can be used to test and debug the generated RTL code, making it easy to ensure that the implementation is working as expected.

“The tool will give you guidance on different ways that the current algorithm state can be improved,” Kachala says. “That is super helpful.”

The HLS tool can show a developer where efficiencies and improvements can be made, or point out bottlenecks or restrictions.

“The tool definitely helped me in that sense,” Kachala says. “On top of it, I’m not very fluent in RTL language. It would take more learning on my part to actually just get it working on an FPGA.”

As an embedded software engineer, Kachala’s background is primarily in C++, Python And higher level C.

“It was interesting seeing the initial implementation, how it estimates the runtime and the resource utilization,” he says. “My initial iteration of it was about 10 times slower than running it in software. After going through multiple iterations and improvements, I got it between 20 to 50 times quicker than what the software implementation could have been. It’s just really cool seeing the improvement that you could get with [HLS].”

Morales concurs, noting that pragmas in HLS are helpful in getting the desired end result, when the required code for the HLS tool to infer the desired hardware isn’t always intuitive.

One important consideration when working with HLS is that not all C, C++, or SystemC code can be automatically translated into RTL code. The HLS tool has certain limitations and constraints, so it is important for developers to understand these and design their code accordingly. For example, HLS does not support dynamic memory allocation or recursion, so these features need to be avoided in the source code. Additionally, HLS does not support certain features of C++, such as virtual functions and templates, so developers should be familiar with these limitations before starting to use the tool.

By allowing developers to work at a higher level of abstraction and automatically generating FPGA-optimized RTL code, HLS can significantly speed up the development process and help achieve faster and more efficient implementations. However, it is important to understand the limitations of HLS and design the code accordingly.

If you need help accelerating development on an FPGA-based project, or think and FPGA can bring higher performance to your system, schedule a meeting with our team and turn your ideas into reality.

Matthew Russell
by Matthew Russell