![](https://assets.isu.pub/document-structure/200903140837-f2e0b329ba51ae2e7a82a286a2f8c9bd/v1/1661e5a787af7012f18fbea5a91b1b5a.jpg?width=720&quality=85%2C50)
11 minute read
High-level FPGA programming for nanosecond 44 High-level FPGA programming for nanosecond timing in terabit communicationtiming in terabit communication
HIGH-LEVEL FPGA PROGRAMMING FOR NANOSECOND TIMING IN TERABIT COMMUNICATION
Next-generation data communication using laser signals between ground stations and satellites will be at the terabit per second level. Given the high demands on data quality and processing speed, wavefront sensors and FPGAs are essential ingredients of the required communication terminals. Demcon and Qbaylogic demonstrate the potential of high-level functional FPGA programming.
Advertisement
Jan Kuper Joost Kauffman
Within the Tomcat project (Terabit Optical Communication Adaptive Terminal), part of the European Space Agency’s Artes Strategic Program Line Scylight, TNO is in charge of developing an optical ground station, including an optical ground terminal (Figure 1). From a satellite, a terminal will receive laser signals that are affected by atmospheric conditions such as temperature variations and turbulence, which induce deformations of the beam’s wavefront. Adaptive optics can counteract these deformations using a segmented deformable mirror in which each segment is individually actuated based on the input provided by a wavefront sensor.
TNO and Demcon jointly built a wavefront sensor upgrade to the high sample rate (5 kHz) required for this application. It’s one of the laser communication instruments developed and marketed by the Enschede-based company with the Dutch FSO instruments consortium, supported by the knowledge institute. This project involved dedicated optical hardware and data-processing software, implemented in FPGAs, as 5 kHz sampling couldn’t be achieved on a PC.
The wavefront sensor comprises an array of lenslets that each focus part of the incoming signal on a subregion of camera sensor pixels. To calibrate the mirror segments, 256 two-dimensional points of gravity of the corresponding subregions of each image have to be calculated on an FPGA. The camera sends the image to the FPGA line after line, in packages of 8 pixels of 12 bits each, together with some control bits for validity and end-of-line, at a rate of
Figure 1: Within the Tomcat project, an optical ground station, including an optical ground terminal, for laser satellite communication is being developed.
80 MHz – whereas the Tomcat specification requires a 200 MHz FPGA rate. Hence, on average, one package of pixels arrives every 2.5 cycles of the FPGA clock. The transfer of a full image takes 157 μs and the calculations have to be completed within 3 μs after the last package has arrived.
Haskell
Demcon engaged University of Twente (UT) spinoff Qbaylogic, because of its design methodology for
high-level FPGA programming. The advantages are that dependencies in the various processes, for example regarding the order in which the data are received or transmitted, can be dealt with adequately and exact timing, down to the nanosecond level, can be achieved. This high-level functional methodology offers a fast design process with full control over FPGA code efficiency.
A central element in the design methodology, based on the functional programming language Haskell, is the open-source compiler Clash, the result of over 10 years of (ongoing) UT research. It translates a functional specification written in Haskell into any of the standardized hardware description languages (HDLs, such as VHDL, Verilog and System Verilog). Haskell isn’t commonly used in industry, one of the reasons being that it offers less control over CPU performance than, for example, C. However, on an FPGA, the situation is the other way around: with Haskell, a designer has more control over performance than with C/ C++. The reason is that a functional Haskell specification is structural, while a C/C++ program describes behavior rather than structure. Hence, Clash generates HDL code in a structure-preserving way, whereas a high-level synthesis tool starting from C/C++ needs transformations that are hard to grasp and may generate unintended hardware with unpredictable performance.
Worthwhile aspects of the methodology include its fundamental
![](https://assets.isu.pub/document-structure/200903140837-f2e0b329ba51ae2e7a82a286a2f8c9bd/v1/5f3139680955d37af93e42719937b13f.jpg?width=720&quality=85%2C50)
ly model-based character and the availability of adequate abstraction mechanisms, among which are higher-order functions, embedded languages and typing mechanisms. Another advantage is that cycleaccurate simulation on a functional level is possible at all design stages.
Correctness
Haskell, close to mathematics, is suitable for expressing the model of an application. In the Tomcat project, this was the basis for fast and effective communication about the precise functionality definition. In addition, such a model is executable, so that its correctness can be checked. The model then is the starting point for the design of an FPGA architecture that has the same functionality and meets the performance requirements. Note that all design steps are performed in the same language and hence are executable and testable. This increases the productivity (and the satisfaction) of the designer and greatly contributes to the correctness of the design.
A first abstraction mechanism is higher-order functions (HOFs) – “higher-order” because they take another function as an argument. HOFs express architectural patterns in which a given function is used repetitively. For example, in Tomcat, the parallel pairwise multiplication of a vector ps of pixels with a vector is of indices can be formulated using the HOF zipWith as zipWith (*) ps is (see also Figure 2). The first argument of zipWith can be any bi
Figure 2: The architectural
pattern of zipWith,
together with another higherorder function,
fold, which
expresses an accumulative application of a binary operation on a sequence of values. The dot product of two vectors is a combination of
zipWith and fold. nary function, in this case (*), for multiplication. In practice, vectors and matrices tend to be huge and the straightforward modeling with HOFs may lead to architectures that either don’t fit on the FPGA at hand or may give rise to a slow clock. In such cases, modifications of HOFs are available with proven correctness by which a design can be pipelined or otherwise executed over time.
Embedded languages, the second abstraction mechanism, offer the possibility of hiding the underlying bit representation of certain constructs. They’re very practical for an instruction set of a processor or for the states of a state machine and are very helpful in avoiding errors. In Tomcat, a small embedded language is defined for packages of pixels arriving from the camera (Figure 3). Note that an embedded language is defined as a data type so that a function can be directly defined on constructs of such a language by using a technique called pattern matching (Figure 4).
The importance of Clash’s typing mechanism, the third abstraction mechanism, can hardly be overstressed, because many, if not most, errors made in practice are typing errors. A strong type-checking mechanism catches errors in an early design stage. As part of the typing mechanism, Clash can derive the type of some component, even if the designer doesn’t indicate the type explicitly. In Tomcat, this feature was often used: when the spec
ification wasn’t accepted by the type system and its error message was somewhat cryptic, it would help to isolate a function and ask Clash what type it ‘thinks’ the function has.
Right the first time
The methodology yielded a fully pipelined architecture, by which incoming packages of pixels are first regrouped into vectors of the size needed for the application, then multiplied in parallel by index values and finally, the results are added in a tree-shaped adding mechanism. Then follows a step of accumulating results per relevant region, after which a pipelined division operation is applied for the actual point-of-gravity computation. For each subregion, this computation is completed 70 clock cycles after its last pixel has arrived on the FPGA. At 200 MHz, this corresponds to 0.35 μs, whereas the requirement was 3 μs. The VHDL generated by Clash was adapted to the required interface and straightforwardly integrated with the VHDL for the surrounding architecture as developed in other project parts.
The functional level and abstraction mechanisms of the Clash methodology made the communication between Demcon and Qbaylogic fluent and effective since the language used for the design process is close to the language in which the application was defined originally. The resulting architecture, which satisfied the requirements, was created within the required development time; in fact, it was right the first time.
Jan Kuper is co-founder of Qbaylogic and Joost Kauffman is a senior system engineer at Demcon Focal, both in Enschede.
Edited by Nieke Roos
Figure 3: The small embedded language defined in Tomcat for packages of pixels arriving from the camera. The first clause is for packages containing a Boolean for an endof-line marker and a vector of 8 pixels (defined as 12-bit words). The second clause is needed at those clock cycles when no new package of pixels arrives. Clash has a default translation of values of such types into bit patterns, but the designer can also define a bit representation.
![](https://assets.isu.pub/document-structure/200903140837-f2e0b329ba51ae2e7a82a286a2f8c9bd/v1/cd35d0f56e308f8190b846f2548bcb38.jpg?width=720&quality=85%2C50)
![](https://assets.isu.pub/document-structure/200903140837-f2e0b329ba51ae2e7a82a286a2f8c9bd/v1/71a704716a9b21d86ebc01ea1c231ddb.jpg?width=720&quality=85%2C50)
Figure 4: Using pattern matching, a function f can be
directly defined on constructs of an embedded language. Parts of packages may be automatically extracted and
given names (eol, pxls), which may then be used in the
body of the function definition.
![](https://assets.isu.pub/document-structure/200903140837-f2e0b329ba51ae2e7a82a286a2f8c9bd/v1/c423f04bb8ba451ea2380e366d460260.jpg?width=720&quality=85%2C50)
![](https://assets.isu.pub/document-structure/200903140837-f2e0b329ba51ae2e7a82a286a2f8c9bd/v1/47b07b7c6938203e6cfa71e06e9ab177.jpg?width=720&quality=85%2C50)
![](https://assets.isu.pub/document-structure/200903140837-f2e0b329ba51ae2e7a82a286a2f8c9bd/v1/ff3203772153268f1039dcb519f2cc61.jpg?width=720&quality=85%2C50)
![](https://assets.isu.pub/document-structure/200903140837-f2e0b329ba51ae2e7a82a286a2f8c9bd/v1/9a66f7765f261fc3d2a0c74052384152.jpg?width=720&quality=85%2C50)
![](https://assets.isu.pub/document-structure/200903140837-f2e0b329ba51ae2e7a82a286a2f8c9bd/v1/37617068b270b637d57213d38152d527.jpg?width=720&quality=85%2C50)
![](https://assets.isu.pub/document-structure/200903140837-f2e0b329ba51ae2e7a82a286a2f8c9bd/v1/dd91113fe0e5242431d59975d8f111bd.jpg?width=720&quality=85%2C50)
FORTUNES OF HIGH TECH
A HISTORY OF INNOVATION AT ASM INTERNATIONAL 1958-2008
Jorijn van Duijn explores the dynamics behind the greatest high-tech innovation: the computer chip. This brand new book describes the history of ASM International between 1958 and 2008.
Order now
techwatchbooks.nl/fortunes
Marcel Pelgrom consults on analog IC design.
![](https://assets.isu.pub/document-structure/200903140837-f2e0b329ba51ae2e7a82a286a2f8c9bd/v1/2011191d427e3452b6f23d2ea99f766a.jpg?width=720&quality=85%2C50)
Mastering complexity
When chess world champion Magnus Carlsen was asked what he had learned during his training sessions with former world champion Kasparov, the answer was simple: rapidly analyzing a complex chess position. And indeed, Carlsen is one of the fastest to find the crucial move in an unknown position. An important aspect of chess analysis is to recognize structures: specific arrangements of pawns and pieces that create a solid defense or allow specific attacking tactics. Experienced players cherish such structures, as they’re integral to their game.
Similar mechanisms can be recognized in designing and analyzing complex technical systems, both in hardware and software domains. In the era where a telephone exchange required a big hall, the only way to keep control over the design was to strictly separate functions and structures. The same is valid for a ‘simple’ system-on-chip IC, with various I/O channels, a few processing units and a diverse collection of memories. Clear function definitions and interfaces allow off-loading irrelevant sidelines when overviewing the system.
Managing complexity starts with selecting people. Brilliant engineers design simple systems. Electronics and software mavericks, on the other hand, are a real danger, as their smart and elegant ideas mess up the simplicity of structures and turn mastering complexity into a nightmare. Equally dangerous are people with hammers that treat every issue as a nail. These engineers, once successful with a trick, apply the same solution to every problem.
Engineers must be able to adapt their way of thinking to the system they design. That starts with acquiring a variety of tools and skills, which allows you to easily switch from analyzing a control loop in the time domain to its frequency domain or root-locus view. Real masters of complexity can simplify the system to its essence without getting distracted.
Organizing the daily workload is paramount
They can change their perspective and analyze a problem from various sides. A broad technical experience allows them to use ideas and methods from other disciplines. There they outperform their run-of-themill colleagues. They see the red line running through the project and by experience discriminate essential operations from nice-to-haves.
Implementing a system and choosing its technical structures is a consequence of a rigorous analysis of the required objective for the system. Forty years in IC design have taught me that there’s always only one objective that really matters – the rest is negotiable. A system built for serving multiple objectives is by definition too complex to master. Extensions, upgrades, modifications or even repairs drown in an uncontrollable workload.
The objective comes sometimes with the specifications, but mostly it’s more than that. Finding the crucial objective can be trivial. For Miele household appliances, it’s reliability. For Ferrari cars, it’s appearance. For Apple electronics, It’s ease of use. When Apple wanted to introduce the smartphone, Steve Jobs had every proposal brought into his office. He knew what his company objective was and how it should translate into a phone. Every proposed device not satisfying that was smashed into the wall.
During the implementation phase, the objective is leading and recognizable in the backbone of the system. Engineers have an unhealthy tendency to economizing their designs. As if the Excel generals in the company would be able to spot the waste in hardware or code. Every structure has its own set of supporting functions; optimization isn’t advisable in an early design phase.
Complexity requires an appropriate level of administration. In the days that a single engineer could build a chip or software application, some scribbling might suffice. Today, progress in a project is virtually impossible without adequate documentation and communication. Every engineer can easily identify a dozen more attractive tasks, yet a clear way of working is a must to master complexity. Organizing the daily workload is paramount: keeping track of experiments, sorting, labeling and storing test and simulation results, and keeping modifications separate from the golden code are elements necessary for avoiding communication pitfalls in modern system design.
Designing complex systems requires a comprehensive approach, from personnel selection to documenting the daily progress. Mastering complexity is foremost an exercise in discipline, a virtue desperately needed in today’s world.