# New Methodologies for SET Characterization and Mitigation in Flash-Based FPGAs

Sana Rezgui, Member, IEEE, J.J. Wang, Member, IEEE, Eric Chan Tung, Brian Cronquist, Member, IEEE, and John McCollum

*Abstract*—New SET characterization and mitigation techniques unique for non-volatile FPGAs are investigated. Their implementation on a flash-based FPGA and evaluation in-beam show their efficacy with little area overhead but moderately high time penalty for highly-scaled technologies.

*Index Terms*—SET Characterization and Mitigation, reprogrammable and non-volatile FPGAs, radiation testing.

## I. INTRODUCTION

**R**<sub>(FPGA)</sub> constitute an effective ASIC replacement for various applications in the military and aerospace markets. However, their sensitivity to Single Event Effects (SEE) in addition to other radiation effects warrants investigation, characterization and mitigation [1]. Indeed, as transistor feature sizes have scaled down, their critical charges for SEE have scaled down as well. As a consequence, SEE could affect both the sequential and combinational logic. In the first case, they are called Single Event Upset (SEU) and in the second one, they are called Single Event Transient (SET).

If hardening is incorporated into a non-volatile FPGA's sequential logic, SET can become the primary source of observable errors [2-7] as seen in the One-Time Programmable (OTP) FPGAs. On non-volatile FPGA or ASIC circuits, their effect can be "transient" if not captured by a memory cell. Triple Module Redundancy (TMR), the most commonly used SEE mitigation technique, is hence not a requirement for the combinational logic gates, which could avoid high area overhead and reduce complexity of implementation for mitigated designs. Indeed, instead of tripling the design to vote out the wrong path, the SET could be filtered at the input of a TMR'd memory element (Flip-Flop (FF), latch, SRAM, etc.). This is the main idea of this paper: evaluation of a new SET mitigation technique useful for non-volatile FPGAs (NV-FPGA) based on SET filtering.

Manuscript received July 20, 2007. This work was supported in part by the Air Force Research Laboratory (AFRL).

Sana Rezgui, *Member, IEEE*, J.J. Wang, Brian Cronquist and John McCollum are with Actel Corporation, Mountain View, CA 94043 USA (650-318-4928; fax: 650-318-2571; e-mail: sana.rezgui@actel.com).

J.J. Wang, e-mail: j.j.wang@actel.com

Eric Chan Tung was with Actel Corporation, Mountain View, CA 94043 USA. He is now with the University of Toronto, Toronto, ON Canada.

Brian Cronquist, e-mail: brian.cronquist@actel.com

John McCollum, e-mail: john.mccollum@actel.com

Previously proposed SET filtering techniques [3-7], show that most of them would require duplication of the logic or delaying the signal issued from the combinational logic cells with a delay higher than the SET pulse width. The selected SET mitigation solution in this paper mainly uses the latter variant and its efficacy will then depend on the maximum SET pulse width, since it could result in a high time penalty for the mitigated design. For this purpose, new methodologies for the calculation of the SET cross-section with no mitigation and the measurement of the maximum SET pulse width are also required and will be described in this paper. The results issued from the Heavy Ion (HI) in-beams testing of both mitigated and non-mitigated designs implemented on a 0.13-µm ProASIC3 FPGA core, running up to 50 MHz frequency, will be presented and discussed.

# II. BASIC SET DETECTION AND MITIGATION

The proposed technique for the measurement of SET crosssection on the combinational logic is derived from a technique used in [3-6]. As shown in Fig. 1, conceptually the design utilizes an inverter-string connected to a latch to capture SET in the inverters. In normal operation, the input of the inverterstring and the Reset-input of the latch remain at '0'. The application of a momentary '1' to the Set-input potentially can cause the latch to go to the set state with an output of '1'. Resetting the latch would recover the output to '0'. Consequently, any SET having a pulse width wider than the latch setup time will trigger the state transition from '0' to '1'. This SET-detection technique will calculate a true combinational logic SET cross-section; its effect doesn't depend on the clock speed.



Fig. 1: SET Characterization Circuit

As shown by Baze [5], the same technique can be enhanced to measure the SET pulse width and also to mitigate SET effects. Fig. 2 shows a conceptual design. There are three basic components: 1) a combinational logic called target (for SET generation), 2) an SET filter, which controls the minimum detectable pulse-width of an SET, and 3) an asynchronous latch to capture and register the occurrence of an SET as a static state. The SET filter uses an inverter string to delay the signal along one path and uses a guard-gate to pass only those transients with widths exceeding the delay. Fig. 2 shows a guard-gate of 4 transistors; it functions as an AND gate when the 2 input-signals agree, or as a latch of the previous state when input signals differ.



Fig. 2: SET pulse-width measurement and SET mitigation circuit

This technique is easy to implement in an NV-FPGA with minimum hardware overhead. However, a trade-off between performance and hardening is unavoidable. The wider the filtered SET pulse, the lower the maximum allowed frequency of the mitigated design. This penalty will become more severe with advancing technologies and lower operation of core voltages. Nevertheless, this technique is considered as the baseline approach for the SET mitigation on the ProASIC3 FPGA family. Its implementation on this device will require some changes and enhancements to detect and mitigate SET. In the following, the test designs used for the validation of the SET detection technique will be called "Test Design A" and for the evaluation of the SET mitigation technique "Test Design B". Brief descriptions of the selected FPGA internal architecture as well as the test vehicle for the implementation of these novel approaches will be provided in the next section.

# III. PROASIC3 IMPLEMENTATION OF ENHANCED SET DETECTION AND MITIGATION

#### A. Device

The ACTEL ProASIC3 family is both non-volatile and reprogrammable, which is enabled by an advanced Flash-based, 130-nm LVCMOS process with 7 metal-layers. This product family has up to 3 million system gates in the core logic area, 504 kbits of true dual-port SRAM, 616 single-ended I/O, and 300 differential I/O pairs. Also included on chip, are 1-kbit nonvolatile Flash ROM (FROM) memory and up to 6 integrated phase locked loops (PLL). Two devices, the A3P250 and the A3P1000, from the ProASIC3 product family are selected for the implementation and demonstration of hardened designs.

The FPGA core consists of a number of logic tiles called "VersaTiles" and routing structures (as shown in Fig. 3). Each logic tile is a combination of CMOS logic and flash switches and can be configured as a three-input logic function (Look-Up Table: LUT 3) or as a D-flip-flop (with or without enable), or as a latch by programming the appropriate flash switch interconnections [8]. VersaTiles can flexibly map the logic and sequential gates of a design and are connected with each other through routing structures and floating gate (FG) switches as shown in the bottom right side of Fig. 3.



Fig. 3: ProASIC3 Core VersaTile and Flash-Based Switch

The flash switch storing the programming information includes two transistors that share the same FG. One is the sensing transistor, which is only used for writing and verification of the FG voltage. The other is the switching transistor, used to connect or separate routing nets, or to configure a logic tile as well as to erase the FG. These flash switches are distributed throughout the device to provide nonvolatile, reconfigurable programming to connect signal lines to the appropriate Logic Tile inputs and outputs.

In 0.13-µm technology, the active junctions of both the CMOS logic and the switches in a "VersaTile" or routing structures are expected to be SET sensitive. The purpose of the proposed SET characterization technique is to determine the SET cross-section of the smallest cell unit accessed by the user in the FPGA core. The smallest cell unit found in this FPGA is a routing FG switch and a logic tile configured as an inverter. In this paper, this cell unit will be called a Logic Cell like-Inverter (LCI) and the main idea will be to study the SET effects on an LCI and how to mitigate them. More thorough SEE characterization of the rest of the A3P programmable architectures is given in [9].

## B. SET Detection based on "Test Design A"

To implement "Test Design A" on an A3P, the latch used for SET detection should itself be mitigated to avoid unwanted errors. The latch is tripled as well as its reset signal. The output signal of the inverter-string is connected to each latch-input and each latch is implemented on one separate logic tile. Each latch-output signal is routed to a separate output pad to allow the "master" FPGA to decide whether an SET has occurred on the target combinational logic (the inverter-string) or the remainder of the FPGA's design. Hence, the SET cross-section is calculated based on a comparison between the 3 output values (Dout\_TR0, Dout\_TR1 and Dout\_TR2) as follows:

- Any discrepancy between the 3 output signals, where only one of them is equal to '1' should be due to an SEU in one of the latches, the FG switches used to connect one of the latches or the active regions of its corresponding output pad. The Finite State Machine (FSM) built in the "master" FPGA will then count it as an SEU in a latch and reset the latch storing the wrong information.
- Any agreement of 2 or 3 output signals on the logic state '1' should indicate the occurrence of an SET in the target combinational logic. Indeed, since an SET occurrence in any of the LCIs should propagate to the 3 latches, the 3 output-signals should be at '1' but if one of them is in reset mode, the two others will still indicate the occurrence of an SET. Upon the detection and count of an SET, the FSM in the "master" FPGA will reset the 3 latches to allow the detection of the next transient. As a result, a few SET may not be counted during the reset term of the 3 latches. This should be negligible based on the SET rate (in terms of seconds as it will be seen in beam testing) and the high speed of error-correction in each latch (less than 40 ns).

Fig. 4 shows the block diagram of the Test design A with 450 LCI, together with a compact scheme of the SET detection circuit implemented on the "master" FPGA.

| Part                   | A3P250 | A3P1000 |
|------------------------|--------|---------|
| System Gates           | 250K   | 1M      |
| D-Flip-Flops           | 6,144  | 24,576  |
| RAM Kbits              | 36     | 144     |
| Flash-ROM              | 1K     | 1K      |
| Secure (AES) ISP       | Yes    | Yes     |
| Integrated PLL         | 1      | 1       |
| Global Signals         | 18     | 18      |
| I/O Banks              | 4      | 4       |
| Single-Ended I/O       | 151    | 154     |
| Differential I/O Pairs | 34     | 35      |



Fig. 4: Block Diagram of the SET Detection in the FPGA Combinational Logic "Test Design A"

# C. SET Pulse Width Measurements and Mitigation based on "Test Design B"

As in the case of "Method A", few additional enhancements and modifications are done for the implementation of the "Test Design B" on the A3P FPGA. The specific guard gate cell (Fig. 5a) is replaced by an FPGA VersaTile like-LUT3 that performs the same logic function. Fig. 5b shows the selected logic implementation on the gate level, a NAND Celement [10] named GG in this paper. Indeed, the C-element, described in [10] is an asynchronous logic component. The output of the C-element reflects the inputs when the states of all inputs match. The output remains in this state until both inputs transition to the other state.

Furthermore, because of its sensitivity to SET and to avoid any SET single point failure, this cell is tripled and each GGoutput is connected to a separate latch. The delay block is implemented by means of LCI cells and will be called LCIdelay. Fig. 6 shows the proposed scheme for the SET pulse width measurement as well as for the SET mitigation to be evaluated.



Fig. 5: Different Implementations of the Guard-Gate Cell



Fig. 6: Block Diagram of the SET Mitigation in the FPGA Combinational Logic "Test Design B"

# IV. PROASIC3 RADIATION TESTS AND RESULTS

# A. Test Setup

The test setup includes two boards: 1) a "master" board for the monitoring and control of the DUT FPGA in-beam and 2) a "slave" board for the communication between the host PC and the master board (through two USB ports). The "master" board includes an A3P1000-FG484 (called "master" FPGA in this paper), and a DUT FPGA mounted on a PQ208 package that could support either an A3P250-PQ208 or an A3P1000-PQ208. The board's external clock is supplied by an oscillator of 33 MHz. To acquire higher test frequencies, the internal PLL of the master FPGA is used. The DUT power is provided by an HP power supply through a GPIB bus connected to a PC for current sensing and monitoring.

Furthermore, short IO "channels" of an input routed immediately to a nearby output, have been added between the "master" FPGA and the DUT. There are 38 Single-Ended (SE) and 13 Low Voltage Differential Signals (LVDS) I/O channels in total, on both FPGAs. This type of board architecture allows the implementation of several separate designs on the same DUT and simultaneous testing.

## B. FPGA Core SET Characterization & Mitigation

## 1) Test Designs

Considering the high number of IOs connecting the DUT to the "master" FPGA, many sub-designs could be implemented and exercised simultaneously on the same DUT with no apparent interaction between them (from user point of view). This feature was very beneficial to test the non-mitigated and mitigated designs at the same beam test conditions. Indeed, the beam test design is a set of 12 sub-designs, which are 1) a non-mitigated design, and 2) eleven mitigated LCI-delay implementations from 2 to 22 LCI in length.

As deduced from the design's timing analysis, the setup time for each LCI takes approximately 500 picoseconds, which varies the LCI-delay in the logic between 1 and 11 ns. The main purpose of increasing the test delay to 11 ns is to make sure that there will be no observed errors. Fig. 7 shows the resulting implementations of test designs A and B after modification. "Test Design A" used for SET detection is implemented on channel 1, while the various designs for "Test Design B" are implemented on channels 2 to 12.



Fig. 7: Non-Mitigated and Mitigated Test Designs with Various LCI-Delays but no IO Bank SET Mitigation

# 2) Test Results

Figure 8 shows the cross-sections obtained for channels 1, 2, 3 and 4. Channel 2 uses a 2 LCI-delay, channel 3 uses a 4 LCI-delay and channel 4 uses a 6 LCI-delay. Beyond the 6 LCI-delay, the rest of the channels did have some error-scattering. This includes errors at low LET < 10 MeV-cm<sup>2</sup>/mg for the highest delay channel.

This effect was conjectured to be due to SET on the enable signal of the IO banks. As all the sub-designs are implemented with different delays, it is difficult to observe this event at once on all the IO channels and consequently to verify the true origin of this phenomenon. However, the cross-section of the error-scattering  $(1.7 \times 10^{-7} \text{ cm}^2/\text{IO-Bank})$  as given in Fig. 8 (in hollow circles) equals the wide SET cross-section on the enable signal of a single IO bank, which suggests that this error scattering might be truly due to SET on the used IO banks. Nonetheless, additional beam experiments were performed to investigate the origin of these errors, as will be shown in following section.



Fig. 8: SEE Sensitivities of Non-Mitigated and Mitigated Test Designs with Various LCI-Delays

The results for channel 1 show an SET cross-section per LCI of  $10^{-7}$  cm<sup>2</sup>, and a LET<sub>th</sub> no less than 3.45 MeV-cm<sup>2</sup>/mg. Additionally, adding a 2-LCI SET mitigation (channel 2) shows little reduction of the design's sensitivity to SET (less than half) and no change on the LET<sub>th</sub> value, which proves that most of the occurring SET on this FPGA, has a pulse width wider than 1 nanosecond. However, the LCI-delay increase from 2 to 4 LCI reduces the saturation cross-section

 $(2 \times 10^{-8} \text{ cm}^2/\text{LCI})$ , which is almost 5 times less than when no mitigation is used (channel 1). No great improvement was observed though on the LET<sub>th</sub>.

LCI-delay has to be increased for further SET hardening. A 6 LCI-delay (channel 4) increases the LET<sub>th</sub> to 19 MeV- $cm^2/mg$  and reduces the cross-section to 3.9 x 10<sup>-9</sup> cm<sup>2</sup>/LCI (almost 10 times less than in the case of the non-mitigated LCI cell) while only one error was observed at LET equal to 58.72 MeV-cm<sup>2</sup>/mg for the channel 5 design at a fluence of 1.1 x 10<sup>8</sup> Xenon particles. Note that some of these errors on channels 1 to 5 could be due to SET on the enable signal of the used IO banks, which means that these cross-sections could be overestimated and the LET<sub>th</sub> underestimated.

Except for this error-scattering, the obtained results demonstrate that the maximum SET pulse width on an A3P FPGA core is between 3 and 4 ns. The increase of the LET<sub>th</sub> with the SET pulse width demonstrates that higher LET HI hits result in wider SET pulses.

## C. IO SET Test and Abnormality

## 1) Test Design

Two IO standards have been targeted for beam testing: LVCMOS33 (Low Voltage CMOS operating at 3.3V) and LVDS25 powered at 2.5V. The DUT FPGA is configured with designs that implement 38 short SE "channels" of an input routed immediately to a nearby output, and 13 LVDS IO channels routed also in the same manner. Using pins in close physical proximity minimizes the routing resources and therefore the number of FG switches. The IO test design was tested at 3 different frequencies (2, 16 and 50 MHz). The simplest version of this test design is illustrated for one SE IO channel in Fig. 9.



Figure 9: Scheme of the Single-Ended IOs' Testing

# 2) Test Results

The radiation test experiments have been performed in heavy-ion beams for an LET varying between 6 and 83.04 MeV-mg/cm<sup>2</sup>. Beam test results showed 3 types of transient errors: 1) an SET on the IO channel and 2) an SET observed only on one clock cycle that disrupts the entire (and only one) IO bank, and 3) an SET that could last for 2 (or 7) clock cycles when running the design at 16 MHz (or 50 MHz) and disrupts also the whole IO bank.

Being the product's designer, we could investigate the origin of this SET on the IO bank (short or wide). Indeed, there is only one common signal between all the IOs of one single IO bank and that is a global enable signal. The circuit used to provide this signal is composed of combinational logic

and latches. If an SET occurs on the combinational circuit after the latches then it would last for few nanoseconds as any other SET on the CMOS logic or the FG switches. If an SET occurs on the combinational logic and induces an SEU in the latch or in the latch itself, it will disable the IOs for 250 ns, which is exactly what we are seeing in beam. This type of design was done originally to avoid in-rush current but it appears that it could potentially be disrupted in HI radiation environment.

This event is similar to the so-called IO SEFI event that has been observed in Xilinx Virtex FPGAs [11], since they both affect many I/Os at the same time. However, as opposed to the Virtex FPGA case, this SET affects only one single A3P IO bank at a time and never all of them at once, which should allow its mitigation, as it will be demonstrated in the following of this paper. In addition, unlike in the Virtex FPGAs, it does not require any error correction such as scrubbing (requiring milliseconds); it is simply a transient and would clear in at most for 250 ns. Furthermore, the crosssections of both events are very different:  $2x10^{-7}$  cm<sup>2</sup>/IO-Bank for the A3P and  $4x10^{-6}$  cm<sup>2</sup>/FPGA for the Virtex FPGA.

Furthermore, none of the observed errors required reconfiguration of the DUT and none of them have been observed when running the design at 2 MHz. At 16 MHz, only one event of error-type 3 was observed starting from an LET of 67.8 MeV-mg/cm<sup>2</sup> at a fluence of  $1.5 \times 10^6$  of Xenon particles. At 50 MHz though, the 3 error-types were clearly observed and their cross-sections are provided in Fig. 10.



This data shows that the FPGA's IOs are susceptible to SET, which occur primarily at high frequencies (50 MHz). In addition, because SET-like type 3 lasts for over 280 ns (7\*2\*20 ns = 280 ns when running the design at 50 MHz and 2\* 2\* 62.5 ns = 250 ns when running the design at 16 MHz), using a mitigation solution based on SET filtering could result in a huge time penalty of the DUT design which is not practical. However, a TMR implementation of the used IOs where each of the input or output uses a different IO bank could mitigate these types of errors. It should be mentioned that this cross-section is very low  $(1.7 \times 10^{-7} \text{ cm}^2/\text{IO-bank})$  and in some applications such as video imaging should not

constitute a good justification to use TMR for the IOs. The saturation cross-section of error-type 2 is 2.2 x 10<sup>-6</sup> cm<sup>2</sup>/IO-Bank. The Threshold LET (LET<sub>th</sub>) for all errors is around 7 MeV-mg/cm<sup>2</sup>. More testing will be done to prove that the errors of types 1 and 2 are similar to those occurring on the CMOS logic or FG switches of 0.13 um technology, which could allow their filtering.

# D. FPGA Core Full SET Mitigation

In order to explain the error-scattering issue and to eliminate it, for the test design we have tripled each input and output (I/O) where each I/O uses a different IO bank from the 2 other Inputs or Outputs. Each set of tripled input is voted and its voter's output is driven to the target circuit (chain of inverters) while each of the 3 latches' outputs are separated on 3 different IO banks. The final test design is depicted in Fig. 11. In addition, the number of LCI has been increased in each sub-design from 450 to 486 so 100% of the FPGA VersaTiles are used. Therefore, each logic tile in the part is either configured as an inverter (for combinational target circuit or delay chain), a NOR gate for a latch, or a NAND gate for a GG cell.





obtained for LET < 78.5 MeV-cm<sup>2</sup>/mg. This means that wider SET events (> 3 ns and < 4 ns) start appearing at a LET higher than 43 MeV-cm<sup>2</sup>/mg. This SET event has a very low environment occurrence rate since their underlying crosssection is very low (2x 10<sup>-9</sup> cm<sup>2</sup>/LCI) compared to the total SET underlying cross-section.

Also, since no errors have been observed on designs with more than 6 LCI-delays, this data clearly prove that the previously observed experimental error-scattering was due to SET effects on the enable signal of a used IO bank. Indeed, only TMR (using 3 different IO banks) could mitigate such event and the SET filter was simply overwhelmed in the previous experiments (section "A. FPGA Core SET Characterization & Mitigation").

The above test designs were meant to measure the SET pulse width and the SET cross-sections per LCI. However, a real design's implementation combines both combinational and sequential logic. This should be tested to prove the efficacy of the proposed SEE mitigation technique. This will be targeted in the following section.

# V. VALIDATION OF PROPOSED SEE MITIGATION SOLUTION

As shown in the previous sections, the A3P FPGA core if configured as combinational logic is sensitive to SET but using a mitigation technique based on SET filtering could guarantee its immunity to these events. Indeed, based on the pulse width measurements given in the previous section, the new proposed SET mitigation solution requires a 6-LCI delay chain (3 ns) so all SET could be filtered at the inputs of sequential elements for LET < 43 MeV-cm<sup>2</sup>/mg. The block diagram that depicts the final proposed mitigation solution combining TMR and SET filtering is given in Fig. 13.



As mentioned earlier in the section "II. RELATED WORK", instead of using an SET filter with a delay element, the combinational logic between two sequential elements (FF, latch or memory) could be duplicated and the outputs of these 2 combinational logic paths will be the GG-inputs as shown in Fig. 14.



Fig. 14: SEE Mitigation Scheme 2

There are some limitations to this second SET mitigation scheme that could be seen only in the dynamic test mode. Indeed, in the static test mode, any SET in one logic path will be filtered by the GG. However in the dynamic test mode and unlike the single-string guard gate solution, any SET during the rising or falling edges of the input signal that would last longer than the sum of the DFF setup time and the clock duty cycle is likely to be registered by the DFF at the clock edge.

Despite this low sensitivity to SET, the time performance of such an implementation is much improved when compared to the single-string solution. The area overhead is however twice higher than in the case of the single-string solution. Furthermore, it is harder to implement such a mitigation solution because it does involve changes on the whole design rather than only on the library cells of a DFF or a latch.

In the following, several mitigation solutions are tested so the best mitigation solution could be selected, based on its performance in terms of time penalty, area overhead and SEE immunity. Two main test designs are implemented w/o the SET mitigation of the enable signal of used IO banks to provide better understanding of the A3P's radiation performance in both cases. HI in-beam testing of mitigated designs implemented on an A3P-1000-PQ208 FPGA core, running up to 50 MHz frequency, are presented and discussed.

## A. Without Mitigation of the IO Banks

## 1) Test Designs

The A3P1000-PQ208 was selected for improved design integration and to collect better statistics on the mitigated designs since their SEE cross-sections are expected to be very low. The first design (D1) evaluates the efficacy of several employed mitigation solutions based on SET filtering and TMR implementations. For comparative study, non-mitigated and mitigated test designs have been implemented and tested simultaneously in the same DUT. D1 includes 6 sub-designs as shown in Fig. 15 and detailed in the following:

- 1. Original Design (D1-1 or SRL): The original design is a shift register of 6 DFF where 10 inverters are always inserted between each 2 DFF. This sub-design is repeated 5 times in the FPGA design (using 5 separate IO channels).
- 2. TMR'd Design (D1-2): This design is practically the same as the original design (D1-1) except that the DFF are TMR'd. This design is repeated 5 times in the FPGA design.
- 3. SET Filtered Design as shown in Fig. 13 (D1-3): This subdesign is the mitigated version of the original design using the SET filter with a 6 LCI delay chain. The global inputdata is filtered at the input of the first DFF while the clock is tripled. This sub-design is repeated 20 times in the FPGA design to get as much statistics as possible.
- 4. SET Filtered Design as shown in Fig. 13 (D1-4): This subdesign is similar to the design D1-3, except that the global clock is also filtered in this case. This sub-design is repeated 4 times in the FPGA design. Because of the inserted delay element in the path of the global clock signal, the testing was not possible at 50 MHz.

- 5. SET Filtered Design (D1-5, as in Fig. 13): This sub-design is similar to D1-4 except that the clock is duplicated and filtered by 3 GG cells as described in Fig. 14. This subdesign is repeated 4 times in the DUT and allows testing at higher frequencies (up to 50 MHz) since no delay is inserted in the clock path.
- 6. TMR'd Design (D1-6): This sub-design is the full TMR version of the original design (IOs, combinational logic and DFF). This sub-design is repeated 4 times in the FPGA design.

For all the sub-designs, the output and the input pads even when filtered are always connected directly to the DFF to avoid any additional delays due to the board's propagation delays. This allows beam testing at faster speeds. Note that all the mitigated designs are implemented with no mitigation of SET on the IO banks. For instance, for design D1-6, the tripled inputs (outputs) are always on the same IO bank, in such a way that if the wide "transient" event does occur on a used IO bank, it would propagate to the 3 TMR domains and will be clearly observed on the tripled outputs.

It should also be mentioned that the tiles corresponding to successive inverters are placed as close as possible to each other, as well as the TMR'd DFF, so that Multiple Bit Upset (MBU) will have the highest effect on these designs [12]. However, within the tiles that are configured as inverters, the MBU effects have the least impact on the design, since only one inverter is used and other CMOS logic in the tile that are not adjacent; hence there's no impact on the logic tile output. This should not affect overall the design's response, because the effect is still within the tile, affecting only one domain and therefore with no impact on a TMR output. This insures that the results are valid.

# 2) Test Results

The beam test results for D1-1, D1-2 and D1-3 running at 2, 16 and 50 MHz are shown in Fig. 16. Note that all SEE cross-sections are calculated for just one SRL (w/o mitigation) and as SEE sensitivities were very low at 2 MHz. WEIBULL curves are not displayed; only data points are added in this case.

Overall, it is clear that a mitigated design whether with TMR of DFF only or based on SET filtering combined with TMR of the DFF, has a much lower SEE cross-section than a non-mitigated design (D1-1). In addition, it is shown in Fig. 16 that increasing the level of mitigation, from only SEU mitigation (D1-2) to SET & SEU mitigation (D1-3) resulted in a reduction of approximately one and a half order of magnitude of the overall SEE cross-section at a given frequency. Note that for this first step of data analysis, the global errors on IO banks were not taken in account so the impact of SEE mitigation effort could be clearly demonstrated. In addition, the obtained results for the 3 designs (D1-1, D1-2 and D1-3) show an increase in the SEE cross-sections with the frequency increase.



Fig. 15: Non-Mitigated and Mitigated Designs with no Mitigation of the IO banks enable signals



Fig. 16: SEE Sensitivities of Non-Mitigated and Mitigated Designs with No Mitigation of the IO banks

At 50 MHz, the obtained results shown in Fig. 17 prove that all SET mitigation solutions are efficient and lead approximately to the same result. Also, the designs' underlying cross-sections  $(2x \ 10^{-6} \text{ cm}^2/\text{SRL})$  is very comparable to the saturation cross-section of the sum of errors type 2 and 3 observed on an IO bank (2.4 x  $10^{-6} \text{ cm}^2/\text{IO-Bank}$ ), which was well expected and prove again the consistency of the results.

The data shows also that D1-6 (TMR-all) has lower SEE cross-sections at LET < 43 MeV-cm<sup>2</sup>/mg compared to D1-3, D1-4 and D1-5 (employing SET filtering & TMR). This has been observed at all tested frequencies and is shown in Fig. 17. This could be due to a lack of statistics, although the fluences were higher than  $10^7$  of beam particles at each LET.

More testing will be done though to gain better statistics.

It should be mentioned though that a design using exclusively SET filtering rather than TMR of the combinational logic and the IOs, with no regards to the IO bank mitigation, should show lower SEE sensitivity. Indeed, if error-type 2 observed on the IO banks could be filtered then the design's cross-section should be of the wide SET event on the enable signal of an IO bank (error type 3). Since D1-4 could not be tested at 50 MHz, this could be observed at 16 MHz. Again, more testing is needed.

It is clear then that with no mitigation of the enable signal of each IO bank, SEE immunity could not be obtained with any of the employed mitigation solutions (based on TMR or SET filtering combined with TMR) but the worst case design's cross-section could be calculated at a given frequency based on the number of used IO banks. Besides, it is clear also that if a reduction in the maximum frequency of 30 % is allowed (compared to 15% in the case of TMR-All), SET filtering solution could be very beneficial in terms of hardware overhead and simplicity of implementation.



Fig. 17: SEE Sensitivities of SEE Mitigated Designs (D1-3, D1-4, D1-5 and D1-6) with No Mitigation of the IO banks

## B. With Mitigation of the IO Banks

The main objective of this second test of SEE mitigation is to accomplish complete SEE immunity with either mitigation technique (TMR-All or SET filtering combined with TMR). Two mitigated designs have been implemented for the mitigation of both of the FPGA core and the enable signal of each used IO bank. The two designs are similar to D1-5 (SET filtering combined with TMR) and D1-6 (TMR-All), but this time with mitigation of each global enable signal of a used IO bank. For that purpose, all used IOs have been tripled and each set of tripled input/output (I/O) was placed on a separate IO bank from the 2 other inputs/outputs. In addition, each I/O from the same domain is always placed on the same IO bank so errors do not propagate from one domain to another.

The first design, called here D2 is similar to design D1-6, and implements a full TMR'd shift register with combinational logic inserted between each two TMR'd DFF. The second design (D3) implements a similar shift register combined with combinational logic, this time mitigated with SET filtering and TMR (similar to design D1-5). To gain better statistics; both designs have been tested independently using the maximum of FPGA's resources.

1) Test Designs

# a) Mitigation Based on TMR

The test design of the TMR'd shift registers with combinational logic uses 87.3% of the FPGA core. It is implemented on 12 separate sub-designs. Each sub-design uses 34 TMR'd DFF (102 DFF) where a tripled chain of 16 inverters is always inserted between each of the 2 TMR'd DFF. In total, each sub-design is using 1584 tiles configured as inverters, 102 configured as DFF and 102 as majority voters. The whole design (12 sub-designs) is using 21456 logic tiles out of 24576 available in the device.

## b) Mitigation based on SET Filtering & TMR

The second test design implements a shift register where combinational logic is always inserted and its output filtered between each 2 TMR'd DFF. The SET filter uses a delay chain of 6 LCI. In addition, the clock and the data inputs (Din) are tripled to allow mitigation of the IO banks. This design is implemented on 12 separate sub-designs.

Each sub-design is using 52 TMR'd DFF (156 DFF) where 8 inverters are always inserted between each 2 TMR'd DFF and 3 GG SET filter with a 6 LCI delay. In total, each subdesign is using 408 tiles configured as inverters, 520 for the SET filter and 156 configured as DFF and 52 as majority voters (they are not tripled). Note that the tiles reserved for the SET filter are at their maximum usage however the combinational logic cells could be increased with no limitation except the one due to the maximum allowed design's frequency. This means that the hardware overhead does not increase with the complexity or the size of the test design unlike when the design is mitigated with TMR. The whole design (12 sub-designs) is using 18432 logic tiles out of 24576 available in the device (75% of the FPGA core).

# 2) Test Results

Beam test results showed encouraging results in terms of SEE mitigation. Indeed, as shown in Fig. 18, no errors were observed on the TMR'd design (D2) for LET < 97.8 MeV- $cm^2/mg$ . D3 demonstrated SEE sensitivity (SET in this case since all DFF are TMR'd and therefore are SEU immune) at LET > 43 MeV- $cm^2/mg$ . Indeed, the previously presented results, in section "2. FPGA Core Full SET Mitigation)", proved that an SET filter using a delay of 6 LCI could filter SET only up to an LET < 43 MeV- $cm^2/mg$ . For higher LET, the SET could last longer than 3 ns and therefore won't be mitigated. This new result validates the previous data and confirms the correctness and consistency of the employed test and mitigation methodologies since the same conclusion was obtained with two very different DUT designs.

In addition, if the WEIBULL curve of the SET crosssection per LCI mitigated by an SET filter employing a 6 LCI delay with 3 GG shown in Fig 13, is multiplied by the number of TMR'd DFF (52 in this case), the cross-section of the worst case SEE sensitivity of the mitigated design employing this type of SEE mitigation, is the one shown in gray in Fig 18. This worst case should correspond to the highest design's cross-section running at the maximum allowed frequency. Indeed, if we assume that only one SET in all the combinational logic inserted between two TMR'd DFF could get caught in this TMR'd DFF, the highest cross-section of any design where sequential elements are SEU immune should be the number of memory elements (DFF, latches, SRAM address locations, etc.) multiplied by the SET crosssection of an LCI. This is simply the worst case calculation and is independent of the design's frequency.

Indeed, in reality (in space) and even in beam, the probability of having an SET caught by a DFF during the window of vulnerability of a DFF (hold and setup time) is less or equal than 1. Therefore, it should not matter if there are many combinational logic gates between 2 TMR'd DFF or just one since the probability for an SET to induce an SEU should remain always less or equal than 1. This is mainly due to the very short window of vulnerability of a DFF (around 30 picoseconds). Therefore, only one of these SET could induce an SEU and the others will be just transient.

This is valid only for non-volatile FPGAs, since SET effects are only transient in this type of circuits. The obtained data did confirm this assumption and all the data points during the irradiation of this test design are under this line which practically proves the validity of this new theory at least for this design. This new theory could appear very useful in the future in predicting the highest SEE cross-section (the worst case) of any given design (with no concerns about its complexity) implemented on the A3P FPGAs and employing this technique of SEE mitigation. Beam test experiments on more complex designs (processors) are needed to consolidate this theory.

Furthermore, based on the same assumption and since all SET were filtered for the channels employing 8 LCI as a delay chain, a similar design to Design 3, this time using the 8 LCI delay chain should show no SEE sensitivity. Additional testing will be needed to demonstrate that.



Fig. 18: SEE Sensitivities of Full SEE Mitigated Designs

# VI. RECOMMENDATIONS FOR A NEW RADIATION TOLERANT REPROGRAMMABLE NV-FPGA: RTA3P

The previous results highlight the efficacy of the tested SEE mitigation techniques, whether by fully employing TMR or by using SET filtering combined with TMR. Based on these results, five mitigation solutions could be proposed according to the designers' needs and the mission's requirements:

- 1. If the wide SET IO bank cross-section of 2 x  $10^{-7}$  cm<sup>2</sup>/IO bank could be tolerated or the triplication of IOs is not acceptable and the part is intended to operate at LET > 43 MeV-cm<sup>2</sup>/mg then the mitigated design should be implemented as in Fig. 13 (D1-3). A 6 LCI-delay chain should be used, which would limit the design's maximum frequency to 70 MHz.
- 2. If the wide SET IO bank cross-section of  $2 \times 10^{-7} \text{ cm}^2/\text{IO}$  bank could be tolerated or triplication of IOs is not acceptable and the part is intended to operate at very high LET then the mitigated design should be implemented as in Fig. 13 (D1-3) but with a delay chain of 4 ns (8 LCI) to fully mitigate any SET, which would limit the maximum frequency to 60 MHz.
- 3. For full SEE mitigation even at very high LET, the DUT design should be mitigated as in Fig. 13 (D1-3). The delay chain should be of 4 ns to fully mitigate any SET (8 LCI), which would limit the design's maximum frequency to 60 MHz. The IOs must be tripled and separated on 3 different IO banks.
- 4. For full SEE mitigation for LET > 43 MeV-cm<sup>2</sup>/mg: The mitigated design should be implemented as in Fig. 13 (D1-3). The delay chain should be of 3 ns to fully mitigate any SET (6 LCI), which would limit the maximum frequency to 70 MHz. The IOs must be tripled and separated on 3 different IO banks.
- 5. If there is a high restriction on the design's performance and timing limitation of 30% is not acceptable then the design should be TMR'd or a trade-off should be made so the delay chain could be reduced for an acceptable SEE cross-section.

We would recommend the solutions number 1 or 3, because the HI population with LET higher than 43 MeV-cm<sup>2</sup>/mg is very low in the space environments. The SEE design's crosssection in such environment should be extremely low, especially with a very low SET cross-section ( $2x \ 10^{-9} \ cm^2/LCI$ ). For critical missions operating in harsher radiation environments at LET mostly higher than 43 MeV-cm<sup>2</sup>/mg, an 8 LCI-delay would be recommended for SET suppression, although it might constitute a higher hardware overhead and time penalty.

Optimized versions of the proposed SET mitigation scheme to reduce area overhead or time penalty are being studied. For instance, using more of a logic tile than an inverter to increase its setup time in the delay chain could reduce the size of an SET filter by half.

## VII. CONCLUSION

In this paper, novel SET characterization and mitigation methodologies have been designed, tested and validated on reprogrammable non-volatile FPGAs: ACTEL A3P product family. HI in-beam experiments were performed at LBNL and TAMU. Indeed, for the first time, SET pulse widths were measured and SET cross-section calculated on an FPGA, independently of the frequency. The obtained results showed that SET pulse widths for LET < 43 MeV-cm<sup>2</sup>/mg are shorter than 3 ns. Beyond this LET, SET pulse widths can be wider than 3 ns but always shorter than 4 ns; but with a very low underlying cross-section (2x  $10^{-9}$  cm<sup>2</sup>/LCI). Wide SET event (that could last for 250 ns) has also been observed on the enable signal of each single IO bank. No FPGA's reconfiguration was required due to beam irradiation.

In addition, novel SEE mitigation solutions were proposed based on SET filtering of the combinational logic and the TMR of the sequential elements. Its efficacy was proven and validated in beam by demonstrating a full SEE immunity inbeam. The proposed test and mitigation methodologies as well as the presented beam results will be the foundation for radiation tolerant product developments (RTA3P) and beyond.

#### REFERENCES

- [1] J.J. Wang, N. Charest, G. Kuganesan, C.K. Huang, M. Yip, H.S. Chen, J. Borillo, S. Samiee, F. Dhaoui, J. Sun, S. Rezgui, J. McCollum and B. Cronquist, "Investigating and Modeling Total Ionizing Dose and Heavy Ion effects in Flash-Based Field Programmable Gate Array", *RADECS* 2006, Athens, Greece.
- [2] M. Berg, J. J. Wang, R. Ladbury, S. Buchner, H. Kim, J. Howard, K. Label, A. Phan, T. Irwin, and M. Friendlich, "An Analysis of Single Event Upset Dependencies on High Frequency and Architectural Implementations within Actel RTAX-S Family Field Programmable Gate Arrays", *IEEE TNS, Vol. 53, NO. 6*, Dec. 2006, pp 3569-3574.
- [3] A. Balasubramanian, B.L. Bhuva, J.D. Black, L.W. Massengill, "RHBD Techniques for Mitigating Effects of Single-Event Hits Using Guard-Gates", IEEE TNS, Vol. 52, NO 6, Dec. 2005, pp 2531 – 2535.
- [4] R.L. Shuler, C. Kouba, P.M. O'Neill, "SEU Performance of TAG Based Flip-Flops", IEEE TNS, Vol. 52, NO 6, Dec. 2005, pp 2550 – 2553.
- [5] M. P. Baze, J. Wert, J. W. Clement, M.G. Hubert, A. Witulski, O.A. Amusan, L. Massengill, and D. McMorrow, "Propagating SET Characterization Technique for Digital CMOS Libraries", *IEEE TNS*, *Vol. 53, NO. 6*, Dec. 2006, pp 3472-3478.
- [6] R.L. Shuler, A. Balasubramanian, B. Narasimham, B.L. Bhuva, P.M. O'Neil and C. Kouba, "The effectiveness of TAG or Guard-Gates in SET Suppression Using Delay and Dual-Rail Configurations at 0.35 um", *IEEE TNS, Vol. 53, NO. 6*, Dec. 2006, pp 3428 -3431.
- [7] D. Mavis, and P. Eaton, "SEU and SEU Modeling and Mitigation in Deep-Submicron Technologies", *IRPS 2007*, pp 293-305, Albuquerque.
- [8] ProASIC3 Flash Family FPGAs Datasheet: Available: http://www.actel.com/documents/PA3Architecture\_DS.pdf
- [9] S. Rezgui, J.J. Wang, E. Chan Tung, B. Cronquist, and J. McCollum, "Comprehensive SEE Characterization of 0.13-µm Flash-Based FPGAs by Heavy-Ion Beam Test", Data Workshop RADECS 07, France.
- [10] S. Mitra, M. Zhang, N. Seifert, B. Gill, S. Waqas, K. S. Kim, "Combinational Logic Soft Error Correction", IEEE ITC, Nov. 2006.
- [11] G. Swift, S. Rezgui, J. George, C. Carmichael, M. Napier, J. Maksimowictz, J. Moore, A. Lesea, R. Koga and T.F. Wrobel "Dynamic Testing of Xilinx Virtex-II Field Programmable Gate Array (FPGA) Input/Output Blocks (IOBs)", *IEEE TNS, Vol. 51, NO. 6*, Dec. 2004, pp 3469-3479.
- [12] H. Quinn, P. Graham. J. Krone, M. Caffrey, and S. Rezgui, "Radiation-Induced Multi-Bit Upsets in SRAM-Based FPGAs", IEEE TNS, Vol. 53, NO. 6, Dec. 2005, pp 2455-2461.