print

Boosting wireless subsystem performance with FPGA co-processing

Advanced wireless functions such as a Turbo decoder Forward Error Correction (FEC)
block require a degree of flexibility and performance
that makes them well suited for DSP processors coupled with a performance optimized
“accelerator” engine for timing critical functions. Designers of these systems are faced
with a variety of implementation options including hardened blocks available as part of a
market specific DSP processor and FPGAs configured in either a pre-/post- or co-processor
topology to a DSP. Rapidly evolving standards, such as WiMAX or LTE (WCDMA Long Term
Evolution) often favour the flexibility advantages of a reconfigurable accelerator based
on an FPGA.

Once the decision has been made to use an FPGA based accelerator, system architects
need to decide between a pre-/post-processing or co-processing topology. This is often a
trade-off between control and performance. Configuring
an FPGA as a co-processor is attractive to traditional DSP developers because it is a more
software centric design flow that puts the DSP in direct control of the data handoff
process. Configuring the FPGA as a pre-processor is a more hardware oriented
implementation but can significantly improve overall system performance
by streamlining the interface.

For example, let’s consider the latest proposals
for 3G LTE where the transmission time interval (TTI) has been reduced to from 2 ms down
to 1ms for HSDPA and 10ms for WCDMA. This essentially requires that data be processed from
the receiver to the output of the MAC layer in less than 1000 usec. The diagram in Figure
2 shows that using a SRIO port on the DSP running at 3.125Gbps, with 1.25 coding and
200bit overhead for the Turbo decode function,
results in DSP-to-FPGA transfer delay of 230usec. Nearly a quarter of the TTI period is
required just to transfer the data. Taking into account the other expected delays this
pushes the Turbo codec performance requirement to a very demanding 75.8Mbps for 50
users.

However, if the DSP latency is removed and the FPGA is used to process the turbo codes
as a largely independent post-processor, then the saving in the time caused by not having
to transfer the data at high bandwidth between DSP and FPGA reduces the throughput rate of
the Turbo decoder down to 47Mbps. This decrease in throughput rate enables more cost
effective devices to be used, and reduces in system power dissipation.

As well as Turbo decoding, there are many other functions that can benefit form this
design approach. More details on the complete
range of solutions available to wireless systems design engineers are available at www.xilinx.com/esp/wireless


<!–

–>

Contact Information

Xilinx, Inc.
Xilinx, Inc.

2100 Logic Drive
San Jose, CA, 95124
USA

tele: 408.559.7778
fax: 408.559.7114
more_info@xilinx.com
www.xilinx.com

Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google

Comments

Leave a Reply

Comment

Security Code: