Fourth
generation wireless and mobile system are currently the focus of
research and development. Broadband wireless system based on orthogonal
frequency division multiplexing will allow packet based high data
rate communication suitable for video transmission and mobile internet
application. Considering this fact we proposed a data path architecture
using dedicated hardwire for the baseband processor. The most computationally
intensive part of such a high data rate system are the 64-point
inverse FFT in the transmit direction and the viterbi decoder in
the receiver direction. Accordingly an appropriate design methodology
for constructing them has to be chosen a) how much silicon area
is needed b) how easily the particular architecture can be made
flat for implementation in VLSI c) in actual implementation how
many wire crossings and how many long wires carrying signals to
remote parts of the design are necessary d) how small the power
consumption can be. This paper describes a novel 64-point FFT/IFFT
processor which has been developed as part of a large research project
to develop a single chip wireless modem.
2. SYSTEM SPECIFICATIONS
AND PROBLEM IDENTIFICATION
The complete structure of a modern specified by the IEEE 802.11a
standard is shown in fig. 1. It consists of a data scrambler, modulator,
convolutional encoder, interleaver, 64-point IFFT/FFF, demodulator,
deinterleaver, Viterbi decoder, and descrambler. The standard specifies
a data rate ranging from 6 to 54 Mb/s. Depending on the de d rate,
the modulation scheme adopted can be binary phase shift keying (BPSKt,
quaternary phase shift keying (QPSK), or quadrature amplitude modulation(QAM)
with l-6bit/subcarrier.
The encoding rates supported in the standard are 1/2, 2/3, and 3/4.
The bandwidth of the transmitted signal is 20 MHz and the OFDM symbol
duration is 4 ?s including 0.8 ?s for a guard interval [1 ] Thus,
in effect, FFT/TFFT has to be computed within 4 ?
In general,
the FFT implementations typically fall into one of the two categories:
1) methods based on direct Fourier transform [and 2) methods based
on direct hardware implementations of established FFT signal flow
graphs [8] [14]. A problem with these solutions is that the approach
adopted on the algorithmic level typically takes little account
of its implications at the architecture, data flow, or chip design
levels. Thus, many of these designs [8]-[10] may be irregular, dominated
by wiring, and may have heavy overheads in terms of data storage
[15] In a complex system, deployment of such strategies may result
in severe disadvantages, because of the tight timing constraints
and implicit requirement of low power consumption.
The conventional
Cooley-Tukey radix-2 FFT algorithm requires 192 complex butterfly
operations for a 64-point FFT computation. Considering that one
FF1' has to be computed within 4?s, one butterfly operation has
to be completed within 20,8 ns which results in 48 MHz clock frequency
for a single butterfly architecture. The synthesis result for a
radix-2 butterfly unit (one complex multiplication and two complex
additions) in 1HP O.25?m- technology shows that it occupies 0.18-mm
area and dissipates 17 mW power at that frequency. On top of this
butterfly unit, one needs memory to store the complex twiddle factors
and complex intermediate data, serial-to-parallel and parallel-to-serial
converters at the inputs and outputs, respectively, complicated
addressing logic and control circuitry. Combining all these circuit
modules it is expected that the power dissipation of the entire
processor will be quite high. Moreover, the input data arrives at
20 MHz clock frequency, and thus, it is more appropriate to operate
the FFT module at that frequency. In order to satisfy the time constraint
at this frequency, one has to employ multiple butterfly units in
parallel, which in turn increases the area and power dissipation.
Alternatively, since most of the implementations of the IEEE 802.1
1a standard oversample the incoming data at 40 or 80 MHz, a single
radix-2 butterfly based FFT module should be operated at 80 MHz
clock frequency (the next available frequency to the actually required
frequency). This approach satisfies the timing constraint, but at
the cost of high power consumption.
In order to speed up the FFT computation, more advanced solutions
have been proposed using an increase of the radix [15] [16]. These
approaches result in increase of arithmetic complexity within the
butterfly itself. The radix-4 FFT algorithm is most popular and
has the potential to satisfy the current need. However, a single
radix-4 butterfly requires three complex multiplications and eight
complex additions. Thus, in order to carry out one radix-4 butterfly
operation per clock cycle, one needs to complete 12 real multiplications
and 16 real additions at each cycle. Since multipliers are typically
very power-hungry elements in a VLSI design, this type of arrangement
results in significant power consumption.
You may also like this : Nanomachines, Humanoid Robots, Automatic Vehicle Locator, Tunable Spiral Inductors, Remote Media Immersion, Near Field Communication, 64-Point FT Chip, Embedded System Security, FPGA in Outer Space, FRAM, Intelligent Control, Isoloop Magnetic Couplers, Micro Electronic Pill, Nuclear Microbatteries , Offshore Oilrig Safety Using PLC , Plug and play sensors, Polyfuse, Polymer LED, Razor Technology , SOI Power Devices ,The PCI Express Architecture, RMI System, Applied Electronics and Instrumentation Engineering Seminar Reports, PPT and PDF.
|
<<back |