Design and VHDL Implementation of a Multiple Antenna DS/SS Code Acquisition Module without Doppler Frequency Search Targeted for GPS purposes
by Enric M. Calvo
A la meva estimad´ıssima Gemmeta, el m´es gran que he fet a la vida tamb´e ´es teu.
Agra¨ıments ´ una gran sort a la vida arribar a un moment com aquest, despr´es de molts Es anys d’esfor¸cos i sacrificis, de penes i de gl`ories. Tamb´e ´es una gran sort poder compartir aquest moment amb la gent que un m´es s’estima. Per`o encara ´es una sort m´es gran haver pogut compartir tot el que ha calgut per arribar fins aqu´ı amb aquesta gent, encara que alguns per un motiu o un altre no hi puguin ser. Les paraules queden curtes per poder agrair i manifestar amb l’alegria que sento tot el que tanta gent ha fet i ha deixat de fer per mi. Comen¸cant per la fam´ılia m´es propera, fins arribar als millors amics, que tantes vegades heu cregut en mi molt m´es que jo mateix. Gr`acies a les meves tres donetes per ocupar-vos tant´ıssimes vegades de tot el que jo no he pogut, i per preocuparvos tant per mi quan les situacions eren dif´ıcils alegrar-vos tant per mi quan les coses m’han sortit millor del que m’imaginava. Gr`acies a tots, Xavi, Mireia, Ra´ ul, Emili, Ferran, perqu`e haver-me ajudat tant, tant, TANT en els moments m´es dif´ıcils de la vida. M’heu demostrat de maneres incre¨ıbles quant m’estimeu. A tota Pompeia, l’Ana, la Maria, la Berta, l’Ari i el Ton, per haver estat sempre tan pacients, tan pendents i tan alegres. Seguiu aix´ı! Al Kitus, per seguir alimentant-me la vena somiadora i ut`opica de l’enginyeria, al Miquel, per haver compartit junts tantes il·lusions pel coneixement i tantes nits sense dormir (o b´e mirant estels, o b´e acabant pr`actiques de laboratori); al LluisMa, per seguir il·lusionant-me amb la preciositat de les comunicacions; al Cristian per tota l’alegria que sempre ens has donat a tots; a l’Arnau per totes les discusions i il.lusions metacomunica-filos`ofiques que hem tingut, a l’Ester i la Raquel, per ser de les noies m´es maques que he conegut; al Jordi, per haver-te preocupat tant de mi; al David, el Sergi i el Llu´ıs, per haver tingut el coratge d’engegar XipNova i radiar passi´o per l’enginyeria; a tot BJT (Davidm, N´estor, Marisa, Sonia, Joel, etc...), en especial al Paco i a l’Hel´ı per haver-me ajudat tantes vegades. Un dels agra¨ıments m´es especials que de tot cor vull donar ´es pel Gregori. Amb aquest projecte no m’has donat nom´es una grand´ıssima oportunitat, sin´o infinites, que no podr´e tornar-te mai. Per aquesta capacitat que sempre has tingut d’encisar, apassionar els alumnes i d’indicar, amb poques paraules, conceptes que s´on preciosos i ens exciten i criden ser estudiats. I gr`acies tamb´e per tot all`o no tan t`ecnic en qu`e m’has ajudat, i que s´e que mai no podr´e arribar-te a agrair ni de bon tros. El Xavi (Villares) tamb´e ha estat una de les persones que m´es m’ha aju-
iv dat amb el projecte. Haig de recon`eixer que amb persones com tu d´ona gust treballar, perqu`e a part de ser un excel.lent enginyer (gaireb´e Doctor), ets una persona amable, atenta i, per sobre de tot, bona. ¡Suerte con el Doctorado! Per`o la menci´o m´es especial que vull fer, per tota la c`arrega que ha hagut de compartir amb mi aquests darrers anys, ´es per la meva floreta preferida, la Daisy. Gr`acies de tot cor per haver fet un esfor¸c tan gran, haver tingut tanta paci`encia amb mi i sobretot haver estat sempre al meu costat. Bona part del que s´oc i ser´e t’ho deur´e sempre a tu.
Contents Agra穡覺ments
iii
Preface
xi
1 GPS Fundamentals and Signal Structure 1.1 GPS fundamentals . . . . . . . . . . . . . 1.1.1 Range and Pseudorange . . . . . . 1.1.2 Space Vehicles . . . . . . . . . . . 1.2 Spread Spectrum Fundamentals . . . . . . 1.2.1 Pseudorandom sequences . . . . . 1.3 GPS Signal Structure and Frequency Plan
I
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
DS/SS Synchronization
1 1 1 2 2 3 5
8
2 Quantization Effects in DS/SS 9 2.1 Quantization Loss . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.1 1 and 2 bit Quantizers . . . . . . . . . . . . . . . . . . . . 9 2.1.1.1 Performance of 1-bit and 2-bits ADCs . . . . . . 13 3 DS/SS Acquisition 3.1 Conventional Techniques: 2D Serial Search . . . . . . . . . . . . 3.1.1 Sliding Correlator . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Performance of Sliding Correlator . . . . . . . . . . . . . 3.2 Avoiding Doppler Search . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Doppler Effects . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1.1 Carrier Doppler Frequency Shift . . . . . . . . . 3.2.1.2 Chiprate Doppler . . . . . . . . . . . . . . . . . 3.2.1.3 Relationship between Carrier Frequency Doppler and Chiprate Doppler . . . . . . . . . . . . . . . 3.3 Description of a DS/SS Communication System . . . . . . . . . . 3.3.1 Timing Recovery . . . . . . . . . . . . . . . . . . . . . . . 3.3.1.1 Basic Concepts . . . . . . . . . . . . . . . . . . . 3.3.1.2 Analysis of Interpolation/Decimation for Timing Recovery . . . . . . . . . . . . . . . . . . . . 3.3.2 Signal Model . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2.1 Chiprate Doppler approximation . . . . . . . . .
16 16 16 18 20 20 21 21 21 22 22 22 24 27 28
CONTENTS
3.4
3.5 3.6
3.7
II
vi
3.3.2.2 Input SNR and C/N0 definition . . . . . . . . . 3.3.2.3 A/D conversion . . . . . . . . . . . . . . . . . . 3.3.3 Chip Detection . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3.1 Chip Matched Filter . . . . . . . . . . . . . . . . 3.3.3.2 Decimation by Nsc . . . . . . . . . . . . . . . . . 3.3.4 Symbol Detection . . . . . . . . . . . . . . . . . . . . . . 3.3.4.1 Despreading . . . . . . . . . . . . . . . . . . . . 3.3.4.2 Integrate & Dump by Ncs . . . . . . . . . . . . . Description of the Code Acquisition System . . . . . . . . . . . . 3.4.1 Signal Model . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Chip Detection . . . . . . . . . . . . . . . . . . . . . . . . 3.4.3 Coherent Detection – Correlator . . . . . . . . . . . . . . 3.4.4 Non-Coherent Detection – Squarer and Non-Coherent I&D Simplifications and other requirements for GPS . . . . . . . . . . 3.5.1 GPS Single Point Solution: Number of channels . . . . . Cells Test Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.1 Sequential Test (SEQ) . . . . . . . . . . . . . . . . . . . . 3.6.2 Validation Dwell (VAL) . . . . . . . . . . . . . . . . . . . 3.6.3 Sidelobe-Check Procedure (SLC) . . . . . . . . . . . . . . 3.6.4 Correlators Parallelization for Faster Acquisition . . . . . 3.6.4.1 Maximum-Finding Procedure (MAXF) . . . . . 3.6.4.2 Careful Parallelization . . . . . . . . . . . . . . . Spatial Diversity in Synchronization . . . . . . . . . . . . . . . . 3.7.1 Code Acquisition Enhancement: Nant Antennas . . . . . .
VHDL Implementation
4 Code Acquisition Hardware Implementation 4.1 Specifications and Total Parameterization . . . . . . . . . . . . 4.1.1 Acquisition Related Parameters . . . . . . . . . . . . . . 4.2 Single Correlator . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Gold Code Generator . . . . . . . . . . . . . . . . . . . 4.2.2 Coherent Correlators: Coherent Detection . . . . . . . . 4.2.2.1 PRN Code Multiplication . . . . . . . . . . . . 4.2.2.2 Integrate & Dump: Coherent Integrator . . . . 4.2.3 Power Detection – Non-Coherent Detection . . . . . . . 4.2.3.1 Squarer – Non-Coherent Chip Detection . . . . 4.2.3.2 Non-Coherent Integrator Implementation . . . 4.2.4 Correlator Control blocks . . . . . . . . . . . . . . . . . 4.2.4.1 Dwell Control FSM . . . . . . . . . . . . . . . 4.2.4.2 Set Next Hypothesis block (setnexthyp) . . 4.2.4.3 Coherent I&D Control FSM (fsm coh ctrl) 4.3 Parallelization among Ncorr correlators . . . . . . . . . . . . . . 4.3.1 Overview of the Parallelization . . . . . . . . . . . . . . 4.3.2 Correlator changes . . . . . . . . . . . . . . . . . . . . . 4.3.3 Dwell Control FSM changes . . . . . . . . . . . . . . . .
29 31 33 34 35 37 37 37 39 41 41 42 45 48 49 49 50 51 53 54 54 54 55 55
58 . . . . . . . . . . . . . . . . . .
59 59 61 62 63 64 67 68 69 70 71 73 73 76 79 79 81 82 83
CONTENTS
4.4 4.5
4.3.4 Set Next Hypothesis (setnexthyp) Enhancement: Antenna Arrays . . . . . . . Acquisition Module Simulations . . . . . . . 4.5.1 Model Simulation and construction .
5 Conclusions & Future Work
vii changes . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
85 85 85 86 93
A Impact of Frequency Doppler on the Correlator 95 A.1 Without Data Modulation . . . . . . . . . . . . . . . . . . . . . . 96 A.2 With Data Modulation . . . . . . . . . . . . . . . . . . . . . . . . 97 B Acronyms Bibliography
100 102
List of Figures 1.1 1.2 1.3
An LFSR generates PRN sequences . . . . . . . . . . . . . . . . GPS spectrum allocation . . . . . . . . . . . . . . . . . . . . . . . GPS Modulator Diagram . . . . . . . . . . . . . . . . . . . . . .
2.1 2.2 2.3
2-bit and 1-bit A/D Converter transfer function . . . . . . . . . . 10 1-bitMAG ADC transfer function . . . . . . . . . . . . . . . . . . 13 Theoretical ADC conversion losses . . . . . . . . . . . . . . . . . 14
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14
PN autocorrelation function (squared pulse) . . . . . . . . Sliding Correlator . . . . . . . . . . . . . . . . . . . . . . . Acquisition Uncertainty Region or Hypothesis Space (HS) Timing recovery from analog to digital approaches . . . . Generating shifted signals analogically/digitally . . . . . . Equivalent DS/SS Receivers in tracking mode . . . . . . . DS/SS Acquisition Module Block Diagram . . . . . . . . . ∆SNR(dB) vs. Ncoh in the correlator. . . . . . . . . . . . Definition of PF A and PM . . . . . . . . . . . . . . . . . . . FSM for the cell test control for 1 correlator. . . . . . . . Sequential test of hypothesis . . . . . . . . . . . . . . . . . PM and PF A in the VAL dwell . . . . . . . . . . . . . . . Code Sidelobes vs. mĎ„ due to partial correlation . . . . . Spatial Diversity in Acquisition . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
17 17 19 23 26 30 40 46 48 50 50 52 53 57
Two-tap selection C/A-code generator implementation . . . . . Two-tap selection C/A-code generator implementation . . . . . Simplification of DS/SS Acquisition Module . . . . . . . . . . . Integration part of the Coherent Integrate & Dump . . . . . . . Implementation of the squarer . . . . . . . . . . . . . . . . . . . Non-Coherent Integrator Implementation . . . . . . . . . . . . Dwell Control FSM. . . . . . . . . . . . . . . . . . . . . . . . . Sidelobe Check Procedure (SLC ) . . . . . . . . . . . . . . . . . Coherent Control FSM . . . . . . . . . . . . . . . . . . . . . . . Sample Bit-Level Parallelized Correlator . . . . . . . . . . . . . Simulation in C of the mean time acquisition. Both are for worst case Doppler of fd = 6000Hz . . . . . . . . . . . . . . . . . . . 4.12 The whole acquisition process . . . . . . . . . . . . . . . . . . . 4.13 After Validation Process Succeeds . . . . . . . . . . . . . . . .
. . . . . . . . . .
63 64 67 68 71 72 74 77 80 87
4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
4 6 7
. 88 . 89 . 90
LIST OF FIGURES
ix
4.14 Zoom just after the MAXF procedure . . . . . . . . . . . . . . . 91 4.15 Zoom to check that the correct epoch is delivered . . . . . . . . . 92 A.1 Normalized plot of ∆SNR (linear) . . . . . . . . . . . . . . . . . 97 A.2 Probability Mass Function of the r.v. N1 . . . . . . . . . . . . . . 99
List of Tables 1.1
GPS signal specifications . . . . . . . . . . . . . . . . . . . . . . .
7
4.1 4.2 4.3 4.4 4.5
GPS related parameters, some imposed and some chosen by design Parameters chosen in the hardware design . . . . . . . . . . . . . GPS C/A Code Generator parameters for various SVs. . . . . . . XOR truth table . . . . . . . . . . . . . . . . . . . . . . . . . . . ROM translation from ADC SIGN/MAG to 2C’s . . . . . . . . .
61 62 65 68 69
Preface The final aim of this project is to provide the hardware implementation of a Direct Sequence Spread Spectrum (DS/SS) code acquisition unit suited for a GPS receiver (RX). The code acquisition unit in a DS/SS receiver is the first of the two tasks that a receiver must perform in order to synchronize with the input signal. Its mission is to try to find the correct alignment between the spreading sequence locally generated at the receiver and the sequence that spread the data in the transmitter (TX). Because of several reasons (relative movement, oscillators inaccuracies in the TX and the RX, etc.), there are slight changes in time that make the first task (code acquisition) insufficient. The second task of a DS/SS synchronization unit is, once a coarse alignment has been stablished between the two signals, track it constantly. This means that in every RX there always has to be a tracking unit, that is basically composed of a code-tracking subsystem (dubbed DLL) plus a carrier frequency-tracking subsystem (also called PLL). Building a complete GPS receiver is out of the scope of this project. In one of those, apart from the synchronization unit there is an amount of data to process if the global position of the user has to be given. Therefore, only the acquisition module will be of our interest and also the information related to GPS given in the sequel will be mostly the information needed in the development and understanding of this project. For further information related to the whole GPS system the reader can address [4]. As usual, this work starts where other works ended. In particular, the implementation of the communication system described herein takes most of its algorithmics from previous works [1], [2] related to the Automated Transfer Vehicle Spread Spectrum Demodulator designed at the Universitat Polit`ecnica de Catalunya (UPC). Few changes with respect to those algorithms will be noted and justified. One of the main efforts that has been made in this project is to try to make as many general assumptions as possible, with the final intention of getting a final implementation that would be useful, not only for a GPS RX, but also for any other DS/SS systems simply tuning some design parameters. In this sense, a big effort was made on this flexible design to make sure that as many parameters as possible were variable while not having to tune any internals of the design. The hardware implementantion will be given in the form of VHDL code in order to be able to map it on several targets after recompilation, for instance any FPGA family or maybe even an ASIC.
Chapter 1
GPS Fundamentals and Signal Structure Mankind has been willing to discover and explore new worlds for ages. Long since we have had the need for navigation and spatial orientation, thanks to the Sun and the Moon in the beginning, and later on thanks to stars. Nowadays, with the fast evolution of technology, it is a network of satellites around the Earth that play the role of ancient stars and allow us to find out our position and kinematics on the Earth by simply broadcasting time-stamped signals.
1.1 1.1.1
GPS fundamentals Range and Pseudorange
The principle of GPS positioning is based on the concept of time of arrival (TOA) ranging to determine user location. This concept entails measuring the propagation time of signals broadcast simultaneously from satellites at known locations. Ideally, the satellite-to-user range is given by R = c ∗ (TOA − TOT), where c is the speed of light and TOT is the time of transmission stamped on the signal. This TOT is a coarse time stamp included as part of the binary data transmitted in the GPS signal, and a fine adjustment is achieved by tracking the PRN code shift of the received signal. In theory, a receiver can estimate its position using three TOA measurements if the satellite and receiver clocks are synchronized. However, in practice receivers typically use inexpensive inaccurate quartz oscillators as local clocks, which are set approximately to GPS time. Unfortunately, the inaccurate receiver clock introduces certain timing offset or clock bias to the true GPS time and thus the difference between the observed TOA and TOT is a function of both satellite-to-receiver range and receiver clock offset: P R = c ∗ (TOA − TOT) = R + c ∗ b
(1.1)
where R is the satellite-to-user range, b is the receiver clock bias, and P R is the resulting range-like observable known as the pseudorange.
1.2 Spread Spectrum Fundamentals
2
The receiver can overcome this clock biasing problem incorporating another equation to the four unknowns system, the distance to a fourth satellite. Thus, in addition to the 3D coordinates of spatial position, a user needs to estimate also the receiver clock offset, so at least 4 satellites need to be observed.
1.1.2
Space Vehicles
The GPS satellites, or space vehicles (SVs), are put in medium earth orbital planes and they are moving in space at a speed of about 4 km/s along their orbits, repeating almost the same ground track twice a day. Although each SV is equipped onboard by a pair of ultra-stable atomic clocks, the satellite clocks are maintained in synchronism by the monitoring control segment that uploads the updated parameters for each SV clock, together with the navigation message broadcast by each satellite.
1.2
Spread Spectrum Fundamentals
For the Global Positioning System, DS/SS was the chosen modulation for satellite signaling. Spread spectrum techniques have been widely used for military purposes because they render robust in front of strong interferences, and can hardly be detected and recovered from unauthorized receivers who don’t have an a priori knowledge of the despreading sequence. GPS signal modulation falls into a broad category of signals known as spread spectrum modulation. Spread spectrum modulation uses a transmission bandwidth many times greater than the strict information bandwidth. At the receiver, the signal will be despread so as to demodulate it afterwards. Spectrum can be spread using several techniques, being the two most popular: 1. changing the carrier frequency in a very fast pseudorandom manner, known as Frequency Hopping Spread Spectrum (FH/SS) 2. or multiplying the low bitrate information sequence by a high bitrate pseudorandom noise-like (PRN) sequence, known as Direct Sequence Spread Spectrum (DS/SS) In the design of the GPS system the latter approach was chosen as the spreading method and therefore will be described here to a certain extent.1 Spreading and despreading are symmetrical operations: both multiply the input signal by a PRN signal. That means that the spread signal can be despread at the RX only if the spreading sequence is known a priori. At the same time, all signals (interferences) added in the way from the TX to the RX will be spread. This means that a signal buried in noise will become a strong signal whereas strong interferers will be spread in a much wider bandwidth, thus becoming weaker. 1
For readers interested in the former method, see [5]. For deep study of spread spectrum, see [3]
1.2 Spread Spectrum Fundamentals
3
The ration between the transmission and information bandwidths in a DS/SS system is called spreading factor or processing gain, and the bits of the spreading sequence are called chips. Spread spectrum modulation has several properties, all of them improving at higher processing gain, that make it ideal for navigation applications: • position location and velocity estimation. • high tolerance to unintentional or intentional interference. • low detectability of transmitted signal by an unintended receiver. • multiple access communications by a large population of users transmitting in the same bandwidth.
1.2.1
Pseudorandom sequences
We have so far described the potential advantages of spread spectrum modulation. Now we will describe what makes all of this possible. As noted before, one of the key points in DS/SS is the spreading sequence. We said also that the spreading sequence should be a pseudorandom noise-like sequence (PRN). A PRN or PN sequence is a signal whose statistical properties resemble those of noise but that in fact is deterministic. These sequences can be generated easily if the characteristic polynomial is known but they appear as random noise to those users who do not have such information. In the vast majority of DS/SS communication systems binary PN sequences are used, since they are the easiest to implement and process, and they behave almost optimally. Randomness properties that are to be achieved by the ultimately deterministic sequences are: 1. Relative frequencies of “0” and “1” are each 12 . 2. The number of run lenghts of zeros or ones of length n is a fraction the total.
1 2n
of
3. If the random sequence is shifted by any number of positions the resulting sequence will have an equel number of agreements and disagreements with the original sequence. If these properties are met, the behaviour of a PRN sequence is equivalent to that of a binary independent random sequence, also known as Bernoulli sequence. The convenience of this fact is that PRN sequences can be generated by a simple linear operation specified by a finite number of binary parameters. A PRN sequence can be generated by a Linear Feedback Shift Register (LFSR) as in Figure 1.1. Each clock time the register shifts all contents to the right. The sequence an propagates through with each term generated linerarly from the preceding r terms according to the formula an =
r X i=1
ci an−i
(1.2)
1.2 Spread Spectrum Fundamentals
z −1 an
an−1
an−2
z −1
c1
4
z −1
c2
an−r cr
Figure 1.1: An LFSR generates PRN sequences. ci are binary connection coefficients: ci = 0 means no connection while ci = 1 means connection. Adders (mod 2) are implemented as XOR gates.
where ci are connection coefficients (1 for connection, 0 for no connection), and addition is done modulo-2. We define the generating function of the LFSR as G(z) = a0 + a1 z
−1
+ a2 z
−2
+ ··· =
∞ X
an z −n
(1.3)
n=0
being z −1 a Tc seconds delay. Combining (1.2) and (1.3) we may express the latter as a ratio of finite polynomials G(z) =
g0 (z) f (z)
(1.4)
where f (z) is the characteristic polynomial of the LFSR and g0 (z) depends on the initial condition vector.
Gold Codes Properties The Gold codes selected for the C/A signal are a family of codes formed as the product (or modulo-2 sum) of two different properly paired maximal length LFSR (MLFSR), both of the same period P = 1023. This is a largely used family of codes that were also selected for the GPS constellation Code Division Multiple Access (CDMA) thanks to its special correlation properties: • Low crosscorrelation properties between two different sequences of the same family. • Capability to generate many codes from two LFSR’s with similar properties. A whole family of Gold codes can be easily generated simply by time-shifting one of the two basic LFSR sequence that generate it. And this can be achieved very easily thanks to the Cycle and Add property of MLFSR sequences.
1.3 GPS Signal Structure and Frequency Plan
5
Cycle and Add Property Maximal Lenght LFSR sequences have a special property named Cycle and Add Property that allow us to generate a time-shifted copy of the sequence with the modulo-2 addition of two other sequences. Symbolically, using the polynomial notation, we denote the original sequenc by G0 (z), the shifted sequence by Gτ (z), both of length 2r − 1, and their respective initial conditions g0 (z) and gτ (z). Now, since G0 (z) and Gτ (z) are generated linearly from g0 (z) and gτ (z) according to 1.4, G0 (z) =
g0 (z) , f (z)
Gτ (z) =
gτ (z) f (z)
(1.5)
Since the polynomial operations are linear, it follows from the distributive law that the modulo-2 sum of the two sequences has polynomial G0 (z) + Gτ (z) =
g0 (z) + gτ (z) f (z)
(1.6)
This means that it can be generated by the initial condition polynomial g0 (z)+gτ (z). This itself is another valid initial vector, so the sequence generated is itself a time shift of the original sequence. In the GPS system this property is used in order to generate
1.3
GPS Signal Structure and Frequency Plan
From the GPS signaling specifications one can see how it was born for military reasons rather than for civilian applications. In the definition of the system, there are two services: the Standard Positioning Service (SPS) and the Precise Positioning Service (PPS). Ranging errors have been specified to be better than 10m for the PPS, in contrast to 100m for the SPS. For this purpose, each SV broadcasts two types of PRN ranging signals:2 • the C/A code (C/A for Clear or Coarse Acquisition). • and the P code (P for Precision). Usually encrypted, so usually called P(Y) code. The C/A code is a 1.023MHz signal available to all civilian users, and is needed to aid in the synchronization of the 10.23MHz P code. 1.023MHz was chosen as the chiprate of the C/A code for practical reasons: the C/A code has a period of 1023 chips, so at 1.023MHz every C/A period takes 1ms. The P code is a 10.23MHz signal intented to be used by authorized users that need more than standard precision. To achieve the specifications of the PPS, ionospheric and tropospheric delay effects need to be taken into account. This can be done transmitting the same information in two disjoint bands and measuring the delay difference in both 2
Actually there exists a third PRN used only in a time-gated mode for a Nudet (Nuclear Detonation) Detection System (NDS)
1.3 GPS Signal Structure and Frequency Plan
6
P(Y) code
2.046MHz
L2
20.46MHz
C/A code
L1
P(Y) code
F
Figure 1.2: GPS spectrum allocation
bands. The GPS signal consists of two components, Link 1 or L1 and Link 2 or L2, both in the L-band. L1 is called the primary carrier and is centered at 154∗10.23MHz = 1575.42MHz, and the secondary carrier L2 is centered at 120∗ 10.23MHz = 1227.60MHz. The signal bandwidth at both center frequencies are 20.46MHz. With this bandwith and since Pulse shaping in the GPS system was chosen to be squared pulse shaping, that means that at RF the sidelobes filtered out are 10 (one-sided) lobes for the C/A case and only the mainlobe for the P case. L1 carries an unbalanced QPSK modulation, with the C/A code on the in-phase component and the P code on the quadrature phase (90◦ rotated) component 3dB weaker than the C/A power (hence unbalanced). Also on this band, the 50bps navigation data signal modulates both in-phase quadrature components identically. However, L2 is only BPSK modulated by the P code and by the 50bps navigation data sequence.3 A graphical representation of the spectrum of the transmitted signals in the two bands can be seen in Figure 1.2, and an outline of the signal modulator at the TX can be seen in Figure 1.3. The received power levels on earth of the three GPS signals transmitted by the SVs are extremely weak and depend on the user elevation angle. The minimum received signal power level, at an elevation angle of 5◦ from the user’s horizon, are −160dBW for the C/A code, −163dBW for the P code at L1, and −166dBW for the P code at L2. A brief summary of signal specifications, power levels and frequency band allocation can be found in Table 1.1
3
Navigation data modulation can be disabled by ground command
1.3 GPS Signal Structure and Frequency Plan
7 FUNDAMENTALS OF GPS
25
+90˚
L1 Carrier
- 3 dB L1 Signal
1575.42 MHz
C/A-Code 1.023 MHz BPSK modulation
Navigation Data 50 Hz
Modulo-2 addition
P(Y)-Code 10.23 MHz
L2 Carrier L2 Signal
1227.60 MHz
Figure 1.3: GPS Modulator Diagram
Fig. 3.2
3.3.1
GPS signal structure.
Satellite Signals
GPS satellite transmissions utilize direct sequence spread spectrum (DS-SS) modulation [32]. Spread spectrum techniques have been widely in military use because they can comC/A atrecovery L1 P(Y) L1 P(Y) at L2 batSignal strong interference and prevent message by at unauthorized receivers, and they have also Frequency been adopted for commercial applications [16]. Spread spectrum Center (MHz) 1575.42 1575.42 1227.60 communication systems involve the transmission of a signal in a radio 20.46 frequency bandwidth much greater RF Bandwidth (MHz) 20.46 20.46 than the data information bandwidth to be conveyed. These systems spread the transmitted Sidelobes in Passband (one-sided) 10 1 1 signal a frequency range substantially greater of the modPRNspectrum Chiprateover (Mchips/sec) 1.023 10.23than the bandwidth 10.23 a PRN data Period 1 ms systems, 1 week 1spreading weeka is performed ulating message. In Direct sequence (DS) the spectral (bps)signal by an auxiliary 50 pseudo random-noise 50 50 byNavigation multiplyingData the data (PRN) code [34]. Minimum Received Power (dBW) −160 −163 −166 Each SV broadcasts two types of PRN ranging signals, as well as navigation data which b Minimum Received C/N0 (dBHz) 40 health data, 40 allowing users 40 to measure their consists of satellite ephemeris data and satellite a pseudo-ranges and hence estimating their positions, velocity, [32]. Figure 3.2 ilP code is 38 weeks long, but is subdivided so that each of 37 possible and GPStime satellites lustrates the GPS signal structure. The ranging signals are pseudo-random noise codes that or ground transmitters gets a 1-week period code b modulate the account satellitea 10.23MHz carrier frequencies binary phase shift keying (BPSK). Each Taking into bandwith for using the P-code signal and 1.023MHz for the C/A-code satellite transmitssignal continuously two microwave carrier signals called the primary carrier, MHz, and secondary carrier, MHz. The carrier frequencies Table 1.1: GPS signal specifications are modulated by spread spectrum codes with a unique PRN sequence associated to each satellite and by the navigation data [33]. The near orthogonality of the PRN code sequences permits all the satellites to transmit on the same carrier frequencies without incurring significant mutual interference [34]. PRN codes are simply deterministic binary sequences with specific statistical random noise-like properties [34]. The family of PRN codes is mainly characterized by the low cross-correlation between the codes, they are nearly orthogonal, and the autocorrelation function is almost zero except at zero delay [32].
Part I
DS/SS Synchronization
Chapter 2
Quantization Effects in DS/SS Usually the effect of quantization is simplified to the computation of a single formula that gives the number of necessary word width for a given SNR. However, this is a valid approximation only for cases of that word width large. Here in this chapter we study in detail the effect that bit-quantization has on the received DS/SS when the number of bits is minimum (1 or 2).
2.1
Quantization Loss
In DS/SS communications it is common to have the spread signal power much lower than noise power, i.e. it is very typical to make the receiver work at very low SNR and use correlators to recover the transmitted information. In such adverse conditions, to any unintended receiver there seems to be no signal in the band, apart from noise. As noise is dominating, it would be pointless to use several bits at the Analog to Digital Converter (ADC), as that would mean that the only thing we would be improving is the way noise is represented. Thus, it is customary to use one-bit ADC for low-end receivers and maybe two or three bits ADC’s for higher end purposes. In the following sections we study the effect of the number of bits in SNR degradation terms at the ADC stage for these cases and reach a conclusion about the best tradeoff in ADC-loss/hardware-burden.
2.1.1
1 and 2 bit Quantizers
By digitizing it is understood that analog samples are mapped into discrete values at discrete time instants. This mapping implies discarding information from the original signal, and therefore a loss in the SNR is expected. The mapping for a 2 bit ADC can be described by a transfer function like that of Figure 2.1. We will use from now on the same nomeclature as usually found in the literature: the sign of the sample is given with the SIGN bit, and wether the sample crossed the theshold |∆| is given with the MAG bit. We are using this transfer function because doing so we will reach a closedform expression function of L and ∆ useful for evaluating the behaviour of either
2.1 Quantization Loss
10 A/D Output
L 1 −∆
A/D Input
−1
∆
−L Figure 2.1: 2-bit A/D Converter transfer function. It becomes the transfer function for a 1-bit ADC setting L = 1
1-bit or 2-bit ADCs. Parameters L and ∆ will be chosen to achieve maximum performance (minimum SNR degradation). The signal model under study is the following: p r(t) = Ps s(t) + n(t) (2.1) where the transmitted signal s(t) = ±1 is a BPSK DS/SS modulated signal with power Ps and n(t) is additive white gaussian noise (AWGN) with zero mean and power σn2 . An assumption of very low signal to noise ratio at input (SNRi ) is made, as we are dealing with spread spectrum communication systems where typically Ps σn . In our case (GPS), SNRi is expected to be around −20dB, so low-SNR approximations will hereby apply. The sampled received waveform is p rk = Ps sk + nk (2.2) where noise samples nk are assumed to be independent. It is known that one property of the matched filter (or correlator) is that it maximizes the SNR at its output. Thus, we should consider SNR degradation only in the case that our PN generator is aligned to the received sequence, i.e. the PN generator sˆk ∈ {±1} at the receiver is aligned to the received sequence and thus sˆk sk = +1 for all k. Otherwise the SNR that we would be calculating would not correspond to the real one. As our purpose is then to calculate the SNR degradation caused by the quantizer between the correlator’s output and input signal, we shall first find the correlator output expression. In that case, R(0) =
=
M X k=1 M X k=1
sˆk Q[rk ] =
M X
p sˆk Q[ Ps sk + nk ]
k=1 M X p Q[ Ps + n0k ] = Q[yk ] k=1
(2.3)
2.1 Quantization Loss
11
√ √ where yk = Ps + n0k , and has mean µy = Ps and variance σy2 = σn2 , and where n0k = sˆk nk also has independent samples. The signal-to-noise ratio at the output of the correlator SNRo will depend on the mean and variance at its output: SNRo =
M m2Q[y] M 2 E 2 (Q[yk ]) E 2 (R(0)) = = var(R(0)) M var(Q[yk ]) var(Q[yk ])
(2.4)
The M factor is added by the correlator, and is called correlator gain. We will not consider it now that we are focusing only on ADC losses. Note that since the received and generated PN sequences are supposed to be aligned, the calculation of this mean and variance is done as if it was for an equivalent “all ones” input. The mean of the quantizer output mQ[y] = E(Q[yk ]) whose transfer function is depicted in Figure 2.1 is E(Q[yk ]) = L Pr{yk > ∆} + Pr{0 < yk < ∆} − Pr{−∆ < yk < 0} − L Pr{yk < −∆}
(2.5)
Using the following relations Pr{0 < yk < ∆} = Pr{yk > 0} − Pr{yk > ∆}
(2.6a)
Pr{−∆ < yi < 0} = Pr{yk < 0} − Pr{yk < −∆}
(2.6b)
we can write E(Q[yk ]) as E(Q[yk ]) = (L − 1) Pr{yk > ∆} − (L − 1) Pr{yk < −∆} + Pr{yk > 0} − Pr{yk < 0}
(2.7)
Noticing that Pr{yk > 0} − Pr{yk < 0} = Pr{0 < yk < 2µy } = 2 Pr{0 < yk < µy }
(2.8)
then mQ[y] is E(Q[yk ]) = 2 Pr{0 < yk < µy } + (L − 1) [Pr{yk > ∆} − Pr{yk < −∆}] (2.9) These probabilities can be expressed with the Q(x) function as µy Pr{0 < yk < µy } = Q(0) − Q (2.10) σn ∆ − µy ∆ + µy Pr{yk > ∆} − Pr{yk < −∆} = Q −Q (2.11) σn σn Recall that Q(x) is called the right-tail probability and is the probability of exceeding a given value x in a standard normal probability density function (PDF) denoted by N (µ, σ 2 ), being µ and σ 2 its mean and variance, respectively. The function Q(x) is also termed complementary cumulative distribution function, and it can not be evaluated in a closed-form. It is defined as Z ∞ 1 2 Q(x) = √ e−t /2 dt ≤1 (2.12) 2π x
2.1 Quantization Loss
12
and is related to the well-known error function erf (x) by x 1 1 Q(x) = − erf √ 2 2 2
(2.13)
where erf (x) is 2 erf (x) = √ π
Z
x
2
e−t dt
≤2
(2.14)
∞
At very low SNRi we can use the following approximations to the Q(x) function: 1 2 Q(x − δ) − Q(x + δ) ≈ √ (2δ)e−x /2 δ 1 (2.15a) 2π Q(x − δ) + Q(x + δ) ≈ 2Q(x) δ 1 (2.15b) Thus, µy E(Q[yk ]) = 2 Q(0) − Q σn ∆ − µy ∆ + µy + (L − 1) Q −Q σn σn 1 µy 1 2µy −∆2 /2σn2 ≈2 √ + (L − 1) √ e 2π σn 2π σn
(2.16) (2.17)
where in the approximation step (2.16)-(2.17) we have used the aproxima −(µy /2)2 tion exp ≈ 1, as SNRi = (µy /σn )2 and at very low SNR we have 2σ 2 n
that (µy /σn )2 1. Reagrouping (2.17) we finally have ! r h i p 2 2 −∆2 /2σn E(Q[yk ]) = 1 + (L − 1)e SNRi π
(2.18)
Variance can be expressed as σo2 = var(Q[yk ]) = E(Q2 [yk ]) − E 2 (Q[yk ]) ≈ E(Q2 [yk ])
(2.19) (2.20)
where it has been considered that E 2 (Q[yk ]) E(Q2 [yk ]) as E 2 (Q[yk ]) is proportional to the SNRi and we are considering we are working at very low SNRi . Now, doing similar reasoning and calculations we find that the expectation of the squared quantizer output is E(Q2 [yk ]) = L2 Pr{yk > ∆} + (−L)2 Pr{yk < −∆} + 12 Pr{0 < yk < ∆} + (−1)2 Pr{−∆ < yk < 0} = (L2 − 1) Pr{yk > ∆} + Pr{yk < −∆} + Pr{yk > 0} + Pr{yk < 0} = (L2 − 1) [Pr{yk > ∆} + Pr{yk < −∆}] + 1 ∆ − µy ∆ + µy 2 = (L − 1) Q +Q +1 σn σn
(2.21)
2.1 Quantization Loss
13
which by the sake of approximation (2.15a) can be written as ∆ 2 2 E(Q [yk ]) ≈ 1 + 2(L − 1)Q σn
(2.22)
Thus, the A/D conversion loss for low SNRi scenarios can be stated as ∆ SNRi SNRi σo2 LADC L, = = 2 = σn SNRo /M mQ[y] /σo2 m2Q[y] /SNRi 1 + 2(L2 − 1)Q σ∆n = (2.23) 2 2 2 −∆2 /2σn 1 + (L − 1)e π 2.1.1.1
Performance Comparison for 2-bits, 1-bit(SIGN) and 1-bit(MAG) ADCs
With the ADC transfer function from Figure 2.1 we have three quantization options to implement at the receiver trading off hardware burden and performance. These options are: • 2-bits, so in this case L and ∆ should be found to maximize LADC L, σ∆n . • 1-bitSIGN , or 1-bit keeping only the SIGN information. This case is the same as if L = 1, and ∆ then has no meaning. • 1-bitMAG , or 1-bit keeping only the bits that exceed ∆ along with their sign information. This case is similar to setting L = ∞, but LADC (∆/σn ) needs to be maximized anyway. More simply, this case can be regarded as if the ADC had a zero level output and the transfer function was like that of Figure 2.2 A/D Output
1 −∆
A/D Input
∆
−1
Figure 2.2: 1-bit A/D Converter transfer function with zero level output.It only keeps those bits exceeding ∆ (MAG bits)
It is straightforward to use equation (2.23) to evaluate the 1-bitSIGN ADC simply setting L = 1. In this case, conversion loss is π/2, or 1.96 dB. This
2.1 Quantization Loss
14
loss is obviously independent of ∆ as the ADC behaves just like a “sign(x)” function, so no threshold apart from 0 is being used. In Figure 2.3 we can see a plot of the ADC conversion losses for the 2-bits, 1-bitSIGN and 1-bitMAG cases. A/D Conversion loss (dB) L=1(1bitSIGN) L=2(2bits) L=3(2bits) L=3.5(2bits) L=inf.(1bitMAG) 2
LADC(L,∆/σn)
1.5
1
0.5
0 0
0.5
1
1.5
Figure 2.3: LADC L,
∆ σn
2 ∆/σn
2.5
3
3.5
4
, theoretical ADC conversion losses
In the case of the 2-bits ADC only the curves corresponding to L = 2, L = 3 and L = 3.5 have been plotted. The curves for L > 3.5 would plot either overlapped or show higher conversion losses, so they have been omitted for visual simplicity. The optimum pair L,∆/σn for a 2-bits ADC has been chosen L = 3, ∆/σn = 1.0 (L could have been chosen between 2.5 and 4.0 without much difference, but for implementation purposes 3 is more suitable). In this case the conversion loss is 0.55dB. As we said before, the 1-bitSIGN case has 1.96dB constant loss for all values, but we would like to emphasize one fact that can be observed in Figure 2.3. The performance of 1-bitMAG ADC is better than 1-bitSIGN ’s for a range of threshold values, but unfortunately worsens a lot beyond a certain point. The explanation to this effect is the following: by keeping only those samples that are beyond the threshold we are weighting only samples that are likely to have been “good” realizations, hence improving the SNRo . But if we set the threshold too high, we are becoming too “selective” or too “picky”, and so very few samples will cross that threshold, hence degrading the SNRo . The advantages of 1-bitSIGN are robustness in AWGN (it has the same performance for alli thresholds) and even simpler hardware, since no AGC is needed to dinamically adjust the threshold. On the contrary, 1-bitMAG can help in the
2.1 Quantization Loss
15
SNR degradation while keeping hardware very simple too, but unfortunately stresses the need for a more critical AGC, in order to keep the threshold in the low-losses region. The 1-bitMAG ADC achieves a minimum conversion loss of 0.92dB at ∆/σn = 0.61 and of 1.32dB at ∆/σn = 1.1 In this project, as in many GPS receivers, the 2-bits option is the one preferred in order to keep conversion losses as low as possible while allowing for a certain hardware simplicity. In previous paragraphs it was shown that in this case we would choose L = 3, and ∆/σn = 1.0. Setting the threshold at ∆/σn = 1.0 can also be regarded as if we were forcing the AGC+ADC to output determined statistics for the SIGN and MAG bits (always considering the system is working in only AWGN and no CW interference). Obviously the SIGN bit statistics are plausibly expected to take each value around the 50% of the time. But the MAG bit is going to have other statistics different from 50%, depending on the ∆ chosen.2 The probability that the threshold ∆ is exceeded depends as we know on the Q(x) function, Pr{yk > ∆} + Pr{yk < −∆} = Pr{|yk | > ∆} ≈ 2Q(∆/σn )
(2.24)
For ∆/σn = 1 this probability 2Q(∆/σn ) = 31.7%, so this is the expected statistics on the MAG bit. Fortunately, these facts just mentioned are taken into account in most GPS receivers, and most of those with 2-bit ADCs have an AGC that is continuously seeking for ∆/σn = 1.3
1
It is important to note the conversion loss at this threshold, as most 2-bit GPS RF frontend receivers are designed for it, so if we intended to use 1-bitMAG in our design, we couldn’t avoid using it. 2 Nevertheless in the case of 1-bitMAG ADC the optimum threshold at 0.61 makes the MAG statistics to be around 50%. This can be understood if we think that with only 1 bit of information, the maximum information is given when the probability of each outcome is 12 . In the 2-bit case this does not have to be like this because samples that cross the threshold are given more weight, hence have more information, so statistics do not have to be the same. This threshold has to be carefully set if we don’t want to “miss” good samples or let us be “mislead” by mistakes that would be more weighty and hence more harmful. 3 The AGC+ADC modules of the RF frontend GP2010 by Zarlink chosen for our project is designed with this in mind [10].
Chapter 3
DS/SS Acquisition In Direct Sequence Spread Spectrum communication systems, synchronization can be splitted in two parts: code acquisition and tracking. Acquisition of the pseudorandom signal (or Code Acquistion) is the first stage in the synchronization process. Its purpose is to provide coarse epoch synchronization to the incoming signal, i.e., to align the receiver’s generated pseudorandom signal to that transmitted by the satellite as much as possible.1 In these conditions, the tracking subsystem can be initialized because the synchronization error provided by the acquisition unit is expected to be relatively small, usually a fraction of chip. Achieving this coarse alignment is necessary to be able to remove the spreading sequence ck (t) from the spread signal sent by the k th satellite. Once despread, the data sequence can be estimated correctly.
3.1
Conventional Techniques: 2D Serial Search
As described in several books and papers [5],[4] dealing with DS/SS acquisition, one of the most intuitive ways to align the incoming PN sequence with the one generated at the receiver, is by means of matched filtering (MF). This is not strange, as we know that the MF is optimum in the SNR sense [11]. At the output of a filter matched to the incoming PN sequence we would obtain correlation values of that sequence with the PN sequence generated at our receiver. From the correlation properties of PN sequences we know that high or very low correlation values are expected from aligned or missaligned sequences, respectively. A typical PRN correlation can be seen in Figure 3.1.
3.1.1
Sliding Correlator
A very simple code acquisition system based on the MF could be one like that of Figure 3.2, called sliding correlator. These blocks compute the correlation at a given timing error ε = τ0 − τˆ between the incoming spread signal r(t) and our despreading sequence c(t). τ0 is the initial unknown timing error, τˆ is the estimated timing error, and the sliding correlator objective is to set 1
Later we will see that in DS/SS as much as possible translates into within 1/2 chip error.
3.1 Conventional Techniques: 2D Serial Search
17
R(τ ) 1 θ
Threshold
−Tc
Tc τmax
−1 P
τ
Figure 3.1: PN autocorrelation function for the (theoretical) perfectly squared pulse. Remember this function is P · Tc seconds periodic.
ε ≈ 0. It does so by expecting an output level at s(t) to be over an arbitrary threshold θ. Until this threshold θ is not surpassed by an autocorrelation value we won’t consider our reference to be coarsely aligned to r(t), since in that situation ε is not likely to be close to 0. Meanwhile we will cyclically change the alignment difference between r(t) and c(t). Incrementing the alignment difference cyclically will eventually lead the correlator to an almost perfect alignment depending on the step size with which alignment is changed. Each of the discrete values that τˆ can take and needs to be checked for validation are called cells or hypothesis, and the whole set to which τˆ belongs is the uncertainty region or simply hypothesis space (HS). We want to put emphasis here on a fact that has to be clear from now on: hypotheses or cells are code-phase differences, they are by no means “chips”. It is because of this PN-sequence shifting that this correlator is dubbed sliding correlator, and because of its behaviour it belongs to a widely studied class of DS/SS synchronizers called Serial Search.2
r(t − τ0 )
s2 (t)
s(t) R λTc
(·)2
0
c(t−ˆ τ)
PN Generator
≶ control
θ r(t) = d(t)c(t) + n(t)
Threshold
Figure 3.2: Sliding Correlator 2 There are other alternatives to serial search based on estimation of the LFSR state register or on FFT methods [?] that won’t be discussed in this project, as the method proposed is based on serial search too.
3.1 Conventional Techniques: 2D Serial Search
18
Several parts are to be identified in Figure 3.2. The multiplication by c(t) plus the integrator behave as a matched filter. The squarer non-linearity is used for the purpose of supressing data modulation, otherwise the integrator output would depend on the sign of every data bit sent. In the special case of having a CDMA pilot for synchronization aiding purposes this squarer would not be necessary anymore. When θ is eventually exceeded the comparator will stop changing τˆ, i.e. it will stop changing the phase of the generated PN sequence. In Detection theory literature [11] the signal that enters the comparator to be tested against a threshold is usually called test statistic.
3.1.2
Performance of Sliding Correlator
There are several reasons to note in this scheme that make the basic sliding correlator have a rather poor performance in practice. 1. The PN-sequence MF used is not a transversal filter but a serial filter, instead. This means that every correlation computation will take λ times more in comparison with a transversal filter. λ is the number of chips to be integrated and λTc is the amount of time required to output a single correlation value, in comparison with only Tc for a traditional filter. The reason to do it this way is that implementing a matched filter for a whole period of a PN sequence could take many taps, depending on the sequence length. That would require too much hardware for one single operation in the receiver, so it is not adequate for systems where PN sequences are greater than a certain value. 2. So far, all cells in the uncertainty region are tested during the same amount of time λTc . The time we spend comparing the two signals is called observation time or dwell. It is plausible to expect that a wrong hypothesis will output low correlation values all along the dwell. Depending on how this dwell time is set, we can think of three possible types of tests: Single Dwell, Multiple Dwell and Sequential Test. In Single Dwell, the observation time of all cells is constant for all of them. Multiple Dwell arquitectures make several observations of K different times on the same cell, starting with shorter times gradually increasing. This way the mean observation time per cell is seriously reduced since wrong hypothesis are discarded sooner. Sequential Test arquitectures use variable observation times based on the assumption that good hypothesis tend to improve in time while bad hypothesis tend to worsen little by little, thus reducing even more the mean observation time per hypothesis. It is important to chose carefully the dwell approach, since the false-alarm probability, miss-detection probability and mean acquisition time of our receiver will depend on it. 3. Doppler can spoil correlation values significantly.
3.1 Conventional Techniques: 2D Serial Search
19
In following sections we will see that correlating two sequences where one of them suffers from Doppler shift makes correlation properties valid only locally. So correlations can not be computed for as long as we may need to raise our signal buried in noise above it. That is why most commercial receivers need to try to compensate for an hypothetical unknown Doppler in the acquisition process if they want to correlate for longer times than that allowed by Doppler. In fact, this leads us to a bi-dimesional hypothesis space very well known by all DS/SS receiver designers. In Figure 3.3 we can see that the HS is formed by a 2D space, belonging to one axis all possible code offsets τˆk for 1 ≤ k ≤ P , and to the other axis all possible Doppler shift cases to be tested to achieve a desired carrier frequency offset resolution. In this project we will try to make a high performance spread spectrum synchronizer for GPS but without incurring in excessive hardware burden. That means that our correlators will not use transversal filters to avoid using too much hardware for that sole operation, we will use a combination of sequential test and multiple dwell configuration, and we will try to make our receiver work in a 1-dimensional HS instead of the well known 2D HS. Later on we will improve performance by parallelizing the whole acquisition system and taking advantage from an antenna array. 1 cell 1/Ncc chips usually 1/2 chip
1 Doppler bin is fd /Nbins Hz width Cells already tested Cell being tested
P : LFSR sequence period Ncc : number of cells/chip
P · Ncc cells per Doppler bin usually 1023 · 2 = 2046 in GPS Figure 3.3: Acquisition Uncertainty Region or Hypothesis Space (HS)
3.2 Avoiding Doppler Search
3.2
20
Avoiding Doppler Search
A new DS/SS synchronization algorithm described in [1] and thoroughly studied in [2] was developed, being one of the constraints that code acquisition should be done before carrier synchronization. This constraint entailed that the hypothesis space search could not include frequency search. Other constraints had to do with acquisition speed at several C/N0 ranges and that decisions should not be fed back. In this project we will focus on these constraints without seeking for ultra fast code acquisition. The reason is that in GPS one frame takes 30 seconds to be received, so it is a waste to use much more hardware logic to synchronize in e.g. 100ms instead of 500ms, if it takes so long to get enough information to start single-point calculations. To avoid frequency search, a simple but careful study of the Doppler effects had to be done, and the result was a very robust algorithm that only searches in the code space, and is really fast for a wide range of C/N0 . Basically, the algorithm follows the same basic principle shown in Figure 3.2, where we can think of the sliding correlator behaving just like a simple power detector used to test whether the spreading sequence is a good hypothesis and so surpasses a power threshold θ. To fully understand how the proposed algorithm avoids making a joint code/frequency acquisition using a 2D HS, let us study Doppler effects in detail.
3.2.1
Doppler Effects
It is known that in any communication system Doppler has an expansive or compressive effect on the the received wave, or equivalently will decrease or increase the apparent carrier frequency of the incoming signal when the TX and RX are falling apart or coming closer, respectively.3 This increment/decrease ∆f is measured in Hz if absolute Doppler is being considered or as a the ratio between ∆f and fc if relative Doppler is preferred. Obviously this ratio is dimensionless, and we will see later that not only the relative, but also the absolute value is of great importance when analyzing restrictions applicable to our synchronization method. In GPS applications, we must be aware that a carrier frequency shift of about ∆f = ±6KHz can exist. This parameter needs to be considered in the design of the code acquisition subsystem, since in the very initial synchronization phase we have absolutely no information about the Doppler frequency shift of the received satellite signal. In DS/SS communications, Doppler affects the received signal in two ways closely related to one another: 1. Carrier Doppler Frequency Shift (∆f ), which will limit the maximum coherent observation time. 2. Chiprate Doppler (∆r ), which will limit the maximum noncoherent observation time. 3
Also, TX/RX clock oscillators can suffer from bias, thus shifting fc . To simplify analysis, in the sequel we consider that this errors are all included in the whole Doppler frequency shift.
3.2 Avoiding Doppler Search 3.2.1.1
21
Carrier Doppler Frequency Shift
The fact that the received signal is frequency shifted limits the amount of time the matched filter will be able to integrate in the search for a PRN correlation peak. In other words, there exists a maximum observation time during which we can look for similarities between the received signal and our PRN code (hence coherent, as in the comparison both phases are taken into account). Having a Doppler frequency shifted signal means that the matched filter is comparing a reference signal with a replica modulated by an amount equal to the absolute carrier Doppler frequency shift ∆f .4 3.2.1.2
Chiprate Doppler
The chiprate Doppler ∆r is an effect that consists in an expansion/compression of the PRN sequence caused by the same effect as carrier Doppler: a radial velocity component between TX/RX. However, as the chiprate is several orders of magnitude lower than the carrier frequency, this expansion/compression is less appreciable yet needs to be considered. As the received PRN sequence is thus slowly drifting while we compare it to our reference, there exists a maximum amount of time that, if exceeded, we could not guarantee that the hypothesis delivered to the PLL/DLL has a maximum timing error of ±1/2 chip. This is the reason why the chiprate Doppler sets a maximum non-coherent observation time. 3.2.1.3
Relationship between Carrier Frequency Doppler and Chiprate Doppler
The two effects explained so far are caused by the same phenomenon, so they have to be closely related to one another in some form or other. The relationship between carrier frequency Doppler and chiprate Doppler is given in the following expression: ∆f ∆r = rc fc
(3.1)
This expression states a natural fact. It says that the relative expansion/compression of the chiprate equals the relative expansion/compression of the RF carrier due to Doppler.5 4
Notice that in the RF frontend where mixers downconvert fc down to IF the absolute ∆f Doppler is conserved in the whole downconversion process. 5 With a numerical example it can be further clarified; we will use P-code’s chiprate ∆rc = 10.23Mchips/s and L1 frequency carrier fc = 154rc = 1575.42 · 106 cycles/sec. K = fc /rc = 154 carrier cycles/chip, is the number of cycles that fit in one chip. So, if the amount of Doppler is ∆f = 6000 cycles/sec, the number ∆r of chips that the signal will have been shifted in one second is ∆r = 6000(cycles/sec)/K(cycles/chip). Thus, ∆r = ∆f /K = ∆f , from which (3.1) is found. fc /rc
3.3 Description of a DS/SS Communication System
3.3
22
Description of a DS/SS Communication System
In Figure 3.6(a) we can see a block diagram of a general DS/SS Communication system when it is already synchronized and it is continuously tracking the signal in order not to lose lock. Note that blocks that apply estimated corrections (phase rotator, timevariant interpolator, variable decimator) have been depicted while those that estimate those corrections have been omited for visual simplicity. In Figure 3.6(b) we can see a simplification of the RX in Figure 3.6(a). The simplification has been done assuming that the timing recovery subsystem is working as expected and therefore the assumption that Tc /Ts is rational can be made. Since this is a rather serious assumption, we will firstly demonstrate its feasibility, that is, we will show how the timing recovery is done, and from that point on we will use the simplified RX to describe how the DS/SS RX works.
3.3.1 3.3.1.1
Timing Recovery Basic Concepts
In digital implementations of communication systems, the ADC stage is preferred to be as close as possible to the antenna, so that all the signal processing in the RX is done by means of simpler, smaller, less consuming and more replicable functions: digital gates. All-digital implementations avoid any signal processing before the ADC and also avoid changing at any moment the ADC sampling rate, so that the sample rate is always fixed at fs = 1/Ts . This means that the A/D conversion stage is an asynchronous sampling process with respect to the transmitter clock,6 and that at the receiver there exist only samples at multiples of the sampling period: t = kT s. It is called asynchronous because the ratio symbol period/sampling period T /Ts is not rational (in our case T is here the chip period Tc ). We must stress that Tc is incommensurate with Ts and that the RX can only see samples at a Ts time basis because this enforces us to find an algorithm that can give out samples at multiples of Tc based on calculations of samples at multiples of Ts . We know this can be done because by the Nyquist sampling theorem since we know that all the information of a bandlimited signal y(t) is contained in its samples y(kTs ) if the analog signal has been sampled at a sample rate at least twice as high its bandwidth. The second thing that this algorithm should do is be able to give out samples at t = nTc + εˆTc , i.e. give out samples of the matched filter output at the best sampling instants (εTc is a fraction of chip, ε ≤ 1). These two key operations of the algorithm are called Timing Recovery. In digital signal processing the conversion of sample rates is studied under the name of “Multirate Signal Processing” and suggests interpolation and decimation techniques. Note that because of the irrational ratio Tc /Ts this 6
Even with the best clock standards, there would always exist some differences due to other effects like Doppler, heat, etc., that would sink our receiver if we assumed synchronism.
3.3 Description of a DS/SS Communication System
23
multirate signal processing will require an irrational interpolation/decimation ratio. So the first aspect we will address in this section is to show how we can take for granted that at the MF output of our receiver there are samples at multiples of Tc with a best-sampling-instant correction εˆTc . We want to show that a matched filter plus a time-variant interpolator plus a variable decimator behave exactly as if we were taking samples at the maximum MF output at a symbol rate Tc disregarding the actual sample rate Ts of our receiver. In few words, we will take samples at t = nTc + εˆTc out of samples at t = kTs . This equivalence we want to show is summarized in Figure 3.4, where we go from a completely analog approach to a fully digital implementation step by step, from top to bottom. r(t)
y(t)
r(nTc + εˆTc )
(a) Fully Analog
gCM F (t) Matched Filter
y(t)
r(kTs )
y(kTs )
(b) Hybrid t = kTs
t = nTc + εˆTc
r(t) r(nTc + εˆTc )
gM F (kTs )
D/A
Matched Filter
Interpolator
t = nTc + εˆTc
µn
y(t)
r(kTs )
y(kTs )
(c) Fully Digital t = kTs
mn Ts
r(kTs + µn Ts )
gM F (kTs )
hI (kTs , µn )
Matched Filter
Interpolator
Variable Decimator
µn
y(t)
y(kTs )
(d) Fully Digital t = kTs
r(mn Ts +µn Ts ) =r(nTc +ˆ εTc )
mn Ts
y(kTs + µn Ts ) r(kTs + µn Ts ) hI (kTs , µn )
gM F (kTs )
Interpolator
Matched Filter
r(mn Ts +µn Ts ) =r(nTc +ˆ εTc )
Variable Decimator
Figure 3.4: Timing recovery equivalences from fully analog to fully digital approaches, step by step.
We briefly describe now the steps followed in Figure 3.4 to explain them later in further detail.
3.3 Description of a DS/SS Communication System
24
(a) to (b) It is possible to go from the fully analog (a) to the hybrid implementation thanks to the theorem of equivalence between discrete-time and continuous-time signal processing. (b) to (c) The D/A can be regarded as a “perfect” interpolator. In the digital counterpart the interpolator will try to interpolate samples for the variable decimator, so that when it takes samples at mn Ts , those samples are equivalent to a sample taken at nTc . (c) to (d) The last step has two possible interpretations. The first one is that filters gM F (kTs ) and hI (kTs , µ) are linear, so they can be swapped. The second interpretation, as we will see later in more detail in equation 3.13, is that y(kTs ) can also be interpolated a fraction of sample µn so that when these samples are processed by the matched filter and the variable decimator, they will look like if the symbol time was a multiple of the time between consecutive samples. This fact may be more easily seen with an example. Let us suppose that the sampling rate is slightly above the minimum sample rate, i.e. 2Ts < Tc but with 2Ts ≈ Tc (e.g. 2.01Ts = Tc ).7 Initially, it looks like if the interpolator did not have to interpolate any sample and that the variable decimator is decimating by 2 (taking 1 sample out of 2). But after 100 symbols some drifting has happened: we have taken 201 samples out of 100 symbols. The accumulated spare sample makes that at this moment we are not sampling exactly at the maximum of the MF, but just in the middle of two maximums. To fix this error, we need the interpolator and the variable decimator to do the following: 1. Interpolator: it must delay every input sample a 1/200th of sample. Doing so it looks like if Tc /Ts was integer (2, in the example). 2. Decimator: After 200 samples, the interpolator has delayed the signal one sample (half symbol) and it can not delay the signal anymore. It is now when the decimator has to discard that sample and start decimating by 2 again from the next sample on. Also, the interpolator has to be reset to start over the “incremental interpolation” from 0 delay to 1/2 sample delay. The advantage of (d) against (c) is that it simplifies the hardware implementation of the matched filter since in that case output samples are not needed at every clock, only at multiples of it, so it allows the use of slower (simpler) implementation of matched filters. Usually they are always Finite Impulse Response (FIR) filters. 3.3.1.2
Analysis of Interpolation/Decimation for Timing Recovery
Let us now analyze formally how the evolution from the fully analog to fully digital implementations in Figure 3.4 is done. We will try to see in every step 7 This example makes the assumption that Tc /Ts is rational without loss of generality only for illustrative purposes.
3.3 Description of a DS/SS Communication System
25
the interpretation of the equations if they were to be implemented in hardware. Our aim in the chip detection stage in the receiver is to end up at the output of the MF with a stream of chips, so we are trying to get the samples r˜(nTc + εˆTc ) as a function of only Ts . We may rewrite this expression as r˜(nTc + εˆTc ) = =
r˜(t)|t=nTc +ˆεTc = y˜(t) ∗ g˜M F (t)|t=nTc +ˆεTc
Z∞
y˜(τ )gM F (t − τ )dτ
t=nT +ˆεT
−∞
c τ =kTs
−∞ X
= Ts
(3.2)
c
y˜(kTs )gM F (nTc + εˆTc − kTs )
(3.3)
k=−∞
Equations (3.2) and (3.3) can be depicted as in Figure 3.4 where it can be seen that they correspond exactly to the fully analog and hybrid timing recovery implementations. By the Nyquist Sampling Theorem we know that −∞ X
π 0 x(t ) = x(iTs ) sinc t − iTs Ts i=−∞ −∞ X π x(t + τ ) = x(iTs + τ ) sinc (t − iTs ) Ts 0
(3.4)
(3.5)
i=−∞
And with a very simple change of variable x(t + τ ) =
−∞ X
x(iTs ) sinc
i=−∞
π (t + τ − iTs ) Ts
(3.6)
Equation (3.6) is interesting since it shows that any delayed version of a signal is perfectly represented by any sampled version of the signal filtered with a proper interpolator hI (kTs , τ ). Equations (3.5) and (3.6) have the same output, but we must notice that the underlying ideas for their practical implementation are quite different. These equations can be graphically depicted as in Figure 3.5, where we can see that while the first requires an analog control over the ADC to chose analogically the sample instant, the second implementation can generate any delayed output version digitally with a filter hI (kTs , τ ). Now we can write the MF as a sampled version of the MF interpolated by a filter. The MF in equation (3.3) is sampled at kTs and the output delayed to nTc + εˆTc , so gM F (nTc + εˆTc − kTs ) = gM F (−(t − τ ))|t=kTs
(3.7)
τ =nTc +ˆ εTc
=
−∞ X i=−∞
gM F (iTs ) sinc
π (nTc + εˆTc − kTs − iTs ) Ts
(3.8)
3.3 Description of a DS/SS Communication System x(t)
26
x(kTs + τ )
(a) t = kTs + τ
x(t)
x(kTs )
(b)
x(kTs + τ ) hI (kTs , τ )
t = kTs
Interpolator
Figure 3.5: Generating shifted signals by (a) Analog delayed sampling, (b) Digital interpolation
Since our intention is to leave the last expression as a function of only Ts , we need to express nTc + εˆTc as a function of the sample rate time base Ts . Before we mentioned that the ratio Tc /Ts is irrational, so no simplification is possible. But we can write it another way: nTc εˆTc nTc + εˆTc = Ts + + µn Ts (3.9) Ts Ts = Ts [mn + µn ] (3.10) where µn and mn are the fractional and integer parts of the sampling period that give the same time instant for the nth chip that would be given if sampled at nTc + εˆTc . It is of crucial importance to note that both the integer and fractional part are time-variant, i.e. their value depend on n. These two values have to be estimated by the TED so that they can be compensated for. They could be calculated if Ts and Tn were known with infinite precision before-hand and as long as they did not vary a single decimal in time. Obviously, this is certainly impossible and unconvenient, so they have to be estimated on-the-fly. Now the MF filter can be expressed only in Ts terms: gM F (nTc + εˆTc − kTs ) = gM F ([mn + µn ] Ts − kTs )
(3.11)
Finally, substituting in previous equations we can see that the implementantion is now fully digital and that allows the two possible digital interpretations in Figure 3.4: r˜(nTc + εˆTc ) = r˜ (mn Ts + µn Ts ) −∞ X = Ts y(kTs )gM F ([mn + µn ] Ts − kTs )
= Ts
k=−∞ −∞ X
y(kTs + µn Ts )gM F (mn Ts − kTs )
(3.12)
(3.13)
k=−∞
If we carefully examine these equations we will see that they suggest different implementations:
3.3 Description of a DS/SS Communication System
27
• Equation (3.12) suggests that the interpolator should be after the MF to make its impulse response appear slightly delayed (µn Ts ) and then sample again its output at the integer sample instant mn Ts (see Figure 3.4c). • Instead, equation (3.13) suggests that the incoming y(t) should be sampled at instants kTs + µn Ts , but we have already seen in Figure 3.5 that the result is the same if we sample at kTs and immediately place the interpolator there. (see Figure 3.4d). High Sampling Rate Simplification (Ts Tc ) In the Timing Recovery subsystem the function of the interpolator is, as we have seen, to delay the input signal to a convenient point so that when sampling later at integer sampling instants, we achieve the same result as if we had sampled at integer symbol instants. Sometimes receivers use a sampling rate much higher than only twice the bandwidth of the signal, mostly nowadays when high-speed low-cost ADCs are available and widely spread. In this case, the interpolator may be redundant. The reason is that if for instance we are sampling one symbol with 10 samples (10Ts = Tc ) and we needed the interpolator to delay the signal a fraction of sample, simply discarding one sample at the variable decimator would be equivalent to the case in which the interpolator would have delayed the signal 1/10th . This fraction may look too coarse, but we may checked the impact on SNR loss that it has on our system and decide it it is negligible or not. Very probably the associated loss will be negligible in comparison to other losses in our system, and our hardware will be advantageously simpler.
3.3.2
Signal Model
With the equivalences seen in the previous section we have proved that receivers in Figures 3.6(a) and 3.6(b) are equivalent. Also it has been shown that samples at nTc + εTc can be extracted from samples at kTs , so we can use the second scheme where the ADC samples at instants with timing error compensation for the analytical study since it is conceptually simpler and the same information is obtained from both of them. In this section the signal model used in the study of the DS/SS receiver is described. The received signal can be expressed as y(t) = s(t) ∗ hc (t) + w(t)
(3.14)
where hc (t) is the channel impulse response and w(t) is AWGN noise introduced by the channel. The signal sent by the TX s(t) is: √ s(t) = 2 Re s˜(t)ejωo t (3.15)
3.3 Description of a DS/SS Communication System
28
and s˜(t) is the complex envelope of s(t) and can be expressed as8 s˜(t) ,
∞ X i=−∞
dj i k c|i|P gT (t − iTc )
(3.16)
Q
where d ∈ {±1} are the data bits, c ∈ {±1} is the spreading sequence, gT (t) is the pulse shape used in transmission, Q = m1 P (m1 integer) is called the processing gain and it is the number of chips in every data symbol and P is the period of the spreading sequence. In the sequel Q may also be specified by the constant Ncs (number of chips per symbol). When m1 = 1 the processing gain coincides with the factor by which the spectrum has been spread. For large processing gains without too much synchronization burden acquiring the code, this constant is greater than one, m1 > 1. In GPS applications m1 = 20 since every data symbol fits in 20 PN periods, and P = 1023. The operation b Qi c = int(i/Q) refers to the lowest integer resulting from the division inside, and the operation |i|P = i mod P is the modulus-P operation on i. We assume that the channel adds an unknown delay τ to the received signal (timing error) that needs to be estimated by a timing error detector9 (TED) that we believe to be working correctly. We also assume that the channel adds a phase component θ(t) that consists in a random initial phase and a Doppler carrier frequency shift: θ(t) = 2πfd t + ϕ = ωd t + ϕ Thus, the baseband equivalent received signal y˜(t) at the output of the I/Q demodulator is: y˜(t) = s˜(t − τ )ejθ(t) + w ˜0 (t) (3.17) where τ = εTc and ε < 1 because for the chip detection system the error of no more that one chip is of interest to us when the receiver is in tracking mode. In following sections, when acquiring the code, the span of the timing error will be considered to be one PN-period (P chips). 3.3.2.1
Chiprate Doppler approximation
There is a small detail worth noting about the Doppler effect. To be rigorous, we could not write the received signal component as the signal s˜(t) modulated by an amount fd , since doing so the compression/expansion suffered by s˜(t) is not being taken into account. Rigorously we should write the signal component as as ∞ X ∆r s˜(t) , eθ(t) dj i k c|i|P gT t − τ − iTc 1 − (3.18) rc Q i=−∞
8 For simplicity, we will work with the baseband equivalent of the signals, and their notation will carry√a ∼ on top of them. Note also that the baseband equivalence definition used here carries a 2 and thus conserves signal power. 9 In DS/SS communications, the most popular TED is the DLL
3.3 Description of a DS/SS Communication System
29
where it can be seen that the relative Chiprate-Doppler ∆r/rc effect is being explicitly considered as an expansion/compression of the pulses. This fact should be seriously considered when designing receivers meant to work in high dynamics environments, but since the maximum Doppler considered in this project is not very high, this term will be neglected in the sequel. Doing this, the approximation we are doing of the Doppler effect is that it only behaves as if it was modulating the signal by fd Hz. 3.3.2.2
Input SNR and C/N0 definition
The power spectral density of the complex gaussian noise w ˜0 (t) is Sw˜ (f ), the same as that of w(t), ˜ the noise component after the phase correction block if this operation was done in the analog domain. It is known that Sw˜ (f ) is the Fourier transform of the autocorrelation function Rw˜ (τ ): Sw˜ (f ) = F {Rw˜ (τ )}
(3.19)
The correlation function can be written as Rw˜ (τ ) = E {w(t) ˜ w ˜ ∗ (t + τ )} and the complex baseband equivalent of w(t) can be expressed as w(t) ˜ = iw˜ (t) + jqw˜ (t) being iw˜ (t) and qw˜ (t) its in-phase and quadrature components, respectively. They are both independent gaussian processes with zero mean and power spectral density N0 /2 in the band |f | < W/2. Simplifying the previous expression, we reach Rw˜ (τ ) = Riw˜ (τ ) + Rqw˜ (τ ) and Sw˜ (f ) is therefore: Sw˜ (f ) = F {Riw˜ (τ )} + F {Rqw˜ (τ )} = Siw˜ (f ) + Sqw˜ (f )
(3.20)
= 2Siw˜ (f ) ∀|f | < W/2
= N0
The mean power of the received signal s˜(t) is Ps˜ = Rs˜(0) 1 = lim T →+∞ T
Z
1 T →+∞ T
Z
T
n o E |˜ s(t)|2 dt
0
= lim
0
T
+∞ X
|gT (t − iTc )|2 dt
i=−∞
+∞ X 1 = lim Nc →+∞ Nc Tc
−iTZ c +Nc Tc
|gT (s)|2 ds
(3.21)
i=−∞
=
lim
Nc →+∞
1 Nc Nc Tc
Z
−iTc +∞
|gT (s)|2 ds
−∞
= Ec /Tc = E c rc R +∞ where Ec = −∞ gT2 (s)ds is the energy of the trasmitted shaping pulse and rc = 1/Tc is the chiprate of the spreading signal.
y(t)
y 0 (t)
√
2e−jωo t
√
LPF
LPF
y˜0 (t)
Interpolator
hI (kTs , µ)
k=
(b) Simplified Receiver
Chip Detection
Chip Matched Filter
Tc n Ts
c|n|P
Symbol Detection
n=
Tb l Tc
z˜(nTc ) z˜(lTb )
I&D
hM A (nTc )
Figure 3.6: Equivalent DS/SS Receivers in tracking mode
A/D Sampler &Quantizer
gR (kTs )
r˜0 (nTc ) x ˜(nTc )
(a) Complete Receiver
n=
≶γ
dˆl
Symbol Detection
I&D
c|n|P
Chip Matched Filter
Tb l Tc
z˜(nTc ) z˜(lTb ) hM A (nTc )
Variable Decimator
r˜(mn Ts + µn Ts ) = r˜(nTc + ε˜Tc )
mn Ts
gR (kTs )
Chip Detection
y˜0 (kTs + εˆTc ) r˜(kTs + εˆTc )
t = kTs + εˆTc
Phase & Frequency Correction (From PLL/NCO)
e
µ ˆ
y˜0 (kTs ) y˜0 (kTs + µn Ts )r˜(kTs + µn Ts )
ˆ −j θ(kT s)
y˜(kTs )
A/D Sampler &Quantizer
2e−jωo t
y˜(t)
t = kTs
From Timing Estimator (DLL)
≶γ
dˆl
3.3 Description of a DS/SS Communication System 30
3.3 Description of a DS/SS Communication System
31
The noise power is then Z
+â&#x2C6;&#x17E;
PwË&#x153; =
SwË&#x153; (f )df = N0 W
(3.22)
â&#x2C6;&#x2019;â&#x2C6;&#x17E;
and at this point the SNR is: SNR =
E c rc N0 W
(3.23)
We will see how despreading improves the SNR at later stages of the RX chain thanks to the fact that, in the despreading process, noise is being filtered out by much narrower filters, whereas the signal component power remains unchanged along the RX if the operations are power conservative, hence improving the signal to noise ratio. Thus, in DS/SS systems the raw input SNR or the chip to noise ratio Ec /N0 can be very low (in the order of â&#x2C6;&#x2019;20 to â&#x2C6;&#x2019;40dB) as long as the spreading ratio is large enough to ensure that the Eb /N0 ratio is high enough to allow for reliable transmission. These facts put in evidence that the measure of the quality of the system can not be expressed only with the SNR parameter. This is the reason why in DS/SS systems another parameter is defined: C/N0 , or Carrier to Noise ratio. It is defined as C/N0 = W ¡ (SNR) [Hz] (3.24) and has units of Hz (or dBHz if expressed in logarithmic scale). At any k th stage in the receiver10 the SNR can be calculated inversely: SNRk =
C/N0 Wk
[Hz]
(3.25)
where Wk is the output bandwidth at the k th stage. From the definition of C/N0 we see that the ratio is independent of the RF filter bandwidth, which is always related to the chiprate parameter rc , so now we have a bandwidth-normalized quality parameter. 3.3.2.3
A/D conversion
Sampling at instants kTs + ÎľË&#x2020;Tc yields: yË&#x153;0 (kTs + ÎľË&#x2020;Tc ) = sË&#x153;(kTs â&#x2C6;&#x2019; Tc ) + w Ë&#x153;y0 (kTs )
(3.26)
where = Îľ â&#x2C6;&#x2019; ÎľË&#x2020; and where it has been assumed that the receiver is synchronized and thus θ(kTs ) has been corrected by the PLL/NCO pair and the DLL is giving an approximation ÎľË&#x2020; of the fractional timing error Îľ for the simplified RX (Fig. 3.6(a)) in the form of the pair mn and Âľn for the complete RX (Fig. 3.6(b)). We can see that this assumptions allow us to see the ADC as a process synchronized to the timing recovery subsystem and therefore takes samples at variable strobe instants. This allows us to study the synchronized 10
As long as operations performed up to that point are linear.
3.3 Description of a DS/SS Communication System
32
receiver as if the ratio Tc /Ts was integer, although we well know this is not a real assumption. For simpler notation, the noise component of the signal has not been written with the ÎľË&#x2020;Tc because the statistical properties remain the same and notation is simpler. Because now the DS/SS receiver is considered to be synchronized, we assume that θ(t) has been corrected by the PLL/NCO pair and that the DLL is giving an approximation ÎľË&#x2020; of the fractional timing error Îľ for the simplified RX (Fig. 3.6(a)) in the form of the pair mn and Âľn for the complete RX (Fig. 3.6(b)). We can see that this assumptions allow us to see the ADC as a process synchronized to the timing recovery subsystem and therefore takes samples at variable strobe instants. This allows us to study the synchronized receiver as if the ratio Tc /Ts was integer, although we well know this is not a real assumption. Combining equations (3.26) and (3.16) we obtain a detailed description of the phase-corrected baseband equivalent of the received signal for the RX in tracking mode: 0
yË&#x153; (kTs + ÎľË&#x2020;Tc ) =
â&#x2C6;&#x17E; X
dj i k c|i|P gT (kTs â&#x2C6;&#x2019; Tc â&#x2C6;&#x2019; iTc ) + w Ë&#x153;y0 (kTs )
(3.27)
Q
i=â&#x2C6;&#x2019;â&#x2C6;&#x17E;
In the ADC stage we are concerned with discrete samples of analog signals, so it is important to note that, from this stage on, a sampled signal expressed for instance as y(kTs ) could also be expressed as y[k]. In this project the former is preferred, since while stating that it refers to a discrete sequence, it also gives an idea of the sampling process laying within. The noise can be completely described by its statistical properties. The power spectral density of the discrete noise sequence is expressed as SwË&#x153;y0 ej2Ď&#x20AC;f and is related to the analog noise power spectral density
j2Ď&#x20AC;f
SwË&#x153;y0 e
+â&#x2C6;&#x17E; 1 X k = SwË&#x153; f â&#x2C6;&#x2019; Ts Ts k=â&#x2C6;&#x2019;â&#x2C6;&#x17E;
= N0 fs
(3.28)
W/2 â&#x2C6;&#x20AC; |f | 6 fs
The noise spectrum is completely flat in the particular case that the sampling rate fs equals to the frontend receiver bandwidth W . Otherwise, it will be flat only in a narrower bandwidth. If the sampling process has been done at the Nyquist rate at least, it is known that there should be no difference in the signal and noise powers of the discrete sequence when compared to the analog counterparts. We can doublecheck this assumption by computing the power of yË&#x153;0 (kTs ), RyË&#x153;s0 (0), and the noise power PwË&#x153;y0 in the discrete case. For the discrete case, N n
X
2 o 1 R (0) = lim E yË&#x153;s0 (kTs )
N â&#x2020;&#x2019;+â&#x2C6;&#x17E; 2N + 1 yË&#x153;s0
k=â&#x2C6;&#x2019;N
(3.29)
3.3 Description of a DS/SS Communication System
33
and the expectation can be calculated separately ) ( +∞ +∞ n
o X X
2 dj i k d∗j m k ci c∗m gT (kTs − iTc )gT∗ (kTs − mTc ) E y˜s0 (kTs ) = E m=−∞ i=−∞ ∞ X
=
Q
Q
|gT (kTs − iTc )|2
i=−∞
(3.30) and performing some change of variables we reach Ry˜s0 (0) =
+∞ 1 X Ec fs |gT (kTs )|2 = = Ec rc Nsc Nsc
(3.31)
k=−∞
where we have used the result of the theorem of equivalence between analog and discrete-time signals:11 Ts
+∞ X
|gT (kTs )|2 =
Z
+∞
|gT (t)|2 dt
(3.32)
−∞
k=−∞
and, as we defined previously, the right-hand side of the last equation is the energy of the transmitted chip Ec . The noise power after the ADC is Z Pw˜y0 =
+1/2
−1/2
Sw˜y0 ej2πf df = (N0 fs ) (W/fs ) = N0 W
(3.33)
And thus, the SNR after the ADC is the same as the one obtained in the analog domain: E c rc SNRy0 = (3.34) N0 W For the typical case where fs = W the ratio can be written as SNRy0 =
3.3.3
Ec N0 Nsc
(3.35)
Chip Detection
After the A/D conversion, the next stage in the receiver chain is the chip detection block, that consists in a chip matched filter (CMF) plus decimator. Note that this block would change a little bit its implementation in a real design, like in Figure 3.6(a) where we see that there would be an interpolator before the CMF and that the decimator should be variable. 11
Bandlimited signals sampled at least at the Nyquist rate
3.3 Description of a DS/SS Communication System 3.3.3.1
34
Chip Matched Filter
After convolution with the incoming signal, the CMF is expected to form a global pulse response in which, after decimation, no sample at its output would contain any interference from previous chips (or symbols, in general). This is called an ISI-free pulse, and it is designed according to Nyquist’s rules of pulse design. The filter gR (kTs ) matched to the transmitted pulse gT (kTs ) is designed in order to obtain a global response of a Nyquist pulse: gN (kTs ) = gT (kTs ) ∗ gR (kTs )
(3.36)
For convenience and without loss of generality we choose the CMF gR (kTs ) to have unit energy in the discrete time domain: +∞ X
|gR (kTs )|2 = 1
(3.37)
k=−∞
Since gR (kTs ) has unit energy and is matched to gT (kTs ), it can be expressed as gR (kTs ) = √
1 g ∗ (−kTs ) Ec fs T
(3.38)
and therefore the Nyquist pulse gN (kTs ) has the following forms, in time and frequency domains, respectively: 1 gN (kTs ) = √ gT (kTs ) ∗ gT∗ (−kTs ) Ec fs p ∗ = Ec fs gR (kTs ) ∗ gR (−kTs )
1
2 GN ej2πf = √
GT ej2πf
Ec fs
2 p
= Ec fs GR ej2πf
(3.39) (3.40) (3.41) (3.42)
At the output of the CMF we have r˜(kTs ) = r˜s (kTs ) + r˜n (kTs ) = y˜0 (kTs ) ∗ gR (kTs )
(3.43)
where r˜s (kTs ) = s˜(kTs ) ∗ gR (kTs ) and r˜n (kTs ) = w ˜y0 (kTs ) ∗ gR (kTs ). After the CMF stage and before decimation, the noise has been coloured by the filter in the following way:
2
j2πf j2πf
Sr˜n e = Sw˜ e (3.44)
GR ej2πf
A remark must be made about notation of noise components. Throughout the theoretical analysis, we will make notation differences between white and coloured noises. The formers will be noted with the letter n and a subindex indicating the stage they refer to, whereas the latter will be expressed as the letter of the stage they refer to with the subindex n, for noise. In this case, after the CMF the noise process is r˜n (kTs ) and after decimation, where we will see that noise is white again, it will be dubbed n ˜ r0 .
3.3 Description of a DS/SS Communication System 3.3.3.2
35
Decimation by Nsc
Decimation of the filtered signal is the process by which the signal is sampled at its highest peak without any other symbol interference as long as there is no timing error. Sampling at that peak is useful because it allows us to recovers with one only sample the energy of a whole pulse containing one chip of the signal. Again, for simplification in notation, we define the decimated sequence as a time-shifted version of the original decimated sequence. rË&#x153;0 (nTc ) = rË&#x153;s0 (nTc ) + n Ë&#x153; rË&#x153;0 (nTc ) = rË&#x153;(nTc + ÎľË&#x2020;Tc ) = rË&#x153;(kTs + ÎľË&#x2020;Tc )|k= Tc n
(3.45)
Ts
Note that decimation has been computed by an Nsc = Tc /Ts factor, i.e. the number of samples per chip. Substituting and developing last equation yields 0
rË&#x153; (nTc ) =
â&#x2C6;&#x17E; X i=â&#x2C6;&#x2019;â&#x2C6;&#x17E;
dj i k c|i|P gN ((n â&#x2C6;&#x2019; i) Tc â&#x2C6;&#x2019; Tc ) + n Ë&#x153; rË&#x153;0 (nTc ) Q
= dj n k c|n|P gN ( Tc ) Q X + dj i k c|i|P gN ((n â&#x2C6;&#x2019; i) Tc â&#x2C6;&#x2019; Tc ) + n Ë&#x153; rË&#x153;0 (nTc ) i6=n
(3.46)
Q
where we define the signal term and the ISI term rË&#x153;s0 (nTc ) = dj n k c|n|P gN ( Tc ) Q X 0 rË&#x153;ISI (nTc ) = dj i k c|i|P gN ((n â&#x2C6;&#x2019; i) Tc â&#x2C6;&#x2019; Tc ) i6=n
(3.47)
Q
In this project, the ISI has been considered to be negligible because the scope of the project was to develop a parameterized code acquisition module, assuming gaussian interference only. In the sequel, the ISI term will be carried on from stage to stage with the purpose of reminding where it is and the effect that can have on our system, but will eventually be neglected when calculating signal to noise ratios or other parameters of interest. After decimation, it makes sense to calculate again the SNR at this point, because now, as long as synchronization has been achieved properly, we expect to have the content of one chip per sample without the interference of any previous chips. The power of the signal component after decimation is n
2 o RrË&#x153;s0 (0) = E rË&#x153;s0 (nTc )
(3.48) which can be developed n
2 o â&#x2C6;&#x2014; â&#x2C6;&#x2014; 0 â&#x2C6;&#x2014; j k
k j E rË&#x153;s (nTc ) = E d n d n c|n|P c|n| gN ( Tc ) gN ( Tc ) Q
= |gN ( Tc )|2 â&#x2030;&#x2C6; |gN (0)|2
Q
P
(3.49)
3.3 Description of a DS/SS Communication System
36
where the in the last step we have considered â&#x2030;&#x2C6; 0 since the receiver is in tracking mode. So, using eq. (3.32) and noting that gN (0) = â&#x2C6;&#x161;
+â&#x2C6;&#x17E; X p 1 |gT (ÎťTs )|2 = Ec fs Ec fs Îť=â&#x2C6;&#x2019;â&#x2C6;&#x17E;
(3.50)
we conclude that RrË&#x153;s0 (0) = Ec fs
(3.51)
Regarding the noise, after decimation the noise power spectral density of the discrete sequence suffers from aliasing
j2Ď&#x20AC;f
SnË&#x153; r0 e
Nsc â&#x2C6;&#x2019;1 f â&#x2C6;&#x2019;i 1 X = SrË&#x153;n ej2Ď&#x20AC; Nsc Nsc
N0 fs = Nsc
i=0 NX sc â&#x2C6;&#x2019;1
(3.52)
f â&#x2C6;&#x2019;i j2Ď&#x20AC; N sc
GR e
2
i=0
By eq. (3.42) we know NX sc â&#x2C6;&#x2019;1 2 f â&#x2C6;&#x2019;i 1
j2Ď&#x20AC; f â&#x2C6;&#x2019;i
GN ej2Ď&#x20AC; Nsc = Nsc (3.53)
GR e Nsc = â&#x2C6;&#x161; E f c s i=0 i=0 j2Ď&#x20AC;f since, by design, GN e is a Nyquist pulse and all Nyquist pulses fulfill the condition of being constant in the frequency domain when several spectrum copies overlap at certain distances. In our case, this condition entails NX sc â&#x2C6;&#x2019;1
Nsc â&#x2C6;&#x2019;1 p f â&#x2C6;&#x2019;i 1 X GN ej2Ď&#x20AC; Nsc = Ec fs Nsc
(3.54)
i=0
The last assertion can be easily understood if we think that, after sampling, the ISI-free desired response must be a delta, and as is commonly known, the Fourier transform of a delta is a flat spectrum. Hence, SnË&#x153; r0 ej2Ď&#x20AC;f = N0 fs â&#x2C6;&#x20AC;0 6 f < 1 (3.55) and the total noise power is PnË&#x153; r0 = N0 fs
(3.56)
Finally, the SNR at the output of the CMF is SNRrË&#x153;0 =
Ec N0
(3.57)
Note that, up to this point, the system behaves exactly as a generic BPSK system would, where we send binary chips instead of binary symbols. Hence, the equivalent of the quality parameter Eb /N0 in the former system here becomes Ec /N0 . This parameter refers to the quality with which chips are detected, but we eventually expect the data symbols and not the chips to be recovered successfully. Later we will see how the final Ec /N0 can be set as low as desired keeping at the same time Eb /N0 constant, changing only the processing gain parameter.
3.3 Description of a DS/SS Communication System
3.3.4 3.3.4.1
37
Symbol Detection Despreading
By despreading it is understood that we undo the process that was performed in the TX when we multiplied a broadband (or spreading) signal by the narrowband information signal. The way to achieve despreading is multiplying by the complex conjugate of the spreading sequence so that, when multiplied, they cancel each other. However, we will use only real spreading sequences, so despreading becomes an aligned multiplication of the input signal by the same spreading sequence c|n| . At the output of the multiplier we find x Ë&#x153;(nTc ) = rË&#x153;0 (nTc )câ&#x2C6;&#x2014;|n| =
dj
n Nsc
kg
N
P
( Tc ) + x Ë&#x153;ISI (nTc ) + n Ë&#x153; x (nTc )
(3.58) (3.59)
where we can see that at this point, we expect Q identical samples scaled by a timing error plus interferences (ISI and noise). If we carefully look at x Ë&#x153;(nTc ) we can see that it is the same result that we would have obtained if x Ë&#x153;(nTc ) was the oversampled sequence of the analog signal bearing only the narrowband information, i.e. x Ë&#x153;(nTc ) = x Ë&#x153;(t)|t=nTc
(3.60)
and the signal component of x Ë&#x153;s (t) can be expressed as x Ë&#x153;s (t) =
â&#x2C6;&#x17E; X
dl gN ( Tc ) p (t â&#x2C6;&#x2019; lTb )
(3.61)
l=â&#x2C6;&#x2019;â&#x2C6;&#x17E;
where we have defined p(t) as a square pulse of unit amplitude and Tb seconds: Y t â&#x2C6;&#x2019; Tb /2 p(t) = (3.62) Tb From the previos statements, we may infer that, as long as the (de)spreading operations are properly carried out, it seems as though the DS/SS operation was transparent to us. Actually, spreading a signal can be seen as a special form of modulation that makes the information signal believe it is going through an infinite-bandwidth channel, because 1/Tc 1/Tb . This is one of the reasons why symbols use squared pulses whereas chips may or may not12 use Nyquist pulses different from rectangular ones. 3.3.4.2
Integrate & Dump by Ncs
This block could have also been called Symbol Matched Filter (SMF), since the I&D operation behaves as a filter that is matched to a perfectly square pulse. 12 In GPS, chips use almost perfectly rectangular pulses, whereas UMTS applications or the futurely deployed Galileo Positioning System use a roll-off factor β 6= 0.
3.3 Description of a DS/SS Communication System
38
The integration operation is performed by a filter dubbed hM A (nTc ), being the MA a subindex for moving average. â&#x20AC;&#x153;Moving averageâ&#x20AC;? filter is the name of those filters that perform an integration of the incoming samples and that have a parameter that controls the rate at which previous integrations are forgotten. In our case, the I&D does not forget any integration until the dump line is asserted, when it resets the integrator and thus forgets all previous values. After the integration, we have zË&#x153;(nTc ): zË&#x153;(nTc ) = zË&#x153;s (nTc ) + zË&#x153;ISI (nTc ) + zË&#x153;n (nTc ) =x Ë&#x153;(nTc ) â&#x2C6;&#x2014; hM A (nTc ) Ncs â&#x2C6;&#x2019;1 1 X x Ë&#x153;(nTc â&#x2C6;&#x2019; iTc ) = Ncs
(3.63)
i=0
and at this point the noise power spectral density is shaped by the MA filter
2
SzË&#x153;n ej2Ď&#x20AC;f = SzË&#x153;n ej2Ď&#x20AC;f HM A ej2Ď&#x20AC;f
(3.64)
2 sin2 (Ď&#x20AC;f Ncs )
j2Ď&#x20AC;f
HM A e
= 2 Ncs sin2 (Ď&#x20AC;f )
(3.65)
where
Again, the noise has been coloured by a filter, but again will be whiten when decimated. The final sequence in the symbol detection process before the hard decision of the estimated data sequence is zË&#x153;(lTb ): zË&#x153;(lTb ) = zË&#x153;(nTc )|n=lNcs = zË&#x153;s (lTb ) + zË&#x153;ISI (lTb ) + n Ë&#x153; z (lTb )
(3.66)
The signal component hast the form zË&#x153;s (lTb ) =
=
Ncs â&#x2C6;&#x2019;1 1 X dj lNcs â&#x2C6;&#x2019;i k gN ( Tc ) Ncs Ncs
1 Ncs
i=0 NX cs â&#x2C6;&#x2019;1
(3.67) dl gN ( Tc )
i=0
= dl gN ( Tc ) and the noise component becomes
j2Ď&#x20AC;f
SnË&#x153; z e
Ncs â&#x2C6;&#x2019;1 f â&#x2C6;&#x2019;i 1 X = SzË&#x153;n ej2Ď&#x20AC; Ncs Ncs
= =
N0 fs Ncs N0 fs Ncs
i=0 NX cs â&#x2C6;&#x2019;1
2
j2Ď&#x20AC; f â&#x2C6;&#x2019;i
HM A e Ncs
i=0
â&#x2C6;&#x20AC;0 6 f < 1
(3.68)
3.4 Description of the Code Acquisition System
39
where in the last step we have used the fact that hM A (nTc ) is an ISI-free pulse, so it has a flat spectrum because after decimation we obtain a delta with no interference from other symbols. This can be seen easily in the time domain, since hM A (nTc )â&#x2C6;&#x2014;hM A (nTc ) is a triangular pulse of 2Ncs â&#x2C6;&#x2019;1 samples of duration, so the sampled peak of amplitude 1/Ncs is surrounded by Ncs â&#x2C6;&#x2019; 1 samples at each side. Thus, overlapping these triangular pulses by Ncs samples give a result in which there is no interference between two any contiguous symbols. The delta obtained by decimation can be expressed in the Fourier domain: Ncs â&#x2C6;&#x2019;1
f â&#x2C6;&#x2019;i 2 1 1 1 X
j2Ď&#x20AC; N cs = F δ[l] = â&#x2C6;&#x20AC;0 6 f < 1 (3.69)
HM A e Ncs Ncs Ncs i=0
After the I&D, the SNR becomes SNRzË&#x153; =
PzË&#x153;s Ec fs Ec Ncs Eb = = = PnË&#x153; z N0 fs /Ncs N0 N0
(3.70)
which is the same SNR as the one we would expect in a BPSK modulation scheme.
3.4
Description of the Code Acquisition System
In the previous section we have seen how the DS/SS receiver works when synchronized. In this section we address the main theoretical analysis of the purpose of this project, the code acquisition in the synchronization process of a DS/SS communication system. Our intention is to design an all-digital DS/SS acquisition system trying to improve its performance either by using a test statistic a bit more complex [1],[2] than the one described in Section 3.1.1 and by parallelizing later the HS among several correlators. In Figure 3.7 we can see a block diagram of the acquisition system designed in this project. Note that there are changes in the receiver chain from the â&#x20AC;&#x153;Correlatorâ&#x20AC;? blocks on. These changes are meant to perform the acquisition of the code. As we said previously, the main purpose of the code acquisition system in a DS/SS receiver is to find the timing error at the input, i.e. to find the best Ď&#x201E;Ë&#x2020; so that eventually Îľ â&#x2030;&#x2C6; 0. In this situation, we can consider the two sequences quite aligned if when the estimation is delivered to a Delay-Locked Loop (DLL), it will be able to start tracking. Usually DLLâ&#x20AC;&#x2122;s can track PN signals when the timing error is less than half chip, so this coarse resolution will be the target precision of the whole code acquisition system. Also, when this estimation is found a PhaseLocked Loop (PLL) is started in order to achieve carrier synchronization, i.e. correct phase and frequency errors.13 To achieve this 1/2 chip error resolution, sampling the signal at twice the chiprate fs = 2rc is enough, while we Nyquist criteria is respected too. Thus, in the sequel we can make the assumption that Nsc = 2 samples per chip. 13 It can be demonstrated that a PLL seeking for phase and frequency errors can be started under worse conditions than those sufficient for the DLL to converge.
y˜(kTs )
Chip Matched Filter
gR (kTs )
r˜(kTs )
Chip Detection
PN
n=
Reset
I&D
hM A (nTc ) Tcoh l Tc
z˜(nTc ) z˜(lTcoh )
k · k2
hpd (nTcoh )
T (Z)
FSM
Reset
PostDetection I&D
Z(lTcoh )
Power Estimator (Non-Coherent Detection)
Figure 3.7: DS/SS Acquisition Module Block Diagram
c|n|P
k=
Tc n Ts
x ˜(nTc )
Correlation Estimator (Coherent Detection)
3.4 Description of the Code Acquisition System 40
3.4 Description of the Code Acquisition System
3.4.1
41
Signal Model
Although the signal model was already defined in the previous section, we must be aware now that phase error has not been corrected since in the acquisition stage the DLL is not working. Thus, the discrete sequence at the input of the CMF is y˜(kTs ) =
+∞ X i=−∞
dj i k c|i|P gT (kTs − iTc − τ )ejθ(kTs ) + w(kT ˜ s)
(3.71)
Q
where all parameters were defined in Section 3.3.2.
3.4.2
Chip Detection
The result of the chip matched filter is r˜(kTs ) = y˜(kTs ) ∗ gR (kTs ) = r˜s (kTs ) + r˜n (kTs )
(3.72)
where r˜s (kTs ) = √
+∞ +∞ X X 1 dj i k c|i|P gT (λTs − iTc − τ )gT∗ (λTs − kTs )ejθ(λTs ) Ec fs i=−∞ Q λ=−∞
(3.73) After a simple change of variable λTs − kTs = γTs , r˜s (kTs ) yields +∞ +∞ X X 1 j k r˜s (kTs ) = √ d i c|i|P gT (γTs + kTs − iTc − τ )gT∗ (γTs )ejθ(γTs +kTs ) Ec fs i=−∞ Q γ=−∞
(3.74) In the last summation we see that the Doppler effect is preventing us from performing a normal matched filtering operation. But if we pay attention to the fact that the CMF is computed with a finite number of samples with γ being in that small range of numbers, then we may approximate θ(γTs + kTs ) = 2π
fd (γ + k) + ϕ ≈ θ(kTs ) fs
(3.75)
since fd /fs ≈ 0. The term ejθ(kTs ) can then be factored out affecting each symbol exclusively from this point on. Now the CMF can be easily computed, so that r˜(kTs ) can be approximated by r˜(kTs ) = ejθ(kTs )
+∞ X
dj i k c|i|P gN (kTs − iTc − τ )
(3.76)
Q
i=−∞
Decimating the sequence to get approximately one sample per chip, r˜s (nTc ) = r˜(kTs )|k=nNsc ≈ ejθ(nTc )
+∞ X i=−∞
dj i k c|i|P gN (nTc − iTc − τ ) Q
(3.77)
3.4 Description of the Code Acquisition System
42
It is worth noting that at each sample of the sequence r˜(nTc ) we have the same information as we would have obtained in a BPSK receiver: we have the information of one chip with a fractional chip timing error (or symbol, in the BPSK case), regardless of whether the RX has already acquired the PRN code or not. It should be reminded that in DS/SS, synchronizing means not only sampling at the peak of the CMF output, but also ensuring that the spreading and despreading sequences are aligned. To account for this chip timing error, it is convenient to decompose τ in integer and fractional parts: τ = mτ + µτ Tc
(3.78)
where mτ = int Tτc is the integer part of the chip-normalized timing error and µτ is the remaining fractional part. This way, r˜(nTc ) now is r˜s (nTc ) = dj n−mτ k c|n−mτ |P gN (µτ Tc )ejθ(nTc ) + r˜ISI (nTc ) Q
(3.79)
where the ISI term can is X r˜ISI (nTc ) = ejθ(nTc ) dj i k c|i|P gN ([n − mτ ] Tc − iTc − µτ Tc ) ≈ 0 (3.80) Q i6=n−mτ
It is interesting to pay attention to the information that r˜s (nTc ) carries: the data symbol and the mth τ sent chip, all scaled by the amplitude of the pulse shape response when sampled with some timing error µτ and with an unknown phase error. The data symbols and the phase error will be eliminated in the squarer whereas the other two factors will be used in the code acquisition process. The power of the signal at the output of the CMF can be computed now easily Rr˜s (0) = E {˜ rs (nTc )˜ rs∗ (nTc )} = |gN (µτ Tc )|2 (3.81) and, as it can be seen, this power equals Ec fs when µτ = 0.
3.4.3
Coherent Detection – Correlator
To check whether the PRN code phase has been acquired or not, we need some device to perform a correlation operation of the incoming signal with a shifted local reference. This is the mission of the Correlator block, and it consists of a multiplier plus an I&D. After the multiplication we have x ˜(nTc ) = r˜(nTc )c∗|n| = x ˜s (nTc ) + x ˜ISI (nTc ) + n ˜ x (nTc ) P
(3.82)
where the signal component becomes x ˜s (nTc ) = dj n−mτ k c|n−mτ |P c∗|n| gN (µτ Tc )ejθ(nTc ) Q
P
(3.83)
3.4 Description of the Code Acquisition System
43
and whose statistic mean shows that only in the synchronized status we can expect outputs different from 0 E {˜ xs (nTc )} = δmτ gN (µτ Tc )ejθ(nTc )
(3.84)
At the output of the integration block, there is z˜(nTc ) = z˜s (nTc ) + z˜ISI (nTc ) + z˜n (nTc ) = x ˜(nTc ) ∗ hM A (nTc ) Ncoh X−1 1 x ˜(nTc − iTc ) = Ncoh
(3.85)
i=0
where we should note that, now that we are in the acquisition stage, we do not accumulate Ncs samples, but only Ncoh (number of samples integrated coherently), with Ncoh < Ncs . The reason of performing integrations of less chips (less time) than in tracking was qualitatively explained in Section 3.2.1 and will be quantitatively addressed in the following lines. After dumping the accumulation of Ncoh = Tcoh /Tc samples, we obtain z˜s (lTcoh ) = z˜s (nTc )|n=lNcoh
Ncoh X−1 1 = r˜ ([lNcoh − i] Tc ) c∗|lNcoh −i| P Ncoh
(3.86)
i=0
A careful look at this equation will tell us some key aspects of the operation we are performing. Firstly, note that only windows of Ncoh samples are being accumulated, and secondly note that what the accumulation we are performing is the multiplication of two signals. This operation closely reminds us of an unbiased estimator of the correlation for sampled sequences at the origin. Thus, we could write ˆ r˜c,N (0) = R coh
Ncoh X−1 1 r˜ ([lNcoh − i] Tc ) c∗|lNcoh −i| = z˜s (lTcoh ) P Ncoh
(3.87)
i=0
ˆ XY,N (0) = More generally, we know that R
1 N
NP −1
Xi Yi ≈ RXY (0) and that
i=0
RXY (0) = E {XY }. Obviously we expect r˜ and c to be correlated when coarsely aligned, but if we develop z˜s (lTcoh ) we reach an expression that can be seen as the correlation of two uncorrelated processes Ncoh X−1 1 z˜s (lTcoh ) = e gN (µτ Tc ) dj lNcoh −i k e−jωd iTc c|lNcoh −mτ −i|P c∗|lNcoh −i| P Ncoh Q i=0 (3.88) in which we could understand that we are trying to perform the correlation of c|n−mτ |P c∗|n| with dj n k ejωd nTc . These two sequences we know they are uncorjθ(lNcoh Tc )
P
Q
related, and knowing that RXY (0) = E{XY } turns into RXY (0) = E{X}E{Y } for uncorrelated X and Y processes, we could make the approximation of breakˆ ing the correlation estimator R(0) in two expectation estimators and now compute each expectation separately, q jθ(lNcoh Tc ) −jωd (Ncoh −1)Tc /2 ˆ z˜s (lTcoh ) ≈ e gN (µτ Tc )Rc,Ncoh (mτ ) e DQ (Ncoh ) (3.89)
3.4 Description of the Code Acquisition System
44
where ˆ c,N (mτ ) = R coh e−jωd (Ncoh −1)Tc /2
q
DQ (Ncoh ) =
Ncoh X−1 1 c|lNcoh −mτ −i|P c∗|lNcoh −i| P Ncoh
(3.90)
1 Ncoh
(3.91)
i=0 Ncoh X−1 i=0
dj lNcoh −i k e−jωd iTc Q
ˆ c,N (mτ ) should have been defined as R ˆ c,N (lNcoh , mτ ) beNote that R coh coh cause partial correlations depend also on the starting point. For simplicity we will use the former notation and will emphasize the points where this should be also accounted for. How we get to eq. (3.91) and the definition of the term DQ (Ncoh ) can be found in the Appendix A, here we will only use the result. At this point we can find the power of the signal component at the output of the I&D and the noise power in order to find the SNR at that point. The signal power is n o 2 2 ˆ c,N E |˜ zs (lTcoh )|2 = gN (µτ Tc )R (mτ ) DQ (Ncoh ) coh (3.92) 2 ˆ r,c =R (τ ) DQ (Ncoh ) ˆ c,N (mτ ) as the whole correˆ r,c (τ ) = gN (µτ Tc )R where we have defined R coh lation function. It will be more useful to normalize it, so that ˆ )= R(τ
g 2 (µτ Tc ) ˆ 2 1 ˆ2 Rr,c (τ ) = N Rc,Ncoh (mτ ) Ec fs Ec fs
(3.93)
ˆ )| ≤ 1 will be the normalized global correlation function, and we Now |R(τ shall note that takes into account three facts: • the fact that the PRN sequences are misaligned, • the fact that at this stage there still exists a fractional timing error µτ • and, most importantly, the fact that we are performing partial correlations of the PRN sequences. The noise power is the same as that of Eq. (3.68) but changing Ncs by Ncoh , n o N f 2σ 2 0 s E |˜ zn (lTcoh )|2 = = Ncoh Ncoh
(3.94)
yielding an SNR14 at the output of the coherent integrator SNRcoh =
Ec ˆ Ec ˆ R (τ ) Ncoh DQ (Ncoh ) = R (τ ) ∆SNR N0 N0
(3.95)
Up to this point we have used the same blocks that we introduced in Section 3.3 to reuse equations that were found there. However, in the acquisition 14 Rigorously we could not call this ratio an SNR because synchronization has not been made and thus the amount of signal power at this point varies depending on τ .
3.4 Description of the Code Acquisition System
45
module, our main aim is to compute the power at the output of the coherent correlator, and it would be preferrable to do it with noise normalized, We will see later that doing it this way, the power detector has to decide wether the statistics of the samples belong to a pdf centered at noise power (constant for all possible Ncoh ) or shifted by an amount proportional to the SNR. This normalization with respect to noise does not affect √ the SNRcoh already found and it can be achieved with a multiplying factor Ncoh in the coherent accumulator, for instance. Now the coherent I&D can be seen as a filter with unity gain, thus preserving the noise power at its input. Hence, from now on we may rewrite the signal component power and noise power as ˆ 2 (τ ) Ncoh DQ (Ncoh ) Pz˜s = s2 = R r,c
(3.96)
Pz˜n = 2σ 2 = N0 fs
(3.97)
Looking at eq. (3.95) we can see several key aspects on the design of the whole acquisition system. We see that the SNR depends, basically, on the ˆ (τ ) and Ncoh DQ (Ncoh ). While the former puts in evidence the fact terms R that not much signal power can be expected if the sequences are not slightly aligned,15 the latter shows, when plotted, the restrictions that our correlator should be bound to, and stablishes an optimum coherent observation time for the correlator. It is in fact a term that helps us improve the SNR and that is why it is also dubbed ∆SNR. In Figure 3.8 we can see a plot of the Ncoh DQ (Ncoh ) function for the particular case of the GPS acquisition unit designed in this project. In it one can see that the SNR improves linearly while the number of coherent integrations is small compared with the rc /fd ratio, and starts decreasing abruptly when they are of the same magnitude. Note that no SNR increment is obtained for a range of Ncoh . In this range, our receiver would be completely blind and could not detect the instant when the two PRN sequences would be aligned.
3.4.4
Non-Coherent Detection – Squarer and Non-Coherent I&D
The coherent detection block allows us to improve the SNR in order to see whether the incoming signal is despread properly or not. If the two PRN sequences are quite aligned, the despreading operation will be successful and a power detector at the output of the correlator would tell us about it. However, the limitation in the coherent observation time due to phase errors and possibly data modulation does not allow the SNR increment to be high enough to perform a reliable detection. That is why we need the non-coherent detection blocks. While the squarer will eliminate the data sequence and the phase uncertainty, the post-detection I&D will perform an average on the output samples of the squarer so that finally a test detection block decides how likely it is that the hypothesis being tested is the right one. Unfortunately, this noncoherent observation time s also limited, as we saw in Section 3.2.1, if we want 15
for the cases of Ncoh large enough. Otherwise code sidelobes may mislead the detector.
3.4 Description of the Code Acquisition System
46
SNR Increment 16
14
SNR increment (dB)
12
10
8
6
4
2
0 1
2
10
10 Ncoh
Figure 3.8: Plot of ∆SNR(dB) vs. Ncoh in the correlator for the worst case of Doppler shift (6KHz) and the GPS C/A code chiprate (rc = 1.023MHz). Maximum gain of 15.9dB at Ncoh = 64.
to make sure that the hypotheses we end up with is the same as the one we started testing. Squarer At the output of the squarer block there is 2 ˆ r,c Z(lTcoh ) = |˜ z (lTcoh )|2 = R (τ ) Ncoh DQ (Ncoh ) + Zn (lTcoh )
(3.98)
The squarer is a non-linear block, so now the statistics of the noise component have changed. Now, the probability density function (PDF) of Z(lTcoh ) is not Gaussian anymore. On the contrary, its PDF is a noncentral chi-squared with two degrees of freedom, and it arises as a result of summing the squares of two independent and identically distributed (IID) Gaussian random variables with nonzero means. In our case, we had two independent noise components of power σ 2 each and mean p the square root of half the power of the signal component z˜s (lTcoh ), i.e. s2 /2. We may call these two random variables Xi and Xq for inphase components respectively, and define them as Xi , Xq ∼ p and quadrature 2 2 N s /2, σ . ˆ ) = 0 we In the particular case that sequences are not aligned and thus R(τ can say that the PDF of Z(lTcoh ) is a central chi-squared with two degrees of freedom. For the non-central chi-squared with two degrees of freedom, we have
3.4 Description of the Code Acquisition System that its PDF is
√ s2 1 − s2 +Z 2 Z 2 fZ (Z) = 2 e 2σ I0 2σ 2σ
47
(3.99)
where s2 and σ 2 were defined in (3.96) and (3.97), respectively, as the power of the signal component and the power of each of the two noise components of z˜(lTcoh ). The I0 (x) function is the modified Bessel function of the first kind and order 0. It is defined as +∞ X (x/2)2k+α k! (α + k)! k=0 By using the result that if ξ ∼ N µ, σ 2 , then E ξ 2 = µ2 + σ 2 E ξ 4 = µ4 + 6µ2 σ 2 + 3σ 4 var ξ 2 = E ξ 4 − E 2 ξ 2 = 4µ2 σ 2 + 2σ 4
Iα (x) =
(3.100)
we can find that the mean and variance of Z simply with the mean and variance of Zs and Zn because var {Z} = var {Zn } and E {Z} = E {Zs } + E {Zn } var {Z} = var Xi2 + Xq2 = 2 var Xi2 = 4s2 σ 2 + 4σ 4 (3.101) 2 2 2 E {Z} = E {Zs } + E {Zn } = E {Zs } + 2E Xi = s + 2σ (3.102) Non-Coherent I&D At the output of the non-coherent I&D there is T (Z) = Z(lTcoh ) ∗ hpd (lTcoh ) =
L−1 1 X Z(lTcoh − iTcoh ) L
(3.103) i=0 E ˆ 2 (τ ) Ncoh DQ (Ncoh ) + T (Zn ) = R r,c L D E 2 (τ ) ˆ r,c where we define T (Zs ) , R Ncoh DQ (Ncoh ) = s2 and make emL 2 (τ ) is ˆ r,c phasis on the comment that was made in Equation (3.90) that, since R the computation of partial correlations and depends on the starting point, at the output of the non-coherent I&D we will have a time average of length L that we note with the symbols h·iL (see Figure 3.13 in page 53 for a plot of the code-sidelobes for the case of our acquisition unit). For short notation, we will also note the signal and noise component as Ts and Tn , respectively, and T for the sum of the two. Thus, we can now compute the code sidelobes due to partial correlation at the output of the non-coherent I&D. Since the post-detection integrator is accumulating L pairs of IID squared Gaussian random variables, the result distribution of T (Z) obeys the noncentral chi-squared PDF with 2L degrees of freedom, and it can be expressed as L−1 L(s2 +T ) √ 2 L T Ls2 − 2 2σ fT (T ) = 2 e IL−1 LT 2 (3.104) 2σ s2 σ D
3.5 Simplifications and other requirements for GPS
48
where IL−1 (x) can be evaluated with eq. (3.100). In the sequel we will make basically two assumptions: when the PRN seˆ ) ≈ 0 whereas, when quences are not aligned (|τ | > Tc ), the correlation R(τ ˆ aligned (τ = 0), R(τ ) = 1. These two possibilities (right or wrong hypothesis) render two PDFs that, thanks to the central limit theorem, become Gaussian PDFs for L large enough (for L > 10 it starts to apply), and this will simplify our study enormously. We will refer to these PDF as f0 (T ) and f1 (T ) for fT (T )|τ >Tc and fT (T )|τ =0 , respectively. In Figure 3.9 we have depicted these two PDFs. fT (T ) 6
τ > Tc
τ =0
PM
PF A
2 + 2SNRcoh
T (Z)
-
2
θ = 2 + SNRcoh
Figure 3.9: Plot of f0 (T ) (left) and f1 (T ) (right) and definition of PF A and PM . The plot is normalized with respect to σ 2 , and drawn for L = 50 and SNRcoh = −4dB.
Other possible values of τ between these two (0 < |τ | < Tc ) will render a PDF that will be between the other two PDFs. The moments of T (Z) are the following: E {T } = mT = E {Zs } + E {Zn } = s2 + 2σ 2 = 2σ 2 (1 + SNRcoh )
(3.105)
4σ 4 var {Zn } = (1 + 2SNRcoh ) L L
(3.106)
var {T } = σT = var {Tn } =
3.5
Simplifications and other requirements for GPS
In Section 1.3 it was seen that in GPS there was no pulse shaping, and that squared pulses were filtered out to the 10th lobe for the C/A case at the satellite. However, our RF frontend filters out only the mainlobe of the C/A signal (bandwidth of 2.046MHz). The loss associated to this filtering is less than 0.5dB [4], and it can be computed as the ratio of the power of sidelobes vs.
3.6 Cells Test Statistic
49
sidelobes plus mainlobe. It is so low because the spectrum of the signal is basically a squared sinc and therefore decays at a 1/f 2 rate. At the input of the system there are the I/Q components of the complex baseband equivalent resulting from IF sampling. Since the RF frontend gives out only the mainlobe, the CMF becomes a simple Integrate & Dump of Nsc = 2 samples per chip. It is important to note that this operation will be lossless in any other DS/SS communication system that does not use other than rectangular pulse shaping. In another system using this simplification is the same as if we were neglecting the ISI of the pulse shaping. But this operation is exactly the same than the one performed in the coherent I&D, so in order to simplify hardware we could join these two operations. The only changes that have to be accounted for are: â&#x20AC;˘ that now the PN generator needs to hold the same chip value for 2 chips (or Nsc in general). Also, the increments from one cell to the next one have to be done in steps of 1/2 chip (or 1/Nsc , in general), â&#x20AC;˘ and, now the I&D will work at twice the previous frequency (fs Nsc , in general).
3.5.1
GPS Single Point Solution: Number of channels
Our objective with this project was to end up with a DS/SS receiver useful for GPS applications. To be able to compute the single point solution (global position) of a receiver, it is necessary to get information from 4 satellites (see Section 1.1.1). This translates in that our receiver needs to have at least 4 tracking correlators or channels, to be useful. This means that the GPS receiver needs to have the receiver chain described so far replicated Ntrack times.
3.6
Cells Test Statistic
This is the part where the performance improvement is made for one single correlator. In section 3.6.4 we will see how we can parallelize the whole acquisition module among several correlators, which is the only improvement left when the cells test statistic has been optimized. As previously pointed out in Section 3.1.2, one of the things that can really make the difference between two correlators is the way cells are tested. For best performance in this project we have chosen a combination of sequential test and multiple dwell [1]. Later on we will see that there is not enough with these two tests due to the nature of the serial search, and due to Doppler limitations in the coherent integration time. The part of the acquisition process that performs the tests in a more intelligent way than simply comparing with a threshold can be modeled by a Finite State Machine (FSM) like the one in Figure 3.10. In the sequel we may use the names of the states when referring to the tests for abbreviation purposes.
3.6 Cells Test Statistic
50
SEQ VAL
IDLE
SL
MAXF
Figure 3.10: Simple Finite State Machine obeyed by the cell test control block for the case of only one correlator.
3.6.1
Sequential Test (SEQ)
This is the first type of test that will be applied to every hypothesis. Its main aim is to discard wrong hypothesis very early in the adquisition process. To achieve this early discarding, the test relies on the fact that good hypothesis will always be above an increasing threshold whereas wrong hypothesis may be above this threshold only for few non-coherent integrations. This behaviour is depicted in Figure 3.11. Nonâ&#x2C6;&#x2019;coherent Integration Output
Thres. Slope
Correct Hyp.
Wrong Hyp
Initial Thres.
Nonâ&#x2C6;&#x2019;coherent integrations
Figure 3.11: Sequential test. Hypothesis are tested for every new noncoherent integration. Correct epochs are expected to be above the threshold line, unlike wrong epochs.
Unlike the validation dwell, the sequential test performs a test for every non-coherent integration. And as soon as that value is under the threshold slope, the current hypothesis will be declared wrong. Then, to start testing the next epoch the coherent and non-coherent integrators will be reset and the PN generator will be delayed half a chip. This test will go on for a maximum
3.6 Cells Test Statistic
51
number of integrations Lseq , after which the tested hypothesis will be declared to have passed the test and the state machine will advance one step, to the validation dwell. While the key property of the sequential test is that it discards wrong hypotheis very early, its main drawback is that it yields unacceptable probability of false alarm PF A (false acquisition). It will be the validation dwell’s objective to improve this probability, and actually the VAL procedure will set the final PF A of our system.
3.6.2
Validation Dwell (VAL)
In this procedure the presumably correct epoch delivered by the SEQ is tested for a constant amount of time, nomatter if at any point in the integration it gives negative results. Unlike the SEQ, only the last integration Lval is considered in the VAL, if it is positive we consider to have found a good hypothesis whereas if negative, we go back to SEQ to test the next hypothesis. It is therefore a single or fixed dwell procedure that yields the probabilities of miss PM and false alarm PF A depicted in Figure 3.12. These probabilities were defined graphically in Figure 3.9, and analitically we see that they are the non-central chi-squared right-tail probabilities. +∞ Z
fT (T ) dT |τ >Tc
PF A =
+∞ Z = f0 (T ) dT
θ
θ +∞ Z
Zθ fT (T ) dT |τ =0 =
PM = −∞
(3.107)
f1 (T ) dT
(3.108)
θ
But thanks to the central limit theorem we know that for L large enough these probabilities can be approximated very well by the Gaussian right-tail probability Q(x) (see Eq. (2.12)), and then we can rewrite the two probabilities as: θ − m0 PF A ≈ Q (3.109) σ02 θ − m1 PM ≈ Q (3.110) σ12 where PM = 1 − Pd , being Pd the probability of detection of a good hypothesis. In Figure 3.12 we can see that both probabilities decrease monotonically with the number of non-coherent integrations. Thus, Lval should be chosen as large as possible in order to minimize PF A and PM , but in Section 3.2.1 it was seen that there was a maximum non-coherent observation time if we did not want to end the test checking a different hypothesis than the one we had started with. Hence Lval is limited by the chiprate Doppler.
3.6 Cells Test Statistic
52 Performance of the non−coherent detector
−1
10
C/N0 = 40dBHz
−2
10
Prob. of Miss
−3
10
−4
Prob.
10
Prob. of False Alarm
−5
10
−6
10
−7
10
−8
10
100
200
300
400
500
600
700
800
900
1000
LVAL
Figure 3.12: Probabilities of miss PM and false alarm PF A for the Validation Dwell vs. the number of non-coherent integrations in the VAL procedure Lval .
Validation Dwell is not enough There is one fact in our design that makes the VAL procedure insufficient to ensure we have acquired the PRN code. The reason is that our acquisition is based on a serial search approach. This means that in the search for a good hypothesis, the first hypothesis that is above the threshold is considered to be the good gone. But a strong code sidelobe could mislead the detector and make it believe it is the correct hypothesis when actually it is only a powerful code sidelobe. This situation is worsened in two situations: 1. In the case that Doppler and chiprate allow the coherent integration to be performed during very few chip periods, 2. and in the case where the DS/SS receiver has to work in a wide range of C/N0 , since in high C/N0 scenarios code sidelobes are expected to be powerful and therefore misleading-prone. Therefore, after the VAL procedure we can only say that the receiver is aligned to a sidelobe of enough SNR, and this fact can be used already at this stage to start the PLL to acquire the carrier frequency Doppler. In order to make sure that synchronization has been achieved with the mainlobe instead of a sidelobe, a block that performs this test is necessary. This block is called Sidelobe-Chek Procedure (SLC).
3.6 Cells Test Statistic
3.6.3
53
Sidelobe-Check Procedure (SLC)
In GPS the expected C/N0 is about 40 dBHz, but if possible it would be desirable that our system worked under worse conditions. Worst Sidelobes in GPS The assertion that the acquisition module may take for valid a code sidelobe instead of theDmainlobe can E be clearly seen in 2 ˆ Figure 3.13, where the time-average function Rc,Ncoh (mτ ) has been plot for L
L = 1 (no average) and L = 100. There we see that, if the incoming signal C/N0 range is greater than the worst sidelobe ratio at the output of the non-coherent I&D, our detector may be misled. That is because the acquisition works all the time searching for the lowest C/N0 , but the power of the mainlobe at the minimum C/N0 may well be the same power of a sidelobe for an incoming signal of much more power. It is also worthwhile to note that partial correlation does not improve for greater L. Code Sidelobes after Non−Coherent I&D (L=1)
Code Sidelobes after Non−Coherent I&D (L=100)
−5
−5
−10
−10
<R2(mτ)>L
0
<R2(mτ)>L
0
c
−15
c
−15
−20
−20
−25
−25
−30
100
200
300
400
500 mτ
600
700
(a) L=1,Ncoh = 85
800
900
1000
−30
100
200
300
400
500 mτ
600
700
800
900
1000
(b) L=100,Ncoh = 85
Figure 3.13: D Code Sidelobes E vs. mτ due to partial correlation at the output of the non2 ˆ c,N coherent I&D R (m ) . Worst sidelobes are −8.76dB at mτ = 620 for L = 1 and τ coh L
−15.54dB at mτ = 665 for L = 100
Sidelobe-Checking is not enough There is another reason why SLC can not finally deliver the best hypothesis. The reason is that chiprate Doppler causes a deviation in the final delivered hypothesis, because there is some time gap between the moment the SLC procedure finds one hypothesis as the best one and the moment when it delivers that hypothesis. A block is needed in order to perform the final hypothesis test, and it is needed that this block performs the test in a parallel fashion if we want to minimize the chiprate Doppler effect. This block is meant to test simultaneously the delivered hypothesis and the closest hypotheses on the left and on the right of that presumably best hypothesis.
3.6 Cells Test Statistic
54
Obviously this block makes sense only in the case where there are several correlators working in parallel, otherwise the test can not be performed in parallel, and it is further explained in Section 3.6.4.1.
3.6.4
Correlators Parallelization for Faster Acquisition
In order to decrease the amount of time required by one single correlator to find the good hypothesis, we would need to split the HS in Ncorr equal parts and make every correlator deal with only a fraction HS/Ncorr of hypothesis of the whole HS. The mean acquisition time for a system made up of Ncorr correlators is approximately 1/Ncorr th the total acquistion time if we consider the SLC and MAXF procedures much less time consuming compared with the SEQ and VAL. Again, we want to repeat that hypotheses are not “chips” of the PN code, but phase differences between the incoming PN spreading sequence and the locally generated replica. Thus, if the whole HS is splitted in Ncorr parts or sub-HS, each of the correlators can work in that region only until another correlator claims to have found a presumably good hypothesis. 3.6.4.1
Maximum-Finding Procedure (MAXF)
As commented in the previous Section 3.6.3 there exists the need to eventually perform a parallel test to make sure that the drift caused by the chiprate Doppler has not moved the best hypothesis. Since this drift is presumably small, it is plausible to expect the best hypothesis to have moved from the one found in the SLC to the next or the previous hypothesis in the HS (or maybe two next or two previous hypotheses). Therefore, the proposed technique for the MAXF is to make use of some of the correlators on the left and on the right16 of the winner correlator at the SLC stage, so that each of them tests the ±1, ±2, · · · hypotheses, i.e. the contiguous hypotheses to make sure the best hypothesis is delivered to the DLL regardless the chiprate Doppler effect. This test is again based on the Fixed Dwell test. It will perform the test for a fixed amount of time after which the non-coherent integrations of all the involved correlators will be compared and the one with the highest output value will be considered to carry the best hypothesis. 3.6.4.2
Careful Parallelization
Parallelizing correlators does not change anything of the previous analyses. Nevertheless the parallelization plays a very important role when doing the hardware implementation, as we will see in Section 4.3. The reason is that correlators need to keep perfect knowledge of what they and other correlators are doing all the time. Since the HS are actually phase differences between the PRN sequence generated at the receiver and the one 16
Here left and right graphically refer to the correlators in charge of testing the previous and the sub-HS, respectively. The correlators at both ends consider each other one of its side-correlators.
3.7 Spatial Diversity in Synchronization
55
received at the input, care must be put on synchronization of all correlators working in parallel if we do not want them to get their hypothesis overlapped with other correlatorsâ&#x20AC;&#x2122; hypothesis. Changes in the state machine in Figure 3.10 are necessary to account for the parallelization. The reason is that now correlators need to interact and interchange some information since they are working in different zones in the search for the same goal. In the beginning of the acquisition process, every correlator is in the SEQ and VAL procedures and work completely independently from any other correlator. It will not be until one of them claims to have found a good hypothesis that all correlators will start to interact. The successful correlator in the validation dwell should be able to interrupt the others nomatter if they are in the SEQ or VAL procedure. At this point, all of them will perform the SLC, but with the special note that when one of them finds a good hypothesis it will somehow affect the other correlatorsâ&#x20AC;&#x2122; operation. The way this will be done is by means of increasing the threshold slope and initial threshold that are common to all of the correlators.
3.7
Spatial Diversity in Synchronization
In wireless comunications, the use of antenna arrays has been widely recognized as a promising means to increase the system capacity, to improve the signal quality and to extend the coverage. Several types of systems using antenna arrays have been proposed, including adaptive array techniques for DS/SS systems. However, most of these systems always assume perfect code synchronization at the RX, and rarely a mention is made about the improvement of initial synchronization. In this section we address this issue. As previously stated, when synchronizing the receiver has no information about timing, phase or frequency. With an array of antennas the Direction of Arrival (DOA) is another unknown in our system. This means that signals at the input of the antenna array can not be added coherently, i.e. using the phase information, to use a DS/SS receiver like the one described so far, because doing so we would be unwillingly synthesizing a broadside radiation pattern. Since we can not tell for sure the satellite we are seeking is in the broadside direction, we need to use the spatial diversity information delivered by the Nant antennas in a non-coherent way. Looking at the receiver in Figure 3.7 we can see that there is one block where the phase information of the received signal vanishes: the squarer. This is the point where all the information coming from multiple antenna should be added, because it is a non-coherent sum.
3.7.1
Code Acquisition Enhancement: Nant Antennas
Let us see what consequences arise from the fact that we are computing a non-coherent summation. We saw in Section 4.2.3.1 that this non-linearity changes the statistical properties of the noise. But we saw in Section 3.4.4 that by the central limit
3.7 Spatial Diversity in Synchronization
56
theorem the statistics became Gaussian again after the integration of L noncoherent values, being L 1. Thus, an array of antennas will help in this process of increasing L by an amount Nant . We can therefore substitute in all previous equation the L factor by LNant . This has obvious consequences on the non-coherent integrator: now that we have Nant more squared samples to compute its statistical mean, the variance of the unbiased estimation computed by the non-coherent I&D will be Nant times lower. The mean, however, will remain the same. E {TNant } = ET = 2Ď&#x192; 2 (1 + SNRcoh ) var {TNant } =
var {T } 4Ď&#x192; 4 = (1 + 2SNRcoh ) Nant Nant L
(3.111) (3.112)
As it can be seen in Figure 3.14, an also very important factor that should be taken into account when using an array of antennas is the fact that the hardware burden increases with the proportionally to Nant . The number of total correlators for a code-acquisition scheme parallelized in Ncorr correlators and Nant antennas is Ncorr Nant , and this means that the hardware overhead increases approximately in the same proportion if we consider the computational blocks more expensive, in terms of area, than control blocks. The reason is that control blocks need not be replicated in a receiver with spatial diversity because only the blocks up to the squarer need be independent for every antenna and for every sub-HS.
RF Frontend I/Q Demodulator
√ −jω t o 2e
LPF
ri (kTs )
RF Frontend I/Q Demod.
RF Frontend I/Q Demod.
Chip Matched Filter
CMF
Decimator
c|i|P
I&D
Coherent
Z1,Ncorr
k · k2
ZNant ,Ncorr
ZNant ,1
cNcorr
Ncorr Parallel Non−Coh. Correlators
Non−Coherent Correlator
rNant
c1
Non−Coh. Correlators
Ncorr Parallel
cNcorr
c1
ZNcorr
Z1
+ PN Gen.
Control FSM
Non−Coh I&D
+ PN Gen.
Control FSM
Non−Coh I&D
Figure 3.14: Configuration of Spatial Diversity in Acquisition. The receiver uses an array of Nant antennas and parallel correlators by a factor Ncorr .
RF IN
RF IN
RF IN
r1
Z1,1
3.7 Spatial Diversity in Synchronization 57
Part II
VHDL Implementation
Chapter 4
Code Acquisition Hardware Implementation In this chapter we describe how the implementation of the Direct-Sequence Spread Spectrum Code Acquisition Module described in previous chapters was built for the specific case of a GPS receiver with a multiple-antenna frontend. The description of the whole design has been done with an increment in difficulty from a single correlator to parallelization among Ncorr correlators.
4.1
Specifications and Total Parameterization
In the design of the DS/SS acquisition module there has been one key idea that we have tried to keep in mind during the whole design flow: reusability. Although the final implementation has been targeted for GPS purposes as that is the application our physical RF frontend is intended for, the VHDL hardware design was developed keeping in mind that it would be ideal if it could solve more than one problem at once, i.e. if it could be used for any other generic DS/SS communication system. To do so, it should offer the possibility to choose several complexities depending on the specific application. In this sense, it was designed with the idea in mind that changing several parameters and recompiling for the target architecture, it would give out a new acquisition module suited for those particular needs. The parameters that have been chosen as variable are: â&#x20AC;˘ Ncorr , or the number of acquisition correlators that search for a good hypothesis and deliver the hypothesis to the tracking correlators. â&#x20AC;˘ Ntrack , or the number of tracking correlators (or channels) that once a good hypothesis has been found for a particular code (in GPS, that an SV has been found), it keeps that hypothesis for 2 purposes: firstly, so that the PLL+DLL work on that hypothesis and secondly to free the acquisition correlator so that it can start another search. â&#x20AC;˘ Nsc , the number of samples that the ADC takes for every chip. For some applications it is interesting to work with high-speed ADCs, and maybe
4.1 Specifications and Total Parameterization
60
later swap to lower speed ADCs and check whether the degradation is acceptable or not. • Nhc , the number of hypotheses per chip. In acquisition usually 2 hypothesis per chip (that is 1/2 chip error resolution) is enough for the PLL+DLL to start tracking, but it could someday be necessary to test more than 2 hypothesis per chip, so this parameters takes care of that. • Nbs , the number of bits/sample of the ADC. For DS/SS applications usually two bits is more than enough since with them we achieve almost the minimum A/D conversion loss (0.55dB), but for multiple-antenna applications where situations like (un)intentioned jamming may be present and of interest to study, the ADCs width should allow for a larger dynamic range. In for these cases more than 2 bits are mandatory. As can be seen from the previous list of paremeters, the design tries to cover as many applications as possible in only one design: from simple low-cost DS/SS receivers to high-end receivers where the only optimization criteria is performance. The drawback of such design is that it may not allow for boosting optimizations for specific cases that would (probably) result in less hardware burden. Hence our design trades off flexibility against ultimate optimization. Nevertheless, this is becoming less and less important everyday with the everincreasing size of FPGAs. Other parameters that have been taken account in the paremeterization but that belong to more specific parts of the design are in Table 4.2.
Notation For the illustration of the VHDL models, figures as close as possible for the visual understanding have been drawn with several notation details: • All blocks with a ˆ mark in the surrounding box need the clock signal, because they are sequential blocks. The rest are combinatorial.1 . • Name of VHDL blocks are in this font, • and name of signals are in this other font, • generics in VHDL blocks will be written in font GENERIC, • constants will be written in font CONSTANT, • variables also in this font, • and FSM’s states are referred in the font STATE . 1
In the VHDL model, blocks defined inside processes are sequential only if they are activated when either Clk or Reset change regardless of events on other signals.
4.1 Specifications and Total Parameterization
61
GPS related parameters Chiprate Carrier frequency Doppler Chiprate Doppler Sampling Frequency
rc fd ∆r fs
1.023MHz ±6000 Hz ±3.9ppm 2rc
Table 4.1: GPS related parameters, some imposed and some chosen by design
GPS related Specifications The general specifications concerning the GPS system can be extracted from Section 1.3, although some parameters have to be chosen by the designer. These parameters, both determined by the GPS system and chosen by design, can be seen in Table 4.1. The Carrier Frequency Doppler has been chosen to be 6000Hz because it has been assumed that our receiver will not be exposed to high dynamics. The sampling frequency of the code acquisition module is in baseband because it can be assumed that, if the RF frontend does not deliver samples at that frequency, a resampling module can be attached with the great advantages that simplifies the whole receiver and besides it allows for less power consumption [12].
4.1.1
Acquisition Related Parameters
When studying the code acquisition system we saw that there were some parameters on which some blocks depended. Some of those parameters are constant and some others need to change depending on the process they are. Mainly, the two basic numbers that need to be chosen carefully are the number of coherent integrations Ncoh and the number of noncoherent integrations L. It was seen that, in general terms, these two numbers are approximately defined once rc and fd have been defined. While Ncoh = 68 seems optimum (see Figure A.1), Ltotal = 1547 is the maximum number of non-coherent integrations during which one hypothesis can be observed. This number is obtained with the following reasoning: at the final delivery of the correct hypothesis we may expect a maximum deviation of 0.5 chips, so the global observation time cannot be longer than 0.5/∆r = 128.33 ms. This translates in a total number of non-coherent accumulations of (0.5/∆r) ∗ 1023chips 1ms /Ncoh = 1544, that can be divided in all the dwells seen in Section 3.6. However, the reasoning described so far is only approximated, and the fastest acquisition time does not rely only on those optimum parameters. For best performance and faster acq and because it is a non-linear system, we had no other choice than simulating the whole system. The simulation has been carried out in C, with a program written by the authors of [1],[2]. With this program and after many simulations some of the values that seemed to perform very well were are in Table 4.2.
4.2 Single Correlator
62
Parameters chosen in Parallelization Ncoh L SEQ L VAL L SLC L MAXF SlopeSEQ ThSEQ a
85 125 800 128 200 1.625a 0.6875b
The Slope is the bit-quantization of 2.2955*170/256
The factor 180/256 is because we divide by 16 instead of b
√ 2 ∗ 85
The initial threshold is the bit-quantization of 1.0*180/256 Table 4.2: Parameters chosen in the hardware design
4.2
Single Correlator
There are several key aspects in the design that should be clear before studying it in detail, and they will be dealt with in the following sections (or have already been considered in the theoretical study in Chapter 3.4) according to the following outline: • It is important to keep in mind that this design is actually a power detector. A very sophisticated power detector in which a positive detection means that the incomping PN sequence and our PN generated sequence seem to be aligned. • The power detection scheme takes place mostly in the correlator, where we can find all operations in order to non-coherently detect power above a given threshold. • Of great importance too, in the correlator one can find the statistic tests applied to the detection variables in form of a Finite State Machine (FSM), and these tests are what will make the whole acquisition system perform very well or very poorly. The optimization of the parameters used in these tests have had to be found in a heuristic fashion as described in Section 4.5 due to their non-linear nature. • In order to improve the average acquisition time we will then parallelize the hypothesis search scheme among several correlators. With this, we will be trading off hardware complexity against average acquisition time. In fact we will see that when parallelizing the algorithm, hardware complexity grows very much due to the necessary inter-correlator collaboration, so it will be explained in further detail in Section 4.3 (see also Section 3.6.4). • When swapping from one single antenna to an array of antenna great improvements can be made if the spatial-diversity information is accounted for. Nevertheless this information will have to be mixed in a stage of the
4.2 Single Correlator
63
receiver where phase information is not being considered, otherwise we could be synthesizing undesired antenna radiation patterns. Note that this improvement is not as good as it would be in the coherent case, as in the non-coherent case the increased noise does not allow for linear improvements. Other concepts that should be clear and that will be made clear in every section where they apply, are: Obviously, all components before the squarer need to work identically for the I/Q components.
4.2.1
Gold Code Generator
The Gold Code generator used for the C/A code in GPS is made of two maximal length LFSRs of ten registers each. Each SV uses a different code for transmission and each of these codes has to be generated at the receiver if we want to decode the data sent by any SV. All codes can be generated by combining the first LFSR and a slight modification on the second LFSR.2 The modification on the second LFSR consists in taking the modulo 2 summation (xor) of two time-offset versions of the code. In Figure 4.1 we can see a the implementation of a code generator recommended in the ICD-GPS-200. Since it relies on the fact of selecting 2 different taps as the delayed versions of the same code, we can see that this generator is capable of generating only (10!/(10 â&#x2C6;&#x2019; 2)!)/2 = 45 possible codes. Of these 45, 9 are unbalanced codes and the remaining 36 are reserved for GPS.
1 2 3 4 5 6 7 8 9 10
G1
C/A PN code
2tap selector
1 2 3 4 5 6 7 8 9 10 clk
G2
Each shift reg. in clk
preset
out
D
preset
Figure 4.1: Two-tap selection C/A-code generator implementation
There is an alternative implementation that is capable of generating any gold code generated by the two LFSRs, and so it is the chosen implementation in this project. This implementation relies on the Cycle and Add property of 2
In the sequel we may refer to the first LFSR as G1 and G2 for the second.
4.2 Single Correlator
64
the codes, that states that the sum of the output of an LFSR and a delayed version of the same LFSR, results in a delayed version of the original sequence. Thus, we can make a more flexible C/A Gold code generator by freezing the the output of G2 a number of cycles equivalent to the delay that the two-tap selection implementation would have given for two certain selected taps. The final implementation is depicted in Figure 4.2, though in next sections we will see slight modifications that make the PRN generator more flexible for different configurations. g1lfsr
1 2 3 4 5 6 7 8 9 10
clk
G1
preset
PRNOutput
g2lfsr
1 2 3 4 5 6 7 8 9 10 G2
Each shift reg. clk
Initial Delay
in clk Count Down
out
D
preset
preset Figure 4.2: Two-tap selection C/A-code generator implementation
For either implementation there is some data that should be available to the designer. In the case of the 2-tap selection code generator, we need two know what pair of taps corresponds to what SV. In our case we need to know the initial delay of G2. This data is available in Table 4.3.
4.2.2
Coherent Correlators: Coherent Detection
The Coherent Detection part is where we perform an accumulation of all chips hopefully multiplied by a quite aligned PN sequence while in this accumulation we take into account the phase of the input signal. If the receiver was in tracking mode instead of in acquisition mode we could call this part Coherent Symbol Detection, but as we saw in Section 3.4.3 we will not be accumulating during one symbol period because of Doppler and we will need to go from coherent into non-coherent detection to get rid of phase information (that will eventually be estimated by the PLL when acquisition is done) to simply check if there is enough power at the output. symbol detection.
4.2 Single Correlator
SV ID No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 —a —a —a —a —a
GPS PRN Signal No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34b 35 36 37b
65
2-Tap Sel.
G2 delay Chips
First 10 C/A Chips (Octal)
2⊕6 3⊕7 4⊕8 5⊕9 1⊕9 2⊕10 1⊕8 2⊕9 3⊕10 2⊕3 3⊕4 5⊕6 6⊕7 7⊕8 8⊕9 9⊕10 1⊕4 2⊕5 3⊕6 4⊕7 5⊕8 6⊕9 1⊕3 4⊕6 5⊕7 6⊕8 7⊕9 8⊕10 1⊕6 2⊕7 3⊕8 4⊕9 5⊕10 4⊕10 1⊕7 2⊕8 4⊕10
5 6 7 8 17 18 139 140 141 251 252 254 255 256 257 258 469 470 471 472 473 474 509 512 513 514 515 516 859 860 861 862 863 950 947 948 950
1440 1620 1710 1744 1133 1455 1131 1454 1626 1504 1642 1750 1764 1772 1775 1776 1156 1467 1633 1715 1746 1763 1063 1706 1743 1761 1770 1774 1127 1453 1625 1712 1745 1713 1134 1456 1713
a
PRN sequences are reserved for Ground Transmitters
b
PRN sequences 34 and 37 are identical Table 4.3: GPS C/A Code Generator parameters for various SVs and ground transmitters.
4.2 Single Correlator
66
For coherent detection, we have to follow two steps: 1. multiply the input signal by the locally generated PRN code, 2. and accumulating the result for an amount of time large enough to see if the correlation is high and thus the effective SNR increases (this will be done in next blocks). These two operations remind us of a coherent correlator, and will succeed in despreading the incoming signal as long as the PRN spreading sequence is quite aligned to the sequence generated at the receiver. To perform the chip detection we saw in previous chapters that we can find a chip at the output of the Matched Filter (MF) if it is sampled by the variable decimator at correct instants. However, in GPS there is no pulse shaping (i.e. square pulses are used), so some simplifications can be made.3 Since the MF for a square pulse is a perfect I&D, we can put it in the integrator part of the coherent I&D of the correlator. Now, let us remember how the acquistion process worked in a scheme like that of Figure 3.7, with a variable decimator and a MF before the multiplication by the spreading sequence and after that we will see that it is equivalent to the simplified proposed scheme. Previous scheme In the first case (the one closer to the scheme studied in theory, see Figure 3.7), the PN generator was designed to give out PN sequences that could be delayed by 1 chip, so it had to be the variable decimatorâ&#x20AC;&#x2122;s mission to give better resolutions than only 1 chip. We will focus on the case of 2 hypothesis per chip, i.e. 2 samples per chip.4 This way, since in the acquisition process we wish a resolution of 1/2 chip, in the process of seeking the good hypothesis the FSM should advance to the following hypothesis (that is 1/2 chip) with one of these two steps alternatively: 1. For the same PN phase and if only one hypothesis has been tested, we can discard one sample in the variable decimator, start sampling at the next sample, and keep on doing it at exactly the same sample rate of 1 sample every 2 samples. This is equivalent to saying that now we have to sample odd instants when before we were sampling at even instants, and vice-versa. 2. When both even and odd instants have been evaluated for a given PN phase we can tell for sure that we have tested the two hypothesis for a given PN phase or hypothesis, so now it is time to jump to the next PN phase by delaying the PN-sequence generator 1 chip. 3 This simplification might seem contradictory with the aim of designing a general purpose DS/SS RX. This decision was taken with the belief that ISI at a non-MF is negligible in comparison with the noise at very low SNRs. 4 This equivalence is not strictly true at all, but we will use it for illustrative purposes. We could for instance be sampling at 10 samples per chip but seek for 1/2 chip errors, so we would simply need to skip 5 samples to check the next hypothesis.
4.2 Single Correlator
67
Proposed scheme The proposed scheme does the same but in a different way (see Figure 4.3). Doing the aforementioned is equivalent to having an I&D that instead of accumulating “detected chips” it accumulates samples multiplied by a PN generator. Obviously, it will have to accumulate the same number of chips as in the previous case times the number of samples per chip, i.e. Nef f coh = Ncoh Nsc , where Nef f coh is the effective number of coherent samples the I&D will integrate, Ncoh is the number of coherent chips the I&D in the previous scheme would integrate, and Nsc = Tc /Ts is the number of samples per chip. This PN generator needs to hold the same chip value for 2 clock cycles because we need to multiply two incoming samples by the same estimated chip. To jump to the next hypothesis, now it will be done simply delaying the PN generator 1 clock cycle, since that will produce the multiplication of the incoming sequence by a PN sequence delayed 1/2 chip. Chip Detection + Correlator (Coherent Detection)
Power Estimator (Non-Coherent Detection)
z˜(lTcoh )
y˜(kTs )
k · k2
hM A (kTs ) n=
c | k |N 2
1/2 chip PN Generator
Tcoh l Ts
I&D + CMF
PN
Z(lTcoh )
T (Z) hpd (nTcoh )
PostDetection I&D Reset
Reset
FSM Figure 4.3: Simplification of DS/SS Acquisition Module Block Diagram seen in theory for the particular case of number of samples per chip Nsc = 2 and square pulse shaping (or suboptimum chip detection, if ISI is neglected).
4.2.2.1
PRN Code Multiplication
The PRN code can take values −1, 1, so multiplication translates into changing or not the sign of the input signal. This operation reminds us of the XOR operation, since this operation changes the first bit (X1) or leaves it as it is depending on the second bit (X2). In Table 4.4 we can see the truth table of the exclusive-OR (XOR, ⊕) logical operation, and we can see that substituting 0 to 1 and 1 to −1 we obtain the desired truth table for a −1, 1 multiplier. Hence, while for theoretical analysis the −1, 1 representation is more convenient, in a hardware implementation this multiplication will be done very easily with only an XOR gate, if the input data is in the form of SIGN+MAG.5 5 It would not be so straightforward if the data format of the ADC was for instant in 2C’s, since negating one word implies negating all bits plus adding one to the result.
4.2 Single Correlator
68 X1 0 0 1 1
X2 0 1 0 1
X1⊕X2 0 1 1 0
Table 4.4: XOR truth table. Note that if we swap 0’s and 1’s for 1 and −1 respectively, we have the same truth table required for the despreader.
Note that since this block is only applying a logical operation to the SIGN bit of the ADC word it is obviously combinatorial, so no register is necessary. Also, since the signal we are trying to despread was spread by a fast BPSK signal, the despreading sequence is real and applies equally to both I/Q components. 4.2.2.2
Integrate & Dump: Coherent Integrator
Once the two PN sequences multiplication has been done, the first thing we should do in the I&D is translating these values into values that are ready for accumulation. This entails translating every sample to their corresponding value in the ideal ADC transfer function (see Figure 2.1 in page 10) and in 2C’s numerical representation so that normal adders can be used. In Section 2 it was seen that a uniform 2-bits ADC behaved almost optimally, so a uniform translation will assumed to be equally good for any ADCword width and will be parameterized for a generic Nbs −bits. We have dubbed this translation as “ROM” in Figure 4.4 and in the VHDL design, not because it is physically implemented as a ROM, but because it behaves as such (it could also have been called a LUT). IDCoh ROM_out DataIn
ROM
SA Adder
Acc
DataOut
Delay 1clk
Clk
DFF
LastAcc Reset
SA=Saturated Arithmetics
Figure 4.4: Integration part of the Coherent Integrate & Dump described in IDCoh.vhd.
The adder used in the integration is a saturated arithmetics adder. This means that in the accumulation process, overflow and underflow are taken into account. Obviously this is necessary to avoid slipping from large positive values to small negative values when adding two positive numbers (and the other way
4.2 Single Correlator
69
round with negative values). For the two I/Q components identical IDCoh blocks were used. Although the decimator belongs to this block, it will be comented later as it is in another block where we deal with it. 1 thing left: - normalizing. Parameterization The parameterization of this block has been done in two aspects: the input word width and the output word width. The output word width is parameterized with the ACC WIDTH generic, and at compile time this value becomes a constant and an adder of that width is implemented. Parameterizing the input word width (ACC WIDTH) means that the “ROM” accepts virtually any input word width and translates it to the necessary width so that accumulation can take place normally. It can be seen from Table 4.5 that this table can be built as a combinatorial transformation on the input values. 2-bits ADC SIGN/MAG Value ROM_out 1 1 +3 0 11 1 0 +1 0 01 00 −1 1 11 01 −3 1 01
3-bits ADC SIGN/MAG Value 1 11 +7 1 10 +5 1 01 +3 1 00 +1 0 00 −1 0 01 −3 −5 0 10 0 11 −7
ROM_out 0 11 1 0 10 1 0 01 1 0 00 1 1 11 1 1 10 1 1 01 1 1 00 1
Table 4.5: ROM translation from 2- and 3-bits ADC output in the form of SIGN/MAG to 2C’s. With some negations and adding a trailing 1 a combinatorial conversion truth table is easily obtained. Slanted numbers are simply negated in the conversion
The translation is the following: 1. The sign bit is directly negated. 2. The magnitude portion of the input is the same/negated for positive/negative inputs. 3. A trailing bit set to 1 is appended to the word (this can be understood since all output values are odd, so the last bit of the word necessarily has to be 1).
4.2.3
Power Detection – Non-Coherent Detection
Since we are in the acquisition process we have no knowledge about timing, phase or frequency. Checking wether the PN sequence phase is aligned is the same as searching for a coarse estimation of the timing of the incoming signal.
4.2 Single Correlator
70
Therefore, we need to get rid of the other parameters that if taken into account could mislead us to believe PN sequences are misaligned when that is not true. The frequency term is not being taken into account by design because the coherent integrator would not integrate further than a point in which Doppler would become hazardous. The phase term elimination is achieved by an envelope detector, i.e. an squarer, so that from that point on detection will not be affected by phase misalignments anymore. 4.2.3.1
Squarer â&#x20AC;&#x201C; Non-Coherent Chip Detection
Squaring a complex signal is commonly known as calculating the squaredmodulus or squared-norm of a vector. The squared modulus is proportional to the power of that signal, hence the fact that the whole design can be seen, from a very simplistic viewpoint, as a sophisticated power detector. Several options were considered for the implementation of the squarer. CORDIC is an extremely efficient algorithm cappable of computing, among p many mathematical operations, the norm of a complex vector x2 + y 2 , but requires a LUT of N 2 bits and N clock cycles for an output precission of N bits. It is not completely clear that this LUT could be shared by all correlators.6 Besides, we saw in Section 3.7.1 that the number of necessary non-coherent correlators is Nant Ncorr , and this number could easily be as high as 64 for the case of 8 antennas and 8 parallel searching correlators. So the best options seem to be those that use less resources. For our purposes, the squaring operation is necessary but the square-root is not necessary at all, since its effects can be accounted for in the thresholds in the non-coherent integrator that follow the squarer. Hence, it was decided that a simple adder per channel used as multiplier would suit our expectations while consuming very few resources, and its graphical representation is in Figure 4.5. The algorithm that performs the squaring operation is the following. While there is no new coherent accumulation to square (NewCoh2SQR = 0) we register every coherent integration in two registers, both in absolute value. The first register will hold the value of the coherent integration (reg data in) to use it as the accumulation value in every summation, while the second will hold it with the purpose to left-shift it (shift from LSB to MSB) so that the ith bit of the word selects whether the accumulation weighted by 2i has to be done or not. This algorithm is the same algorithm used by human people, with the simplification that multiplications are binary, i.e. 1 or 0 (hence we either sum or not). In the simplified case of squaring, we have that it is equivalent to accumulating the value to square left-shifted so many positions as the index of the 6
Ideally, the LUT should be shared by several correlators, but we could run out of routing resources, maybe not for the sole implementation of the acquisition module but for the rest of the synchronization system.
4.2 Single Correlator
71
Squarer
0 0
|DataIn|
reg_data_in
1
MUX
Half Adder
DFF
DataOut
Half Adder
MUX
DataIn
DataIn
DFF
1 1 −DataIn
<< temp_data_out
DataInSign
MSB DFF
Left Shift 1 bit
abs(DataIn) 0
1
abs_data_in
LSB
Out (Q)
DFF
DFF
In (D)
DFF
DFF Legend with implicit reset
Clk Enable 0
Clk
1
Left−Shift Reg. w/ preload
0
Reset
NewCoh2SQR
Figure 4.5: Implementation of the squarer as a simple conditional-adder plus shift. It can be found in Squarer.vhd
bit we are mulitiplying by indicates. Surely an equation will be more clarifying: w2 = (bL−1 bL−2 . . . b1 b0 )2 = (bL−1 bL−2 . . . b1 b0 )w L−1
= (2 =
L−1 X
L−2
bL−1 )w + (2
1
(4.1) 0
bL−2 )w + (2 b1 )w + (2 b0 )w
(2i bi )w
(4.2) (4.3)
i=0
In our desing w is the value held by the first register, and will not change value in the squaring process. The second register is used to get the bi value, and the multiplication by 2i is performed in the accumulation loop, where the value is left-shifted one position, thus multiplying the so-far accumulation by the base being used, 2 for the binary case. When a pulse is asserted with the request to square the coherent output the two values held in the two registers are frozen and are already used to start the previously explained process. 4.2.3.2
Non-Coherent Integrator Implementation
For the implementation of the non-coherent integrator a reduction in the word width has been made.
4.2 Single Correlator
72
As we saw in Section 3.6 the tests performed in all the procedures consist in comparing whether the accumulated squared output is above a threshold. These comparisons are done either after a fixed amount of time or at every non-coherent integration, depending on the procedure being run. Performing this test is equivalent to performing an accumulated incremental test. This test consists in checking that the accumulation of increments, i.e. the input minus slope (increment=input−slope) is above 0. This fact has two favourable effects: the width of the accumulation word can be shorter and the comparison is easily checked with the sign bit of the result. Conceptually, doing this is like if we had performed a rotation of ϕ = atan(slope) to the lines in Figure 3.11. The initially negative threshold is important to be considered in SEQ, since it determines how fast wrong hypothesis are discarded and also the period of grace that is given to good hypothesis that do not have a good start. Setting it very negative would increase the mean acquisition time but setting it too close to zero could make it miss a good hypothesis with a bad initial realization. This threshold has negligible effects on VAL. The final implementation can be seen in Figure 4.6 and is contained in the file Dwell.vhd. Dwell Input−Slope
Input
Full Adder
Increment
SA Adder
DFF Carry=1
DwellAcc
LastAcc
MUX
Slope
Clk
Acc
1
0
Delay 1clk DFF SA=Saturated Arithmetics
Slope NewCoh Reset ThOffset
Figure 4.6: Non-Coherent integrator implementation in Dwell.vhd
Some facts to note from the implementation are: • the slope is substracted from the input signal by adding the negative Slope (remember in 2C’s a negative number is built negating bit-wise and adding 1 to the final result). • the NewCoh input signal behaves as the sampling strobe at the output of the squarer, indicating that a new coherent accumulation has been delivered and squared. • The saturated arithmetics adder is used for in case there was an overflow in the accumulation. This situation could happen in cases of high C/N0 scenarios or a too narrow accumulation word width.
4.2 Single Correlator
73
â&#x20AC;˘ Note also that the Reset signal only makes LastAcc take the value ThOffset (initial threshold offset) or the previous accumulated value (Acc). Actually the implementation described herein may vary and depends on the FPGA synthesizer and the target FPGA used.
4.2.4
Correlator Control blocks
This is one of the most important points, not only because these blocks will control the tests and therefore will finally set the mean acquistion time, but because the previously described blocks need to be carefully controlled and signaled if we do not want to miss any clock cycle that could make our design unstable. To coordinate the way that previously described blocks interact, there is the need to control them in order. This is the task of the FSMs and other control block that are preferrably represented as VHDL code. All the control blocks inside a correlator are described in the following sections: fsmcontrol (FSM) for the dwell control, setnexthyp (VHDL) for all the signaling to the PRN generator, and fsm coh ctrl (FSM) for the dumping control of the the coherent I&D and the squarer. Basically an FSM is a block with several inputs and outputs, that contains several states where the block can be at some times and some variables to take note of some facts (as counters, boolean values, etc.). Jumping from one state to another is done by means of conditional transitions, that usually have an associated action to them. In the FSM diagrams presented in the sequel, transitions are arrows that have a condition (in underlined italics font) to be fulfilled by the inner variables or block inputs, and the actions associated to that transition to perform on variables or output signals apper in boxes . Note that in VHDL and in all FSMs the counters have always been instantiated as down-counting variables, since it makes that the logic synthesized for the comparison operation be a simple NAND instead of a substraction operation. 4.2.4.1
Dwell Control FSM
The Finite State Machine (FSM) that deals with the control of the observation time per hypothesis is depicted in Figure 4.7. It looks a bit more complicated than could be expected for a single correlator, but this is the version that deals also with the correlators parallelization. This means that there are signals needed only when the parallelization has been done. While in this section we will outline the main points of this FSM, it will not be until Section 4.3 where we will take care of the rest of the signals that enables the FSM of one correlator to interact with other correlatorsâ&#x20AC;&#x2122; FSM. The variables and signals used in this FSM (for the case of only 1 correlator) are the following, and we will mark input/output signals with (i)/(o) and variables with (v): Naccs (v) Number of non-coherent integrations performed. It is used as the counter of maximum iterations to be performed at SEQ, VAL, SLC and MAXF states.
LFSR_TRNSF
SLReset
Sign
JumpToSL
v sl_ntrials
v waitncycles
v Naccs
ValOK <= '0'; DoubleTHs <= '0'; SLdone <= '0'; MAXdone <= '0';
AccReset <= '0'; NextHyp <= '0';
AccReset <= '1';
Reset = '1'
VAL
AccReset <= '0'; ACQ_DONE
Naccs = 0 MAXdone <= '1';
Naccs /= 0 and NewCoh = '1' Naccs := Naccs - 1;
NextHyp <= '1';
JumpToMAX = '1'
MAXdone
SLdone
AccReset <= '0'; Naccs := L_MAXF ;
MAXF
SLC_DONE
SLdone <= '1'; AccReset <= '1';
ChangeHyp = '1'
ChangeHyp = '0' NextHyp <= '0'; AccReset <= '1';
CORRECT_HYP
Sign = POSITIV and Naccs = 0 --Validation PASSED ValOK <='1'; waitncycles := NhypPerChip*LFSRWidth - 1; DoubleTHs <= '1';
The LFSR state transferrence can't be handled by GlobalFSM. It's required that every dwell control FSM waits for the amount of time to ensure no NextHyp are coming; that would mangle the tranferred state.
Naccs := Naccs - 1;
LFSR_TRNSF
waitncycles := NhypPerChip*LFSRWidth - 1;
Naccs /= 0 and NewCoh = '1'
Sign = NEGATIV and Naccs = 0
J1
Naccs := L_VAL; AccReset <= '1';
JumpToSL = '1'
SLC
DoubleTHs
waitncycles = 0 ValOK <= '0'; AccReset <= '1'; sl_ntrials := MAX_SLtrials;
NextHyp AccReset
ValOK
Figure 4.7: Dwell Control FSM fsm control. It is in charge of checking whether tests are passed or not. It controls all the necessary signaling for the non-coherent observation time and requests for next hypothesis when necessary.
AccReset <= '1'; NextHyp <= '1';
IDLE
Sign = NEGATIV
Naccs = 0 and Sign = POSITIV
SEQ
ChangeHyp
Reset
waitncycles /= 0 waitncycles := waitncycles - 1; JumpToSL = '1' DoubleTHs <= '0'; waitncycles := NhypPerChip*LFSRWidth - 1;
Naccs := Naccs - 1;
Naccs /= 0 and NewCoh = '1'
NewCoh
CLK
JumpToMAX
NewCoh = '1' waitncycles := NhypPerChip*LFSRWidth - 1; Naccs := L_SEQ - 1;
JumpToSL = '1'
DwellCtrlFSMState
g MAX_SLtrials=0
Entity: fsmcontrol Architecture: behavioral
4.2 Single Correlator 74
4.2 Single Correlator
75
NewCoh (i) A new squared coherent value has been delivered to the noncoherent integrator. Sign (i) It says whether the so-far accumulated non-coherent increments are above the slope threshold or not. As commented in Section 4.2.3.2 integrating only increments has the advantage that a sign check is enough to see if the accumulation crossed the threshold or not. NextHyp (o) This signal will be asserted to tell another FSM that it should advance one hypothesis. In the design it was preferrable to separate the FSMs that deal with the non-coherent observation time (this one) and the one that deals with hypothesis searching. The reason is that it gives more flexibility when for several reasons hypothesis jumping is not necessarily regular (it will be seen later in the setnexthyp FSM). AccReset (o) Signal devoted to resetting the non-coherent accumulator. ValOK (o) ValOK means Validation Test Passed, and it is an output signal that is asserted when the correlator has finally found a good hypothesis. DoubleTHs (o) This output is mostly used in the SLC to indicate that a more powerful sidelobe has been found, and that therefore the initial threshold (THOffset) and threshold slope (Slope) have to be doubled to keep on checking for more powerful code sidelobes until only one is found. This last sidelobe found will be considered the mainlobe. And this FSM behaves in the following way: • From the IDLE state jumps to the SEQ state to start the first test of the current hypothesis, allowing for a maximum number of non-coherent integrations (Naccs) of Lseq . As long as the tests are above the threshold, i.e. Sign is positive, it will continue in this state. As soon as an accumulation renders a negative accumulation result, it will advance to next hypothesis by means of asserting NextHyp and resetting the non-coherent accumulator. • If the SEQ test has succeeded we can say we have an hypothesis likely to be right, so we start the VAL test that consists in checking only the sign of the accumulation at the end of the integration period. The VAL test is the one that will set the probabilites of miss PM and false alarm PF A . If the result is negative, we will jump to next hypothesis and restart from IDLE .7 • When the VAL test succeeds, it is very likely that the current hypothesis is the good one, but in scenarios with high C/N0 dynamic range the VAL test could be saying that the sidelobe it has checked has as much power as the expected power of the mainlobe of a much weaker signal. Therefore, 7
Note that in the FSM graph there is a red J1 pseudo-state. It is not an state, so it takes no clock cycles. It is called ”junction” and it simply makes the two inputs have some common actions, in our case asserting both AccReset and NextHyp.
4.2 Single Correlator
76
it is necessary to check whether any other hypothesis is more powerful, that is, we need to check if we have detected a code sidelobe instead of the code mainlobe. The mission of the SLC state also can be seen as to avoid false synchronization with the code sidelobes and force synchronization with the mainlobe of the partial correlation. Separately, for lack of space problems, in Figure 4.8 we can see how the SLC stage is done. There we can see that it consists in a series of fast sequential dwells to discard very fast those hypotheses that are not powerful enough. Note that in the transition from FASTSEQ to SLFOUND we are implicitly taking note of the estimation as a valid one. This is done in two ways: 1. The variable sl ntrials is reset to its initial constant value MAX SLtrials everytime a good hypothesis is found. For the case of only one correlator this variable equals the number of hypothesis to be tested. Resetting this variable everytime a good hypothesis is found is very convenient, because this ensures that when all sidelobe-checking has been done, the hypothesis on which that correlator will be set will be the last best hypothesis it has found. The value MAX SLtrials is a constant that is not defined in this FSM, but it is a VHDL generic 8 that has actually been assigned the number of hypotheses to be checked by that correlator in the instantiation inside every correlator.9 2. In the GlobalControl FSM (an state diagram that will serve as the orchestra director for all the correlators working in parallel), it must be written down what correlator was the last one that asserted DoubleTHs, since that will be considered to be the winner correlator. â&#x20AC;˘ After the SLC procedure we jump to the preparation of correlators for the MAXF procedure, but that will be explained in detail in Section 4.3, since the MAXF can be performed only when using more that one correltaor.
4.2.4.2
Set Next Hypothesis block (setnexthyp)
This is a VHDL-coded block that deals with all the necessary signaling to jumping from one hypothesis to the following one. There are several reasons why this block was made independent of the Dwell Control FSM. One reason is that we wanted to keep the dwell control FSM as simple as possible, because it is already complex enough to debug it and test it works properly. Secondly, this block gives more flexibility to the design, since it is 8
A generic value in VHDL is a parameter that can be changed in the block at the time of instatiation at a higher hierarchy level. Thus, it allows reusability because one can design general models that depend in an a priori unknow values, hence parameterizing designs. 9 Later we will see that the last correlator rarely will be checking the same number of hypothesis as the others, since the number of hypothesis to check by every correlator is Nhyp = bHS/Ncorr c except for the last one, that takes the remaining hypothesis HS â&#x2C6;&#x2019; (Ncorr â&#x2C6;&#x2019; 1)Nhyp .
4.2 Single Correlator
77
Hierarchical level: SLC Entry
AccReset <= '0'; NextHyp <= '0';
NewCoh = '1'
NewCoh = '1' and Naccs /= 0 and Sign = POSITIV
Naccs := L_SLC - 1;
Naccs := Naccs - 1;
SLRESET Sign = NEGATIV
FASTSEQ
AccReset <= '1'; NextHyp <= '1'; sl_ntrials := sl_ntrials - 1;
Naccs = 0 and Sign = POSITIV DoubleTHs <= '0'; AccReset <= '1';
DoubleTHs <= '1'; sl_ntrials := MAX_SLtrials; -- WRITE DOWN VALID ESTIMATION!!!!
sl_ntrials = 0 SLFOUND
Exit
Everytime a sidelobe is found we reset the counter that keeps looking for more sidelobes. Since this counter is the one that will allow us to exit from the SLC state, it will serve as the pointer to the valid estimation. Actually, thanks to the NextHyp FSM when we have checked all the hypothesis corresponding to this correlator we can be sure to be at the same hypothesis as the last one found before. This can be assured because thanks to the action "sl_ntrials := MAX_SLtrials" in the "Naccs=0 and Sign=POSITIV" transition. This means that when leaving the SLC state we just need another control FSM to tell us if we were the last corr. to find a good hyp. (so we are the mainlobe) or we have to go to an hyp. close to the one found by the winner corr. This control FSM will memorize what correlator last doubled TH's. And will force the correlators beside the winner Corr(i) to move to hypothesis: Corr (i-2) : HypEstimated - 2 Corr (i-1) : HypEstimated - 1 Corr (i+1) : HypEstimated +1 Corr (i+2) : HypEstimated + 2
Figure 4.8: Sidelobe Check Procedure (SLC )
constructed in such a way that it could be possible to test different ways hypothesis are advanced and tested. But most importantly, this block comes very handy when parallelizing the acquisition modules, since it will be transparent for the dwell control FSM to ask for a next hypothesis when, because of the parallelization, this task may entail different actions depending on which hypothesis is being tested. Implementing the correlator according to the simplifications explained in Section 4.2.2 requires some control on the PRN generator depending on the Nsc . As it was seen in Figure 4.3 where the acquisition receiver in Figure 3.7 had been simplified, we need now a PRN generator capable of giving out PRN waveforms delayed by only half chip (for the case Nsc = 2) instead of one chip. To do this, it was decided to add some more control lines in the PRN generator and make the setnexthyp block responsible of those control lines. These control lines are meant to behave as enable signals for the PRN generator, so that it advances one chip if the enable line allows it, or hold the output value if it does not allow it.10 This behaviour is very interesting if we are trying to parameterize the design with the number of samples per chip Nsc . In the sequel we dub the behaviour of these control lines as Freezing Waves or FW11 for simplicity. The word waves is in plural because there are two LFSRs to freeze, and instead of using a single line to freeze the two LFSRs we will take advantage of these two separate enable lines to implement the PRN generator as an initially-delayed-2nd -LFSR Gold code generator like that of Figure 4.2. Therefore, another task to be accomplished by the setnexthyp block is to make the selection of the SV code to use depending on the input G2CodeDelay. This input tells the block how many bits the second LFSR should be delayed in order to generate the selected PRN code matching the selected SV. To these ends, the way this block works is the following. The most important 10
Note also that the behaviour can be completely opposite, so holding the old value for a fraction of chip can be substituted by letting it run for a fraction of chip instead. 11 Also in the comments of the VHDL source code we will use the freezing wave concept.
4.2 Single Correlator
78
action and therefore the main point we should keep clear in mind is that the freezing wave assertion and deassertion is carried out by the counter variable. It is a countdown12 variable that will keep the FreezeG1 and FreezeG2 signals asserted while counter 6= 0, that is Nsc − 1 clock cycles, and will deassert it for one single clock cycle when counter equals 0. It is understood that while FreezeG1 or FreezeG2 is asserted the PRN generator is frozen, that is, it is holding the same value for more than one cycle, so doing this we can be sure that the chiprate generated at the output of the PRN generator is the same as the input chiprate. Varying a little bit the content of this counter we will also be varying the time the two Freezing Waves are asserted, and thus we will be jumping from hypothesis to hypothesis. Most of the time every correlator has to deal with only a fraction of the whole HS, so we need to make sure that jumping from the last hypothesis of one correlator takes us to the first hypothesis of the same correlator’s sub-HS. This is trivial in a only-one correlator design, and it is trivial as well for the case where the PRN runs in free-run mode in which all the correlators have to go through the whole HS, but it is a little more tricky for the Ncorr correlators case when not running in free-run mode. Since hypothesis are phase differences between the input and the PRN generator, we just need to make sure that we will keep the PRN of the correlator frozen during the exact number of cycles ∆counter that result from the difference between the last hypothesis and the first. Thus, when being at the last hypothesis of our sub-HS and jumping to the first hypothesis of the same subHS, we simply need to add to the counter variable that number of cycles, i.e. ∆counter = (Total no. of hyp. – No. of hyp. to be checked by this corr.). The 1 added to this number in the VHDL code is due to the fact that this code is executed when a NextHyp = 1 condition is given, and this requires to advance one more hypothesis. There are two more variables to comment in the VHDL code of this block: updateCurrHyp and hypToRun. The former is used to update the current hypothesis at the same rate as the PRN is being updated, and this variable is incremented in the same way counter is incremented, with the difference that it counts hypothesis as unity increments nomatter how many cycles we decide to put between hypothesis.13 The latter is used to know how many hypothesis of our sub-HS have been checked, so that we can rewind to the first hypothesis when it has all been explored. During the initialization of the system, the second LFSR needs to be kept frozen for a number of bits depending on the SV selected. Note that in the 12
Again, countdown is preferred for less resources consumption when performing comparisons 13 One possibility of HS subdivisions could be interlaced hypothesis, in which two consecutive hypothesis belonging to the same correlator could be separated by other hypothesis belonging (1) (1) (1) (2) (2) (2) to other correlators, in the fashion: H1 , H2 , . . . , HNcorr , H1 , H2 , . . . , HNcorr , . . ., where (j)
Hi is the j th hypothesis of the ith correlator. In this project we have not found any real improvement in doing so, and therefore consecutive hypothesis are distributed contiguously: (1) (2) (1) (2) (1) (2) H1 , H1 , . . . , H2 , H2 , . . . , HNcorr , HNcorr , . . ..
4.3 Parallelization among Ncorr correlators
79
VHDL code this is done by a little piece of code that will not allow the second LFSR to advance one position until the value of the input G2CodeDelay has been decremented to 0. 4.2.4.3
Coherent I&D Control FSM (fsm coh ctrl)
This FSM is the one in charge of controlling the coherent integrate & dump and synchronizing it with the squarer. It is composed of two smaller FSMs: IDCoh state and SQRsyncFSM (see Figure 4.9). The former is the one that keeps count of how many chips have been integrated to eventually assert the signal SQRcountdown to tell the latter FSM that it can assert NewCoh2SQR to start the squaring computation and that it has to start counting the number of necessary cycles for the squaring to take place. The dumping of the coherent I&D is done in a special way, as was implicitly indicated in Section 4.2.3.1. Adders in the coherent integrator are combinatorial, and the abs() function in the squarer as well. So the reg data in is always registering the absolute value of the coherent integration and when the NewCoh2SQR signal is asserted, the last registered value is the one that will be squared. One might be pose the question of why two FSMs instead of only one, when it seems feasible that one FSM could take care of all the signaling involved. The reason why there are two FSMs is that while the squaring is being done, the coherent integrator can start another coherent integration without waiting for the squaring to have finished. Besides, when we upgrade the design to improve with the spatial diversity given by an antenna array, we will only need to modify the simple FSM SQRsyncFSM.
4.3
Parallelization among Ncorr correlators
As we saw in Section 3.6.4 several correlators can decrease the mean acquisition time if each of them focuses only on a fraction of the whole HS. The parallelization of the acquisition system is a rather delicate task because it requires that all correlators interact. This can be done in two ways: we could either give some intelligence to every correlator or we could keep them simple in and have them be controlled by a super-structure control block. In this project we have chosen this latter approach, and the control block is called GlobalControl. This entails two things with respect to the correlators. 1. Firstly, every correlator has to give information to the GlobalControl block because this control block needs that information to combine it so as to know in what stage the acquisition is exactly. 2. Secondly, once this information has been distilled, every correlator is informed of what they should be doing, so they must accept disruptions in
Reset = '1'
v sqrdelay
IDLESQR
Hold = '0'
NewSQR <= '1'; NewCoh2SQR <= '0';
sqrdelay = 0
NewCoh2SQR <= '1'; sqrdelay := ACC_WIDTH-2;
SQRcountdown = '1'
SQR_SYNC
counter := Ncoh; IDCohReset <= '1';
Hold = '1'
counter := counter - 1; IDCohReset <= '0';
WAITSQR
sqrdelay := sqrdelay - 1;
sqrdelay /= 0
counter := Ncoh; SQRcountdown <= '1';
counter = 0
COH_ACC
counter /= 0 counter := counter - 1;
+ NewSQR (New Squared sample): there is a new squared (non-coherent) sample that can be integrated by the non-coh I&D
+ sqrdelay (Squarer Delay) is initialized to 2 integers less than ACC_WIDTH because: 1) to spend N cycles a signal asserted it has to be intialized at N-1 2) the sign of the input signal (of width ACC_WIDTH) will be positive for sure after the "abs(路)" operation performed on DataIn at the squarer.
+ NewCoh2SQR (New Coherent Accumulation to Square): activates the Squarer Operation because there seems to be a new coherent integration available.
This FSM is very important because it's the one that synchronizes IDCoh output, Squarer Output and I&D reset. If the squarer is combinatorial (no delay), "SQR_SYNC" can be bypassed. If the approach selected is an n-cycles squarer (e.g. CORDIC) instead, a second FSM "SQRsyncFSM" is needed in order to activate the Squarer
Hold
NewCoh2SQR
Clk NewSQR
IDCohReset
Reset
Figure 4.9: Coherent Control FSM fsm coh ctrl. Note there are two sub-FMSs. The first one IDCoh state cares for the I&D (no. of integrations, dumping and resetting) and by means of the SQRcountdown ignites the second FSM dubbed SQRsyncFSM whose aim is to indicate when the squaring operation has completed.
NewSQR <= '0';
NewSQR <= '0'; NewCoh2SQR <= '0';
Reset = '1'
SQRsyncFSM
IDLE
IDCohReset <= '1'; SQRcountdown <= '0';
IDCohReset <= '1'; counter := NCoh; SQRcountdown <= '0';
v counter
IDCoh_state
s SQRcountdown
Entity: fsm_coh_ctrl Architecture: fsm_coh_ctrl_arch
4.3 Parallelization among Ncorr correlators 80
4.3 Parallelization among Ncorr correlators
81
their current operation and jump to the stage where they are told to go depending on what has been commanded by the GlobalControl.
4.3.1
Overview of the Parallelization
We have to say that parallelization turned out to be a tough task since there were many control lines from and to the correlators to be taken into account, and also because the design requires to be conceived in a parallel fashion instead of the more common sequential way to which all programming languages have us accustomed to. In the single correlator case considered previously, there was only one correlator in charge of all the testing procedures. Briefly, in the parallel case, every correlator will be working on its own in the beginning up until one correlator asserts to have found an hypothesis to be more exhaustively examined. At this point a higher level control machine (GlobalControl) will transfer the PRN-generator state of the winner correlator to a spare PRN-generator so that the PLL can start acquiring the carrier frequency offset. After this, the control block will tell each correlator to start examining more closely its subHS in the search for the mainlobe, i.e. they should jump to the SLC state. Everytime a correlator finds an hypothesis that is above the threshold slope, it requests for a increment in the current thresholds (both threshold offset and threshold slope) so that all correlators try to start finding a more powerful sidelobe. This action will continue until all correlators have tested their sub-HS and the last correlator who has requested an increment in thresholds will be considered the one with the best hypothesis. At the output of the SLC state the best timing estimation may have drifted a little bit, so we should perform a parallel test on the delivered estimation and the surrounding estimations to check which of them is the strongest one. To perform the parallel test we first make the correlators that will participate in the MAXF procedure jump to the hypothesis at both sides of the winner by means of the CORRECT HYP state. After the MAXF procedure is done and the GlobalControl block decides which owns the best timing, the succeeding correlatorâ&#x20AC;&#x2122;s PRN-generator will be again transferred to the PRN-generator of a tracking correlator in order to start the DLL to fine-tune the synchronization process. Now we can say that the PRN-code of the selected SV has been eventually acquired. Differences with the proposed parallelization scheme in the ATV It is worth mentioning that the parallelization scheme developed in this project differs from the proposed scheme in the ATV project [1]. They key idea that differs in the two schemes is the PRN-state transference. In [1], the correlator that happened to find a sufficiently powerful hypothesis was literally frozen to aid the PLL start converging. This fact entailed that the sub-HS span covered by the frozen correlator had to be absorbed by another correlator.
4.3 Parallelization among Ncorr correlators
82
However, in this project the PRN-state transference approach was adopted since it was found that it made the FSMs involved far more hardware friendly and the hardware burden incurred by adding a spare PRN-generator was negligible (only a few registers and some XOR gates are needed). Actually, less than negligible it becomes zero in the case of a multiple-channel tracking DS/SS system like GPS, because once an SV has been found and synchronized, the coarse acquisition module needs to be restarted in the search for another SV but keeping the previously found SV-code in tracking mode.
4.3.2
Correlator changes
For simplification purposes and robustness in the design, it was thought that a necessary requirement for all correlators was repeatability and reusability. That is, we had to do a unique correlator that we could instantiate as many times as necessary for one design but that it was also useful for any other DS/SS communication system. Thus, it was necessary to design a block that covered all possible subtleties between correlators, and these differences had to be overcome somehow, maybe by parameterizing them in the form of generic clauses in the code or by using constants for the buses width defined in another package for easy modification. To this end, we designed a correlator that is valid for all instances but differs only on some instantiation values and whose buses’ widths are not numerically defined but constant-name defined to easily obtain a slightly different design by simply changing the value of those constants. All the constants that define all configurable widths or other options of the design are grouped in a VHDL package that will be described later in Section ??. To have a visual impression of what the correlator is like for the case of a single-antenna parameterized parallel correlator, we can see a block diagram in Figure 4.10. The instantiation variables (also called generics) that affect the correlator have to do with what hypotheses are tested by each correlator and the way they are tested. These three generics are called: CorrNUM HYPO is the size of the hypotheses subspace to be checked by this correlator, i.e. the span of hypotheses covered by this correlator. In the instantiation all correlators except for the last one cover the same amount of hypothesis, the integer part of the division of the whole HS among Ncorr correlators: CorrNUM HYPOi = bHS/Ncorr c
i ∈ [1, Ncorr − 1]
(4.4)
and the last correlator takes the remaining hypothesis: CorrNUM HYPONcorr = HS − (Ncorr − 1)bHS/Ncorr c
(4.5)
CorrINITIAL HYP is the first hypothesis of the correlator’s subspace. In the parameterization, CorrINITIAL HYPi = i · CorrNUM HYPOi
∀i 6= Ncorr .
(4.6)
4.3 Parallelization among Ncorr correlators
83
CorrINITIAL WAIT is the number of cycles that initially the correlator should wait before considering it is in its first hypothesis. It is meant to put each correlator exactly in the first hypothesis when the system is started up. One might pose the question of why it is necessary to have CorrINITIAL HYP and CorrINITIAL WAIT if they seem to be identical. Actually in the chosen design they have been chosen to be equal, but for more generality they were kept separated to allow different hypothesis testing. CorrINITIAL WAIT can be different from CorrINITIAL HYP when the sub-HS of one correlator is not contiguous to the sub-HS of the correlators at both sides of that correlator. For instance, if we wanted to check hypothesis in an interlaced HS, the hypothesis distribution would be: (1)
(1)
(1)
(2)
(2)
(2)
H1 , H2 , · · · , HNcorr , H1 , H2 , · · · , HNcorr , · · ·
(4.7)
(j)
where Hi refers to the j th hypothesis of the ith correlator. For the case of interlaced hypothesis space, while CorrINITIAL HYP would not change its value (because from a logical point of view every correlator has a disjoint subset of hypothesis), the phase difference of the first hypothesis would be different. In this project we have not found any real improvement in doing so, but instead less difficulties are found in hardware implementation issues if we use consecutive hypothesis distributed contiguously: (1)
(2)
(CorrNUM HYPO1 )
H1 , H 1 , · · · , H 1
,
(1) (2) (CorrNUM HYPO2 ) H2 , H 2 , · · · , H 2 , (1) (2) (CorrNUM HYPON corr ) HNcorr , HNcorr , · · · , HNcorr
4.3.3
Dwell Control FSM changes
As previously stated, the parallelization of the design may entail some disruptions in the normal operation of one correlator. These disruptions appear as a consequence of the fact that in the beginning all correlators work independently, but as soon as one finds a valid hypothesis after the VAL procedure, the remaining correlators have to jump to the sidelobe-check procedure (SLC ) to keep on refining the validation. This disruption is considered in every dwell control FSM as a possible transition from the states IDLE , SEQ and VAL to the LFSR TRANSF state. These transitions have been coloured in red for visual identification in Figure 4.7. The LFSR TRANSF state is basically a pause state where all correlators have to wait for the PRN transference to take place from the suceeding correlator to a PRN generator of a tracking correlator. For the parallelization of the design, there are some more facts to be taken into account in the dwell control FSM (fsmcontrol). The additional variables and control lines that that are of specific use in the parallelization of the design are the following: waitncycles (v) This variable is used as a temporary countdown to wait for the PRN transference to have completed.
4.3 Parallelization among Ncorr correlators
84
sl ntrials (v) It is used in two ways for the same purpose. One thing we do with this variable is to keep count in the SLC of how many hypotheses are left to be checked during this procedure. Everytime the sequential test performed by the SLC procedure succeeds, we reset it with the value in MAX SLtrials, that has been assigned the value CorrNUM HYPO in the instantiation of the FSM, i.e. the number of hypotheses to be checked by that correlator. As we said previously, resetting this variable ensures that when the SLC is over, the correlator is exactly at the best timing estimation found by that correlator. SLdone (o) As soon as the SLC has ended, this signal will be asserted in order to allow the correlators jump hypotheses one by one covering all the HS, not only their sub-HS. It is important to note that unless SetNextHyp is told the contrary, hypotheses would be cycling in the sub-HS covered by that correlator, and therefore they would never reach the expected hypothesis. To circumvent this problem the same dwell control FSM has to allow every correlator to run in free-mode by asserting this signal. ChangeHyp (i) This input is devoted to keep the correlator jumping hypotheses while it makes it go to one of the hypotheses next to the winner. It is important to note that because there are several blocks to cross between the block that surveys that the right hypothesis has been reached (GlobalControl) and the final block that stops jumping hypotheses (SetNextHyp) there will be some intertial clock delays that have to be accounted for in GlobalControl. JumpToSL (i) This is a disrupting signal that tells all correlators that had been so far working independently that they should go immediately to the SLC procedure to seek for more powerful sidelobes, if there are any. At this point we can say that all correlators start sharing some information that is given to the GlobalControl and that it will distribute among other correlators in form of signals that will make them jump states or changing thresholds. JumpToMAX (i) With this input the GlobalControl block will tell the correlator that, since all correlators taking part in the MAXF procedure have reached the SLC DONE state, the MAXF procedure can start. MAXdone (o) This signal informs to the GlobalControl block that the accumulation output at the non-coherent integrator is valid for comparison with the other correlatorsâ&#x20AC;&#x2122; integrations. The SLC is a procedure that can be performed even in the case of a single correlator. However, due to its principle of work, the MAXF requires more than one correlator to be performed. Thus, most of the signaling taking place after the SLC has to do with the case of a receivers made up of several correlators, and from the previous definition of signals and variables we can see that it works in the following way.
4.4 Enhancement: Antenna Arrays
85
After the SLC is done SLdone is asserted to allow the setnexthyp block run in free-mode. This means that hypothesis jumping are not confined to the sub-HS of each correlator. Then, GlobalControl makes some correlators at both sides of the winner correlator move to the hypothesis next to the best one because it is needed for the MAXF to have slight variations of the best timing to test them in parallel. This is done keeping while GlobalControl keeps ChangeHyp asserted, so the FSM is kept in the CORRECT HYP state. Note that in this state NextHyp is held asserted so long as ChangeHyp is asserted, and the state is not left until the same control block does not allow the contrary. When the hypothesis correction is done, the FSM waits for the moment to be allowed to start the MAXF . This test will last L MAXF non-coherent accumulations. This constant is defined in a package along with constants L SEQ, L VAL, L SLC, where each refers to the maximum number of noncoherent accumulations done at the dwell block for the SEQ, VAL and SLC procedures, respectively.
4.3.4
Set Next Hypothesis (setnexthyp) changes
There are few changes in this block to be mentioned. The only changes to this block have to do with the fact that the correlator needs to jump to a hypothesis out of its sub-HS, and the fact that, for proper coordination, all blocks need to inform the GlobalControl block the hypothesis being tested at that moment. FreeRun (i) When the SLC has been done, the fsmcontrol asserts this line indicating that the MAXF can be started whenever necessary. Although this particular correlator does not necessarily have to perform a MAXF test, it must be ready for just in case. CurrentHypo (o) It is an output signal used to inform Globalcontrol exactly what hypothesis the correlator is testing at that moment.
4.4
Enhancement: Antenna Arrays
It was shown in Section 3.7.1 that a substantial improvement in performance can be achieved with an array of antennas. The VHDL code designed in this project was written with these considerations in mind, but unfortunately for lack of time it could not be completed properly to be included in this thesis.
4.5
Acquisition Module Simulations
We had a working code acquisition module simulator written in C completely parameterized where â&#x20AC;&#x153;almostâ&#x20AC;? optimum values for the correlator were found. The values that gave good results and that are related to the number of coherent and non-coherent I&D are in Table 4.2. For the two cases of C/N0 = 45 and C/N0 = 40 we have plotted the mean time of acquisition vs. the probability of acquisition in Figure ??.
4.5 Acquisition Module Simulations
4.5.1
86
Model Simulation and construction
In the construction of the VHDL model, first the VHDL model was constructed with “real” values to test first that the algorithm behaved as expected and as the C simulation pointed. After that, a VHDL bit-level model simulation was constructed and tested block by block, and joining parts to see that they behaved as expected. The simulations presented here are for a unique case, since simulating one single acquisition in VHDL can be quite long. In Figure 4.12, we have the whole acquisition process. In Figure 4.13, we have the whole acquisition process. Note that the NEWPRN now has the same sequence as ADSIGN (this signal is information taken from the matlab vector tests generator that gives us the true value of the PRN sequence; obviously we would not have this in the real case). In Figure 4.14, we have the end of the acquisition process after the MAXF procedure. Correlator number 2 is the “winner”. In Figure 4.15, we have zoomed the previous figure to show that the correlator who claims to have the right hypothesis is correct.
Reset
LFSROUT
prn_gen_gps
ForceStateIn2
FreeRun
CohReset
Reset
NextHyp
Clk
U8
Freeze2
squareriq
CLK
ValOK DoubleTHs SLdone MAXdone
GO2NEXT_HYP
FreezeG2
FreezeG1
SLisDone
MAXdone
SLdone
DoubleTHs
ValOK
NextHyp
AccReset
Author: Created: Source: Title:
CLK
U7
fsmcontrol
SLReset
JumpToMAX
ChangeHyp
JumpToSL
Sign
Reset
NewCoh
JumpToSL ChangeHyp JumpToMAX SLReset
SIGN
DwellAcc
Enric M. Calvo 2003/09/1 correlator.vhd (correlator.bde) Correlator for a single Antena (BIT_LEVEL)
Departament de Teoria del Senyal i Comunicacions (UPC)
MAXdone
SLisDone
DoubleTHs
ValOK
DWELL_ACC_RESET
NEW_COH_SQRD_READY
Slope(SLOPE_W IDTH-1:0) dwell THOffset(SLOPE_W IDTH-1:0)
Sign
DwellAcc
ThOffset(OUT_WIDTH-1:0)
Slope(IN_WIDTH-1:0)
NewCoh2SQR
Reset
SLisDone
Freeze1
Reset
Clk
G2CodeDelay
setnexthyp
G2CodeDelay
CurrentHypo
FreezeG2
FreezeG1
I_D_reset
fsm_coh_ctrl
NewSQR
NewCoh2SQR
Clk Reset
IDCohReset
Hold
U9
INT_DUMP_RESET
NewCoh
Input(IN_WIDTH-1:0)
Reset
SqrOut(2*ACC_WIDTH:0)
SQR_OUT(SLOPE_WIDTH-1:0)
Reset
Clk
SqrInQ(ACC_WIDTH-1:0)
Figure 4.10: Sample bit-level correlator among Ncorr for the case of a single-antenna design.
CorrINITIAL_W AIT :natural :=0
Generic
CorrINITIAL_HYP :integer :=0
Generic
CorrNUM_HYPO :integer :=0
Generic
Freeze2
FeedBackIn2
CurrHyp
Freeze1
ID_reset
IDCohIQ
IDCohIn_Q(ADC_WIDTH-1:0)
IDCohIn_I(ADC_WIDTH-1:0)
IDCohOut_Q(ACC_WIDTH-1:0)
Clk
U3
SQR_OUT(2*ACC_WIDTH:0)
SqrInI(ACC_WIDTH-1:0)
U4
NEW_COH2SQR
IDCOHOUT_I(ACC_WIDTH-1:0) IDCOHOUT_Q(ACC_WIDTH-1:0) IDCohOut_I(ACC_WIDTH-1:0)
FeedBackIn1
FreezeG2
ForceStateIn1
PRNOutput
FreezeG1
Clk
Reset
U2
FeedBackOut2
OUTLFSR
FeedBackOut2
LFSR_IN
FeedBackOut1
OUTLFSR
FeedBackOut1
ADOut_Q(ADC_WIDTH-1:0)
PRN_Q
MultiplyIQ
ADOut_I(ADC_WIDTH-1:0)
PRN_I
IxC(ADC_WIDTH-1:0)
ADIn_Q(ADC_WIDTH-1:0)
U1 Clk
QxC(ADC_WIDTH-1:0)
ADIn_I(ADC_WIDTH-1:0)
U5
ADIn_I(ADC_W IDTH-1:0) ADIn_Q(ADC_W IDTH-1:0)
4.5 Acquisition Module Simulations 87
4.5 Acquisition Module Simulations
88
C/N0 = 40 dBHz 4.5
0.4
4
0.3
3.5
0.2
3 Tacq (sec)
Terror (sec)
0.1
2.5
0
2 −0.1 1.5 −0.2 1 −0.3 0.5 −0.4 0
0
0.2
0.4 0.6 Probability
0.8
1
0
0.2
0.4 0.6 Probability
0.8
1
0.2
0.4 0.6 Probability
0.8
1
(a) C/N0 =40dBHz
C/N0 = 45dBHz 0.3 0.7
0.2 0.6
0.1
Tacq (sec)
Terror (sec)
0.5
0.4
0
0.3 −0.1 0.2
−0.2 0.1
0
0
0.2
0.4 0.6 Probability
0.8
1
−0.3
0
(b) C/N0 =45dBHz
Figure 4.11: Simulation in C of the mean time acquisition. Both are for worst case Doppler of fd = 6000Hz
Page 1 / 1
70
80
90
100
110
BEST_CORR
DwlAccs
CurrHyp
0
3
0
60
CH
50
23
40
53
30
11
20
26
10
THOffset
0
Slope
SLReset
JumpToMAX
JumpToSL
FreezeG1
HoldG1
oneValidated
PRNTransfStart
NEWPRN
LFSRO
ADSign
CLK
ResetSource
Reset
Name
Project array_acq_bl 120
130
150
160
170 ms
2
(917,918,919,920,921,922,923,1948)
0
11
26
140
4.5 Acquisition Module Simulations 89
Figure 4.12: The whole acquisition process
BEST_CORR
DwlAccs
(154,434,652,920,1230,1471,1737,1948) CurrHyp
CH
3
23
108.24
53
108.23
THOffset
108.23
Slope
SLReset
JumpToMAX
JumpToSL
FreezeG1
HoldG1
oneValidated
PRNTransfStart
NEWPRN
LFSRO
ADSign
CLK
ResetSource
Reset
Name
108.24
108.25
108.26
Page 1 / 1
(017,017,017,017,017,017,017,017)
108.25
108.26
Project array_acq_bl 108.27
108.27
108.28
108.28
108.29
108.29
108.3 ms
4.5 Acquisition Module Simulations 90
Figure 4.13: After Validation Process Succeeds
BEST_CORR
DwlAccs
SLReset
HoldG1
oneValidated
PRNTransfStart
NEWPRN
LFSRO(7)
LFSRO(6)
LFSRO(5)
LFSRO(4)
LFSRO(3)
LFSRO(2)
LFSRO(1)
LFSRO(0)
LFSRO
ADSign
CLK
Name
156.21
156.22
156.22
156.22 156.23
2
(101,2DA,1CA,0BD,350,37C,2CF,224)
156.22 156.22
156.23
156.23
Page 1 / 1
255
156.23 156.23
156.24
Project array_acq_bl
255
156.24
156.24 156.24
156.24
156.25
156.25 156.25
156.25 156.26
254
156.25
0
156.26
1
156.26 ms
4.5 Acquisition Module Simulations 91
Figure 4.14: Zoom just after the MAXF procedure
Figure 4.15: Zoom to check that the correct epoch is delivered
(101,2DA,1CA,0BD,350,37C,2CF,224)
BEST_CORR
DwlAccs
SLReset
HoldG1
oneValidated
PRNTransfStart
NEWPRN
LFSRO(7)
LFSRO(6)
LFSRO(5)
LFSRO(4)
LFSRO(3)
LFSRO(2)
LFSRO(1)
LFSRO(0)
LFSRO
ADSign
CLK
Name
25
12
156.22
134
156.22
194
156.22
224
156.22
Page 1 / 1
240
156.22
2
248
156.22
Project array_acq_bl
252
156.22
126
156.22
62
156.22
158
156.22
206
156.22 ms
4.5 Acquisition Module Simulations 92
Chapter 5
Conclusions & Future Work In this project we have made the design and hardware implementation of a generic Direct Sequence Spread Spectrum acquisition unit, and has been particularized to the case of a GPS receiver. While previous works [2] focused their attention on the statistical characterization and the minimization of the mean acquisition time, our work has taken that as the start point and focused on the hardware implementation issues of such design. This entails the effects of quantization in few bits, done in Chapter 2 and the bit-level specification of the modules and the efficient implementation of functions in a hardware-friendly yet flexible way to avoid as much overhead as possible, in Chapter 4. From the beginning, the design has been conceived with the key idea in mind of finally getting VHDL code for a largely parameterized VHDL code, so that it could be used for other DS/SS. Specifically, the parameters that have been made design dependant are: the number of acquisition correlators Ncorr , the number of PRN generators that will be tracked Ntrack , the number of samples per chip, the number of hypotheses per chip, and the number of bits per sample. Other parameters that can be changed easily include the number of samples that both coherent and noncoherent accumulators will integrate and both the initial threshold and slope of the non-coherent accumulators. Also, the width of most of the datapaths depend on constants defined in a VHDL file. Theoretical aspects related to the way our receiver avoids performing the carrier frequency Doppler search and the consequences that it has on the maximum number of integrations in the coherent and non-coherent integrators have also been addressed in Chapter 3. In Appendix A a deeper study of these effects and the â&#x20AC;&#x153;nominal optimumsâ&#x20AC;?1 when signal is data modulated or not. For the data modulation, we have seen that the effects are negligible. In that same chapter about acquisition we have first introduced the theoretical analysis of an already synchronized DS/SS receiver, so that later we could take advantage of that analysis in order to further simplify the acquisition module analysis. Besides, the improvements in case we used an array of antennas has been described in theory but, unfortunately for time constraints, the hardware im1 we say nominal here, because they are approximate values close to the real ones. The real ones can only be found by simmulation
94 plementation of this new configuration has not been achieved. Arrays will help us reduce the variance of the statistics in the noncoherent correlator (see Section 3.7.1). So, they seem to be promising that, for the same C/N0 conditions, they can speed up acquisition time proportionally to the number of antennas used Nant . The clear pros and cons of our design are commented in the following. It simplifies correlators in the sense that we can get rid of NCO/Phase rotators pairs in the acquisition correlators. However, we should see when mapping into a device how deep the tightness of the parallelist is. This means that, since correlators need to deliver and receive information to and from a single block, the routing resources of the devie might exhaust. The natural continuation of this project is basically twofold. Firstly, the synthesizing and device mapping of the generated VHDL code is a must. With we could check for sure whether the design is well designed or needs to be retouched. Secondly, the separation of the blocks necessary to perform the acquisition with an array of antennas is also important, and with we could then check wether the acquisition is Nant times faster or not, as equations propose. However, there are more aspects that can be improved or changed in this design, many of them related to the implementation of the blocks, e.g. the CMF or the squarer. For instance, the CMF designed in this project is included in the coherent accumulation, and is thus a simplification that can be done because we are not using pulse-shaping. So, one thing that might be required in some DS/SS system is the implementation of the chip matched filter as an isolated entity, and it would made our design useful for a broader range of applications. Unfortunately, including filters in FPGAâ&#x20AC;&#x2122;s may be rather specifically devicedependant, which would substract flexibility to our design. Another of the things that could be done in the simulation in C of the acquisition mean time is make it more automatic. The simulation is non-linear, thus it requires extensive simulations varying some parameters slightly to see if it improves or not. Doing this by hand can be a tedious work, so maybe including algorithms based on genetics would give close-to-optimum results. Also, making slight changes to be able to perform simulations of multipleantenna receivers
Appendix A
Impact of Frequency Doppler on the Correlator In Equation (3.89) in page 43 we have defined the output of the correlator as ! Ncoh X−1 1 jθ(lNcoh Tc ) ∗ z˜s (lTcoh ) ≈ e gN (µτ Tc ) c|n−mτ −i|P c|n−i| P Ncoh i=0 (A.1) ! Ncoh X−1 1 · dj lNcoh −i k e−jωd iTc Ncoh Q i=0
and now our interest is focused on the term Ncoh q X−1 1 dj lNcoh −i k e−jωd iTc = e−jωd (Ncoh −1)Tc /2 DQ (Ncoh ) Ncoh Q
(A.2)
i=0
being this term the joint effect of the unknown carrier frequency Doppler and random data modulation at the same time. For fast reference, we only need to know that DQ (Ncoh ) is a loss factor that takes these two effects into account and is defined as DQ (Ncoh ) = D (Ncoh ) Lmod (Q, Ncoh )
(A.3)
where sin (πN fd Tc ) 2 D (N ) = πN fd Tc Q−N +1 2 Lmod (Q, N ) = Q
(A.4) (A.5)
On the one hand Lmod (Q, N ) is a power loss factor that accounts for the impact of data modulation when correlations of N samples are performed in a system with Q chips per data symbol (or processing gain). On the other hand D (N ) accounts for the power loss due to the effect that Doppler has on the correlation, further explained in next subsections. In this appendix we shall study the global effect of these two terms. However, let us firstly limit our scope to the also common case of no data modulation.
A.1 Without Data Modulation
A.1
96
Without Data Modulation
In this case, the summation of exponentials is easily computed, since it is known that N −1 1 X −jωd iTc sin (πfd N Tc ) e = e−jωd (N −1)Tc /2 N N sin (πfd Tc )
(A.6)
i=0
And the power loss due to this term is dubbed degradation D(N ), D(N ) =
sin (πN fd Tc ) N sin (πfd Tc )
2 (A.7)
and for N 1 and πfd Tc 1, this degradation can be approximated to D(N ) ≈
sin (πN fd Tc ) πN fd Tc
2 (A.8)
Impact on the SNR The SNR at the output of the coherent I&D was defined in Equation (3.95) (see page 44) as Ec ˆ SNRcoh = R (τ ) ∆SNR (A.9) N0 being ∆SNR = Ncoh DQ (Ncoh ). A parameterized plot of this ∆SNR (see Figure A.1) shows that in the coherent I&D there is an optimum number of integrations that depends only on the chiprate/Doppler ratio rc /fd . We see that the maximum ∆SNRmax = 0.23rc /fd is set at Nopt ≈ 0.4rc /fd and also that it completely cancels at Nnul = rc /fd . There are several things to note from this plot and from these equations. Note that at Nnul a null output is produced since it is the number of samples for which we have been accumulating the Doppler phasor during a complete circle rotation. Note also that in the coherent I&D, the degradation restarts for every new coherent accumulation, unlike it seems to intuition. Every new accumulation is independent from all the past ones and the degradation appears because the similarity (squared modulus of the projection) between the reference (first phasor) and the incoming signal (second phasor) is large for short observation periods but degrades for longer periods. The greater the observation period is, the greater the evidence that the two phasors have slightly different frequencies will be. Thus, the degradation due to carrier frequency Doppler sets a maximum coherent observation time for the coherent I&D.1 1
We should also keep in mind this plot of ∆SNR is for the worst Doppler case, but that at lower Dopplers the increment in SNR will be equal or above it.
A.2 With Data Modulation
∆SNRlin (N ) 6
97
= N · D(N )
∆SNRmax ≈ 0.23 frdc
-
Nopt ≈ 0.4Nnul
Nnull =
rc fd
N
Figure A.1: Normalized plot of ∆SNR (linear).
A.2
With Data Modulation
To account for the fact that low bitrate data may be modulating the signal we propose using statistics to see the loss that in average we may expect in the code acquisition stage. We may break the coherent integration in two when there is a datasign toggles in the middle of a coherent accumulation:
NX
N −1 N −1 1 −1
X X
1
1
−jωd iTc
−jωd iTc −jωd iTc
k j e − e (A.10) d lN −i e
=
N
N
Q
i=0 i=0 i=N1 where N1 is a discrete random variable whose statistics are of our interest and that represents the number of samples that contribute negatively in the coherent accumulation. Note that N1 can take values ranging 0, N2 . Values beyond N2 make that the other part of the broken accumulation dominant over the negative. Now we can try to find out how the probability mass function (p.m.f.) of the random variable N1 is. To do it, we will assume that the incoming signal and the local replica are aligned, since when they are not, the correlation function ˆ ) can be considered to produce the maximum loss. R(τ
A.2 With Data Modulation
98
In the correlation procedure, the power of fragments of N samples is computed constantly. The starting chip of the code can be any between the first one (1) and the last one (Q = m1 P ), where P is the period of the PRN code and m1 ≥ 1 is an integer that contributes to the processing gain. Only N − 1 possible accumulations out of Q may fall in a datasign toggle, 2 each with equal probability 1/Q. But the losses are identical for simetric pairs of broken integrations (i.e., the two cases when: (i) the small fraction is in the first bit while the big fraction is in the second, and (ii) the small fraction is in the second bit while the big one is in the firt one). So, this means that the probabilities for N1 to take the values [1, N/2 − 1] is 2/Q. The special case for N1 = N/2 for N even has probability 1/Q. And the vast majority of cases when there is not a datasign +1 toggle breaking the correlation has a probability of Q−N . Q These probabilities draw a p.m.f. like the one in Figure A.2,3 that remind us to a Dirac’s delta for Q N As we said in the previous section, the coherent observation time can not be high because of Doppler. This means that the number of samples integrated is not high in comparison with the number of chips without sign toggle. Then it seems plausible to consider negligible all possible N1 6= 0, then the delta at 0 becomes the amplitude loss factor due to data modulation. Thus, we define the power loss factor as Lmod (Q, N ) =
Q−N +1 Q
2 (A.11)
and we can see that in many practical situations this factor is negligible. For instance, in GPS Q = 20 · 1023 = 20460 and Nopt ≈ 85, yielding a loss factor Lmod = 0.99181 ≈ 1, which is a loss of about 0.036dB. In comparison with any other losses in the system, this loss is negligible.
2 Note that we are not making any assumption about the possibility that N be submultiple of Q. Generally we should look for N and P relative primes, so that we may assume equal probability for every code chip to be the first one in the integration. 3 Note that to be rigorous the probability at each discrete value should have been multiplied by the loss associated with that probability. However, it has been considered to be negligible in the overall.
A.2 With Data Modulation
99
pN1 (N1 ) 6 Q竏誰 +1 Q
2/Q N-1 N/2 Figure A.2: Probability Mass Function of the r.v. N1 . Despite the aspect ratio, this PMF can be regarded as a delta function.
Appendix B
Acronyms 2C’s Two’s complement A/D Analog to Digital Converter ADC Analog to Digital Converter ASIC Application Specific Integrated Circuit ATV Automated Transfer Vehicle AWGN Additive White Gaussian Noise BPSK Binary Phase Shift Keying C/A Code Coarse Acquisition Code CDMA Code Division Multiple Access CMF Chip Matched Filter CORDIC COordinate Rotation DIgital Computer DLL Delay-Locked Loop DOA Direction of Arrival DS/SS Direct Sequence Spread Spectrum FDMA Frequency Division Multiple Access FIR Finite Impulse Response FFT Fast Fourier Transform FPGA Field-Programmable Gate Array FSM Finite State Machine HS Hypotheses Space, or Hypotheses Search Space I&D Integrate & Dump (integrator + decimator)
101 IID Independent and Identically Distributed IF Intermediate Frequency ISI Inter-Symbol Interference LSB Least Significant Bit LUT Look-Up Table MA Moving Average MAG MAGnitude bit in a 2-bit ADC MSB Most Significant Bit P Code Precision Code P(Y) Code Encrypted Precision Code PDF Probability Density Function PMF Probability Mass Function PN PseudoNoise PLL Phase-Locked Loop PPS Precision Positioning Service PRN PseudoRandom Noise-like QPSK Quaternary Phase Shift Keying RX Receiver S/A Selective Availability SIGN SIGN bit in a 2-bit ADC SPS Standard Positioning Service sub-HS Hypotheses Subspace SV Space Vehicle TDMA Time Division Multiple Access TED Timing Error Detector TOA Time Of Arrival TOT Time Of Transmission TX Transmitter VHDL VHSIC Hardware Description Language VHSIC Very High Speed Integrated Circuit
Bibliography [1] J. Sala, J. Riba, G. V´azquez, F. Rey, O. Mu˜ noz, “Algorithmical Design and Bit Level Specification of the ATV S-Band Spread Spectrum Demodulators,” Alcatel Espacio, Feb. 2000. [2] J. Villares, “Sincronismo de C´odigo en Transponders DS/SS para el Automated Transfer Vehicle (ATV),” Ms.Thesis, Universitat Polit`ecnica de Catalunya, 1998 [3] A. J. Viterbi, CDMA Principles of Spread Spectrum Communication. Reading, MA: Addison-Wesley, 1995. [4] B. Parkinson and J. Spilker, Global Positioning System: Theory and Applications, Volume I. Wahington, D.C.: American Institute of Aeronautics and Astronautics, 1996. [5] M.K. Simon, J.K. Omura, R.A. Scholtz, and B.K. Levitt, Spread Spectrum Handbook. New York: McGraw-Hill, 1994. [6] R. B. Ward, “Acquisition of Pseudonoise Signals by Sequential Estimation,” IEEE Trans. on Commun., vol. 13, pp. 475-483, Dec. 1965. [7] U. Mengali, A. N. D’Andrea, Synchronization Techniques for Digital Receivers. Plenum Press, 1997 [8] F. Amoroso,“Adaptive A/D Converter to Suppres CW Interference in DSPN Spread-Spectrum Communications,” IEEE Trans. on Commun., vol. COM-31, pp. 1117-1123, Oct. 1983. [9] J. J. Spilker, Jr., Digital Communications by Satellite. Englewood Cliffs, NJ: Prentice-Hall, 1977. [10] Anonymous, GP2010 “GPS Receiver Hardware Design Application Note”, www.zarlink.com [11] S. M. Kay, Fundamentals of Statistical Signal Processing: Detection Theory. Upper Saddle River, NJ: Prentice-Hall, 1998. [12] W. Namgoong, T.H. Meng,“Minimizing Power Consumption in DS/SS Correlators by Resampling IF Samples – Parts I & II,” IEEE Trans. on Circuits and Systems., vol. 48, no. 5, pp. 450-470, May 2001.
Index Symbols
H
C/N0 , 29 (CMF), 33
hypothesis, 17 Hypothesis Space, 19 hypothesis space, 17
A asynchronous sampling, 22
I
B broadside, 55
Integrate & Dump, 49 Interpolation, 24 interpolation, 22
C
L
cell, 17 channels, 49 chip matched filter, 33 coherent, 20 conditional transitions, 73 Cycle and Add, 63
LFSR, 3 characteristic polynomial, 4
D Decimation, 24 decimation, 22 Delay-Locked Loop, 39 Direction of Arrival, 55 DLL, 39 DOA, 55 dwell, 18
E envelope, 70
F Finite Impulse Response, 24 Finite State Machine, 49, 62 FIR, 24 Freezing Waves, 77 FSM, 49, 62 FW, 77
G generic, 76
M MAG, 9 matched filtering, 16 MAXF, 54 Maximum-Finding Procedure, 54 moving average, 38 Multiple Dwell, 18
N noncoherent, 20
P Phase-Locked Loop, 39 PLL, 39 polynomial characteristic, 4 power detector, 62 PRN-state transference, 81 processing gain, 3 Pulse shaping, 6
S sampling period, 22 saturated arithmetics, 68 SEQ, 50 Sequential Test, 18, 50 Sidelobe-Check Procedure, 53
INDEX SIGN, 9 Single Dwell, 18 SLC, 53 Sliding Correlator, 16 SMF, 37 spreading factor, 3 Symbol Matched Filter, 37 symbol period, 22
T TED, 28 test statistic, 18 time observation, 18 Timing Recovery, 22 tracking correlators, 49
U Uncertainty Region, 19
V VAL, 51 Validation Dwell, 51
104