Week 05: CORDIC and Some Remaining Pynq
CORDIC Time
Please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.mit.edu) to authenticate, and then you will be redirected back to this page.
Design and Build a CORDIC
We, as in you, are going to build a CORDIC. I showed a fun Python implementation in class the other day:
import sys
import math
#precompute
cordic_angles = [180/3.14159*math.atan(2**(-i)) for i in range(17)]
x = int(sys.argv[1])
y = int(sys.argv[2])
z = 0
#actual run-time:
for i in range(16):
if y >0: #sng(y)==1
xn = x + 1/(2**i)*y
yn = y - 1/(2**i)*x
zn = z - cordic_angles[i]
else: #sng(y) == -1
xn = x - 1/(2**i)*y
yn = y + 1/(2**i)*x
zn = z + cordic_angles[i]
print(f"x:{xn}, y:{yn}, z:{zn}")
x = xn
y = yn
z = zn
print(x/1.646)
prin(z)
Specification
You are to build an AXI-streaming CORDIC module that calculates the angle and magnitude of a 2D vector. The module can have an identical interface as the j_math
, fir_15
, etc.. modules of the past few weeks, with a 32-bit AXIS S port, where data is fed in, and a 32-bit AXIS M port where it is sending on resulting data. The standard signals including tready
, tvalid
, tlast
, and tstrb
(always set to 0xF) should be included. Feel free to parameterize things, but you can also hardcode it for 32 bit usage and that's totally fine.
The module should take in a 32 bit tdata on its S port which is to be interpretted as two 16-bit signed integers representing the orthogonal measurements. Use bits 0 through 15 for the X/Real/I dimension value and bits 16 through 31 should convey the Y/Imaginary/Quadrature dimension. Using these two dimensions as the input, your module should run a sixteen-stage CORDIC which generates the angle (a 16 bit unsigned or signed representation of angle going from 0 to 2\pi or -pi to pi, respectively [your choice]) and magnitude (a 16 bit unsigned value). The order of these two terms is up to you. I put magnitude in the lower 16 and angle in the upper, but it is up to you.
The module should be fully pipelined, meaning it should have sixteen stages of CORDIC in it. Note, the last few stages for the angle accumulation will likely be useless with angle increments of less than 1 (and therefore 0), but the module needs 16 stages to potentially ensure the magnitude gets all the way resolved. Both angle and magnitude are both found using a CORDIC in circular vectoring mode (the same version we looked at in lecture 08 so you don't need to have two CORDICs running in parallel. One properly designed CORDIC will produce both values.
Because of limitations discussed in class, the CORDIC will only work in work reliably in quadrants 1 and 4. If your input vector is in quadrants 2, or 3, you should rotate by pi radians into quadrants 4 or 1, respetively before the first stage of the CORDIC, and then you should also be sure to un rotate your angle calculation at the end of the pipeline. Magnitude shouldn't be impacted by any of this (thankfully). This will likely mean you need to pipeline this rotate/no-rotate signal along with the main pipeline of CORDIC math.
CORDIC Gain
CORDICs have a gain of approximately 1.646 due to the pseudorotations. Be sure to account for that through either multiplying/dividing after the calculations are done or pre-scaling and then post-dividing (see lecture for numbers for a 16 bit operation).
Latency
Unlike the FIR or jmath, there is no way to be generating an output from a valid input after only one clock cycle. This should end up being an approximately 16-stage pipelined system (so throughput of 1 sample per clock cycle and a latency of 16 clock cycles). If you need a couple extra cycles to take care of the gain and or angle rotations, that's fine...the big thing we care about is it being fully-pipelined. A few extra cycles of latency are much much less of a concern. Make sure signals like tlast
, tstrb
get conveyed along with the main data payload appropriately.
Testing
You should be able to use your AXIS testing frame work from last week with some minor modifications to your models and testing (in fact, the models should be much easier to integrate than scipy's lfilter
. You will likely not be able to get this bit-accurate with a numpy version, so it is ok to test this to be within an epsilon or so of the true answers. My magnitude, for example, always ended up being a little bit smaller than the true answer. It was good enough. If you'd like to use a Scoreboard with this, you can override the compare
method using a Scoreboard subclass if you'd like.
Testing on Vivado
You've all experienced the problems of differences in Verilog interpretation that may come about between icarus Verilog and Vivado. In order to help with that, it sure would be nice if we could run our Cocotb verification framework using Vivado's Verilog interpretter since that's ultimately what we're putting things on. Thankfully, there is vicoco. This should be installed on the lab machines in the 6.S965 area, and you're free to run it. All you need to do is:
- Replace
from cocotb.runner import get_runner
withfrom vicoco.vivado_runner import get_runner
at the top of your file(s). - Replace
sim = os.getenv("SIM", "icarus")
withsim = os.getenv('SIM','vivado')
- Make sure your System/Verilog files have timescales in them (like
Then it should just run, except use Vivado instead of icarus.
If you're extra lazy and find yourself not in lab, but want to test stuff, the server build framework we use for 6.205 is also at your disposal and for 2025 supports remote simulations with Vivado! Installation instructions are here. You should already have an account on the fpga3.mit.edu/lab-bc2
endpoint, but if not, reach out to Joe.
To run simulations, all one needs to do then is (assuming you're already in a folder with your Verilog in a folder called hdl
...
lab-bc simulate ./ sim/test_cordic_with_vivado.py
Note some of the local text highlighting that Cocotb naturally does, may get lost in transit (but the text will still be there). Waveforms will still show up in sim_build
etc.. Use these sims especially to catch signe erros and other things before you are throwing this on to the FPGA in another week.
OK make it happen.