# Assembly Language Programming

## September 18, 2011

### How To Implement CORDIC

Filed under: Assembly Language, JavaScript, Mathematics — Tags: , , , , — programmer209 @ 3:20 pm

## References

A good discussion of the CORDIC algorithm can be found here:

The original 1959 paper by Volder:

## CORDIC

CORDIC (Coordinate Rotation Digital Computer) is an algorithm that enables the calculation of trigonometric functions using only addition and bit shifts. The method was originally invented in the fifties by Jack Volder for aircraft navigation systems.

The basic method as described by Volder has 2 modes – Rotational and Vectoring – corresponding to the 2 basic problems this algorithm is able to solve:

Rotational: given an angle θ, find Cos(θ) and Sin(θ).

This problem is solved by initially defining a vector of unit length that is aligned with the x-axis. This vector is then rotated until its angle with respect to the x-axis is θ. The coordinates (x,y) of the rotated vector will then be equal to Cos(θ) and Sin(θ) respectively.

Vectoring: given a vector (x,y), find R (the magnitude) and θ.

The vector (x,y) is just rotated until it is aligned with the x-axis. The new value of x will then give the magnitude of the vector R, while θ will be the angle the vector was rotated through to become aligned with the x-axis.

## The Mathematics of Cordic

When a vector (x,y) is rotated by an angle θ, the new coordinates (x’,y’) are given by the equations:

x’ = x.Cos(θ) – y.Sin(θ)

y’ = y.Cos(θ) + x.Sin(θ)

Dividing through by Cos(θ) gives:

x’/Cos(θ) = x – y.Tan(θ)

y’/Cos(θ) = y + x.Tan(θ)

Using the identity:

1/Cos(θ) = sqrt(1 + Tan2(θ))

Gives:

x’.sqrt(1 + Tan2(θ)) = x – y.Tan(θ)

y’.sqrt(1 + Tan2(θ)) = y + x.Tan(θ)

Cordic uses this last form of the equations to implement vector rotations. But they need to be further adjusted so that they can be implemented using only additions and bit shifts. It is these adjustments that turn Cordic into an iterative method for solving the 2 basic problems defined above.

The first adjustment is to restrict the values that Tan(θ) can take on, to powers of negative 2. In other words, Tan(θ) will only be allowed to take on the values 1, 1/2, 1/4, 1/8, … etc. This allows the terms x.Tan(θ) and y.Tan(θ) to be expressed as bit shifts:

x.Tan(θ) = x >> i

y.Tan(θ) = y >> i

Where Tan(θ) = 1/pow(2,i) for some i >= 0.

This means that only specific angles of rotation are allowed, and that any required rotation of a vector will have to be built up by summing these specific angles. So multiple rotations (iterations) will be required to reach the target angle of rotation.

The second adjustment is to deal with the factor: sqrt(1 + Tan2(θ)). It is handled by defining x” and y” as follows:

x” = x’.sqrt(1 + Tan2(θ))

y” = y’.sqrt(1 + Tan2(θ))

And the equations become:

x” = x – (y >> i)

y” = y + (x >> i)

These equations describe a pseudo-rotation where the angle of rotation remains the same, but where the length of the vector will grow by the removed factor.

For any given number of iterations, the growth in the length of the vector can be accounted for by pre-computing a factor known as K. Given that:

sqrt(1 + Tan2(θ)) = sqrt(1 + 1/pow(4,i))

Then for N pseudo-rotations, K is computed by multiplying all of these factors together for i = 0 to N. Since 1/pow(4,i) converges to zero, this means that the K factor will converge to a constant value after a given number of iterations.

In my Javascript code below, K converges to a value of 1.6467602581210654 after 26 iterations.

There is one final detail to be considered before the final form of the equations for the Cordic pseudo-rotations can be defined. This is the direction of the angle of pseudo-rotation, which can be either positive or negative, which means that Tan(θ) will be either positive or negative. This is handled by including the term d(i) in the equations, which will be either 1 or -1 depending on the direction of rotation. So the final form of the equations for a pseudo-rotation is:

 ```x(i+1) = x(i) - d(i).(y(i) >> i) y(i+1) = y(i) + d(i).(x(i) >> i) z(i+1) = z(i) - d(i).atan(pow(2,i)) ```

The z term is the current angle of rotation for the vector (x,y). The atan term gives the angle being added for each i.

For each of the 2 modes of Cordic, the above equations are applied as below. The K length factor is accounted for by building it into the initial vector, which means that the final vector will be of the correct length.

## Rotational Mode

 ```Find Cos(θ) and Sin(θ), where -90o <= θ <= 90o: Start with the vector x = 1/K and y = 0. Rotate the vector towards the target angle θ as follows: - if z(i) < θ, set d(i) = 1. - if z(i) >= θ, set d(i) = -1. ```

## Vectoring Mode

 ```Find (R,θ) for the vector (x,y), where x >= 0: The target angle is 0 (the x axis). Start with the vector x(0) = x/K and y(0) = y/K. Rotate the vector towards the target angle 0 as follows: - if z(i) < 0, set d(i) = 1. - if z(i) >= 0, set d(i) = -1. ```

## Implementation

The aim of this project is to implement Cordic in assembly language using only shifts, adds and table lookups. To do this I will be using my PDP-11 assembly language simulator.

## Javascript – Floating Point

To begin with I write a version in Javascript using floating point numbers, to demonstrate the basic approach to implementing Cordic. Since this is using floating point numbers, there is no need to use the bit shifts as discussed above.

Note: input angles are in degrees.

 ``` CORDIC

```

## Javascript – Fixed Point

Next, the above code is modified such that the values x, y, z and theta are represented as 32-bit signed integers (as they would have to be in an assembly language program targeted to a processor of limited power that did not have a Floating Point Unit). The scheme where a floating point number is represented by a large integer is known as fixed point.

The floating point values are scaled up to 32 bit signed integers as follows:

For the rotational mode, x and y are scaled up to a 32-bit signed integer by multiplying by 2 to the power of 30. In other words, x and y are simply bit shifted to the left by 30 bits. This is valid because |x| and |y| (since they represent Cos(θ) and Sin(θ)) will never be greater than 1. Theta is scaled up in the same way, but is first divided by 100 (so that it can never be more than 1).

For the vectoring mode, x and y are allowed to be greater than 1. Therefore, both x and y are divided by the factor (x+y) before being scaled up to large integers. Theta is scaled up in the same way as it was for the rotational mode.

Since the core logic in this code (the Cordic iterations) is to be implemented in PDP-11 assembly language, the program outputs the scaled integer values in octal, which gives something against which the PDP-11 code can be tested.

Note that the 2 versions of the Javascript program given in this post both use 32 iterations to calculate Cordic. This is the correct number of iterations to use for a value whose precision is 32 bits. If any more than 32 iterations were used, it would have no effect on the final result because in the cordic equations the result of the bit shifts would be zero once i reached values of 32 and above.

 ``` CORDIC

```

## PDP-11 Implementation – 32 Bit

The PDP-11 assembly language code here was written using my PDP-11 Assembly Language Simulator.

This code only implements the rotational mode of Cordic. The inputs and outputs from this code are the 32-bit fixed point integer representations of the floating point values (x, y, z) and theta. These values are scaled as described above. The PDP-11 code here does not do the actual scaling of course.

Since the PDP-11 is a 16-bit machine, but we are using 32-bit integers, the code below uses double words.

The value x is initialised to X_H = 23335 and X_L = 35552, which is the 32-bit octal representation of 1/K. The values y and z are initialised to zero. No matter what the input angle, (x, y, z) are always initialised to these values.

For this example, theta is set to -30 degrees. In the fixed point octal representation this is: TH_H = 166314 and TH_L = 146315. This is the only part of the code that needs to be changed, if a different input angle theta is wanted.

The set of angles of rotation for each i are stored as an array in the code, starting from the label ATAN_I (in the 32-bit fixed point representation of course).

 ``` ; CORDIC - (32 bit) ; Rotational Mode - find Cos(theta) and Sin(theta) mov #ATAN_I p_ARR loop: ; N = 32 to 1 mov N Shift ; Shift = 0 to -31 sub #32d Shift ; dir = 1 mov #1 DIR ; is (z - theta) < 0 ? mov Z_H r0 mov Z_L r1 mov TH_H r2 mov TH_L r3 sub r3 r1 sbc r0 sub r2 r0 tst r0 bmi rotate ; dir = -1 neg DIR rotate: ; get atan[i] and increment p_ARR mov @p_ARR ATAN_H add #2 p_ARR mov @p_ARR ATAN_L add #2 p_ARR ; which direction ? tst DIR bpl next_1 ; negate atan[i] com ATAN_L com ATAN_H add #1 ATAN_L adc ATAN_H next_1: ; z = z + atan[i] add ATAN_L Z_L adc Z_H add ATAN_H Z_H ; x1 = x mov X_H X1_H mov X_L X1_L ; y1 = y mov Y_H Y1_H mov Y_L Y1_L ; x >> i mov X_H r0 mov X_L r1 ashc Shift r0 ; y >> i mov Y_H r2 mov Y_L r3 ashc Shift r2 ; which direction ? tst DIR bpl next_2 ; negate (x >> i) com r1 com r0 add #1 r1 adc r0 ; negate (y >> i) com r3 com r2 add #1 r3 adc r2 next_2: ; x1 = x - (y >> i) sub r3 X1_L sbc X1_H sub r2 X1_H ; y1 = y + (x >> i) add r1 Y1_L adc Y1_H add r0 Y1_H ; update x = x1 mov X1_H X_H mov X1_L X_L ; update y = y1 mov Y1_H Y_H mov Y1_L Y_L dec N bne loop halt ; loop counter = 32 to 1 N: .word 32d ; left shift = (N - 32) = 0 to -31 Shift: .word ; direction of rotation = -1 or +1 DIR: .word ; x, y and z X_H: .word 23335 X_L: .word 35552 Y_H: .word Y_L: .word Z_H: .word Z_L: .word ; x1 and y1 = temp x and y X1_H: .word X1_L: .word Y1_H: .word Y1_L: .word ; theta TH_H: .word 166314 TH_L: .word 146315 ; atan for the current i = (N - 32) ATAN_H: .word ATAN_L: .word ; pointer to atan[i] p_ARR: .word ; atan[i] = atan(pow(2,-i)) for i = 0 to 31 ATAN_I: .word 16314, 146314 .word 10400, 65401 .word 4373, 131270 .word 2217, 56330 .word 1111, 171125 .word 445, 41112 .word 222, 125116 .word 111, 53114 .word 44, 125512 .word 22, 52652 .word 11, 25325 .word 4, 112552 .word 2, 45265 .word 1, 22532 .word 0, 111255 .word 0, 44526 .word 0, 22253 .word 0, 11125 .word 0, 4452 .word 0, 2225 .word 0, 1112 .word 0, 445 .word 0, 222 .word 0, 111 .word 0, 44 .word 0, 22 .word 0, 11 .word 0, 4 .word 0, 2 .word 0, 1 .word 0, 0 .word 0, 0 .end 0 ```

## PDP-11 Implementation – 16 Bit

This version uses 16-bit integers to represent the values x, y, z and theta. The floating point values are scaled up to 16-bit signed integers by multiplying by 2 to the power of 14.

Here, only 16 iterations of Cordic are computed (instead of 32 iterations as for the 32-bit version above). Also, K is computed for i = 0 to 15 only, giving K = 1.6467602578654548. This will make no difference however, since K for N=16 and K for N=32 only differ in the last few decimal places.

In the code below, x is initialised to 23335, which is the 16-bit octal fixed point integer representation of 1/K. As before, y and z are initialised to 0. No matter what the input angle theta is, (x,y,z) are always initialised to these values.

Theta is initialised to 166315 which is the scaled octal integer representation of -30 degrees. The scaled integer representation of theta is calculated by first dividing theta by 100 and then multiplying by 2 to the power of 14.

Of course this version won’t be as accurate as the 32-bit version. When converted back to floating point, the output values are only accurate to about 3 or 4 decimal places.

 ``` ; CORDIC - (16 bit) ; Rotational Mode - find Cos(theta) and Sin(theta) mov #ATAN_I p_ARR loop: ; N = 16 to 1 mov N Shift ; Shift = 0 to -15 sub #16d Shift ; dir = 1 mov #1 DIR ; is (z - theta) < 0 ? mov Z r0 mov THETA r1 sub r1 r0 tst r0 bmi rotate ; dir = -1 neg DIR rotate: ; get atan[i] and increment p_ARR mov @p_ARR ATAN add #2 p_ARR ; which direction ? tst DIR bpl next_1 ; negate atan[i] neg ATAN next_1: ; z = z + atan[i] add ATAN Z ; x1 = x and y1 = y mov X X1 mov Y Y1 ; x >> i mov X r0 ash Shift r0 ; y >> i mov Y r1 ash Shift r1 ; which direction ? tst DIR bpl next_2 ; negate (x >> i) and (y >> i) neg r0 neg r1 next_2: ; x1 = x - (y >> i) sub r1 X1 ; y1 = y + (x >> i) add r0 Y1 ; update x = x1 and y = y1 mov X1 X mov Y1 Y dec N bne loop halt ; loop counter = 16 to 1 N: .word 16d ; left shift = (N - 16) = 0 to -15 Shift: .word ; direction of rotation = -1 or +1 DIR: .word ; x, y and z X: .word 23335 Y: .word Z: .word ; x1 and y1 = temp x and y X1: .word Y1: .word ; theta THETA: .word 166315 ; atan for the current i = (N - 16) ATAN: .word ; pointer to atan[i] p_ARR: .word ; atan[i] = atan(pow(2,-i)) for i = 0 to 15 ATAN_I: .word 16314 .word 10400 .word 4373 .word 2217 .word 1111 .word 445 .word 222 .word 111 .word 44 .word 22 .word 11 .word 4 .word 2 .word 1 .word 0 .word 0 .end 0 ```
Older Posts »