# Assembly Language Programming

## Javascript CRC-16 Calculator

Here is a Javascript CRC-16 calculator I have written to demonstrate the different techniques that can be used to implement the CRC calculation in code.

A CRC is calculated by dividing a generator polynomial into the message. The long division process is done according to the mathemetics of polynomial division mod 2, so the conventional long division process is not used. Instead the various left shifts of the generator polynomial bit pattern are simply XORed into the message bit pattern.

For CRC-16 the generator polynomial is 17 bits in length. The msb (bit 16) and lsb (bit 0) are always 1. Since the msb is always 1, when the polynomial is written in hex form it is the convention just to show the lower 16 bits. For example the IBM polynomial 18005 is usually just written as 8005. My program follows this convention.

## Long Division Method

To visualise the long division process, imagine the message written as a binary number (or bit pattern). The leading bits in the message will be the most significant (or leftmost) bits in the bit pattern. The long division process is begun by left shifting the 17 bit polynomial such that the msb of the polynomial is aligned with the msb of the message. The polynomial is XORed into the message only if the msb of the message is 1 and as a result the msb of the message will be cleared. The polynomial is then right shifted by 1 bit and the process repeats. In this way every bit in the message is tested against the polynomial, with the polynomial being XORed into the message if the message bit that is aligned with the msb of the shifted polynomial is 1. Of course the bits in the message will change as the polynomial is XORed into it, and all of the bits to the left of the current shifted polynomial will become 0.

The remainder from this division process is the CRC. To capture the remainder, 16 zero bits are appended to the right of the message bit pattern. When the long division process reaches the point where the msb of the polynomial is aligned with the lsb (the rightmost bit) of the message and the final XOR is done (only if the lsb was 1) these 16 bits will hold the CRC value.

## Register Method

If the CRC is being computed on a machine where there is a lot of memory and processing power available, then it is easy to implement the above long division method. First create arrays for each of the message and polynomial. Then shift and XOR the polynomial array into the message array as described above.

However on an embedded system where the memory and processing power is limited, this approach is not feasible. This leads to the register method for computing CRCs.

Even though the polynomial for 16 bit CRC is 17 bits in length, a CRC-16 calculation can be done with a 16 bit register. There are three factors that make this possible. One, when an operation is performed on a register that results in a carry from the msb of the data contained in the register, the CPU will capture and store the carry bit. Two, each iteration of the long division process only effects 17 bits within the message. Three, all bits in the message that are to the left of the shifted polynomial are cleared and will not be a part of the final CRC value.

The carry bit together with the 16 bit register gives 17 bits. This bit pattern can be visualised as aligning with the 17 bit polynomial, with the msb of the polynomial aligning with the processor carry bit. We know that the msb of the polynomial is always 1 and the effect it will have if the carry bit is also 1. Therefore, there is no need to store the msb of the polynomial and the remaining bits of the polynomial can be stored in a 16 bit register.

Therefore, a 16 bit CRC can be calculated by shifting the message bits through a register. Each bit that is shifted out of the register and into the processor carry bit is tested. If it is 1, the lower 16 bits of the polynomial are XORed into the register containing the message bits. Once the bit that has just been shifted out of the register has been tested and acted upon, it is no longer needed and can just be discarded. The process works like this:

 ```- Initialise the CRC register: CRC = the 16 leftmost bits of M. (1) Left shift CRC: CRC = CRC << 1 - the processor carry bit now stores the msb: carry = msb(CRC) (2) Left shift the next bit from M into CRC: CRC = CRC + bit (3) Test the carry bit and XOR if 1: - if(carry == 1) CRC = CRC ^ Poly (4) Repeat from (1) until the last bit from M has been shifted through the register. - The register now contains the CRC. - The padding bits are used to push the last message bit completely through the register. ```

## Table Method

The table method allows a processor to process a message byte by byte to calculate the CRC. This is a faster and more efficient way than processing a message bit by bit.

The table method is derived as follows. Visualise the long division process above as applied to any 3 bytes within the message. If each bit in the first (leftmost) byte is tested against the polynomial and an XOR done whenever one of the bits is 1, then a bit pattern can be built up by the combined XOR of the shifted polynomials. This bit pattern will have a length of 24 bits and therefore effect all 3 of the bytes being considered. We already know that when this bit pattern is XORed with the 3 bytes, that the first byte will just becomes zero. So the upper 8 bits of the bit pattern can just be discarded. What is left is the 16 bit pattern that will be XORed with bytes 2 and 3. For any given byte, this 16 bit pattern will always be the same. Therefore a table of these values can be built. The index for this table will be the byte values from 0 to 255. With this table, the long division process can be done byte by byte:

 ```- Let T[byte] = the 16 bit value as described above. - Let the bytes B(i) B(i+1) B(i+2) B(i+3) be a portion of a message. - B(i) is used as the table index to get T[B(i)]. - B(i) is set to zero and T[B(i)] is XORed into the message: B(i+1)' B(i+2)' B(i+3) ... - B(i+1)' is used as the table index to get T[B(i+1)']. - B(i+1)' is set to zero and T[B(i+1)'] is XORed into the message: B(i+2)'' B(i+3)' ... - And so on ... ```

The register method to calculate a CRC can be significantly enhanced if a table is used:

 ```- Initialise the CRC register: CRC = the 16 leftmost bits of M. (1) Using the upper byte from the register B(u), get T[B(u)]. (2) Left shift the register by 1 byte. B(u) is shifted out of the register. (3) Load the next message byte into the lower byte of the register. (4) XOR T[B(u)] into the register. (5) Repeat from (1) until the last message byte has travelled through the register. - The 2 padding bytes are used to push the last message byte completely through the register. ```

If the message is a hexadecimal number that potentially contains an odd number of digits, the table method can be adapted to handle this. The message is just processed digit by digit (or nibble by nibble) using a table that is indexed by a 4 bit value. The table values will still be 16 bits however (for CRC-16).

A message could even be processed in blocks of 2 bytes, but the table needed for this would have 65,536 elements, which would require over 130 KB of memory.

## Direct Table Method

There is a further refinement to the table method known as the direct table method. It works like this:

 ```- Initialise the CRC register to zero: CRC = 0 - Get the first message byte. (1) Get the upper byte from the register B(u). (2) XOR B(u) with the message byte B(M) to get the table index. (3) Left shift the register by 1 byte. B(u) is shifted out of the register. - The lower byte becomes zero. (4) XOR T[B(u) xor B(M)] into the register. (5) Repeat from (1) until the last message byte has been used. - The padding bits are not needed. ```

In this method the message bytes don’t travel through the register at all. They are XORed directly with the upper byte of the register to obtain a table index. Most implementations of CRC use this method, since it is the fastest and most efficient out of all the methods.

The relationship between the direct table method and the more clearcut long division and table methods is far from obvious. However, the direct table method can be derived in only a few simple steps by considering the long division process.

To derive the direct table method, we just need to keep in mind that the XOR operation is both associative and commutative:

 ```Commutative: A ^ B = B ^ A Associative: A ^ (B ^ C) = (A ^ B) ^ C ```

Consider the portion of the message starting from the byte B(i), where the bytes B(0) to B(i-1) have already been processed:

 ```B(i) B(i+1) B(i+2) B(i+3) B(i+4) .... B(n) The message is n bytes long. We are ready to process the byte B(i), which will involve getting the value T[B(i)] and XORing it into the bytes B(i+1) and B(i+2). But before doing this we define a secong bit string which is all zero, except for the byte at position i+2. The byte at i+2 is set to the same value as the byte B(i+2) in the message. Then the message can be defined as: { B(i) B(i+1) 0 B(i+3) B(i+4) .... B(n) } XOR { 0 0 B(i+2) 0 0 .... 0 } Next we XOR T[B(i)] into the message: { B(i+1)' X B(i+3) B(i+4) .... B(n) } XOR { 0 B(i+2) 0 0 .... 0 } The value X is just the lower byte of T[B(i)] - it's value doesn't really matter. Next we XOR T[B(i+1)'] into the message: { X' B(i+3)' B(i+4) .... B(n) } XOR { B(i+2) 0 0 .... 0 } We are now ready to process the X' byte, but to process the message properly we also have to deal with B(i+2). Unlike the previous terms in the bit string to the right of the XOR, B(i+2) is non-zero and will have an effect on the CRC calculation. So we recombine the 2 bit strings to get: (X' xor B(i+2)) B(i+3)' B(i+4)' .... 0 And it becomes clear that the index for the table T[] is just: { X' xor B(i+2) } So for every byte B(i) in the message a bit string can be defined that is all zero except for the byte at position i which has a value of B(i). Let { B(i) } be any one of these bit strings. Then the message can be defined as the xor of all these bit strings: M = { M' } xor { B(0) } xor { B(1) } xor .... { B(i) } .... xor { B(n) } - where { M'} is just a zero bit string. Then by the discussion above, the CRC can be calculated just by processing M' - with the bit string { B(i) } only needing to be XORed in when the byte at position i within M' is about to be processed. ```

As applied to the table method above using the register, this means that the bytes do not need to be shifted through the register. Instead they are just directly XORed with the most recent byte to be popped out of the register to generate a table index.

## Initial and Final Values

Many CRC-16 implemetations specify an initial value which is XORed into the first 2 bytes of the message. CRC-16 also allows a final value to be XORed into the register once the CRC has been calculated. Both the initial and final values are 16 bit values.

My program supports both these options.

## Reflection and Reversed Polynomials

These aspects of CRC-16 are not implemented in my program.

## Program Usage

The input message can be entered as a Text String or a Hexadecimal Number. Press the [ Toggle Input Type ] to select one of these types of input. If a hexadecimal number is the input, it can include whitespace for formatting purposes. This whitespace will just be ignored when the number is processed. The [ Clear Input ] button clears the input area.

To compute the CRC for a message just press the [ Calc CRC ] button.

The user can enter values for the generator polynomial, the initial XOR value and the final XOR value. These are all entered as 4 digit hex numbers. The default values for these parameters are those for the IBM CRC-16 implementation: 8005 for the generator polynomial, FFFF for the initial XOR value, and 0000 for the final XOR value.

The program can also display the lookup tables for 4 bit and 8 bit indexes. Just press one of the [ Show Table ] buttons.

## HTML/Javascript Source Code:

To grab this code, just select it with the mouse, and copy and paste into a text file.

 ``` CRC 16

CRC-16: Text String Input

For hexadecimal number input, whitespace is allowed.

Generator Polynomial: Init: Final:

Enter the message:

```