# Assembly Language Programming

## October 25, 2012

### Digital Signature Algorithm – Generating Large Prime Numbers

Written in x86 assembly language using the Flat Assembler (FASM), this program generates the large prime numbers that are required for the Digital Signature Algorithm (DSA).

The DSA primes p and q are two large prime numbers, which satisfy the relation:

p mod q = 1

In other words, q is a prime factor of p-1.

## Digital Signature Algorithm

The Digital Signature Algorithm (DSA) is outlined in chapter 4 of the Digital Signature Standard (DSS), as published by the U.S. National Institute of Standards and Technology:

The actual mathematical details for the DSA are specified in the fips 186-3 appendices.

In the DSA, a digital signature is generated by applying a private key (x) to the hash of the message to be signed. A public key (y) is used for verification of the signature. The keys x and y are generated from the domain parameters {p, q, g}. The DSA is based on the ElGamal signature scheme.

In this project, the only parts of the DSA I implement are those related to the generation of the two large prime numbers q and p. I do not compute the generator g, the keys x and y, or implement the actual signature process.

These are the parameters output by my program:

 Small Prime c The starter prime of less than 33 bits in length used to generate the prime q. Small Prime c0 The starter prime of less than 33 bits in length used to generate the prime p0. First Seed The initial value of the seed for the hash function. Q Seed The value of the hash function seed at the time the prime q was found. P0 Seed The value of the hash function seed at the time the prime p0 was found. P Seed The value of the hash function seed at the time the prime p was found. Qgen Counter The value of the counter at the time the prime q was found. P0gen Counter The value of the counter at the time the prime p0 was found. Pgen Counter The value of the counter at the time the prime p was found. Q The prime q, with a bit length as given by N. P0 The prime p0, the starter prime for p. P The prime p, with a bit length as given by L.

The various seed and counter values allow the primes q and p to be verified, since a particular first seed will always generate the same set of prime numbers and their corresponding seed and counter values.

To generate these numbers, I implement the following sections from FIPS 186-3:

 A1.2 Construction and Validation of the Provable Primes p and q. C.2 Conversion Between Bit Strings and Integers. C.6 Shawe-Taylor Random_Prime Routine. C.7 Trial Division. C.8 Sieve Procedure.

## The Keys x and y

I don’t implement the key generation process in my code, but I will briefly discuss it. The private key x is a randomly generated integer value in the range [1,q-1]. The public key y is in the range [1,p-1]. It is calculated from x:

y(x)=gx mod p

The only unknown in the expression above is the private key x, all of the other parameters are publically available. So how does the value of x remain secret? The security of x depends on the well known discrete logarithm problem. Here x is the discrete logarithm (to base g) of y. Currently there is no known method to quickly compute discrete logarithms, and a brute force approach to find x is out of the question because of the size of the DSA primes. Therefore y(x) is basically a one way function.

## DSA Security Strength

There are two factors which determine the security strength (or bit strength) of the DSA signature process. The first is the size of the primes p and q, and the second is the choice of hash function used to hash the seed.

The DSA length parameters (L,N) define the size of the two primes, where L is the bit length of p and N is the bit length of q. These are restricted to certain values by the DSS.

The DSS also specifies which hash algorithms can be used in the DSA (the SHA algorithms). Not only is the hash function used to generate the signature for a particular message M, but it is also used to generate the primes p and q.

NIST Special Publication 800-57 (SP 800-57 Part 1) provides some information on the security strengths for the (L,N) pair and hash algorithms. See tables 2 and 3 on pages 63 and 64:

The DSS recommends that the security strength of the hash function either match or exceed that of the selected (L,N) pair. In my program I have chosen the hash functions such that they equal the security strength of each (L,N) pair.

The table shows which combinations of (L,N), SHA functions and seed lengths are supported by my program:

L N Hash() Bits of Security Seed Length
1024 160 SHA-1 80 160 bits
2048 224 SHA-224 112 224 bits
2048 256 SHA-256 128 256 bits
3072 256 SHA-256 128 256 bits

## Seeds and Counters

The generation process requires an initial seed value known as the first seed. The hash function is applied to this seed at various points during the prime generation process, to construct starter values for each of the primes. Every time the hash function is used in this way, the seed value is incremented by 1. The value of the first seed can either be set by the user or be randomly generated.

The DSS requires that the bit length of the seed has to be at least equal to N, the bit length of the prime q. In my code I set the length of the seed equal to N bits, as can be seen in the table above.

The generation process also uses a counter, which is initialised to zero. This counter is incremented every time a new candidate prime is generated, and when a candidate is confirmed to be prime the value of the counter is saved.

Using the seed and counter method to generate p and q allows an end user to verify that these values are legitimate. They just take the supplied first seed value and apply the same calculations to it.

## Generating q

The method defined in Appendix C.6 (Shawe-Taylor Random_Prime Routine) is used to construct the primes q and p0. There are two parts to this method. Steps 3-13 are used to generate a starter prime of less than 33 bits in length, by repeatedly hashing the seed until a value is obtained that is confirmed to be prime. The sieve procedure outlined in Appendix C.8 is used to test the candidates for this small prime. This method is basically the Sieve of Eratosthenes, an algorithm for testing primes that dates back to Ancient Greece.

Steps 16-34 make up the second part of the method.

Note that the method is recursive. If the input length is greater than 32 bits, the method jumps to step 14 where it calls itself again (but with approx half the length). Once the length is less than 33 bits, steps 3-13 of the algorithm are executed to generate the initial starter prime. Then for each time the method was called at step 14, steps 16-34 are executed.

So for example, if N = 160 the flow of execution looks like this:

 ST_Random_Prime(160) - Length greater than 32, jump to step 14. ST_Random_Prime(81) - Length greater than 32, jump to step 14. ST_Random_Prime(42) - Length greater than 32, jump to step 14. ST_Random_Prime(22) - Execute steps 3-13 to get c0 and return. ST_Random_Prime(42) - Execute steps 16-34 and return the next c0. ST_Random_Prime(81) - Execute steps 16-34 and return the next c0. ST_Random_Prime(160) - Execute steps 16-34 and return q.

Steps 16-34 use the input prime c0 to construct a larger prime c, as follows:

The seed is hashed again to produce a string of random bits (x) which is then adjusted so that it has a bit length equal to the input length.

A parameter t is then calculated from x using c0:

t = ceiling(x/2c0)

An initial candidate value for the prime c is then defined as:

c = 2tc0 + 1

This candidate value is then tested for primality. If it is not prime, then the value of t is incremented (t = t + 1), and c is recalculated:

c’ = 2(t + 1)c0 + 1 = c + 2c0

This procedure is repeated until a value of c is found that is prime.

## Generating p

The process specified in Appendix A1.2 (Construction and Validation of the Provable Primes p and q) is used to construct the largest prime p. The primes q and p0 are used to construct candidates for p as follows:

p = 2tqp0 + 1

Using q in this way guarantees that q will be a prime factor of p-1, as required.

As for the case of q above, successive candidate values are generated until a prime is obtained:

p’ = 2(t + 1)qp0 + 1 = p + 2qp0

## Testing Primality

To test the primality of the candidate primes in the construction process for p and q, the DSA uses the Pocklington-Lehmer Primality Test. For some random integer a, this test guarantees that a number N will be prime if it satisfies the following conditions:

aN-1 mod N = 1

GCD(a(N-1)/p – 1,N) = 1

– where p is a prime factor of N-1.

The DSA generates a random integer a (with a bit length of at least N bits) to test candidates for the prime q, by using the same method that was used to generate x (above). This value is then adjusted so that it is in the range [2,c-2]:

a = 2 + [ a mod (c-3) ]

To test candidates (c = 2tc0 + 1) for the prime q (and p0), the DSA computes a parameter:

z = a2t mod c

It then tests the condition:

GCD(z-1,c) = 1

Since 2t = (c-1)/c0, this is the same as the GCD part of the Pocklington-Lehmer Primality Test.

The DSA then tests the condition:

zc0 mod c = 1

But since:

zc0 = (a2t)c0 = (a(c-1)/c0)c0 = ac-1

– this test is the same as:

ac-1 mod c = 1

For candidates of the prime p, the DSA applies the following tests (for z = a2tq mod p):

GCD(z-1,p) = 1

zp0 mod p = 1

Again it can be shown that this is just the Pocklington-Lehmer Primality Test.

## Fast Exclusion Of Non-Primes

To test the candidates for the primes p and q, the DSA uses the Pocklington-Lehmer Primality Test. However, this test requires the use of exponentiation which takes a lot of processing time. Therefore I have introduced an additional step in the testing process, which is not specified in the DSA. The aim of this extra step is to quickly prove that the candidate is not a prime. By doing this the need to carry out the exponentiation can be avoided, thus speeding up the search process for the primes p and q.

To quickly prove that a candidate is non-prime, I just generate some random numbers and use the GCD function to determine if they are coprime (i.e. GCD = 1) with the candidate. If even one of these random numbers is not coprime with the candidate, then the candidate value is not a prime and can be rejected. Quite a few random numbers are needed for this test, because it has been found that any random pair of numbers are likely to be coprime about 60% of the time. Further more, it turns out that some of the candidates generated by the DSA are testing coprime with over 95% of the random values. To make sure that these candidates can be eliminated, I generate 50 random values for testing. This test will not exclude all non-primes because some of the candidates test coprime against 100% of the random values (even if there are a 1000 randoms being generated).

By applying the above test, I can quickly reject about 70-80% of the candidate primes and save a large amount of processing time. For example, the exponentiation computation to test the candidate for p0 (where L = 1024 and L0 = 513) takes about 2 seconds. It would normally take 20 seconds to test 10 candidates in this way, but by applying the GCD filter this time can be reduced to about 4-6 seconds. The time savings are even greater for the larger primes. The exponentiation to test the candidate for p (L = 1024) takes about 6 seconds. Considering that hundreds or even thousands of candidates (up to the limit of 4 x L specified by the DSA) may need to be tested before a value for the prime p is found, the reduction in processing time can be quite significant.

## How To Hash The Seed

In the DSA, the SHA hash functions are applied to the seed to generate pseudo-random starter values for the primes p and q. The seed and the starter values are integers, but the inputs and outputs for the SHA functions are bit strings. Therefore conversion between bit strings and integer values is required (as discussed in Appendix C.2).

The byte order in the seed integer value is little endian. To convert this to a message (bit string) that can be hashed, the bytes are reversed such that the byte order becomes big endian. The SHA 1, 224 and 256 algorithms process a message in blocks of 512 bits. Since the maximum length of the seed is 256 bits, it only requires a single 512 bit block as the input to the hash function. This block is in big endian byte order, with the first 160, 224 or 256 bits being the seed. The remaining bits in the block are the padding bits. These are all zero, with the exception of the first bit following the message (the seed) which is set to 1, and the last 16 bits which are set to the length of the message (the seed length).

The SHA functions output the result as an array of 32 bit words:

H[0]H[1]H[2]H[3]H[4] – e.g. for SHA-1.

This output is converted to a number, just by reversing the array indexing (e.g. H[4] becomes H'[0] etc):

H'[4]H'[3]H'[2]H'[1]H'[0]

It is this form of the hash output that is used to generate the pseudo-random starter values for the candidate primes.

## Program Speed

The program uses a lot of processing power, so it can take a while to generate the primes. For L = 1024, the program should be able to generate some primes in less than 30 minutes (on a decent machine). For L = 2048 and L = 3072, it can take up to a couple of hours (or longer) to generate the primes.

## Test Vectors

A zip file containing test vectors can be found on the NIST website. It is called: 186-3dsatestvectors.zip

## Usage

The program allows the primes to be generated either all at once, or one at a time.

To generate all of the primes at once, press the [Generate All] button. Note that it can take up to one or two hours for a result to be returned.

To generate the primes one at a time (in sequence), first press the [Generate q] button to compute q. Once q has been found, press [Generate p0] to compute the prime p0. Finally, press [Generate p] to find the prime p. The program only takes a couple of minutes to find q, up to 30 minutes to find p0, and one or two hours to find p.

Of course, before generating any primes the L,N pair should be selected by pressing the appropiate button.

After the L,N pair has been set, the user may optionally enter a value for the first seed and press the [Capture] button to get this value. If a first seed value is not supplied in this way, a random value for the first seed will be internally generated by the program.

## The FASM x86 Code

To grab this code, select it with the mouse and copy it. It can then be pasted directly into the FASM IDE.