Fast fixed-point multiplication library
From Retrosoftware
Richtw (Talk | contribs)
(New page: ''Sample Code Library > Fast fixed-point multiplication library'' = Fast fixed-point multiplication library in 6502, using base 127 fixed-point values = Beebs don'...)
Next diff →
Revision as of 19:55, 20 September 2009
Sample Code Library > Fast fixed-point multiplication library
Contents |
Fast fixed-point multiplication library in 6502, using base 127 fixed-point values
Beebs don't multiply numbers very well. Even with the kind of routine I demonstrated here, we're still limited to multiplying unsigned numbers and getting an unsigned result.
Now, there are many occasions (e.g. in a 3D game) where you want to be able to do the equivalent of:
x = r * SIN(angle)
Obviously, SIN(angle) is a fractional value between -1.0 and 1.0, and one thing the Beeb definitely does not excel in is floating point! Other relatively modern platforms which don't have floating point hardware either (e.g. the Nintendo DS) get around this limitation by storing the sine value multiplied by a constant, fetching this when required, multiplying the two values, and then dividing afterwards by the same constant:
x = r * (SIN(angle) * 256) / 256
If the constant is a nice round power of two, this is a cheap operation. This representation of a fractional value multiplied by a constant is called 'fixed point'.
Well, we know all that anyway. Wouldn't it be nice if we could do this on a Beeb? Here I present one possible way of achieving this in 6502, which maximises precision whilst requiring only 8-bit values.
Possible fixed-point representations
Let's suppose we only need to be able to represent values between -1.0 and 1.0. This is often all we need as it encompasses all the values of a sine/cosine table, a reciprocal table (for doing quick divisions), and even allows quick interpolation between two values, e.g. x = start * t + end * (1-t). As long as we can quickly multiply by any value between -1.0 and 1.0 with reasonable precision, we fulfil our remit! If we could find a way to store this fixed point value in 8 bits, it would be great, as we already have a super-fast table based multiplication algorithm (as discussed on the other wiki page) which multiplies 8-bit values.
Fixed-point with 8 fractional bits
How about the approach mentioned above? Multiplying all the fractional values by 256, so that 1.0 is represented by 256, 0.5 by 128, -1.0 by -256, and so on. This would be an efficient representation because in order to multiply an 8-bit value by one of these fixed-point values, we just do a regular 8-bit * 8-bit multiply, and then throw away the bottom 8 bits of the result. The top 8 bits are the answer!
Seems simple - apart from that there's a small problem: values between -256 and 256 don't fit into 8 bits. In fact, we need 10 bits in order to represent this entire range, and there's a lot of redundancy there, because in fact with 10 bits we can represent values between -512 and 511.
I'm not going to explore this approach in this article, because, while providing reasonably good precision, it won't work in a straightforward way with our existing multiply routine, and the aim here is to produce a very fast fixed-point multiplication routine, hopefully requiring only 8-bit values.
Fixed-point with 7 fractional bits
So how about multiplying by 128 instead? Like this, -1.0 is represented by -128 and 1.0 is represented by 128. The problem here is to do with the asymmetry of the two's complement representation - in 8 bits, we can only store values from -128 to +127, which means that if we use 128 as our fixed point constant, we can happily store -1.0, but we can't reach 1.0. This is going to be a problem, so let's throw this idea away straight away.
Fixed-point with 6 fractional bits
So, finally, how about multiplying by 64? Then, values between -64 and +64 represent our required range of -1.0 to +1.0, and this fits easily into a byte. So then, when we want to get the result, it's simply a case of calculating a*b / 64, where of course the division can be performed by a few simple shifts. This approach would work fine, but this is starting to become very imprecise - the fractional part is only represented to the nearest 1/64th - and, what's more, there are a lot of wasted values which we will probably not use (-128 to -65, and 65 to 127 - all representing values less than -1.0 and greater than 1.0 respectively).
At this point, I realised that, thinking outside the box a bit, we didn't need to limit our fixed-point constant to a power of two value. In fact, we could choose anything we wanted, because we already perform our multiplication with lookup tables.
Base 127 fixed-point representation
It sounds complicated, but the premise is actually very simple: use 127 as our fixed-point constant. Thus -1.0 is represented by -127, and +1.0 is represented by +127. When we want to calculate a * b (where b is a fixed-point value), we just calculate:
result = a * b / 127
Division by 127? Yuk! But never fear: we won't actually have to divide by 127 by hand. In fact we will build lookup tables which are predivided by 127.