To avoid veering off-topic in Thomas's line drawing thread, I thought I'd start a new one about multiplication.
Beebs don't multiply numbers very well. Even with the kind of routine I put on the wiki, we're still limited to multiplying unsigned numbers and getting an unsigned result.
Now, there are many occasions (e.g. in a 3D game) where you want to be able to do the equivalent of:
Code:
x = r * SIN(angle)
Obviously, SIN(angle) is a fractional value between -1.0 and 1.0, and one thing the Beeb definitely
does not excel in is floating point! Other relatively modern platforms which don't have floating point hardware either (e.g. the Nintendo DS) get around this limitation by fetching the sine value multiplied by a constant, multiplying the two values, and then dividing afterwards by the same constant:
Code:
x = r * (SIN(angle) * 256) / 256
If the constant is a nice round power of two, this is a cheap operation. This representation of a fractional value multiplied by a constant is called 'fixed point'.
Well, we know all that anyway. Wouldn't it be nice if we could do this on a Beeb?
In Thomas's thread, I was trying to find an efficient way to represent a value between -1.0 and 1.0, which we could easily multiply by, in order to yield the elusive value of
x, above! Obviously, it would be better if the fixed-point value fit nicely into 8 bits, and with as little wastage as possible. The main problem is the asymmetry of the two's complement representation - we can store values from -128 to +127, which means that if we use 128 as our fixed point constant, we can happily store -1.0, but we can't reach 1.0.
To cut a long story short, I realised that we can actually use 127 as our constant (in other words, so that -1.0 is represented by -127, and +1.0 is represented by +127), and then write a multiply routine which automatically divides the result of a*b by 127.
This turns out to be really easy, because we already use a table to lookup the result of the multiplication.
Recall:
Code:
(a+b)^2 = (a^2 + b^2 + 2ab) -> (I)
(a-b)^2 = (a^2 + b^2 - 2ab) -> (II)
(I)-(II) yields:
(a+b)^2 - (a-b)^2 = 4ab
=> ab = f(a+b) - f(a-b)
where f(x) = x^2 / 4
So, it follows that if we were simply to divide this table by a further 127, we would automatically get a divide by 127 for free!
Code:
ab / 127 = f(a+b) - f(a-b)
where f(x) = x^2 / 4 / 127
Obvious really!
The next thing we need to do is make sure it works with signed inputs. The above method will only work with unsigned inputs, so we have to do some other stuff in order that it works well.
The naïve solution is to write code like this:
Code:
Remember sign of (val1 EOR val2)
If sign of val1 is negative, negate val1.
If sign of val2 is negative, negate val2.
Multiply val1 * val2.
If the sign we remembered was negative, negate the result.
However, this costs cycles, and also means that the case where any of the inputs is negative is slower than the case where both inputs are positive.
A better solution is to do the table lookup and subtract in different ways, depending on the two signs of the inputs (4 different possibilities).
For example, in the cases where the result needs to be negated, we do the table lookup with positive indices as required, but then reverse the order of the subtraction, like this:
Code:
a*b = f(a-b')-f(a+b') where b' is -b
Anyway, here's a bit of code which does the trick. It executes in an average of 58 cycles (including the subroutine call and return), which is pretty quick!
Code:
\\ ***********************************************************************************************
\\ *
\\ * Fixed-point multiplication
\\ *
\\ * Essentially this is a routine which multiplies two signed 8-bit numbers, and divides the
\\ * result by 127. It does this very quickly! (58 cycles on average, including the subroutine
\\ * call).
\\ *
\\ * This is useful because we can multiply the values in our sine table by 127 (allowing us to
\\ * correctly represent 1.0 and -1.0), and multiply this by another sine value, or a vertex
\\ * coordinate (in the range -127..+127) without losing any magnitude or precision.
\\ *
\\ * The multiplication technique used is the old favourite:
\\ *
\\ * a*b = f(a+b) - f(|a-b|), where f(x) = a^2 / 4, with x = 0..255
\\ *
\\ * Because a byte can only represent -128..127, we have to limit the input range to -127..127,
\\ * otherwise it would be possible to calculate -128*-1.0 (= an out of range result: 128!).
\\ *
\\ * The result is an approximation due to various rounding errors, but tests show that:
\\ * 75% of the results are accurate to within +/-0.5
\\ * 99% of the results are accurate to within +/-1.0
\\ *
\\ * This should be ok for now!
\\ *
\\ ***********************************************************************************************
val1 = &70
val2 = &71
ORG &4000
\\-------------------------------------------------------------------------------------------------
\\ Multiply X * cos(Y)
\\
\\ X is in the range -127..127, and can be thought of as a fixed-point number if you wish, where
\\ 127 represents 1.0.
\\ Y is an angle between 0..255, such that 360 degrees is represented by 256.
\\
\\ Result returned in A
\\-------------------------------------------------------------------------------------------------
.MultiplyByCosine
TYA
CLC
ADC #64
TAY
; fall through
\\-------------------------------------------------------------------------------------------------
\\ Multiply X * sin(Y)
\\
\\ X is in the range -127..127, and can be thought of as a fixed-point number if you wish, where
\\ 127 represents 1.0.
\\ Y is an angle between 0..255, such that 360 degrees is represented by 256.
\\
\\ Result returned in A
\\-------------------------------------------------------------------------------------------------
.MultiplyBySine
LDA SineTable,Y
TAY
; fall through
\\-------------------------------------------------------------------------------------------------
\\ Multiply X * Y
\\
\\ Both X and Y are signed numbers in the range -127..127.
\\ One or both of them is considered as a fixed-point number, where 127 represents 1.0.
\\
\\ Result returned in A
\\-------------------------------------------------------------------------------------------------
.SignedMultiplyFixed
{
STX val1 ; store these two values, we will need to do
STY val2 ; arithmetic with them later
CLC ; clear C in anticipation
TYA:BMI val2neg ; handle cases where val2 is negative
\\ If we get here, val2 is positive
TXA:BMI val1neg_val2pos
\\ If we get here, val1 is positive and val2 positive.
\\ This is the straightforward f(val1+val2) - f(|val1-val2|)
; On entry, A contains val1, C is clear
.val1pos_val2pos
ADC val2
TAX ; X = val1+val2 (in range 0..255)
SEC
LDA val1
SBC val2 ; val1-val2 (in range -127..127)
BCS skipnegate ; see if val1-val2 was negative
EOR #255 ; if so negate it to yield a positive value
ADC #1
SEC ; set C flag in anticipation
.skipnegate
TAY ; Y = |val1-val2| (in range 0..127)
LDA MultTable,X ; C guaranteed to be set here
SBC MultTable,Y
RTS
\\ If we get here, val1 is negative and val2 is positive.
\\ The method being used requires unsigned numbers, so we do the table lookup with -val1.
\\ We would then need to negate the result at the end, but in order to save this,
\\ we just reverse the subtraction of the two table lookups, i.e.
\\ f(|val1-val2|) - f(val1+val2)
\\ However, as we just said, we use -val1, so what we now get is:
\\ f(|-(val1+val2)|) - f(val2-val1)
\\ = f(|val1+val2|) - f(val2-val1)
; On entry, A contains val1, C is clear
.val1neg_val2pos
ADC val2 ; val1+val2 (in range -127..127)
BCS skipnegate2 ; see if val1+val2 was negative
EOR #255 ; if so negate it to yield a positive value
ADC #1
SEC ; set C flag in anticipation
.skipnegate2
TAX ; X = |val1+val2| (in range 0..127)
LDA val2 ; C guaranteed to be set here
SBC val1
TAY ; Y = val2-val1 (in range 0..255)
SEC
LDA MultTable,X
SBC MultTable,Y
RTS
\\ If we get here, val2 is negative
.val2neg
TXA:BMI val1neg_val2neg
\\ If we get here, val1 is positive and val2 is negative
\\ This is similar to the last case; we need to negate the result, and do the table
\\ lookup with -val2, so we end up with:
\\ f(|val1+val2|) - f(val1-val2)
; On entry, A contains val1, C is clear
.val1pos_val2neg
ADC val2 ; val1+val2 (in range -127..127)
BCS skipnegate3 ; see if val1+val2 was negative
EOR #255 ; if so negate it to yield a positive value
ADC #1
SEC ; set C flag in anticipation
.skipnegate3
TAX ; X = |val1+val2| (in range 0..127)
LDA val1 ; C guaranteed to be set here
SBC val2
TAY ; Y = val1-val2 (in range 0..255)
SEC
LDA MultTable,X
SBC MultTable,Y
RTS
\\ If we get here, val1 is negative and val2 is negative
\\ As well as negating the result, we have to do the table lookup with -val1 and -val2.
\\ So, this leaves us with:
\\ f(-(val1+val2)) - f(|val2-val1|)
; On entry, A contains val1, C is clear
.val1neg_val2neg
ADC val2 ; val1+val2 (in range -255..0)
EOR #255
ADC #0
TAX ; X = -(val1+val2) (in range 0..255)
SEC
LDA val2
SBC val1 ; val2-val1 (in range -127..127)
BCS skipnegate4 ; see if it was negative
EOR #255 ; negate it again if so
ADC #1
SEC ; set C flag in anticipation
.skipnegate4
TAY ; Y = |val2-val1| (in range 0..127)
LDA MultTable,X ; C guaranteed to be set here
SBC MultTable,Y
RTS
}
.EndOfSignedMultiplyFixed
\\ Here are the tables, align them to a page boundary
\\ 512 bytes in total... not so bad, really.
ALIGN 256
\\ Sine/cosine table, values multiplied by 127
.SineTable
FOR n, 0, 255
a = SIN(n/128*PI) * 127
IF a >= 0
EQUB INT(a + 0.5)
ELSE
EQUB INT(a - 0.5)
ENDIF
NEXT
\\ Table used by the multiplication routine to perform (a*b)/127
.MultTable
FOR n, 0, 255
EQUB INT((n*n / 4) / 127 + 0.5)
NEXT
...and here is a zip of the BeebAsm source and a .ssd image containing a test program, so you can see that it really works!
