04-28-2009, 09:34 PM
Okay, I confess: I'm not an assembler user. This is why I need a compiler.
That said, I love to nosey into things I don't understand. I often learn from that.
I was looking at an old piece of assembler code I found, when trying (fairly unsuccessfully) to learn Z80 code:
8 Bit number multiply
Rotates Num 1 round to examine it. Rotates Num2 up. Adds HL and DE if the bit is set.
Result is Num1 * Num2 in HL.
LD HL,0
LD DE, (NUM2)
LD A, (NUM1)
LOOP RR A (Divide A by 2 - copying the 1's column bit into the carry flag.)
JR NC, JP1 (Jump over the add if we have to)
ADD HL,DE
JP1 RET Z (Leave when we finish - A has gone to zero)
SLA E }
RL D } Multiply DE*2
JR LOOP
naturally, when I found this again, I couldn't resist comparing it to the 8 bit multiply assembler in the library. I'm not sure which is faster (Boriel's code is a little unclear in the fastcall case which registers are set with parameters; I think it's A and HL tho. I have a suspicion this code is faster, if only because it doesn't loop for a whole 8 bits if A is small - it quits as soon as A hits zero, which might optimize a few loops out.
Anyway, just for your perusal. Like I said, I'm a little beyond my depth on this one; but I love to optimize where I can!
I'm glad LCD mentioned the HISOFT compiler - because I think it does an awesome job. The limitation with it is that it's ON the spectrum, so you need room for basic + compiler + compiled code [though it can be clever and delete the basic as it goes to make more room]. I'm hoping Boriel's compiler will eventually Exceed the speed of Hisoft's - because it can be much cleverer outside the Spectrum than one locked into it!
That said, I love to nosey into things I don't understand. I often learn from that.
I was looking at an old piece of assembler code I found, when trying (fairly unsuccessfully) to learn Z80 code:
8 Bit number multiply
Rotates Num 1 round to examine it. Rotates Num2 up. Adds HL and DE if the bit is set.
Result is Num1 * Num2 in HL.
LD HL,0
LD DE, (NUM2)
LD A, (NUM1)
LOOP RR A (Divide A by 2 - copying the 1's column bit into the carry flag.)
JR NC, JP1 (Jump over the add if we have to)
ADD HL,DE
JP1 RET Z (Leave when we finish - A has gone to zero)
SLA E }
RL D } Multiply DE*2
JR LOOP
naturally, when I found this again, I couldn't resist comparing it to the 8 bit multiply assembler in the library. I'm not sure which is faster (Boriel's code is a little unclear in the fastcall case which registers are set with parameters; I think it's A and HL tho. I have a suspicion this code is faster, if only because it doesn't loop for a whole 8 bits if A is small - it quits as soon as A hits zero, which might optimize a few loops out.
Anyway, just for your perusal. Like I said, I'm a little beyond my depth on this one; but I love to optimize where I can!
I'm glad LCD mentioned the HISOFT compiler - because I think it does an awesome job. The limitation with it is that it's ON the spectrum, so you need room for basic + compiler + compiled code [though it can be clever and delete the basic as it goes to make more room]. I'm hoping Boriel's compiler will eventually Exceed the speed of Hisoft's - because it can be much cleverer outside the Spectrum than one locked into it!