Login

***boriel*** · 06-04-2010, 08:13 PM

Okay, today I've managed to get a little time for 8 bit mutiplication. This is the current (new) code:

Code:
ld b, 8

    ld l, a

    xor a

__MUL8LOOP:

    add a, a ; a *= 2

    sla l

    jp nc, __MUL8B

    add a, h

__MUL8B:

    djnz __MUL8LOOP

    ret

And this is yours, replacing JR with JP (2 T-states faster, each). I've also commented it a lot, because some instructions has been removed (unneeded).

Code:
LD E, H  ; H is the 2nd factor

    LD HL,0

    LD D, L  ; DE => H

    ;LD A, (NUM1) ;; Not needed, already done

LOOP:

    ;; RR A ; (Divide A by 2 - copying the 1's column bit into the carry flag.)

    RRA ; 1 byte, 4 T-states; RR A => 2 bytes, 8T-States and are equivalent!

    ; NOTE: JR is 3 T-States faster than JP when the condition is not met

    ; In this case, it's most likely numbers will be little ones (containing more 0s than 1s), so JP

    JP NC, JP1; (Jump over the add if we have to) ;

    ADD HL,DE ; 11 T-states

JP1:

    RET Z ; (Leave when we finish - A has gone to zero) ; 5 T-States if condition not met, 10 if met

    SLA E  ; 8 T-states

    ; RL D   ;; Multiply DE*2 ; Needless in 8 bit, only E is needed!

    JP LOOP ; 2 T-States faster

Note, your routine returns result in L not in A. Perhaps this routine could be rearranged?

Here's the benchmark I've used to time MUL8:

Code:
DIM a as Ubyte = 8 

DIM t as Uinteger AT 23672 ' REM t = Frames

DIM q as UByte

DIM tmp as UInteger

POKE t, 0 : ' Sets the clock to 0 in a single instruction

FOR tmp = 0 to 65534

    q = a * 165 

NEXT

Print CAST(Fixed, t) / 50

END ' End the program OK instead of an STOP error (STOP is an "error")

PRINT q ' Avoid -O3 variable removal

I haven't timed your routine. To do so, edit mul8.asm in library-asm replacing mul8 code with yours.
Also remember you have the old mul8, not the new one. With the new one, this benchmark gives 8.11 segs.

Login
Username:
Password:	Lost Password?
	Remember me