Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Faster Square Roots
#1
I've got a version that does LONG as well as uInteger later in this thread.

I note that the compiler uses the Spectrum ROM square root routine. This routine is hideously slow. It actually calculates x^(0.5) instead, and takes ages about it. The Newton-Raphson method would be a lot faster, and pretty easy to put in.

If you are willing to sacrifice accuracy, an integer square root would be faster still. For a lot of situations, an integer root would be just fine - for example, if I had to calculate the nearest of two objects on screen, I'm going to have to use pythagoras' theorum to calculate distances. [ A^2 = B^2 + C^2 ] that needs square roots to make a distance. But probably the nearest whole pixel would be a perfectly good enough result!

So, here are two functions, and some code to demonstrate them. One is a perfect replacement for sqr.asm in the library, actually - it's full floating point compatible, 100% accurate, and about 3-6 times faster. It actually uses the FP-Calculator in the rom. It just doesn't use the SQR command. [Note: It comes back with something instead of an error in case of a negative square root request. Boriel - you might want to change that behavor. Not sure.] The integer version... well - look for yourself! I reckon it's about 40 times faster than the fast version.

Also: Should be able to do a something similar for a 32 bit LONG integer.

Copy and compile this program. I hope you like it:

Code:
FUNCTION FASTCALL SQRT (radicand as FLOAT) as FLOAT
ASM
; FLOATS arrive in A ED CB
;A is the exponent.
          
          AND   A               ; Test for zero argument
          RET   Z               ; Return with zero.
          
          ;Strictly we should test the number for being negative and quit if it is.
          ;But let's assume we like imaginary numbers, hmm?
          ; If you'd rather break it change to a jump to an error below.
          ;BIT   7,(HL)          ; Test the bit.
          ;JR    NZ,REPORT       ; back to REPORT_A
                                ; 'Invalid argument'
          RES 7,E               ; Now it's a positive number, no matter what.
          
          call __FPSTACK_PUSH   ; Okay, We put it on the calc stack. Stack contains ABS(x)
                    
          ;   Halve the exponent to achieve a good guess.(accurate with .25 16 64 etc.)

                                ; Remember, A is the exponent.
          XOR   $80             ; toggle sign of exponent
          SRA   A               ; shift right, bit 7 unchanged.
          INC   A               ;
          JR    Z,ASIS          ; forward with say .25 -> .5
          JP    P,ASIS          ; leave increment if value > .5
          DEC   A               ; restore to shift only.
ASIS:     XOR   $80             ; restore sign.
          
          call __FPSTACK_PUSH   ; Okay, NOW we put the guess on the stack
          rst  28h    ; ROM CALC    ;;guess,x
          DEFB $C3              ;;st-mem-3              x,guess
          DEFB $02              ;;delete                x

SLOOP:    DEFB  $31             ;;duplicate             x,x.
          DEFB  $E3             ;;get-mem-3             x,x,guess
          DEFB  $C4             ;;st-mem-4              x,x,guess
          DEFB  $05             ;;div                   x,x/guess.
          DEFB  $E3             ;;get-mem-3             x,x/guess,guess          
          DEFB  $0F             ;;addition              x,x/guess+guess          
          DEFB  $A2             ;;stk-half              x,x/guess+guess,.5
          DEFB  $04             ;;multiply              x,(x/guess+guess)*.5
          DEFB  $C3             ;;st-mem-3              x,newguess
          DEFB  $E4             ;;get-mem-4             x,newguess,oldguess
          DEFB  $03             ;;subtract              x,newguess-oldguess
          DEFB  $2A             ;;abs                   x,difference.
          DEFB  $37             ;;greater-0             x,(0/1).
          DEFB  $00             ;;jump-true             x.

          DEFB  SLOOP - $       ;;to sloop              x.

          DEFB  $02             ;;delete                .
          DEFB  $E3             ;;get-mem-3             retrieve final guess.
          DEFB  $38             ;;end-calc              sqr x.

          jp __FPSTACK_POP
          
END ASM
END FUNCTION


FUNCTION FASTCALL SQRT16(radicand as uInteger) as uByte
asm
    XOR A
    AND A
    
    ld  a,l
    ld  l,h
    ld    de,0040h    ; 40h appends "01" to D
    ld    h,d

    ld b,7
sqrt16loop:
    sbc    hl,de        ; IF speed is critical, and you don't mind spending the extra bytes, you could unroll this loop 7 times instead of DJNZ.
    jr    nc,$+3    
    add    hl,de        
    ccf            
    rl    d        
    rla            
    adc    hl,hl        
    rla            
    adc    hl,hl        
    DJNZ sqrt16loop
    
    sbc    hl,de        ; optimised last iteration
    ccf
    rl    d

    ld a,d
end asm
END FUNCTION


FUNCTION t AS ULONG
   RETURN INT((65536 * PEEK (23674) + 256 * PEEK(23673) + PEEK (23672)))
END FUNCTION

CLS

DIM a,b as float
DIM i as uInteger
DIM time as long

PRINT "ROM","FAST"

REM show it's as accurate
for i=1 to 15
    LET a=rnd * 32768
    PRINT SQR(a),SQRT(a)
next i

PRINT
PRINT "Over 500 Cycles:"
PRINT

REM ROM version 500 times.
LET time=t()
for i=1 to 500
    b=SQR(a)
next i
PRINT "Rom routine: ";t()-time;" Frames."

REM MY version 500 times.
LET time=t()
for i=1 to 500
    b=SQRT(a)
next i
PRINT "Fast routine: ";t()-time;" Frames."

PRINT
PRINT "PRESS A KEY"

PAUSE 1: PAUSE 0

CLS

PRINT "NUM   FAST      INTEGER"

REM show it's as accurate
for i=1 to 15
    LET a=INT(rnd * 32768)
    PRINT a;TAB 6;SQRT(a);TAB 16;SQRT16(a)
next i

PRINT
PRINT "Over 500 Cycles:"
PRINT

REM MY version 500 times.
LET time=t()
for i=1 to 500
    b=SQRT(a)
next i
PRINT "Fast routine: ";t()-time;" Frames."

REM MY Integer version 500 times.
LET time=t()
for i=1 to 500
    b=SQRT16(a)
next i
PRINT "Integer routine: ";t()-time;" Frames."
Reply


Messages In This Thread

Forum Jump:


Users browsing this thread: 1 Guest(s)