ZX BASIC:ISqrt.bas

From BorielWiki
Jump to: navigation, search

An Integer square root is the nearest whole number smaller than the full square root answer. So the integer square root of 10 is 3 instead of 3.162277. You'd get the same answer as INT(SQR(x)) with an integer square root function.

For things like games programming, this is often near enough - for example, the distance formula, based on pythagoras' equation A^2=B^2+C^2 only works if you square root the answer. If you need to find the distances between your items, then you're going to be doing a lot of square roots, and you're going to need to do them FAST.

This function returns integer square roots. For numbers less than 65536, it's about 100 times faster, because it can do 16 bit calculation. For longer numbers, it has to do 32 bit calculations, which are less than optimal on an 8 bit processor! It's still about 50 times faster than the ROM routine, however.

If you want completely accurate results, you should use the floating point fast routine over at fSqrt.bas.

 
FUNCTION FASTCALL iSqrt (num AS ULONG) AS UINTEGER
REM incoming is DEHL
REM output is HL
 
ASM
LD A,D
OR E
JP Z, sqrtLF16bit ; we're inside a 16 bit number. We can use the faster version.
 
LD b,16 ; b times round
EXX ; Out TO root AND REM - we're doing most of this in alternate registers.
LD DE,0
LD HL,0 ; DEHL = REMainder
LD BC,0 ; BC = root
EXX   ;back TO num AND LOOP
sqrtLFasmloop:
EXX  ; out TO root AND REM
 
SLA  C ; root <<= 1
RL  B   ;
 
SLA L ; REM=rem<<1
RL  H  ;
RL  E    ;
RL  D     ;
 
  
SLA L ; REM=rem<<1
RL  H  ;
RL  E    ;
RL  D     ;
EXX  ; back TO Num AND LOOP 
 
LD a,d    ; A = inputnum>>30
AND 192
RLCA
RLCA
 
SLA  L ; num <<= 1
RL  H
RL  E
RL  D
 
SLA  L ; num <<= 1
RL  H
RL  E
RL  D
 
EXX  ; out TO root AND REM
 
  ADD A,L     ; a=a+L              ; REM=REM+num>>30
  LD L,A      ; a-> L               ;
  JR NC, sqrtLFasmloophop1           ;
  INC H
  JR NC, sqrtLFasmloophop1
  INC DE           ;                               ;
 
sqrtLFasmloophop1:
  INC BC                       ; root=root+1
sqrtLFasmloophop2:
 
  ; DEHL = REMainder
  ; BC = root
  
  ; IF REM >= root then
  LD A,D
  OR E
  JR NZ, sqrtLFasmthen ; IF REM > 65535 then rem is definitely > root and we go to true
  
  LD A, H
  CP B
  JR C, sqrtLFasmelse ; H<B - that is REM<root so rem>=root is false and we go to else 
  JR NZ, sqrtLFasmthen ; H isn't zero though, so we could do a carry from it, so we're good to say HL is larger.
  
  ; IF h is out, THEN it's down to L and C 
  LD A,L
  CP C
  JR C, sqrtLFasmelse ; L<C - that is REM<root so rem>=root is false and we go to else
  ; must be true - GO TO true.
      
  sqrtLFasmthen:
  ;REMainder=remainder-root
  AND A ; clear carry flag
  SBC HL,BC ; take root away from the lower half of REM.
  JP NC, sqrtLFasmhop3 ; we didn't take away too much, so we're okay to loop round.
 
  ; IF we're here, we did take away too much. We need to borrow from DE
  DEC DE ; borrow off DE
  
  sqrtLFasmhop3:
  INC BC ;root=root+1
  JP sqrtLFasmloopend
                  
  ;else 
  sqrtLFasmelse:
  DEC BC ;root=root-1
  ;end IF
  
sqrtLFasmloopend:
  
EXX  ; back TO num
  DJNZ sqrtLFasmloop
  
EXX ; out TO root AND REM
  PUSH BC
EXX ; back TO normal
POP HL
   
    
SRA  H
RES 7,H
RR  L       ; Hl=HL/2 - root/2 is the answer.
jr sqrtLFexitFunction
 
sqrtLF16bit:
 
ld  a,l
ld  l,h
ld   de,0040h   ; 40h appends "01" TO D
ld   h,d
ld b,7
sqrtLFsqrt16loop:
  sbc   hl,de      ; IF speed is critical, AND you don't mind spending the extra bytes,
                   ; you could unroll this LOOP 7 times instead of DJNZ.
  
  ; deprecated because of issues - jr nc,$+3 (note that IF you unroll this LOOP, you'll need 7 labels for the jumps the other way!)
  jr    nc,sqrtLFsqrthop1   
  add   hl,de      
 
sqrtLFsqrthop1:      
  ccf         
  rl    d      
  rla         
  adc   hl,hl      
  rla         
  adc   hl,hl      
DJNZ sqrtLFsqrt16loop
      
sbc   hl,de      ; optimised last iteration
ccf
rl   d
ld h,0
ld l,d 
ld de,0
sqrtLFexitFunction:   
END ASM
 
END FUNCTION
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox
Translate
Google AdSense