Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Not so much a bug, as slow!
#1
This program was posted on the WOS forums:

Code:
For x=-100 To 100
For y=-100 To 100
If (x/2-25)*(x/2-25)+(y-50)*(y-50)<200 Or (x/2+25)*(x/2+25)+(y-50)*(y-50)<200 then plot x+100,96-y
Next y
Next x

And if run as is, takes about 9 seconds in Basic, I think.

As ZX Basic:

Code:
FUNCTION t() as uLong
asm
    DI
    LD DE,(23674)
    LD D,0
    LD HL,(23672)
    EI
end asm
end function

DIM x,y as integer
DIM time as uLONG

time=t()

For x=-100 To 100
For y=-100 To 100
If (x/2-25)*(x/2-25)+(y-50)*(y-50)<200 Or (x/2+25)*(x/2+25)+(y-50)*(y-50)<200 then plot x+100,96-y
END IF
Next y
Next x  

time=t()-time
print CAST(float,time)/50

It takes 53 seconds!
Reply
#2
Okay. Fairly convinced there's a bug here.

zxb version 1.3.0-S928

Here's an expanded program in full - was looking at perhaps faster squaring?

Anyway, the retruned value is weird.

A statement of: print x;" ";fastSquares(ABS x);" = ";CAST(uLong, x)*x ;"[]"

Comes back lines like

-61 3721 = 3721 [] 0 []

There's no way it should print anything after it's confirmed the function does the same thing as the X*X calculation. No [] twice. (I put that on the end because it was adding a bunch of zeroes instead without it!)

Code:
FUNCTION t() as uLong
asm
    DI
    LD DE,(23674)
    LD D,0
    LD HL,(23672)
    EI
end asm
end function


FUNCTION FASTCALL fastSquares (number as Byte) as uInteger
ASM
; Calculates (number^2) - does this by the fastest (and most size expensive) method - lookup table.
                        ; This function arrives with "number" - a byte in register A
AND A                   ; Set flags based on A
BIT 7,a                 ; Check if this is a negative number
JR Z,fastSquaresStart ; No? We jump into main routine.
NEG                     ; Negate A to make it positive.

fastSquaresStart:    
ADD A,A                 ; Think of this as A=A+A - that is it doubles the value in A (Because each entry in our table is 2 bytes long)
LD C,A                  ; LET C=A
LD B,0                  ; LET B=0
LD HL,SquaresTable  ; LET HL=address Of Table of HalfSquares.
ADD HL,BC               ; ADD the value in BC to the value in HL - means that HL is the address of the answer result.
LD A,(HL)               ; LET A=PEEK (HL)
INC HL                  ; HL=HL+1
LD L,(HL)               ; LET L=PEEK (HL)
LD H,A                  ; LET H=A
RET                     ; Return - with the answere in HL (High byte in H, Low Byte in L)

;table of (a^2)

SquaresTable:
DEFB 0,0   ;0^2=0
DEFB 0,1   ;1^2=1
DEFB 0,4   ;2^2=4
DEFB 0,9   ;3^2=9
DEFB 0,16   ;4^2=16
DEFB 0,25   ;5^2=25
DEFB 0,36   ;6^2=36
DEFB 0,49   ;7^2=49
DEFB 0,64   ;8^2=64
DEFB 0,81   ;9^2=81
DEFB 0,100   ;10^2=100
DEFB 0,121   ;11^2=121
DEFB 0,144   ;12^2=144
DEFB 0,169   ;13^2=169
DEFB 0,196   ;14^2=196
DEFB 0,225   ;15^2=225
DEFB 1,0   ;16^2=256
DEFB 1,33   ;17^2=289
DEFB 1,68   ;18^2=324
DEFB 1,105   ;19^2=361
DEFB 1,144   ;20^2=400
DEFB 1,185   ;21^2=441
DEFB 1,228   ;22^2=484
DEFB 2,17   ;23^2=529
DEFB 2,64   ;24^2=576
DEFB 2,113   ;25^2=625
DEFB 2,164   ;26^2=676
DEFB 2,217   ;27^2=729
DEFB 3,16   ;28^2=784
DEFB 3,73   ;29^2=841
DEFB 3,132   ;30^2=900
DEFB 3,193   ;31^2=961
DEFB 4,0   ;32^2=1024
DEFB 4,65   ;33^2=1089
DEFB 4,132   ;34^2=1156
DEFB 4,201   ;35^2=1225
DEFB 5,16   ;36^2=1296
DEFB 5,89   ;37^2=1369
DEFB 5,164   ;38^2=1444
DEFB 5,241   ;39^2=1521
DEFB 6,64   ;40^2=1600
DEFB 6,145   ;41^2=1681
DEFB 6,228   ;42^2=1764
DEFB 7,57   ;43^2=1849
DEFB 7,144   ;44^2=1936
DEFB 7,233   ;45^2=2025
DEFB 8,68   ;46^2=2116
DEFB 8,161   ;47^2=2209
DEFB 9,0   ;48^2=2304
DEFB 9,97   ;49^2=2401
DEFB 9,196   ;50^2=2500
DEFB 10,41   ;51^2=2601
DEFB 10,144   ;52^2=2704
DEFB 10,249   ;53^2=2809
DEFB 11,100   ;54^2=2916
DEFB 11,209   ;55^2=3025
DEFB 12,64   ;56^2=3136
DEFB 12,177   ;57^2=3249
DEFB 13,36   ;58^2=3364
DEFB 13,153   ;59^2=3481
DEFB 14,16   ;60^2=3600
DEFB 14,137   ;61^2=3721
DEFB 15,4   ;62^2=3844
DEFB 15,129   ;63^2=3969
DEFB 16,0   ;64^2=4096
DEFB 16,129   ;65^2=4225
DEFB 17,4   ;66^2=4356
DEFB 17,137   ;67^2=4489
DEFB 18,16   ;68^2=4624
DEFB 18,153   ;69^2=4761
DEFB 19,36   ;70^2=4900
DEFB 19,177   ;71^2=5041
DEFB 20,64   ;72^2=5184
DEFB 20,209   ;73^2=5329
DEFB 21,100   ;74^2=5476
DEFB 21,249   ;75^2=5625
DEFB 22,144   ;76^2=5776
DEFB 23,41   ;77^2=5929
DEFB 23,196   ;78^2=6084
DEFB 24,97   ;79^2=6241
DEFB 25,0   ;80^2=6400
DEFB 25,161   ;81^2=6561
DEFB 26,68   ;82^2=6724
DEFB 26,233   ;83^2=6889
DEFB 27,144   ;84^2=7056
DEFB 28,57   ;85^2=7225
DEFB 28,228   ;86^2=7396
DEFB 29,145   ;87^2=7569
DEFB 30,64   ;88^2=7744
DEFB 30,241   ;89^2=7921
DEFB 31,164   ;90^2=8100
DEFB 32,89   ;91^2=8281
DEFB 33,16   ;92^2=8464
DEFB 33,201   ;93^2=8649
DEFB 34,132   ;94^2=8836
DEFB 35,65   ;95^2=9025
DEFB 36,0   ;96^2=9216
DEFB 36,193   ;97^2=9409
DEFB 37,132   ;98^2=9604
DEFB 38,73   ;99^2=9801
DEFB 39,16   ;100^2=10000
DEFB 39,217   ;101^2=10201
DEFB 40,164   ;102^2=10404
DEFB 41,113   ;103^2=10609
DEFB 42,64   ;104^2=10816
DEFB 43,17   ;105^2=11025
DEFB 43,228   ;106^2=11236
DEFB 44,185   ;107^2=11449
DEFB 45,144   ;108^2=11664
DEFB 46,105   ;109^2=11881
DEFB 47,68   ;110^2=12100
DEFB 48,33   ;111^2=12321
DEFB 49,0   ;112^2=12544
DEFB 49,225   ;113^2=12769
DEFB 50,196   ;114^2=12996
DEFB 51,169   ;115^2=13225
DEFB 52,144   ;116^2=13456
DEFB 53,121   ;117^2=13689
DEFB 54,100   ;118^2=13924
DEFB 55,81   ;119^2=14161
DEFB 56,64   ;120^2=14400
DEFB 57,49   ;121^2=14641
DEFB 58,36   ;122^2=14884
DEFB 59,25   ;123^2=15129
DEFB 60,16   ;124^2=15376
DEFB 61,9   ;125^2=15625
DEFB 62,4   ;126^2=15876
DEFB 63,1   ;127^2=16129
DEFB 64,0   ;128^2=16384
DEFB 65,1   ;129^2=16641
DEFB 66,4   ;130^2=16900
DEFB 67,9   ;131^2=17161
DEFB 68,16   ;132^2=17424
DEFB 69,25   ;133^2=17689
DEFB 70,36   ;134^2=17956
DEFB 71,49   ;135^2=18225
DEFB 72,64   ;136^2=18496
DEFB 73,81   ;137^2=18769
DEFB 74,100   ;138^2=19044
DEFB 75,121   ;139^2=19321
DEFB 76,144   ;140^2=19600
DEFB 77,169   ;141^2=19881
DEFB 78,196   ;142^2=20164
DEFB 79,225   ;143^2=20449
DEFB 81,0   ;144^2=20736
DEFB 82,33   ;145^2=21025
DEFB 83,68   ;146^2=21316
DEFB 84,105   ;147^2=21609
DEFB 85,144   ;148^2=21904
DEFB 86,185   ;149^2=22201
DEFB 87,228   ;150^2=22500
DEFB 89,17   ;151^2=22801
DEFB 90,64   ;152^2=23104
DEFB 91,113   ;153^2=23409
DEFB 92,164   ;154^2=23716
DEFB 93,217   ;155^2=24025
DEFB 95,16   ;156^2=24336
DEFB 96,73   ;157^2=24649
DEFB 97,132   ;158^2=24964
DEFB 98,193   ;159^2=25281
DEFB 100,0   ;160^2=25600
DEFB 101,65   ;161^2=25921
DEFB 102,132   ;162^2=26244
DEFB 103,201   ;163^2=26569
DEFB 105,16   ;164^2=26896
DEFB 106,89   ;165^2=27225
DEFB 107,164   ;166^2=27556
DEFB 108,241   ;167^2=27889
DEFB 110,64   ;168^2=28224
DEFB 111,145   ;169^2=28561
DEFB 112,228   ;170^2=28900
DEFB 114,57   ;171^2=29241
DEFB 115,144   ;172^2=29584
DEFB 116,233   ;173^2=29929
DEFB 118,68   ;174^2=30276
DEFB 119,161   ;175^2=30625
DEFB 121,0   ;176^2=30976
DEFB 122,97   ;177^2=31329
DEFB 123,196   ;178^2=31684
DEFB 125,41   ;179^2=32041
DEFB 126,144   ;180^2=32400
DEFB 127,249   ;181^2=32761
DEFB 129,100   ;182^2=33124
DEFB 130,209   ;183^2=33489
DEFB 132,64   ;184^2=33856
DEFB 133,177   ;185^2=34225
DEFB 135,36   ;186^2=34596
DEFB 136,153   ;187^2=34969
DEFB 138,16   ;188^2=35344
DEFB 139,137   ;189^2=35721
DEFB 141,4   ;190^2=36100
DEFB 142,129   ;191^2=36481
DEFB 144,0   ;192^2=36864
DEFB 145,129   ;193^2=37249
DEFB 147,4   ;194^2=37636
DEFB 148,137   ;195^2=38025
DEFB 150,16   ;196^2=38416
DEFB 151,153   ;197^2=38809
DEFB 153,36   ;198^2=39204
DEFB 154,177   ;199^2=39601
DEFB 156,64   ;200^2=40000
DEFB 157,209   ;201^2=40401
DEFB 159,100   ;202^2=40804
DEFB 160,249   ;203^2=41209
DEFB 162,144   ;204^2=41616
DEFB 164,41   ;205^2=42025
DEFB 165,196   ;206^2=42436
DEFB 167,97   ;207^2=42849
DEFB 169,0   ;208^2=43264
DEFB 170,161   ;209^2=43681
DEFB 172,68   ;210^2=44100
DEFB 173,233   ;211^2=44521
DEFB 175,144   ;212^2=44944
DEFB 177,57   ;213^2=45369
DEFB 178,228   ;214^2=45796
DEFB 180,145   ;215^2=46225
DEFB 182,64   ;216^2=46656
DEFB 183,241   ;217^2=47089
DEFB 185,164   ;218^2=47524
DEFB 187,89   ;219^2=47961
DEFB 189,16   ;220^2=48400
DEFB 190,201   ;221^2=48841
DEFB 192,132   ;222^2=49284
DEFB 194,65   ;223^2=49729
DEFB 196,0   ;224^2=50176
DEFB 197,193   ;225^2=50625
DEFB 199,132   ;226^2=51076
DEFB 201,73   ;227^2=51529
DEFB 203,16   ;228^2=51984
DEFB 204,217   ;229^2=52441
DEFB 206,164   ;230^2=52900
DEFB 208,113   ;231^2=53361
DEFB 210,64   ;232^2=53824
DEFB 212,17   ;233^2=54289
DEFB 213,228   ;234^2=54756
DEFB 215,185   ;235^2=55225
DEFB 217,144   ;236^2=55696
DEFB 219,105   ;237^2=56169
DEFB 221,68   ;238^2=56644
DEFB 223,33   ;239^2=57121
DEFB 225,0   ;240^2=57600
DEFB 226,225   ;241^2=58081
DEFB 228,196   ;242^2=58564
DEFB 230,169   ;243^2=59049
DEFB 232,144   ;244^2=59536
DEFB 234,121   ;245^2=60025
DEFB 236,100   ;246^2=60516
DEFB 238,81   ;247^2=61009
DEFB 240,64   ;248^2=61504
DEFB 242,49   ;249^2=62001
DEFB 244,36   ;250^2=62500
DEFB 246,25   ;251^2=63001
DEFB 248,16   ;252^2=63504
DEFB 250,9   ;253^2=64009
DEFB 252,4   ;254^2=64516
DEFB 254,1   ;255^2=65025
END ASM
END FUNCTION

DIM x,y as integer
DIM time as uLONG

'time=t()
'
For x=-100 To 100
For y=-100 To 100
'If ((x<<1)-25)*((x<<1)-25)+(y-50)*(y-50)<200 Or ((x<<1)+25)*((x<<1)+25)+(y-50)*(y-50)<200 then plot x+100,96-y
'END IF
print x;" ";fastSquares(ABS x);"  =  ";CAST(uLong, x)*x ;"[]"
Next y
Next x  

'time=t()-time
'print CAST(float,time)/50
Reply
#3
As an addendum - if I ask it to calculate X^2 in the middle of the loop, it crashes the program. Calculating X*X works fine.
Reply
#4
britlion Wrote:This program was posted on the WOS forums:

Code:
For x=-100 To 100
For y=-100 To 100
If (x/2-25)*(x/2-25)+(y-50)*(y-50)<200 Or (x/2+25)*(x/2+25)+(y-50)*(y-50)<200 then plot x+100,96-y
Next y
Next x

And if run as is, takes about 9 seconds in Basic, I think.
I've run the above code in Sinclair BASIC, and it takes 688secs. (11min. 20secs. aprox.)
[Image: slowrun.th.png]

This is the listing:
[Image: slowlisting.th.png]

In ZX Basic, by default it uses Byte type, it overflows, and produces a random pattern and an Out of Screen error at the end.
When declaring float, it works, but it takes *twice* the time it does with Sinclair BASIC. This is something I need to investigate further.
So using Integer is the way to go, and it takes about a minute (52 secs, as you said).

From my point of view, this is okay. There is some optimizations to be done there (e.g. common factors, etc.). I hope to have some more time to check for this.
Reply
#5
britlion Wrote:Okay. Fairly convinced there's a bug here.

zxb version 1.3.0-S928

Here's an expanded program in full - was looking at perhaps faster squaring?

Anyway, the retruned value is weird.

A statement of: print x;" ";fastSquares(ABS x);" = ";CAST(uLong, x)*x ;"[]"

Comes back lines like

-61 3721 = 3721 [] 0 []

There's no way it should print anything after it's confirmed the function does the same thing as the X*X calculation. No [] twice. (I put that on the end because it was adding a bunch of zeroes instead without it!)
This is also ok. It happens that when the number changes from 6 digits to 4, what you see is the remaining digits from previous calculation.
Ensure that line erases previous results by adding extra spaces after "[]", this way:
Code:
print x;" ";fastSquares(ABS x);"  =  ";CAST(uLong, x)*x ;"[]    " ' <= 4 spaces is enough
Reply
#6
britlion Wrote:As an addendum - if I ask it to calculate X^2 in the middle of the loop, it crashes the program. Calculating X*X works fine.
This is expected, unfortunately: ZX Basic uses the ROM, and a^b raises Invalid Argument if a < 0.
Reply
#7
boriel Wrote:
britlion Wrote:As an addendum - if I ask it to calculate X^2 in the middle of the loop, it crashes the program. Calculating X*X works fine.
This is expected, unfortunately: ZX Basic uses the ROM, and a^b raises Invalid Argument if a < 0.

Wow. You're right. The ROM is completely wrong there. (-1 ^ 2 = 1). No, that's not your fault!
Reply
#8
Anyway, it's very strange that it takes *twice* the time when using FP comparing to Sinclair BASIC. It should take about the same. Still checking... :oops:
Reply
#9
Code:
asm
jr ZXBASICBorielVersionEnd
db "ZX Boriel BASIC version 1.3.0-s924"
ZXBASICBorielVersionEnd:
end asm


DIM x AS FLOAT
DIM y AS FLOAT
'DIM x AS INTEGER
'DIM y AS INTEGER

POKE 23674,0: POKE 23673,0: POKE 23672,0
FOR x=-100 TO 100
FOR y=-100 TO 100
IF (x/2-25)*(x/2-25)+(y-50)*(y-50)<200 OR (x/2+25)*(x/2+25)+(y-50)*(y-50)<200 THEN
PLOT x+100,96-y ' zxb version
'PLOT x+100,y-100 ' original BASIC version
END IF
NEXT y: NEXT x
PRINT (65536*PEEK 23674+256*PEEK 23673+PEEK 23672)/50

This program gives me:

Original BASIC
2040.34

ZX Boriel BASIC version 1.3.0-s924
1477

ZX Boriel BASIC version 1.3.0-s967
1491

ZX Boriel BASIC version 1.3.0-s979
1491
Reply
#10
There was a bug in the PNG listing in both the POKE statement and in the PEEK argument. 23674 got written as 26374.
Reply
#11
Darkstar Wrote:There was a bug in the PNG listing in both the POKE statement and in the PEEK argument. 23674 got written as 26374.
Confusedhock: OMG!!
I overlooked that!! Thank you. Smile
Anyway, I've found a floating point library. It is faster, but a bit less precise (it's like the traditional Ansi C IEEE FP library, 4 bytes per float, not 5). I guess I can implement it, and will be much faster, a bit less precise (but enough for most purposes), and take more memory. But more important: it will allow FP calculations in many Z80 machines, like SNES not having FP Rom routines.
Reply
#12
boriel Wrote:
Darkstar Wrote:There was a bug in the PNG listing in both the POKE statement and in the PEEK argument. 23674 got written as 26374.
Confusedhock: OMG!!
I overlooked that!! Thank you. Smile
Anyway, I've found a floating point library. It is faster, but a bit less precise (it's like the traditional Ansi C IEEE FP library, 4 bytes per float, not 5). I guess I can implement it, and will be much faster, a bit less precise (but enough for most purposes), and take more memory. But more important: it will allow FP calculations in many Z80 machines, like SNES not having FP Rom routines.

Just another one of those silly bugs that can always creep in.

Thanks for letting us know about the Ansi C IEEE FP library.
I do hope that the memory usage of it will not be more than the FLOAT versions that are now in use (s924 and older) and then I mean when it's compiled
and also dynamic run time usage. As you know you canĀ“t rely on the ZX ROM for other platforms so this will have to get implemented sooner or later.
But this will break compatibilty with the original ZX basic and it seems that the current version of FLOAT is somewhat faster than the original interperted
version, so why not keep it in the ZXB and add a SINGLE 32bit datatype that uses the new Ansi C IEEE FP library? Then the old FLOAT library will be accessible
through the --sinclair option and only pulled in if you define a variable as a FLOAT. In the end, mabe all Sinclair specific command will be used that way
like the SAVE command that will not work on the SNES and hardware specific commands like PRINT (screen addressing issues). This way comapitbillty
can be ensured while generic routines like the Ansi C IEEE FP library and commands like GOSUB can be kept the same across all individual platforms. That
reminds me that the FOR/NEXT routines are still not in accord with BASIC standars but with C instead as far as I know. I think it was in DO/LOOP but I
am not sure that the zxb compiler did a complex test to see if it had reached the exit condition when a CP 0 would have been fine.
Reply
#13
boriel Wrote:Anyway, I've found a floating point library. It is faster, but a bit less precise (it's like the traditional Ansi C IEEE FP library, 4 bytes per float, not 5). I guess I can implement it, and will be much faster, a bit less precise (but enough for most purposes), and take more memory.

In case the FP library mentioned above was not added to ZX BASIC yet, I would like to provide my 2 cents about this idea...

It seems ZX BASIC is used almost exclusively for games, assuming this list is accurate enough:

<!-- m --><a class="postlink" href="http://www.boriel.com/wiki/en/index.php/ZX_BASIC:Released_Programs">http://www.boriel.com/wiki/en/index.php ... d_Programs</a><!-- m -->

Notice that games almost never use FP for performance reasons, except to calculate INT (RND * n) typically. Also it's important for games to save as much memory as possible. Because of this, my suggestions are:
  • Change the ZX BASIC builder so it will only include the FP library if the program really uses FP somehow;


  • To avoid INT(RND * n) from forcing the builder to include the FP library (and probably also to optimize compiled code), change RND to accept an optional parameter, such that using RND(n) will return an UINTEGER value between 0 and n-1, and just using RND without parameters will return a FLOAT value as before. Alternatively, if this dual behavior would be a problem for the parser, then provide instead a new function INTRND(n) that works as I described here.

Makes sense?
Reply
#14
einar Wrote:
  • Change the ZX BASIC builder so it will only include the FP library if the program really uses FP somehow;

  • To avoid INT(RND * n) from forcing the builder to include the FP library (and probably also to optimize compiled code), change RND to accept an optional parameter, such that using RND(n) will return an UINTEGER value between 0 and n-1, and just using RND without parameters will return a FLOAT value as before. Alternatively, if this dual behavior would be a problem for the parser, then provide instead a new function INTRND(n) that works as I described here.

Makes sense?
It does! Wink
ZX Basic already does that. Not only it does not include any FP library if not used, but only includes the minimum code for the required functions (there is some #include and #require in the .asm libraries for that). When using a FP library not in ROM, the same will be done, but now the FP functions will take much more RAM (e.g. SIN, COS, etc).
Reply
#15
boriel Wrote:ZX Basic already does that.

Great! Smile

But does it also mean that INT(RND * n) is optimized such that it doesn't really use FP or even integer multiplication?

My concern is that, although RND implementation in ZX BASIC is really fast now (I believe it's now using Patrik Rak's implementation, right?), if it can only be accessed using FP calculations, it would be quite inefficient.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)