Not so much a bug, as slow!

Posts: 766

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Not so much a bug, as slow!

This program was posted on the WOS forums:

Code:
`For x=-100 To 100For y=-100 To 100If (x/2-25)*(x/2-25)+(y-50)*(y-50)<200 Or (x/2+25)*(x/2+25)+(y-50)*(y-50)<200 then plot x+100,96-yNext yNext x  `

And if run as is, takes about 9 seconds in Basic, I think.

As ZX Basic:

Code:
`FUNCTION t() as uLongasm    DI    LD DE,(23674)    LD D,0    LD HL,(23672)    EIend asmend functionDIM x,y as integerDIM time as uLONGtime=t()For x=-100 To 100For y=-100 To 100If (x/2-25)*(x/2-25)+(y-50)*(y-50)<200 Or (x/2+25)*(x/2+25)+(y-50)*(y-50)<200 then plot x+100,96-yEND IFNext yNext x  time=t()-timeprint CAST(float,time)/50`

It takes 53 seconds!

Posts: 766

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Re: Not so much a bug, as slow!

Okay. Fairly convinced there's a bug here.

zxb version 1.3.0-S928

Here's an expanded program in full - was looking at perhaps faster squaring?

Anyway, the retruned value is weird.

A statement of: print x;" ";fastSquares(ABS x);" = ";CAST(uLong, x)*x ;"[]"

Comes back lines like

-61 3721 = 3721 [] 0 []

There's no way it should print anything after it's confirmed the function does the same thing as the X*X calculation. No [] twice. (I put that on the end because it was adding a bunch of zeroes instead without it!)

Code:
`FUNCTION t() as uLongasm    DI    LD DE,(23674)    LD D,0    LD HL,(23672)    EIend asmend functionFUNCTION FASTCALL fastSquares (number as Byte) as uIntegerASM; Calculates (number^2) - does this by the fastest (and most size expensive) method - lookup table.                        ; This function arrives with "number" - a byte in register AAND A                   ; Set flags based on ABIT 7,a                 ; Check if this is a negative numberJR Z,fastSquaresStart ; No? We jump into main routine.NEG                     ; Negate A to make it positive.fastSquaresStart:     ADD A,A                 ; Think of this as A=A+A - that is it doubles the value in A (Because each entry in our table is 2 bytes long)LD C,A                  ; LET C=ALD B,0                  ; LET B=0LD HL,SquaresTable  ; LET HL=address Of Table of HalfSquares.ADD HL,BC               ; ADD the value in BC to the value in HL - means that HL is the address of the answer result.LD A,(HL)               ; LET A=PEEK (HL)INC HL                  ; HL=HL+1LD L,(HL)               ; LET L=PEEK (HL)LD H,A                  ; LET H=ARET                     ; Return - with the answere in HL (High byte in H, Low Byte in L);table of (a^2) SquaresTable:DEFB 0,0   ;0^2=0DEFB 0,1   ;1^2=1DEFB 0,4   ;2^2=4DEFB 0,9   ;3^2=9DEFB 0,16   ;4^2=16DEFB 0,25   ;5^2=25DEFB 0,36   ;6^2=36DEFB 0,49   ;7^2=49DEFB 0,64   ;8^2=64DEFB 0,81   ;9^2=81DEFB 0,100   ;10^2=100DEFB 0,121   ;11^2=121DEFB 0,144   ;12^2=144DEFB 0,169   ;13^2=169DEFB 0,196   ;14^2=196DEFB 0,225   ;15^2=225DEFB 1,0   ;16^2=256DEFB 1,33   ;17^2=289DEFB 1,68   ;18^2=324DEFB 1,105   ;19^2=361DEFB 1,144   ;20^2=400DEFB 1,185   ;21^2=441DEFB 1,228   ;22^2=484DEFB 2,17   ;23^2=529DEFB 2,64   ;24^2=576DEFB 2,113   ;25^2=625DEFB 2,164   ;26^2=676DEFB 2,217   ;27^2=729DEFB 3,16   ;28^2=784DEFB 3,73   ;29^2=841DEFB 3,132   ;30^2=900DEFB 3,193   ;31^2=961DEFB 4,0   ;32^2=1024DEFB 4,65   ;33^2=1089DEFB 4,132   ;34^2=1156DEFB 4,201   ;35^2=1225DEFB 5,16   ;36^2=1296DEFB 5,89   ;37^2=1369DEFB 5,164   ;38^2=1444DEFB 5,241   ;39^2=1521DEFB 6,64   ;40^2=1600DEFB 6,145   ;41^2=1681DEFB 6,228   ;42^2=1764DEFB 7,57   ;43^2=1849DEFB 7,144   ;44^2=1936DEFB 7,233   ;45^2=2025DEFB 8,68   ;46^2=2116DEFB 8,161   ;47^2=2209DEFB 9,0   ;48^2=2304DEFB 9,97   ;49^2=2401DEFB 9,196   ;50^2=2500DEFB 10,41   ;51^2=2601DEFB 10,144   ;52^2=2704DEFB 10,249   ;53^2=2809DEFB 11,100   ;54^2=2916DEFB 11,209   ;55^2=3025DEFB 12,64   ;56^2=3136DEFB 12,177   ;57^2=3249DEFB 13,36   ;58^2=3364DEFB 13,153   ;59^2=3481DEFB 14,16   ;60^2=3600DEFB 14,137   ;61^2=3721DEFB 15,4   ;62^2=3844DEFB 15,129   ;63^2=3969DEFB 16,0   ;64^2=4096DEFB 16,129   ;65^2=4225DEFB 17,4   ;66^2=4356DEFB 17,137   ;67^2=4489DEFB 18,16   ;68^2=4624DEFB 18,153   ;69^2=4761DEFB 19,36   ;70^2=4900DEFB 19,177   ;71^2=5041DEFB 20,64   ;72^2=5184DEFB 20,209   ;73^2=5329DEFB 21,100   ;74^2=5476DEFB 21,249   ;75^2=5625DEFB 22,144   ;76^2=5776DEFB 23,41   ;77^2=5929DEFB 23,196   ;78^2=6084DEFB 24,97   ;79^2=6241DEFB 25,0   ;80^2=6400DEFB 25,161   ;81^2=6561DEFB 26,68   ;82^2=6724DEFB 26,233   ;83^2=6889DEFB 27,144   ;84^2=7056DEFB 28,57   ;85^2=7225DEFB 28,228   ;86^2=7396DEFB 29,145   ;87^2=7569DEFB 30,64   ;88^2=7744DEFB 30,241   ;89^2=7921DEFB 31,164   ;90^2=8100DEFB 32,89   ;91^2=8281DEFB 33,16   ;92^2=8464DEFB 33,201   ;93^2=8649DEFB 34,132   ;94^2=8836DEFB 35,65   ;95^2=9025DEFB 36,0   ;96^2=9216DEFB 36,193   ;97^2=9409DEFB 37,132   ;98^2=9604DEFB 38,73   ;99^2=9801DEFB 39,16   ;100^2=10000DEFB 39,217   ;101^2=10201DEFB 40,164   ;102^2=10404DEFB 41,113   ;103^2=10609DEFB 42,64   ;104^2=10816DEFB 43,17   ;105^2=11025DEFB 43,228   ;106^2=11236DEFB 44,185   ;107^2=11449DEFB 45,144   ;108^2=11664DEFB 46,105   ;109^2=11881DEFB 47,68   ;110^2=12100DEFB 48,33   ;111^2=12321DEFB 49,0   ;112^2=12544DEFB 49,225   ;113^2=12769DEFB 50,196   ;114^2=12996DEFB 51,169   ;115^2=13225DEFB 52,144   ;116^2=13456DEFB 53,121   ;117^2=13689DEFB 54,100   ;118^2=13924DEFB 55,81   ;119^2=14161DEFB 56,64   ;120^2=14400DEFB 57,49   ;121^2=14641DEFB 58,36   ;122^2=14884DEFB 59,25   ;123^2=15129DEFB 60,16   ;124^2=15376DEFB 61,9   ;125^2=15625DEFB 62,4   ;126^2=15876DEFB 63,1   ;127^2=16129DEFB 64,0   ;128^2=16384DEFB 65,1   ;129^2=16641DEFB 66,4   ;130^2=16900DEFB 67,9   ;131^2=17161DEFB 68,16   ;132^2=17424DEFB 69,25   ;133^2=17689DEFB 70,36   ;134^2=17956DEFB 71,49   ;135^2=18225DEFB 72,64   ;136^2=18496DEFB 73,81   ;137^2=18769DEFB 74,100   ;138^2=19044DEFB 75,121   ;139^2=19321DEFB 76,144   ;140^2=19600DEFB 77,169   ;141^2=19881DEFB 78,196   ;142^2=20164DEFB 79,225   ;143^2=20449DEFB 81,0   ;144^2=20736DEFB 82,33   ;145^2=21025DEFB 83,68   ;146^2=21316DEFB 84,105   ;147^2=21609DEFB 85,144   ;148^2=21904DEFB 86,185   ;149^2=22201DEFB 87,228   ;150^2=22500DEFB 89,17   ;151^2=22801DEFB 90,64   ;152^2=23104DEFB 91,113   ;153^2=23409DEFB 92,164   ;154^2=23716DEFB 93,217   ;155^2=24025DEFB 95,16   ;156^2=24336DEFB 96,73   ;157^2=24649DEFB 97,132   ;158^2=24964DEFB 98,193   ;159^2=25281DEFB 100,0   ;160^2=25600DEFB 101,65   ;161^2=25921DEFB 102,132   ;162^2=26244DEFB 103,201   ;163^2=26569DEFB 105,16   ;164^2=26896DEFB 106,89   ;165^2=27225DEFB 107,164   ;166^2=27556DEFB 108,241   ;167^2=27889DEFB 110,64   ;168^2=28224DEFB 111,145   ;169^2=28561DEFB 112,228   ;170^2=28900DEFB 114,57   ;171^2=29241DEFB 115,144   ;172^2=29584DEFB 116,233   ;173^2=29929DEFB 118,68   ;174^2=30276DEFB 119,161   ;175^2=30625DEFB 121,0   ;176^2=30976DEFB 122,97   ;177^2=31329DEFB 123,196   ;178^2=31684DEFB 125,41   ;179^2=32041DEFB 126,144   ;180^2=32400DEFB 127,249   ;181^2=32761DEFB 129,100   ;182^2=33124DEFB 130,209   ;183^2=33489DEFB 132,64   ;184^2=33856DEFB 133,177   ;185^2=34225DEFB 135,36   ;186^2=34596DEFB 136,153   ;187^2=34969DEFB 138,16   ;188^2=35344DEFB 139,137   ;189^2=35721DEFB 141,4   ;190^2=36100DEFB 142,129   ;191^2=36481DEFB 144,0   ;192^2=36864DEFB 145,129   ;193^2=37249DEFB 147,4   ;194^2=37636DEFB 148,137   ;195^2=38025DEFB 150,16   ;196^2=38416DEFB 151,153   ;197^2=38809DEFB 153,36   ;198^2=39204DEFB 154,177   ;199^2=39601DEFB 156,64   ;200^2=40000DEFB 157,209   ;201^2=40401DEFB 159,100   ;202^2=40804DEFB 160,249   ;203^2=41209DEFB 162,144   ;204^2=41616DEFB 164,41   ;205^2=42025DEFB 165,196   ;206^2=42436DEFB 167,97   ;207^2=42849DEFB 169,0   ;208^2=43264DEFB 170,161   ;209^2=43681DEFB 172,68   ;210^2=44100DEFB 173,233   ;211^2=44521DEFB 175,144   ;212^2=44944DEFB 177,57   ;213^2=45369DEFB 178,228   ;214^2=45796DEFB 180,145   ;215^2=46225DEFB 182,64   ;216^2=46656DEFB 183,241   ;217^2=47089DEFB 185,164   ;218^2=47524DEFB 187,89   ;219^2=47961DEFB 189,16   ;220^2=48400DEFB 190,201   ;221^2=48841DEFB 192,132   ;222^2=49284DEFB 194,65   ;223^2=49729DEFB 196,0   ;224^2=50176DEFB 197,193   ;225^2=50625DEFB 199,132   ;226^2=51076DEFB 201,73   ;227^2=51529DEFB 203,16   ;228^2=51984DEFB 204,217   ;229^2=52441DEFB 206,164   ;230^2=52900DEFB 208,113   ;231^2=53361DEFB 210,64   ;232^2=53824DEFB 212,17   ;233^2=54289DEFB 213,228   ;234^2=54756DEFB 215,185   ;235^2=55225DEFB 217,144   ;236^2=55696DEFB 219,105   ;237^2=56169DEFB 221,68   ;238^2=56644DEFB 223,33   ;239^2=57121DEFB 225,0   ;240^2=57600DEFB 226,225   ;241^2=58081DEFB 228,196   ;242^2=58564DEFB 230,169   ;243^2=59049DEFB 232,144   ;244^2=59536DEFB 234,121   ;245^2=60025DEFB 236,100   ;246^2=60516DEFB 238,81   ;247^2=61009DEFB 240,64   ;248^2=61504DEFB 242,49   ;249^2=62001DEFB 244,36   ;250^2=62500DEFB 246,25   ;251^2=63001DEFB 248,16   ;252^2=63504DEFB 250,9   ;253^2=64009DEFB 252,4   ;254^2=64516DEFB 254,1   ;255^2=65025END ASMEND FUNCTIONDIM x,y as integerDIM time as uLONG'time=t()'For x=-100 To 100For y=-100 To 100'If ((x<<1)-25)*((x<<1)-25)+(y-50)*(y-50)<200 Or ((x<<1)+25)*((x<<1)+25)+(y-50)*(y-50)<200 then plot x+100,96-y'END IFprint x;" ";fastSquares(ABS x);"  =  ";CAST(uLong, x)*x ;"[]"Next yNext x  'time=t()-time'print CAST(float,time)/50`

Posts: 766

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Re: Not so much a bug, as slow!

As an addendum - if I ask it to calculate X^2 in the middle of the loop, it crashes the program. Calculating X*X works fine.

Posts: 1463

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Re: Not so much a bug, as slow!

britlion wrote:This program was posted on the WOS forums:

Code:
`For x=-100 To 100For y=-100 To 100If (x/2-25)*(x/2-25)+(y-50)*(y-50)<200 Or (x/2+25)*(x/2+25)+(y-50)*(y-50)<200 then plot x+100,96-yNext yNext x  `

And if run as is, takes about 9 seconds in Basic, I think.

I've run the above code in Sinclair BASIC, and it takes 688secs. (11min. 20secs. aprox.)

This is the listing:

In ZX Basic, by default it uses Byte type, it overflows, and produces a random pattern and an Out of Screen error at the end.
When declaring float, it works, but it takes *twice* the time it does with Sinclair BASIC. This is something I need to investigate further.
So using Integer is the way to go, and it takes about a minute (52 secs, as you said).

From my point of view, this is okay. There is some optimizations to be done there (e.g. common factors, etc.). I hope to have some more time to check for this.

Posts: 1463

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Re: Not so much a bug, as slow!

britlion wrote:Okay. Fairly convinced there's a bug here.

zxb version 1.3.0-S928

Here's an expanded program in full - was looking at perhaps faster squaring?

Anyway, the retruned value is weird.

A statement of: print x;" ";fastSquares(ABS x);" = ";CAST(uLong, x)*x ;"[]"

Comes back lines like

-61 3721 = 3721 [] 0 []

There's no way it should print anything after it's confirmed the function does the same thing as the X*X calculation. No [] twice. (I put that on the end because it was adding a bunch of zeroes instead without it!)

This is also ok. It happens that when the number changes from 6 digits to 4, what you see is the remaining digits from previous calculation.
Ensure that line erases previous results by adding extra spaces after "[]", this way:
Code:
`print x;" ";fastSquares(ABS x);"  =  ";CAST(uLong, x)*x ;"[]    " ' <= 4 spaces is enough`

Posts: 1463

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Re: Not so much a bug, as slow!

britlion wrote:As an addendum - if I ask it to calculate X^2 in the middle of the loop, it crashes the program. Calculating X*X works fine.

This is expected, unfortunately: ZX Basic uses the ROM, and a^b raises Invalid Argument if a < 0.

Posts: 766

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Re: Not so much a bug, as slow!

boriel wrote:
britlion wrote:As an addendum - if I ask it to calculate X^2 in the middle of the loop, it crashes the program. Calculating X*X works fine.

This is expected, unfortunately: ZX Basic uses the ROM, and a^b raises Invalid Argument if a < 0.

Wow. You're right. The ROM is completely wrong there. (-1 ^ 2 = 1). No, that's not your fault!

Posts: 1463

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Re: Not so much a bug, as slow!

Anyway, it's very strange that it takes *twice* the time when using FP comparing to Sinclair BASIC. It should take about the same. Still checking...

Posts: 81

Joined: Tue Mar 29, 2011 1:05 am

Re: Not so much a bug, as slow!

Code:
`asmjr ZXBASICBorielVersionEnddb "ZX Boriel BASIC version 1.3.0-s924"ZXBASICBorielVersionEnd:end asmDIM x AS FLOATDIM y AS FLOAT'DIM x AS INTEGER'DIM y AS INTEGERPOKE 23674,0: POKE 23673,0: POKE 23672,0FOR x=-100 TO 100FOR y=-100 TO 100IF (x/2-25)*(x/2-25)+(y-50)*(y-50)<200 OR (x/2+25)*(x/2+25)+(y-50)*(y-50)<200 THENPLOT x+100,96-y ' zxb version'PLOT x+100,y-100 ' original BASIC versionEND IFNEXT y: NEXT xPRINT (65536*PEEK 23674+256*PEEK 23673+PEEK 23672)/50`

This program gives me:

Original BASIC
2040.34

ZX Boriel BASIC version 1.3.0-s924
1477

ZX Boriel BASIC version 1.3.0-s967
1491

ZX Boriel BASIC version 1.3.0-s979
1491

Posts: 81

Joined: Tue Mar 29, 2011 1:05 am

Re: Not so much a bug, as slow!

There was a bug in the PNG listing in both the POKE statement and in the PEEK argument. 23674 got written as 26374.

Posts: 1463

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Re: Not so much a bug, as slow!

Darkstar wrote:There was a bug in the PNG listing in both the POKE statement and in the PEEK argument. 23674 got written as 26374.

OMG!!
I overlooked that!! Thank you.
Anyway, I've found a floating point library. It is faster, but a bit less precise (it's like the traditional Ansi C IEEE FP library, 4 bytes per float, not 5). I guess I can implement it, and will be much faster, a bit less precise (but enough for most purposes), and take more memory. But more important: it will allow FP calculations in many Z80 machines, like SNES not having FP Rom routines.

Posts: 81

Joined: Tue Mar 29, 2011 1:05 am

Re: Not so much a bug, as slow!

boriel wrote:
Darkstar wrote:There was a bug in the PNG listing in both the POKE statement and in the PEEK argument. 23674 got written as 26374.

OMG!!
I overlooked that!! Thank you.
Anyway, I've found a floating point library. It is faster, but a bit less precise (it's like the traditional Ansi C IEEE FP library, 4 bytes per float, not 5). I guess I can implement it, and will be much faster, a bit less precise (but enough for most purposes), and take more memory. But more important: it will allow FP calculations in many Z80 machines, like SNES not having FP Rom routines.

Just another one of those silly bugs that can always creep in.

Thanks for letting us know about the Ansi C IEEE FP library.
I do hope that the memory usage of it will not be more than the FLOAT versions that are now in use (s924 and older) and then I mean when it's compiled
and also dynamic run time usage. As you know you can´t rely on the ZX ROM for other platforms so this will have to get implemented sooner or later.
But this will break compatibilty with the original ZX basic and it seems that the current version of FLOAT is somewhat faster than the original interperted
version, so why not keep it in the ZXB and add a SINGLE 32bit datatype that uses the new Ansi C IEEE FP library? Then the old FLOAT library will be accessible
through the --sinclair option and only pulled in if you define a variable as a FLOAT. In the end, mabe all Sinclair specific command will be used that way
like the SAVE command that will not work on the SNES and hardware specific commands like PRINT (screen addressing issues). This way comapitbillty
can be ensured while generic routines like the Ansi C IEEE FP library and commands like GOSUB can be kept the same across all individual platforms. That
reminds me that the FOR/NEXT routines are still not in accord with BASIC standars but with C instead as far as I know. I think it was in DO/LOOP but I
am not sure that the zxb compiler did a complex test to see if it had reached the exit condition when a CP 0 would have been fine.

Posts: 84

Joined: Sun Apr 08, 2012 9:33 pm

Re: Not so much a bug, as slow!

boriel wrote:Anyway, I've found a floating point library. It is faster, but a bit less precise (it's like the traditional Ansi C IEEE FP library, 4 bytes per float, not 5). I guess I can implement it, and will be much faster, a bit less precise (but enough for most purposes), and take more memory.

In case the FP library mentioned above was not added to ZX BASIC yet, I would like to provide my 2 cents about this idea...

It seems ZX BASIC is used almost exclusively for games, assuming this list is accurate enough:

http://www.boriel.com/wiki/en/index.php ... d_Programs

Notice that games almost never use FP for performance reasons, except to calculate INT (RND * n) typically. Also it's important for games to save as much memory as possible. Because of this, my suggestions are:

• Change the ZX BASIC builder so it will only include the FP library if the program really uses FP somehow;

• To avoid INT(RND * n) from forcing the builder to include the FP library (and probably also to optimize compiled code), change RND to accept an optional parameter, such that using RND(n) will return an UINTEGER value between 0 and n-1, and just using RND without parameters will return a FLOAT value as before. Alternatively, if this dual behavior would be a problem for the parser, then provide instead a new function INTRND(n) that works as I described here.

Makes sense?

Posts: 1463

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Re: Not so much a bug, as slow!

einar wrote:
• Change the ZX BASIC builder so it will only include the FP library if the program really uses FP somehow;
• To avoid INT(RND * n) from forcing the builder to include the FP library (and probably also to optimize compiled code), change RND to accept an optional parameter, such that using RND(n) will return an UINTEGER value between 0 and n-1, and just using RND without parameters will return a FLOAT value as before. Alternatively, if this dual behavior would be a problem for the parser, then provide instead a new function INTRND(n) that works as I described here.

Makes sense?

It does!
ZX Basic already does that. Not only it does not include any FP library if not used, but only includes the minimum code for the required functions (there is some #include and #require in the .asm libraries for that). When using a FP library not in ROM, the same will be done, but now the FP functions will take much more RAM (e.g. SIN, COS, etc).

Posts: 84

Joined: Sun Apr 08, 2012 9:33 pm

Re: Not so much a bug, as slow!

boriel wrote:ZX Basic already does that.

Great!

But does it also mean that INT(RND * n) is optimized such that it doesn't really use FP or even integer multiplication?

My concern is that, although RND implementation in ZX BASIC is really fast now (I believe it's now using Patrik Rak's implementation, right?), if it can only be accessed using FP calculations, it would be quite inefficient.
Next