Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Compiler Speed Trials
#1
I know that ZX Basic is amazing, but I was wondering how it stood up to other basic compilers that were around for use on the ZX Spectrum. We know that Hisoft basic was pretty fast, for example, and LCD mentioned another compiler the other day that was pretty amazing too.

Let me borrow from an article in Crash Magazine: http://www.crashonline.org.uk/19/compilers.htm

In this article, Simon Goodwin talks about several compilers. Hisoft Basic isn't one of them - it wasn't out yet. He doesn't list the benchmarks, either; but they can be interpolated from this:

Code:
Benchmark BM1 : A null-action FOR, REPEAT or DO loop, executed
                1000 times.

Benchmark BM2 : A  null-action explicitly-coded loop  executed
                1000 times.

Benchmark BM3 : BM2 plus A=K/K*K+K-K in the loop.

Benchmark BM4 : BM2 plus A=K/2*3+4-5 in the loop.

Benchmark BM5 : BM4  plus  a branch to null-action  subroutine
                from inside the loop.

Benchmark BM6 : BM5  plus  an array declaration  M(5),  and  a
                null-action  FOR  loop (of 1-5)  also  in  the
                loop.

Benchmark BM7 : BM6 plus M(L)=A in this 1-5 loop.

Benchmark BM8 : A  square  function,   log  function  and  sin
                function  in  an  explicitly-coded  FOR  loop,
                repeated 100 times.

Benchmark BM9 : Prime  numbers in the range 1-1000 are printed
                to the screen,  calculated in an outer loop of
                1000 and an inner loop of 500,  with no tricks
                at  all.  This  is  a very  bad  prime  number
                routine  indeed,  but a very useful basis  for
                inter-machine,    interpreter   and   compiler
                comparisons.

Simon didn't use Benchmark 9, and I can see why - it's not clearly specified. BM1 to BM8 are pretty clear, however.

My own personal testing with Sinclair Basic gave very slightly differing results. In all cases, my programs were very slightly faster than the timings Goodwin gave in the magazine article. Perhaps he specified things a little differently, perhaps he was using a stopwatch in hand, and human error was the result. Perhaps it was a different version of the ZX Spectrum used. I got the computer to time the programs using the 50 frames per second interrupt timer. For very fast running programs I increased the number of loops by a factor 10 or 100 and estimated back down.

The compilers goodwin tested were:

A Mehmood's "Compiler".
MCODER
Softek's FP and IS
And a little cheekily, Zip 1.5. He wrote that himself, I believe.

The first two rows are for Sinclair Basic. The first being Simon Goodwin's numbers, the second being my own. All times are in seconds, smaller is better.

Code:
BM1      BM2      BM3      BM4      BM5      BM6      BM7      BM8                    BMDRAW
    Sinclair           4.46     8.46     21.56    19.82    25.34    60.82    87.44    23.30                   80.18
    Boriel's ZX BASIC  0.038    0.032     0.30     0.15     0.16     0.328    2.20    24.0

   ZX Basic 1.26-r1603 -O3                                                    0.94    20.78 (17.14 with fSin)
   ZX Basic 1.2.8-s682 -O3                                                    0.88    20.56 (16.94 with fSin)  21.14
   ZX Basic 1.2.8-s758 -O3                                                    0.90    20.76 (17.10 with fSin)  21.32

    HiSoft FP          0.82     1.34      7.26     7.30     7.32    12.52    14.40    21.9
    HS Integer         0.042    0.67      0.08     0.088    0.334    0.50    10.76

    Mehmood            *        0.065     9.0      4.2      4.2     *        *        *
    ZIP 1 .5           0.031    0.064     0.194    0.108    0.115    0.29     0.46    *
    TOBOS              0.58     0.82      2.02     1.76     2.34     6.68     8.72    0.746
    SOFTEK FP          1.75     2.1       8.7      9.4      9.4     19.7     24.0     22.5
    SOFTEK IS          0.058    0.076     0.57     0.98     0.99     1.32    *        *
    MCODER2            0.043    0.097     0.62     0.90     0.92     1.17     1.47    *
The actual code used is listed below. It's possible to Extrapolate what BM1-6 are, because they simply add code to end up with BM7. Bm 8's main loop is listed separately.

Code:
REM BM7
FUNCTION t() as uLong
asm
    LD DE,(23674)
    LD D,0
    LD HL,(23672)
end asm
end function

goto start

subroutine:
return

start:
DIM time,i as uInteger
DIM k,var,j as uByte
let time =t()
LET k=5
LET i=0

label:
LET i=i+1
LET var=k/2*3+4-5
gosub subroutine
DIM M(5) as uInteger
FOR j=0 to 4
LET M(j)=i
NEXT j
IF i<1000 then GOTO label: END IF

print (CAST (FLOAT,t())-time)/50

BM 8 replaces most of the code with:
Code:
REM BM8
DIM i,j as ubyte
j=2
FOR i=1 to 100
result=j^2
result=ln(j)
result=sin(j)
next i
This is changed from using constants to prevent constant folding optimizations.
RESULTS and DISCUSSION

First up, passing all the benchmarks and more, clearly Boriel's work is by far the most flexible and comprehensive compiler available. It blows the spots off everything else in terms of WHAT it can compile, and all credit to him for creating it. It is excellent!

In terms of performance, it's pretty amazing, too. It's the second fastest of all the compilers listed here. Only ZIP goes faster, generally. BM7 is a little disappointing, in that the produced code seems to be slower than both MCODER 2 and Zip by a quite significant margin. Perhaps some examination of array handling code could improve this. With version 1.25 beta, sadly, I couldn't use -O3 as an option - the programs all failed to compiler with this option enabled, so I couldn't see if peephole optimization would make a difference. It's worth noting that most On Spectrum compilers refused to deal with floating point numbers. In this roundup, only Softek FP could do it, and that barely faster than Basic. Boriel's compiler blew me away with the FP result, frankly. I had to check to see if it was doing it correctly, it was so amazing! There might be some sneaky optimization happening, but printing the numbers as it created them did seem to work fine. (Note: It WAS cheating. It was putting in constants at compile time. A clever option, but not what we were aiming to test. This number has been changed)

Fixed Hisoft Basic Numbers. These corrected numbers do in fact show it produces some of the fastest code available, sometimes beaten by ZIP 1.5. It far outmatches what ZIP can do, however, in that it deals with FP as well as integer - and it seems to do both faster than the competition. Of course ZX BASIC basic excels at being FP and Integer aware as well.

Added in Tobos. It's fully FP, so tends to be slow where integer math could improve things. But look at BM8!

ZX BASIC In short: Solid and well optimized. Seems to be slow in BM7 (array handling). Very clever use of constant insertion to produce good BM8 speed value of 0.1 but now times are corrected because that was cheating a little!
[Edit] - Array handling speed has been dramatically increased with later versions. Boriel has stated that he will be looking into further array optimizations similar to Hisoft Basic methods - so we can hope for another doubling of speed, perhaps! Confusedhock:
Reply
#2
First of all, a big Thank you. Wow! What an impressive work! :o

britlion Wrote:I know that ZX Basic is amazing, but I was wondering how it stood up to other basic compilers that were around for use on the ZX Spectrum. We know that Hisoft basic was pretty fast, for example, and LCD mentioned another compiler the other day that was pretty amazing too.

Let me borrow from an article in Crash Magazine: http://www.crashonline.org.uk/19/compilers.htm

In this article, Simon Goodwin talks about several compilers. Hisoft Basic isn't one of them - it wasn't out yet. He doesn't list the benchmarks, either; but they can be interpolated from this:

Code:
Benchmark BM1 : A null-action FOR, REPEAT or DO loop, executed
                1000 times.

Benchmark BM2 : A  null-action explicitly-coded loop  executed
                1000 times.

Benchmark BM3 : BM2 plus A=K/K*K+K-K in the loop.

Benchmark BM4 : BM2 plus A=K/2*3+4-5 in the loop.

Benchmark BM5 : BM4  plus  a branch to null-action  subroutine
                from inside the loop.

Benchmark BM6 : BM5  plus  an array declaration  M(5),  and  a
                null-action  FOR  loop (of 1-5)  also  in  the
                loop.

Benchmark BM7 : BM6 plus M(L)=A in this 1-5 loop.

Benchmark BM8 : A  square  function,   log  function  and  sin
                function  in  an  explicitly-coded  FOR  loop,
                repeated 100 times.

Benchmark BM9 : Prime  numbers in the range 1-1000 are printed
                to the screen,  calculated in an outer loop of
                1000 and an inner loop of 500,  with no tricks
                at  all.  This  is  a very  bad  prime  number
                routine  indeed,  but a very useful basis  for
                inter-machine,    interpreter   and   compiler
                comparisons.

Simon didn't use Benchmark 9, and I can see why - it's not clearly specified. BM1 to BM8 are pretty clear, however.

The above benchmarks are interesting. I'm somewhat surprised of BM7 Sad I didn't suppose array handling was so slow.
BTW do these compilers handle multiple-dimentions arrays?

Britlion Wrote:First up, passing all the benchmarks and more, clearly Boriel's work is by far the most flexible and comprehensive compiler available. It blows the spots off everything else in terms of WHAT it can compile, and all credit to him for creating it. It is excellent!
:oops: Thank you. I now fell more motivated!!! :twisted:

Quote:In terms of performance, it's pretty amazing, too. It's the second fastest of all the compilers listed here. Only ZIP goes faster, generally. BM7 is a little disappointing, in that the produced code seems to be slower than both MCODER 2 and Zip by a quite significant margin. Perhaps some examination of array handling code could improve this. With version 1.25 beta, sadly, I couldn't use -O3 as an option - the programs all failed to compiler with this option enabled, so I couldn't see if peephole optimization would make a difference.
Yes, -O3 definitely makes a difference in array-access speed :!: This is something I'm currently fixing. It seems I reintroduced 2 old bugs back (one in the peephole and another on comparators already fixed). I'm currently working on them.

Quote:It's worth noting that most On Spectrum compilers refused to deal with floating point numbers. In this roundup, only Softek FP could do it, and that barely faster than Basic. Boriel's compiler blew me away with the FP result, frankly. I had to check to see if it was doing it correctly, it was so amazing! There might be some sneaky optimization happening, but printing the numbers as it created them did seem to work fine.

In short: Solid and well optimized. Seems to be slow in BM7 (array handling). Amazing BM8 (Floating Point) speed!

I will edit this post as and when I get around to testing a couple of other compilers. Hisoft Basic and Tobos - both able to do floating point code - are certainly going to be looked at!
I'm happy to read this.
For the FP, it's somewhat odd: I just use constant folding (precalculation) and ROM-CALC for that. Please, check the FP results are right... I mean do you print the FP calculation result on the screen? Do they match? How strange... :| I like this FP result, but... as you, I'm too surprised.
Reply
#3
boriel Wrote:First of all, a big Thank you. Wow! What an impressive work! :o

I like ZX Basic A LOT. I want it to produce the best code possible. *grin* Perhaps we can get it to the point where it gives that upstart C compiler a run for its money. *hmmph* A zx spectrum should be coded in basic *laugh*

boriel Wrote:The above benchmarks are interesting. I'm somewhat surprised of BM7 Sad I didn't suppose array handling was so slow.
BTW do these compilers handle multiple-dimentions arrays?

No. Yours is the only one that will do that from this list. I think Hisoft Basic did, though. Some of them won't do it at all. ZIP might be fast, but it's VERY limited in what it can do - not even string handling, I believe.

Boriel Wrote:Yes, -O3 definitely makes a difference in array-access speed :!: This is something I'm currently fixing. It seems I reintroduced 2 old bugs back (one in the peephole and another on comparators already fixed). I'm currently working on them.

When the optimizer is back together, I'll re-run the speed tests, certainly. Sadly, -O1 seems to work, but anything higher than 1 just fails at the moment.

Boriel Wrote:I'm happy to read this.
For the FP, it's somewhat odd: I just use constant folding (precalculation) and ROM-CALC for that. Please, check the FP results are right... I mean do you print the FP calculation result on the screen? Do they match? How strange... :| I like this FP result, but... as you, I'm too surprised.

I was too, especially given the little tutorial I wrote about using smaller data types when possible to get the fastest code - we know handling five byte numbers (and in the rom routines too) is slower than integers.

You can see the code I wrote and ran - it finds sin(2), ln(2) and 2^2 in the loop. I'm pretty sure the 2^2 uses integer math, and rightly so; but sin and ln...well... if you're using ROM routines, I have no idea why it came back that fast. Does it work out a fixed result for sin(2) and just use that? That would be one reason it works so quickly!

It ISN'T printing the numbers, because it would make it very slow - PRINT is a pretty time consuming thing to do. I'm wondering if it might be possible to have a faster print routine that doesn't do all the bounds checking, and control character checking. I tested the loop with printing on - and it spun through some decimals happily. I timed it with printing off. I did wonder if the compiler would optimize out data that wasn't being used. The time results for this test were staggeringly fast.
Reply
#4
Added in Hisoft trials.

All the reviews and Hisoft said it was the fastest. Not what I found here, by a long margin. It was certainly the most flexible - in its ability to deal with large programs and floating point as well as integer. But the integer benchmarks I managed ran very slowly compared to the competition.

Oh dear.

Did I make a mistake? Here's the optimized BM7 program - used to avoid using DEF FN (which Hisoft allows, but most compilers don't)

Code:
7 REM :INT +a,k,v,i,m()
8 REM : OPEN#
9 CLS
10 POKE 23672,0: POKE 23673,0
90 POKE 23672,0
100 LET a=0: LET k=5: LET v=0
110 LET a=a+1
120 LET v=k/2*3+4-5
130 GO SUB 1000
140 DIM m(5)
150 FOR i=1 TO 5
160 LET m(i)=a
170 NEXT i
200 IF a<1000 THEN GO TO 110
210 PRINT (PEEK 23672+256*PEEK 23673)/50
999 STOP
1000 RETURN
Reply
#5
I tested tobos with

let v=SIN(i)
let v=i^2
let v=LN(i)

instead of numbers that could be replaced with fixed constants (sin (2) could be replaced, for example). It ran in 0.74 seconds instead of the 0.5 for constants. I don't think it's cheating.

When I did the same thing with ZX Basic, it took :!: 24 seconds, instead....
Reply
#6
britlion Wrote:I was too, especially given the little tutorial I wrote about using smaller data types when possible to get the fastest code - we know handling five byte numbers (and in the rom routines too) is slower than integers.

You can see the code I wrote and ran - it finds sin(2), ln(2) and 2^2 in the loop. I'm pretty sure the 2^2 uses integer math, and rightly so; but sin and ln...well... if you're using ROM routines, I have no idea why it came back that fast. Does it work out a fixed result for sin(2) and just use that? That would be one reason it works so quickly!

It ISN'T printing the numbers, because it would make it very slow - PRINT is a pretty time consuming thing to do. I'm wondering if it might be possible to have a faster print routine that doesn't do all the bounds checking, and control character checking. I tested the loop with printing on - and it spun through some decimals happily. I timed it with printing off. I did wonder if the compiler would optimize out data that wasn't being used. The time results for this test were staggeringly fast.

Ok, that's the explanation: precalculation and constant folding => 2^2 => 4, and so on. All those values are constant, and they're calculated at compile time. Even more, if O3 were in use, this program could be reduced to a single NOP, as it does nothing (it does not print on the screen). :!:

Try declaring a Float a = 2 variable, and use Sin(a), Ln(a), 2^a, etc... It should have a speed similar to the ROM-BASIC FP calc (so slow).
Reply
#7
britlion Wrote:I tested tobos with

let v=SIN(i)
let v=i^2
let v=LN(i)

instead of numbers that could be replaced with fixed constants (sin (2) could be replaced, for example). It ran in 0.74 seconds instead of the 0.5 for constants. I don't think it's cheating.

When I did the same thing with ZX Basic, it took :!: 24 seconds, instead....
Ok, this is more in consonance with the FP-CALC Rom. The FP ROM CALC is very powerful... but slow. Tobos and SOFTEK are using their own optimized FP routines so they should get more memory and/or less precision. I have some 3 bytes mantisa FP calc routines (ZX Basic uses 4 bytes) for Z80, or program my ones, but this would require a lot of testing and I don't know if it would worth the hassle). E.g. who will use FP for games?
Reply
#8
boriel Wrote:Ok, this is more in consonance with the FP-CALC Rom. The FP ROM CALC is very powerful... but slow. Tobos and SOFTEK are using their own optimized FP routines so they should get more memory and/or less precision. I have some 3 bytes mantisa FP calc routines (ZX Basic uses 4 bytes) for Z80, or program my ones, but this would require a lot of testing and I don't know if it would worth the hassle). E.g. who will use FP for games?

Yes. Well, first up - congrats on spotting constants and optimizing them. No other compiler back in the day did that. Brilliant move!

As for the "is it worth it" question - it depends what you want to make here - something that's special purpose or something that's the best all rounder.

I just had a look for math routines, and ran across a package of 48 bit floating point routines. Hmm. That would be interesting - being able to go to 32, 40 or 48 bit FP. I really don't know if anyone would use that at this stage, though.

That said, the compiler isn't a long hop from being able to work with other z80 devices, like a TI-89 or a Gameboy.

Know of any good routines that work on the FIXED type?
Reply
#9
I was doing something 'wrong' with hisoft basic - it was using a floating point division. It always uses a floating point division unless it's in the form INT(a/b) in which case it uses integer division.

I'll be retesting and posting new times - only fair, since I assume the other compilers with integer variables are using integer division at the v=k/2*3+4-5 stage. I'm changing that to INT(k/2)*3+4-5. The time for benchmark 7 went down to 0.5 seconds. Much improved.
Reply
#10
Well, -O2 & -O3 seems to be fixed (most of the -On problems were related to previous fixes, in fact).
Compiling B7 with -O3 reduces execution time to 0.16segs :!: :wink:
So, as I said, -O3 has a great positive impact on array access performance.

[Image: benchmarko3.th.png]
(Screenshot)

Suggestion: try the benchmarks again using -O3, to see if it improves times on other benchmarks too (download ZX Basic v1.2.5-r1489b here: <!-- m --><a class="postlink" href="http://www.boriel.com/files/zxb/zxbasic-1.2.5r1489b.msi">http://www.boriel.com/files/zxb/zxbasic-1.2.5r1489b.msi</a><!-- m --> )
Reply
#11
Finally got a quick chance to test this. Yes, -O3 does improve BM7 - but the 0.16 seconds value isn't really fair. -O3 reports that variables are not used and optimizes out all the loops!

Putting a print M(1),k,i,var at the end of the program makes it actually do the work rather than skip it (but putting the print AFTER the time is recorded doesn't extend the time), and it duly recorded a time of 2.12 seconds.

This is a noticeable improvement, but still a long way behind the code that Hisoft Basic and other integer compilers make. It still looks as though the array handling is somewhat behind other implementations.

Incidentally, also tested BM 8, with -O3, which shows about a 16% improvement!

It's clear that things like Tobos use very highly optimized FP math structures. Changing out sin for the fSin function listed in the library (which breaks with -O3 btw) ran in 17 seconds instead - showing 7 seconds of speed up (41%). There might be a very strong case, at some point, for looking into optional faster FP functions.
Reply
#12
britlion Wrote:Finally got a quick chance to test this. Yes, -O3 does improve BM7 - but the 0.16 seconds value isn't really fair. -O3 reports that variables are not used and optimizes out all the loops!
I don't remember that optimization. Which code snippet are you using? BM7 above?

Quote:Putting a print M(1),k,i,var at the end of the program makes it actually do the work rather than skip it (but putting the print AFTER the time is recorded doesn't extend the time), and it duly recorded a time of 2.12 seconds.
Did you also run this modified version on the other compilers? (they might also be doing some optimizations: so just to be sure). Please paste the benchmark code here, or tell me witch one are you using.

If you made some modifications to BM7 and or BM8, please paste them there. Also, the code MUST be the same (e.g. no function calls for any compiler, of function calls for any of them, etc...).

Quote:This is a noticeable improvement, but still a long way behind the code that Hisoft Basic and other integer compilers make. It still looks as though the array handling is somewhat behind other implementations.
Do the other compilers allow multidimensional arrays of float / string / Integers?

Quote:Incidentally, also tested BM 8, with -O3, which shows about a 16% improvement!

It's clear that things like Tobos use very highly optimized FP math structures. Changing out sin for the fSin function listed in the library (which breaks with -O3 btw) ran in 17 seconds instead - showing 7 seconds of speed up (41%). There might be a very strong case, at some point, for looking into optional faster FP functions.
As told before, we could use add a --fast-floating-point option to include Fast FP routines instead of ROM ones (most compilers do). This will eat memory for sure (in fact, z88dk uses ROM calc routines too). FP routines aren't used in games. The most common technique is to use precomputed table values (mostly in demos).
Reply
#13
boriel Wrote:
britlion Wrote:Finally got a quick chance to test this. Yes, -O3 does improve BM7 - but the 0.16 seconds value isn't really fair. -O3 reports that variables are not used and optimizes out all the loops!
I don't remember that optimization. Which code snippet are you using? BM7 above?

Hmm. You yourself said:
boriel Wrote:if O3 were in use, this program could be reduced to a single NOP, as it does nothing (it does not print on the screen).

All I did was add the line print M(1),k,i,var to the end of BM7 listed above. It prints it once, and it prints it AFTER it's worked out how long it took; so it's not like the print line took 2 seconds to print. It HAS to be that -O3 recognizes the loops and variables aren't being "used" and deletes them from the code. It normally would be quite right to do so, as well. Using them in a print statement, slows the whole thing back down.

Boriel Wrote:Did you also run this modified version on the other compilers? (they might also be doing some optimizations: so just to be sure). Please paste the benchmark code here, or tell me witch one are you using.

If you made some modifications to BM7 and or BM8, please paste them there. Also, the code MUST be the same (e.g. no function calls for any compiler, of function calls for any of them, etc...).

It's really not modified, apart from asking it to print the values of some variables right before it ends. As for running it on other compilers; I didn't - I don't even /have/ all of them handy. I do have hisoft basic, however, and I tested that one myself. I'll get round to remaking it and add in that print statement to be sure, but I don't think it even has something like the -O3 as an optimization loop. It does seem to treat array variables as fast as any other type of variable.

Boriel Wrote:Do the other compilers allow multidimensional arrays of float / string / Integers?

You asked that before, and I'll say again: Almost all of them only allow single dimension number arrays. (Zip compiler won't deal with strings AT ALL!). Hisoft basic does allow multidimensional arrays and string arrays:
Hisoft Basic Manual Wrote:HiSoft BASIC supports numeric and string arrays of up to 2 dimensions. Ordinary string variables behave as in BASIC except that they must not exceed in length the amount of space reserved for them at compile time. By default this is 257 bytes (to allow a string of up to 255 characters, plus 2 bytes for the length) but it can be changed by means of the REM : LEN directive.

Quite outside the array issues, I think Hisoft Basic's string handling is vastly less flexible, but far faster than yours. Your strings are always mutable, and of variable sizes. Hisoft uses far less memory efficient fixed string system, not in the heap, which is faster to use. As always, it's memory size and waste vs speed, here. Luckily I've never seen your neat and efficient strings running too slowly, so I think the only "issue" here is array handling. I don't know why your version of the array handling code is the slowest of the tested integer compilers. Is it to do with it handling more variable types, and having to deal with that?

Does it internally know the difference between an array of strings and an array of bytes? Are these different types to the compiler? A fixed size array of numbers is really just a lookup table. Your variable sized string handling (while brilliant in its memory efficiency) makes string arrays much much more awkward to deal with. If you're using the same code to handle arrays of ANYTHING, I can see why hisoft seems to be going faster, here.
Reply
#14
britlion Wrote:
boriel Wrote:
britlion Wrote:Finally got a quick chance to test this. Yes, -O3 does improve BM7 - but the 0.16 seconds value isn't really fair. -O3 reports that variables are not used and optimizes out all the loops!
I don't remember that optimization. Which code snippet are you using? BM7 above?

Hmm. You yourself said:
boriel Wrote:if O3 were in use, this program could be reduced to a single NOP, as it does nothing (it does not print on the screen).
This "reduced to a single NOP" optimization was removed in 1.2.5 since an empty loop makes sense for programmers who need execution delay. But there are other "unused variables" optimizations which only optimize SPACE (memory) not SPEED (code). So I still don't understand this difference in Speed. Anyway, here is your latest BM7 code:
Code:
7 REM :INT +a,k,v,i,m()
    8 REM : OPEN#
    9 CLS
    10 POKE 23672,0: POKE 23673,0
    90 POKE 23672,0
    100 LET a=0: LET k=5: LET v=0
    110 LET a=a+1
    120 LET v=k/2*3+4-5
    130 GO SUB 1000
    140 DIM m(5)
    150 FOR i=1 TO 5
    160 LET m(i)=a
    170 NEXT i
    200 IF a<1000 THEN GO TO 110
    210 PRINT (PEEK 23672+256*PEEK 23673)/50
    999 STOP
    1000 RETURN
I'm going to test with this code. Please, confirm this is the correct test-case.

Britlion Wrote:
Boriel Wrote:Do the other compilers allow multidimensional arrays of float / string / Integers?

You asked that before, and I'll say again: Almost all of them only allow single dimension number arrays. (Zip compiler won't deal with strings AT ALL!). Hisoft basic does allow multidimensional arrays and string arrays:
Hisoft Basic Manual Wrote:HiSoft BASIC supports numeric and string arrays of up to 2 dimensions. Ordinary string variables behave as in BASIC except that they must not exceed in length the amount of space reserved for them at compile time. By default this is 257 bytes (to allow a string of up to 255 characters, plus 2 bytes for the length) but it can be changed by means of the REM : LEN directive.

Quite outside the array issues, I think Hisoft Basic's string handling is vastly less flexible, but far faster than yours. Your strings are always mutable, and of variable sizes. Hisoft uses far less memory efficient fixed string system, not in the heap, which is faster to use. As always, it's memory size and waste vs speed, here. Luckily I've never seen your neat and efficient strings running too slowly, so I think the only "issue" here is array handling. I don't know why your version of the array handling code is the slowest of the tested integer compilers. Is it to do with it handling more variable types, and having to deal with that?
ZX Basic allows per-element size up to 65535 char strings (2 bytes length). I could reduce it to 1 byte (up to 255), but I know of many Sinclair BASIC programs that use long strings (e.g. 1000 chars) for strange purposes. In fact I used this technique very often. Thus, not supporting 256+ length strings will break (even more) compatibility with Sinclair Basic.

Britlion Wrote:Does it internally know the difference between an array of strings and an array of bytes? Are these different types to the compiler? A fixed size array of numbers is really just a lookup table. Your variable sized string handling (while brilliant in its memory efficiency) makes string arrays much much more awkward to deal with. If you're using the same code to handle arrays of ANYTHING, I can see why hisoft seems to be going faster, here.
Yes, having 1 dimensional array (a vector) is very fast. For a vector of bytes, you can even use IX + n indirections. Having up to 2 dimension arrays can also be optimized (in fact, z88dk also uses 2 dimensional arrays or vector, I can't recall know, but in the end, you have to compute the element offset your self).

ZX BASIC tries to be as much compatible as possible with Sinclair Basic. So it allows multiple dimension array. Each dimension carries out a multiplication. It also allows "any size" elements (BTW, Strings are pointers to the Heap, hence 2 byte elements). Currently, only 1, 2, 4, 5 element sizes are allowed. The multiply-chain (1 multiplication per dimension) ends with an extra multiplication (element size) to get the final element offset.

This final element-size multiplication could be slightly optimized:
  • Size 1 => Return
  • Size 2 => Shift Left, Return
  • Size 4 => Shift Left, Shift Left, Return
  • Size 5 => BC = Size, Shift Left, Shift Left, +BC, Return

But in the future, when objects / struct are available, "anysize" elements will appear => multiplication.

Update: You can implement an array as a cascade of look-up tables. This is the way I implement them in C (it's really fast), but in a 48K-memory machine this is prohibitive! :|
Reply
#15
Ok: I downloaded Hisoft Basic 1.1 from World of Spectrum, and compile and run your Test BM7. It prints 4.8 segs.
I recompiled it with ZX BASIC (just declaring variables as Uintegers, which is the +INT equivalent), and execute it. It prints 2 segs.
This is 100% faster (or x2 speed).

Conclusion: Running *THE SAME* program, ZX Basic compiles better than Hisoft 1.1

Observations:
  • -O3 effectively removes unused vars, since variables you don't use for IO (PRINT, POKE, PEEK, BEEP, IN, OUT, Function calls) are not used. Just a dummy print at the end (a future compiler #pragma will prevent this) avoid this.
  • If a newer version of Hisoft Basic (or another compiler) speeds up to 0.2segs or alike, it could be also removing unused vars.
  • HiSoft is a VERY GOOD compiler. It is slower because it's also checking for CTRL+Break (you can stop running programs). ZX Basic will also have this feature in the 1.2.6 final with --enable-break
  • Don't know if Hisoft is doing Constant optimization (eg. K/2*3+4+5 => K/2*3+9)
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)