Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Compiler Speed Trials
#31
*chuckle* of course. Yes. Um.

Well a command line option there, or a #define works as well for a ROM print version.
Reply
#32
boriel Wrote:Now this REALLY needs intensive testing (array and FOR...NEXT loops). I'm uploading a new 1.2.6 beta-r1571, if someone is interested.

Well, I've been happily compiling with this version, and so far nothing major seems to have exploded....

Anyone else?
Reply
#33
Just ran the BM 7 and BM8 tests on the latest test build with -O3.

BM 8 is still sitting the same (and -O3 made no change), but BM7. Wow.

1.10 seconds unoptimized, and 0.94 seconds with -O3 added to the mix.

Boriel, the changes to the array handling stuff have made an enormous difference; and the loop code improvement can't have hurt any. Way to go!

This is what I like to see in a tool that can use the PC's memory space to pull in optimizations and options that could NEVER have fit onto the spectrum itself.

While technically still not quite the fastest array handling of the compilers, still maintaining the far more versatile array options AND halving the time taken is a pretty impressive feat.
Reply
#34
britlion Wrote:Just ran the BM 7 and BM8 tests on the latest test build with -O3.

BM 8 is still sitting the same (and -O3 made no change), but BM7. Wow.

1.10 seconds unoptimized, and 0.94 seconds with -O3 added to the mix.

Boriel, the changes to the array handling stuff have made an enormous difference; and the loop code improvement can't have hurt any. Way to go!

This is what I like to see in a tool that can use the PC's memory space to pull in optimizations and options that could NEVER have fit onto the spectrum itself.

While technically still not quite the fastest array handling of the compilers, still maintaining the far more versatile array options AND halving the time taken is a pretty impressive feat.
And will be hardly more optimized: ZX BASIC allows n-dimensional arrays, Hisoft BASIC does not. This little overhead you see is because of the n-dimensional check. The N-dimensions are pushed into the stack. HiSoft use just registers. So up to 2 dimensions that could be done, but no more. ZX BASIC has an generic array-addressing routine to save memory. Son 1, 2, ..., N array accesses are done through this routine.

BTW: your BM7 has been added to the compiler into the benchmark directory. This directory is only included in the .zip version.
Reply
#35
This does raise a question, Boriel:

Could the compiler recognize a case of dealing with, say, a one dimensional array, and handle that completely differently - say by hanging onto the index in a register (or register pair)? The overwhelming majority of arrays are one dimensional - and it's of course possible in a cross-compiler to build in optimized options for different cases. (we don't have the memory limit for this sort of thing we have for an on-spectrum compiler!)

Sure, if someone wants an 8 dimensional array, good luck to them - the compiler handles that. (Which is utterly staggering, by the way. This is the most flexible compiler for the spectrum by far!). But, having said that, since probably 90% of arrays are single dimension, I'd say the possibility of making that a special case the compiler creates the tightest code for, might be worthwhile.
Reply
#36
britlion Wrote:This does raise a question, Boriel:

Could the compiler recognize a case of dealing with, say, a one dimensional array, and handle that completely differently - say by hanging onto the index in a register (or register pair)? The overwhelming majority of arrays are one dimensional - and it's of course possible in a cross-compiler to build in optimized options for different cases. (we don't have the memory limit for this sort of thing we have for an on-spectrum compiler!)

Sure, if someone wants an 8 dimensional array, good luck to them - the compiler handles that. (Which is utterly staggering, by the way. This is the most flexible compiler for the spectrum by far!). But, having said that, since probably 90% of arrays are single dimension, I'd say the possibility of making that a special case the compiler creates the tightest code for, might be worthwhile.
Most arrays are 1 or 2d, that's right. But keep in mind we're being compatible with Sinclair BASIC and FreeBASIC. It's just Z80 is not very powerful. There's an alternative (faster) way to calculate the offset, but requires much more memory: A table of pointers to pointers to pointers... (a table of pointers per each dimension), so it just requires Sums + pops (no multiplications). I could try some kind of #pragma fastarray to enable them.
Anyway, you (and I mean all of you) might be making the same mistake I explain here. You don't want BASIC. You want C with "BASIC keywords", to suit your needs (e.g. writing games). This will eventually degrade the language to a low-level C-like one (and people will prefer C instead). :|

The main idea of ZX BASIC is to compile Sinclair BASIC programs into machine code (which has been almost achieved, perhaps READ, DATA and RESTORE could be added in the future).

Then think of having a high level language. There are some problems here with that:
  • High level languages require less code, but are difficult to compile or must be completely interpreted (like BASIC VAL "2 * x + sin(y + 1)").
  • They also add some overhead to the execution trying to guess what we, the humans, wanted to do. So some high level things must not be used in critical parts. e.g. in the critical cycle of a game painting.
  • They also add some memory overhead (so 48k might be not enough...)
Okay, but they also bring more interesting things: They make your program shorter, easier to maintain and to port; you will probably come back to your program after a long time and remember what your program is trying to do without much commenting; it's more productive and even portable.

So for what you're trying to do, instead of breaking array support, you should try a different approach: e.g. implement a vector object yourself in ASM. It can be done. There are functions called allocate and deallocate already implemented in <alloc.bas>. Use them to create a dynamic memory block.
Then access your block with an addr integer variable, and read it with PEEK(<typesize>, addr). This is a pointer (ZX BASIC currently does not implement them, so you have to work this way).
Reply
#37
*chuckle* Sometimes I think I'm your biggest fan, and biggest critic all rolled into one.

(Never EVER think I don't love compiler and believe it is awesome, because I do).

And I /really/ don't want it to be a low level language at all. I'm coding library options so people don't have to do that for things like graphics.

I don't even really want a vector (though it's an interesting idea, and it might grow on me; especially (now I consider it) for text based data) - but I do want the produced assembler as efficient as possible for a given basic input code. That's what my focus has been comparing it with other compilers.

Here, for example, I'm talking about the compiler adding to its complexity and taking sinclair basic arrays and choosing to make complex code for a complex multidimensional array, or faster simpler code method for a simple one dimensional array - that is, it codes it like Hisoft Basic's method if it's one D.

Is that not possible - have two code style options for arrays; and one be faster? I'm not even talking about using tables and choosing it for ALL arrays, I'm talking about the compiler looking at the code and saying

1> hey, this array was defined as one dimension AND it is smaller than a register, so I'll use register B to index it while we're in there....
2> hey, this array was defined as one d and it needs BC to index it. Okay, I can do that.
3> ooh, this is a three dimensional array, well, I have code for that, even if it's slower...

That sort of thing. You seem stuck on the idea that all arrays have to be treated with the same methods....

I like the flexibility - but unless I'm completely mistaken, this could produce code as fast (or faster) than the fastest compiler we tested AND still have the option to be more flexible (which costs speed).

How is that wanting the /language/ to change to a lower level syntax? In each case the syntax is DIM a(5) or DIM a(5,3,4,3) - I want the compiler to be smarter; not the language syntax to be more complex!
Reply
#38
britlion Wrote:How is that wanting the /language/ to change to a lower level syntax? In each case the syntax is DIM a(5) or DIM a(5,3,4,3) - I want the compiler to be smarter; not the language syntax to be more complex!
I also agree. Hmmm. I think something could be done. E.g. for 1 or 2d with byte-sized indexes that is: DIM a(0TO 255) or DIM b(0 TO 255, 0 TO 255) could be optimized using the above method. I will try to copy HiSoft BASIC method Smile
Reply
#39
Since we're a few versions on, I thought I'd spin the BM7 and BM8 through the new versions. I've updated post 1, but here is the important table:

Code:
BM1      BM2      BM3      BM4      BM5      BM6      BM7      BM8
    Sinclair           4.46     8.46     21.56    19.82    25.34    60.82    87.44    23.30

    ZIP 1 .5           0.031    0.064     0.194    0.108    0.115    0.29     0.46    *
    TOBOS              0.58     0.82      2.02     1.76     2.34     6.68     8.72    0.746

    Hisoft Basic
    Integer Optimized  0.026    0.042     0.67     0.08     0.088    0.334    0.50    10.76

    Boriel's ZX BASIC  0.038    0.032     0.30     0.15     0.16     0.328    2.20    24.0
    ZX Basic 1.26 -O3                                                         2.12    20.78
    ZX Basic 1.26-r1603 -O3                                                   0.94    20.78 (17.14 with fSin)
    ZX Basic 1.2.8-r2153 -O3                                                  1.36    29.06 (24.18 with fSin)

It seems that the code is less optimized than it used to be, with some quite significant speed loss (It's about 40% slower across the board) in 1.28 from 1.26. I still want to know how Hisoft did BM8 in 10 seconds. And Tobos seems to do it in 0.7! It seems likely they both have optimized trig and logarithmic functions. I do have an optimized SIN function I'm adding in, but it's not saving that much. Need to find a good fast ln function to add to the library, it seems. Though it is clear that hisoft is definitely moving faster on loops.

The scary thing is that Bm8 now seems a LOT slower than sinclair basic in this version?? - even with fast Sin!
More trial runs a few versions down the road!
BL
Reply
#40
Glad to see you're back :!: :!: :!: :wink:

I think I've fixed some errors a bit more (you pointed them somewere but I coudn't locate your comment in this forum). Please, read here <!-- l --><a class="postlink-local" href="http://www.boriel.com/forum/post2081.html#p2081">post2081.html#p2081</a><!-- l --> (the download link is also in that post).
Code generation has been fixed and supposed be as good as it used to be.
Reply
#41
Here's the latest version (seems to be 573? Odd numbering!)

This latest version improves on the last, but not in the BM8 trial. It also fails to compile the fSin function with the error:
Traceback (most recent call last):
File "zxb.py", line 312, in <module>
File "zxb.py", line 265, in main
File "backend\__init__.pyc", line 2386, in emmit
File "backend\__init__.pyc", line 926, in _cast
NameError: global name '_fixed_oper' is not defined

Code:
BM1      BM2      BM3      BM4      BM5      BM6      BM7      BM8
    Sinclair           4.46     8.46     21.56    19.82    25.34    60.82    87.44    23.30
    TOBOS              0.58     0.82      2.02     1.76     2.34     6.68     8.72    0.746

    Hisoft Basic
    Integer Optimized  0.026    0.042     0.67     0.08     0.088    0.334    0.50    10.76

    Boriel's ZX BASIC  0.038    0.032     0.30     0.15     0.16     0.328    2.20    24.0
    ZX Basic 1.26 -O3                                                         2.12    20.78
    ZX Basic 1.26-r1603 -O3                                                   0.94    20.78 (17.14 with fSin)
    ZX Basic 1.2.8-r2153 -O3                                                  1.36    29.06 (24.18 with fSin)
    ZX Basic 1.2.8-r573 -O3                                                   1.32    29.06 (fails to compile with fSin)

More trial runs a few versions down the road!
Reply
#42
New benchmarks with s620. Boriel, it still doesn't like using the fSin code; and I'm not seeing any significant speedup. 1.26-r1603 is still the speed king for ZX Basic. Hisoft fastest overall (though clearly not as powerful!).

Code:
BM1      BM2      BM3      BM4      BM5      BM6      BM7      BM8
        Sinclair           4.46     8.46     21.56    19.82    25.34    60.82    87.44    23.30
       ZX Basic 1.26 -O3                                                          2.12    20.78
       ZX Basic 1.26-r1603 -O3                                                    0.94    20.78 (17.14 with fSin)
       ZX Basic 1.2.8-r2153 -O3                                                   1.36    29.06 (24.18 with fSin)
       ZX Basic 1.2.8-r573 -O3                                                    1.32    29.06 (fails to compile with fSin)
       ZX Basic 1.2.8-s620 -O3                                                    1.32    29.00 (Does not compile with fSin)
Reply
#43
britlion Wrote:Here's the latest version (seems to be 573? Odd numbering!)

This latest version improves on the last, but not in the BM8 trial. It also fails to compile the fSin function with the error:
Traceback (most recent call last):
File "zxb.py", line 312, in <module>
File "zxb.py", line 265, in main
File "backend\__init__.pyc", line 2386, in emmit
File "backend\__init__.pyc", line 926, in _cast
NameError: global name '_fixed_oper' is not defined
This is obviously a compiler error. Please, link me (in the Wiki?) the fsin library (or attach it to this thread if you have changed it). I need to track this bug. :oops:
Better: file a bug in a new thread.
Reply
#44
New benchmarks with 1.2.8.s644. 1.26-r1603 is still the speed king for ZX Basic. Hisoft fastest overall.

Just for curiosity, I downloaded the "legacy" version on the website. It's version r1812. This also shows the slowdown, if you are trying to track when this happened.

Boriel, do you still have the code available to wind it back to 1.26-r1603? We don't have all versions for download any more!

Code:
BM1      BM2      BM3      BM4      BM5      BM6      BM7      BM8
        Sinclair           4.46     8.46     21.56    19.82    25.34    60.82    87.44    23.30
       ZX Basic 1.26 -O3                                                          2.12    20.78
       ZX Basic 1.26-r1603 -O3                                                    0.94    20.78 (17.14 with fSin)
       ZX Basic 1.26-r1812 -O3                                                    1.36    29.00 (24.22 with fSin)
       ZX Basic 1.2.8-r2153 -O3                                                   1.36    29.06 (24.18 with fSin)
       ZX Basic 1.2.8-s644 -O3                                                    1.32    29.00 (24.22 with fSin)
Reply
#45
britlion Wrote:New benchmarks with 1.2.8.s644. 1.26-r1603 is still the speed king for ZX Basic. Hisoft fastest overall.

Just for curiosity, I downloaded the "legacy" version on the website. It's version r1812. This also shows the slowdown, if you are trying to track when this happened.

Boriel, do you still have the code available to wind it back to 1.26-r1603?

Code:
BM1      BM2      BM3      BM4      BM5      BM6      BM7      BM8
        Sinclair           4.46     8.46     21.56    19.82    25.34    60.82    87.44    23.30
       ZX Basic 1.26 -O3                                                          2.12    20.78
       ZX Basic 1.26-r1603 -O3                                                    0.94    20.78 (17.14 with fSin)
       ZX Basic 1.26-r1812 -O3                                                    1.36    29.00 (24.22 with fSin)
       ZX Basic 1.2.8-r2153 -O3                                                   1.36    29.06 (24.18 with fSin)
       ZX Basic 1.2.8-s644 -O3                                                    1.32    29.00 (24.22 with fSin)
A Question: Are you using float or fixed? Or both?
Reply


Forum Jump:


Users browsing this thread: 3 Guest(s)