FAQ  •  Register  •  Login

Compiler Speed Trials

<<

britlion

Posts: 777

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Fri Nov 29, 2013 6:22 pm

Re: Compiler Speed Trials

Here's the latest with 1.3.0 s1121:
(using my benchmark suite listed above)

  Code:
                           BM1      BM2      BM3      BM4      BM5      BM6      BM7      BM8                       BMDRAW
        Sinclair           4.46     8.46     21.56    19.82    25.34    60.82    87.44    23.30                     80.18
       ZX Basic 1.26 -O3                                                          2.12    20.78
       ZX Basic 1.26-r1603 -O3                                                    0.94    20.78 (17.14 with fSin)
       ZX Basic 1.2.8-r2153 -O3                                                   1.36    29.06 (24.18 with fSin)
       ZX Basic 1.2.8-s644 -O3                                                    1.34    29.02 (24.22 with fSin)   30.42
       ZX Basic 1.2.8-s682 -O3                                                    0.88    20.56 (16.94 with fSin)   21.14
       ZX Basic 1.2.8-s696 -O3                                                    0.90    20.60 (16.98 with fSin)   21.18
       ZX Basic 1.2.8-s758 -O3                                                    0.90    20.76 (17.10 with fSin)   21.32
       ZX Basic 1.2.9-s815 -O3                                                    0.90    20.54 (16.92 with fSin)   21.08
       ZX Basic 1.3.0-s971 -O3                                                    0.90    20.80 (17.16 with fSin)   21.40
       ZX Basic 1.3.0-s1121 -O3                                                   0.898   20.818(17.200 with fSin)  21.40



Mixed results. I increased iterations to try to get a finer view of it than the frames variable. Not much change - if anything very very slightly slower, if we allow for rounding errors in the previous results, it's /very/ close, though. Still worth noting that some other compilers (e.g. Zip2) have aced that integer test in half the time.

And again - bugfixes are unfortunately making the code a little bit more convoluted, I think.
<<

boriel

Site Admin

Posts: 1500

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Post Fri Nov 29, 2013 10:37 pm

Re: Compiler Speed Trials

This impressive!! :shock:
You're describing exactly what I did: with -O3 and Byte comparison, I had to add an extra ccf instrucction (4 T-states).
<<

britlion

Posts: 777

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Sat Nov 30, 2013 3:48 pm

Re: Compiler Speed Trials

That would explain it.

Zip2 (and Zip1.5 as originally testing post 1) does seem to be quite a lot tighter, finishing BM7 in .46 seconds. (23 frames); as I noted, however. the downside is this is an integer only compiler - but it proves there's a much shorter solution to the integer code compile.

Demo: https://dl.dropboxusercontent.com/u/490 ... erTest.z80

And even that fast result has code that a clever compiler could tighten.

E.g. poke 16384,255

Could be
  Code:
LD a,255
LD (16384),A

14 T states, and move on.

It does:
  Code:
LD HL,16384
PUSH HL
LD HL,00000
POP DE
EX DE,HL
LD (HL),E


Which is pretty generic code for the simplest case of set a memory address, but 51 T states. It can make sense if the numbers are calculated - but it does treat all numbers as 16 bit, and just masks to 8 bit where necessary. I can't quite blame it for this - it has to fit inside the zx spectrum, with the basic and also with the compiled result. Space is tight there!

Here's
  Code:
LET V=INT(k/2)*3+4-5


  Code:
LD HL,(54864) ; Fetch variable k

SRA H
RR L   ; Divide

PUSH HL   ;save   

ADD HL,HL
POP DE
ADD HL,DE ; multiply by 3

INC HL
INC HL
INC HL
INC HL ; Add 4

DEC HL
DEC HL
DEC HL
DEC HL
DEC HL ; Subtract 5

LD (54952),HL ; Store back into v


Aaaagh. Starts well. Goes a bit strange. I think you'd at least optimise +4-5 into -1, and save some hassle. (Though it's bad programming to put +4 - 5 to be fair) I wonder when it stops doing INC and DEC? What if that was -200? or -5000? :) (Simon Goodwin probably says inc is faster for small numbers, and "small" is apparently at least 5)



So I think a /really/ smart compiler, should be faster than this. :D

(Easy for me to say, isn't it?)
<<

boriel

Site Admin

Posts: 1500

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Post Sat Nov 30, 2013 4:03 pm

Re: Compiler Speed Trials

britlion wrote:That would explain it.

Zip2 (and Zip1.5 as originally testing post 1) does seem to be quite a lot tighter, finishing BM7 in .46 seconds. (23 frames); as I noted, however. the downside is this is an integer only compiler - but it proves there's a much shorter solution to the integer code compile.

Demo: https://dl.dropboxusercontent.com/u/490 ... erTest.z80

And even that fast result has code that a clever compiler could tighten.

E.g. poke 16384,255

Could be
  Code:
LD a,255
LD (16384),A

14 T states, and move on.

It does:
  Code:
LD HL,16384
PUSH HL
LD HL,00000
POP DE
EX DE,HL
LD (HL),E


I can't testing it now, but if ZXBASIC zxbasic is producing such code, then it's mostly a bug. POKE should be a direct LD whenever possible. :?: :?: :?:

britlion wrote:Which is pretty generic code for the simplest case of set a memory address, but 51 T states. It can make sense if the numbers are calculated - but it does treat all numbers as 16 bit, and just masks to 8 bit where necessary. I can't quite blame it for this - it has to fit inside the zx spectrum, with the basic and also with the compiled result. Space is tight there!

Here's
  Code:
LET V=INT(k/2)*3+4-5


  Code:
LD HL,(54864) ; Fetch variable k

SRA H
RR L   ; Divide

PUSH HL   ;save   

ADD HL,HL
POP DE
ADD HL,DE ; multiply by 3

INC HL
INC HL
INC HL
INC HL ; Add 4

DEC HL
DEC HL
DEC HL
DEC HL
DEC HL ; Subtract 5

LD (54952),HL ; Store back into v


Aaaagh. Starts well. Goes a bit strange. I think you'd at least optimise +4-5 into -1, and save some hassle. (Though it's bad programming to put +4 - 5 to be fair) I wonder when it stops doing INC and DEC? What if that was -200? or -5000? :) (Simon Goodwin probably says inc is faster for small numbers, and "small" is apparently at least 5)
So I think a /really/ smart compiler, should be faster than this. :D

(Easy for me to say, isn't it?)

Hmm. This can also be introduced in ZX BASIC with -O3. Let's try...
<<

britlion

Posts: 777

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Sat Nov 30, 2013 10:44 pm

Re: Compiler Speed Trials

boriel wrote:I can't testing it now, but if ZXBASIC zxbasic is producing such code, then it's mostly a bug. POKE should be a direct LD whenever possible. :?: :?: :?:


No, as I tried to explain, but seem to have failed :(

- that's code that's produced by Zip 2 compiler. Which, despite being demonstrably poor as I showed above, completes the task twice as quickly as ZXBasic's generated code.
<<

britlion

Posts: 777

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Fri Jan 31, 2014 11:48 pm

Re: Compiler Speed Trials

It's time for the new version to get tested!

C:\>zxb --version
zxb 1.4.0-s1779

(using my benchmark suite listed above)

  Code:
                           BM1      BM2      BM3      BM4      BM5      BM6      BM7      BM8                       BMDRAW
        Sinclair           4.46     8.46     21.56    19.82    25.34    60.82    87.44    23.30                     80.18
       ZX Basic 1.26 -O3                                                          2.12    20.78
       ZX Basic 1.26-r1603 -O3                                                    0.94    20.78 (17.14 with fSin)
       ZX Basic 1.2.8-r2153 -O3                                                   1.36    29.06 (24.18 with fSin)
       ZX Basic 1.2.8-s644 -O3                                                    1.34    29.02 (24.22 with fSin)   30.42
       ZX Basic 1.2.8-s682 -O3                                                    0.88    20.56 (16.94 with fSin)   21.14
       ZX Basic 1.2.8-s696 -O3                                                    0.90    20.60 (16.98 with fSin)   21.18
       ZX Basic 1.2.8-s758 -O3                                                    0.90    20.76 (17.10 with fSin)   21.32
       ZX Basic 1.2.9-s815 -O3                                                    0.90    20.54 (16.92 with fSin)   21.08
       ZX Basic 1.3.0-s971 -O3                                                    0.90    20.80 (17.16 with fSin)   21.40
       ZX Basic 1.3.0-s1121 -O3                                                   0.898   20.818(17.200 with fSin)  21.40
       ZX Basic 1.4.0-s1779 -O3                                                   0.892   20.628(17.420 with fSin)  21.22



The good: It all compiles and runs. And runs very very slightly faster. We're into much less than a frame (Only visible with more iterations).

The bad: One test (fsin) actually went slower than previously. And we're still about double the time of zip2 compiler for integer results - as discussed above. Zip2 is actually producing that pretty awful code posted in the last few posts, but it's still twice as fast as ZXB for simple integer maths. I suspect the asm modules are mostly untouched - meaning that most of the code the new version produces is identical. It's how it's getting there that's changed!
<<

boriel

Site Admin

Posts: 1500

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Post Sat Feb 01, 2014 1:59 am

Re: Compiler Speed Trials

britlion wrote:It's time for the new version to get tested!

C:\>zxb --version
zxb 1.4.0-s1779

The bad: One test (fsin) actually went slower than previously. And we're still about double the time of zip2 compiler for integer results - as discussed above. Zip2 is actually producing that pretty awful code posted in the last few posts, but it's still twice as fast as ZXB for simple integer maths. I suspect the asm modules are mostly untouched - meaning that most of the code the new version produces is identical. It's how it's getting there that's changed!

In fact it's supposed ZX BASIC 1.4 produces always the same or better code than 1.3. Try using --asm and compare diff (both fSin versions).
Regarding to zip, now that ZX BASIC is again ready for developing, we can discuss the memory / speed tradeoff (many compilers have them). So you can chose --optimize-for-speed or --optimize-for-memory
<<

boriel

Site Admin

Posts: 1500

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Post Tue Feb 25, 2014 9:52 pm

Re: Compiler Speed Trials

BTW: Happy birthday, Britlion :)
<<

britlion

Posts: 777

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Tue Feb 25, 2014 11:47 pm

Re: Compiler Speed Trials

boriel wrote:BTW: Happy birthday, Britlion :)

Thankyou very much ;)
<<

ardentcrest

Posts: 101

Joined: Fri Oct 11, 2013 5:19 pm

Post Wed Feb 26, 2014 8:05 am

Re: Compiler Speed Trials

Happy late one :lol:
I'm always on the chat or facebook.
<<

britlion

Posts: 777

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Tue Mar 18, 2014 5:25 am

Re: Compiler Speed Trials

boriel wrote:So you can chose --optimize-for-speed or --optimize-for-memory


That would be a cool option.

I think better, though, might be to have it as an inline option - so the critical bits can be speed optimised, but the less critical bits can be memory optimised. Routines that need to be fast (e.g. sprites) are one thing, but some others are out of the game loop, or on intro screens etc. For example, I think the redefine key routine is probably not speed critical; but I bet you don't want it bloated. It probably only runs once.
<<

britlion

Posts: 777

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Sat Jul 14, 2018 11:35 pm

Re: Compiler Speed Trials

C:\>zxb --version
zxb 1.8.3

(using my benchmark suite listed above)

  Code:
                           BM1      BM2      BM3      BM4      BM5      BM6      BM7      BM8                       BMDRAW
        Sinclair           4.46     8.46     21.56    19.82    25.34    60.82    87.44    23.30                     80.18
       ZX Basic 1.26 -O3                                                          2.12    20.78
       ZX Basic 1.26-r1603 -O3                                                    0.94    20.78 (17.14 with fSin)
       ZX Basic 1.2.8-r2153 -O3                                                   1.36    29.06 (24.18 with fSin)
       ZX Basic 1.2.8-s644 -O3                                                    1.34    29.02 (24.22 with fSin)   30.42
       ZX Basic 1.2.8-s682 -O3                                                    0.88    20.56 (16.94 with fSin)   21.14
       ZX Basic 1.2.8-s696 -O3                                                    0.90    20.60 (16.98 with fSin)   21.18
       ZX Basic 1.2.8-s758 -O3                                                    0.90    20.76 (17.10 with fSin)   21.32
       ZX Basic 1.2.9-s815 -O3                                                    0.90    20.54 (16.92 with fSin)   21.08
       ZX Basic 1.3.0-s971 -O3                                                    0.90    20.80 (17.16 with fSin)   21.40
       ZX Basic 1.3.0-s1121 -O3                                                   0.898   20.818(17.200 with fSin)  21.40
       ZX Basic 1.4.0-s1779 -O3                                                   0.892   20.628(17.420 with fSin)  21.22
       ZX Basic 1.4.0-s1980 -O3                                                   0.884   20.818(17.202 with fSin)  21.40
       ZX Basic 1.8.3       -O3                                                   0.874   20.818(17.192 with fSin)  21.40


All right. It's been a long long time since I ran this lot, and I thought it was past time we checked to see if Boriel was being kept honest :)

The new refactored version of the compiler works great! I did have to change some code that used if...then...statement and then else statement stuff - the new one line If syntax tripped it up. But those were trivial changes that took no time to fix.

The good news is the code on the latest build is a hair faster - fastest ever actually. way to go! The zip compiler still manages to hold the crown of fastest code (by about a factor of 2, which is startling); getting in Benchmark 7 at 0.47 seconds vs zxb's 0.87 seconds - but zip is a very cut down integer only and very very limited scope compiler. I'm surprised that the optimisation routes that it uses for simple code haven't been looked at, however. We certainly had a discussion about how it honestly doesn't cheat some years ago.

I think the reason is that it inlines code for small cases. Looking at the assembly, it seems that zxb does do a very simple divide by two on a byte value (it runs srl a, and we're done). But for multiplication by three, it doesn't recognise that as an easy case, and sets h to three, and runs the generic a*h code with a call - it could have inlined push, add hl, hl, pop de, add hl, de, and we're done. Most multiplies are probably lower than 5, and certainly lower than 8, and optimising for *2, *3, *4 etc could be a big boost. Similarly, add one or subtract one is caught as a simple case, but add 4 and add 5 aren't - zip sees these as cases to simply run inc a few more times, rather than run a generic add code, which is why it comes out faster. I think catching cases of small adds and small subtracts and running them as inc inc inc inc etc is probably reasonable. Obviously you can take it too far, but again, most changes for +/- are going to be small.

Anyway, that said, this is all moving in the right direction - the compiler is more powerful, and actually faster all at once. It's getting smarter, and hats off to Boriel for keeping this a fantastic piece of software.

[/quote]
<<

boriel

Site Admin

Posts: 1500

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Post Mon Jul 30, 2018 9:28 pm

Re: Compiler Speed Trials

You're absolutely right. Great analysis.
I've switched to a new job (yes, again!) so have been busy and have paused ZX Basic devel for a little while until I settle.

But the new version 1.9 (still beta) finally allows anyone (i.e. you) to program his own peephole optimizer using a DSL (an specific micro-language). It already works for -O1 and -O2. For -O3 it's a bit harder.

This means that it no longer uses python to optimize code.
But this language (much simpler) and anyone can create it's own optimization schemes and even contribute to the compiler that way.
Indeed this new optimizer already optimizes further (specially 32 bit values)
<<

britlion

Posts: 777

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Mon Sep 17, 2018 9:40 pm

Re: Compiler Speed Trials

Think we'll get it as fast as the old Zip 2 compiler, which is nearly 2x faster? :)
Previous

Return to ZX Basic Compiler

Who is online

Users browsing this forum: No registered users and 2 guests

cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by Vjacheslav Trushkin for Free Forums/DivisionCore.

phpBB SEO