FAQ  •  Register  •  Login

Compiler Speed Trials

<<

britlion

Posts: 766

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Fri Nov 29, 2013 6:22 pm

Re: Compiler Speed Trials

Here's the latest with 1.3.0 s1121:
(using my benchmark suite listed above)

  Code:
                           BM1      BM2      BM3      BM4      BM5      BM6      BM7      BM8                       BMDRAW
        Sinclair           4.46     8.46     21.56    19.82    25.34    60.82    87.44    23.30                     80.18
       ZX Basic 1.26 -O3                                                          2.12    20.78
       ZX Basic 1.26-r1603 -O3                                                    0.94    20.78 (17.14 with fSin)
       ZX Basic 1.2.8-r2153 -O3                                                   1.36    29.06 (24.18 with fSin)
       ZX Basic 1.2.8-s644 -O3                                                    1.34    29.02 (24.22 with fSin)   30.42
       ZX Basic 1.2.8-s682 -O3                                                    0.88    20.56 (16.94 with fSin)   21.14
       ZX Basic 1.2.8-s696 -O3                                                    0.90    20.60 (16.98 with fSin)   21.18
       ZX Basic 1.2.8-s758 -O3                                                    0.90    20.76 (17.10 with fSin)   21.32
       ZX Basic 1.2.9-s815 -O3                                                    0.90    20.54 (16.92 with fSin)   21.08
       ZX Basic 1.3.0-s971 -O3                                                    0.90    20.80 (17.16 with fSin)   21.40
       ZX Basic 1.3.0-s1121 -O3                                                   0.898   20.818(17.200 with fSin)  21.40



Mixed results. I increased iterations to try to get a finer view of it than the frames variable. Not much change - if anything very very slightly slower, if we allow for rounding errors in the previous results, it's /very/ close, though. Still worth noting that some other compilers (e.g. Zip2) have aced that integer test in half the time.

And again - bugfixes are unfortunately making the code a little bit more convoluted, I think.
<<

boriel

Site Admin

Posts: 1463

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Post Fri Nov 29, 2013 10:37 pm

Re: Compiler Speed Trials

This impressive!! :shock:
You're describing exactly what I did: with -O3 and Byte comparison, I had to add an extra ccf instrucction (4 T-states).
<<

britlion

Posts: 766

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Sat Nov 30, 2013 3:48 pm

Re: Compiler Speed Trials

That would explain it.

Zip2 (and Zip1.5 as originally testing post 1) does seem to be quite a lot tighter, finishing BM7 in .46 seconds. (23 frames); as I noted, however. the downside is this is an integer only compiler - but it proves there's a much shorter solution to the integer code compile.

Demo: https://dl.dropboxusercontent.com/u/490 ... erTest.z80

And even that fast result has code that a clever compiler could tighten.

E.g. poke 16384,255

Could be
  Code:
LD a,255
LD (16384),A

14 T states, and move on.

It does:
  Code:
LD HL,16384
PUSH HL
LD HL,00000
POP DE
EX DE,HL
LD (HL),E


Which is pretty generic code for the simplest case of set a memory address, but 51 T states. It can make sense if the numbers are calculated - but it does treat all numbers as 16 bit, and just masks to 8 bit where necessary. I can't quite blame it for this - it has to fit inside the zx spectrum, with the basic and also with the compiled result. Space is tight there!

Here's
  Code:
LET V=INT(k/2)*3+4-5


  Code:
LD HL,(54864) ; Fetch variable k

SRA H
RR L   ; Divide

PUSH HL   ;save   

ADD HL,HL
POP DE
ADD HL,DE ; multiply by 3

INC HL
INC HL
INC HL
INC HL ; Add 4

DEC HL
DEC HL
DEC HL
DEC HL
DEC HL ; Subtract 5

LD (54952),HL ; Store back into v


Aaaagh. Starts well. Goes a bit strange. I think you'd at least optimise +4-5 into -1, and save some hassle. (Though it's bad programming to put +4 - 5 to be fair) I wonder when it stops doing INC and DEC? What if that was -200? or -5000? :) (Simon Goodwin probably says inc is faster for small numbers, and "small" is apparently at least 5)



So I think a /really/ smart compiler, should be faster than this. :D

(Easy for me to say, isn't it?)
<<

boriel

Site Admin

Posts: 1463

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Post Sat Nov 30, 2013 4:03 pm

Re: Compiler Speed Trials

britlion wrote:That would explain it.

Zip2 (and Zip1.5 as originally testing post 1) does seem to be quite a lot tighter, finishing BM7 in .46 seconds. (23 frames); as I noted, however. the downside is this is an integer only compiler - but it proves there's a much shorter solution to the integer code compile.

Demo: https://dl.dropboxusercontent.com/u/490 ... erTest.z80

And even that fast result has code that a clever compiler could tighten.

E.g. poke 16384,255

Could be
  Code:
LD a,255
LD (16384),A

14 T states, and move on.

It does:
  Code:
LD HL,16384
PUSH HL
LD HL,00000
POP DE
EX DE,HL
LD (HL),E


I can't testing it now, but if ZXBASIC zxbasic is producing such code, then it's mostly a bug. POKE should be a direct LD whenever possible. :?: :?: :?:

britlion wrote:Which is pretty generic code for the simplest case of set a memory address, but 51 T states. It can make sense if the numbers are calculated - but it does treat all numbers as 16 bit, and just masks to 8 bit where necessary. I can't quite blame it for this - it has to fit inside the zx spectrum, with the basic and also with the compiled result. Space is tight there!

Here's
  Code:
LET V=INT(k/2)*3+4-5


  Code:
LD HL,(54864) ; Fetch variable k

SRA H
RR L   ; Divide

PUSH HL   ;save   

ADD HL,HL
POP DE
ADD HL,DE ; multiply by 3

INC HL
INC HL
INC HL
INC HL ; Add 4

DEC HL
DEC HL
DEC HL
DEC HL
DEC HL ; Subtract 5

LD (54952),HL ; Store back into v


Aaaagh. Starts well. Goes a bit strange. I think you'd at least optimise +4-5 into -1, and save some hassle. (Though it's bad programming to put +4 - 5 to be fair) I wonder when it stops doing INC and DEC? What if that was -200? or -5000? :) (Simon Goodwin probably says inc is faster for small numbers, and "small" is apparently at least 5)
So I think a /really/ smart compiler, should be faster than this. :D

(Easy for me to say, isn't it?)

Hmm. This can also be introduced in ZX BASIC with -O3. Let's try...
<<

britlion

Posts: 766

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Sat Nov 30, 2013 10:44 pm

Re: Compiler Speed Trials

boriel wrote:I can't testing it now, but if ZXBASIC zxbasic is producing such code, then it's mostly a bug. POKE should be a direct LD whenever possible. :?: :?: :?:


No, as I tried to explain, but seem to have failed :(

- that's code that's produced by Zip 2 compiler. Which, despite being demonstrably poor as I showed above, completes the task twice as quickly as ZXBasic's generated code.
<<

britlion

Posts: 766

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Fri Jan 31, 2014 11:48 pm

Re: Compiler Speed Trials

It's time for the new version to get tested!

C:\>zxb --version
zxb 1.4.0-s1779

(using my benchmark suite listed above)

  Code:
                           BM1      BM2      BM3      BM4      BM5      BM6      BM7      BM8                       BMDRAW
        Sinclair           4.46     8.46     21.56    19.82    25.34    60.82    87.44    23.30                     80.18
       ZX Basic 1.26 -O3                                                          2.12    20.78
       ZX Basic 1.26-r1603 -O3                                                    0.94    20.78 (17.14 with fSin)
       ZX Basic 1.2.8-r2153 -O3                                                   1.36    29.06 (24.18 with fSin)
       ZX Basic 1.2.8-s644 -O3                                                    1.34    29.02 (24.22 with fSin)   30.42
       ZX Basic 1.2.8-s682 -O3                                                    0.88    20.56 (16.94 with fSin)   21.14
       ZX Basic 1.2.8-s696 -O3                                                    0.90    20.60 (16.98 with fSin)   21.18
       ZX Basic 1.2.8-s758 -O3                                                    0.90    20.76 (17.10 with fSin)   21.32
       ZX Basic 1.2.9-s815 -O3                                                    0.90    20.54 (16.92 with fSin)   21.08
       ZX Basic 1.3.0-s971 -O3                                                    0.90    20.80 (17.16 with fSin)   21.40
       ZX Basic 1.3.0-s1121 -O3                                                   0.898   20.818(17.200 with fSin)  21.40
       ZX Basic 1.4.0-s1779 -O3                                                   0.892   20.628(17.420 with fSin)  21.22



The good: It all compiles and runs. And runs very very slightly faster. We're into much less than a frame (Only visible with more iterations).

The bad: One test (fsin) actually went slower than previously. And we're still about double the time of zip2 compiler for integer results - as discussed above. Zip2 is actually producing that pretty awful code posted in the last few posts, but it's still twice as fast as ZXB for simple integer maths. I suspect the asm modules are mostly untouched - meaning that most of the code the new version produces is identical. It's how it's getting there that's changed!
<<

boriel

Site Admin

Posts: 1463

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Post Sat Feb 01, 2014 1:59 am

Re: Compiler Speed Trials

britlion wrote:It's time for the new version to get tested!

C:\>zxb --version
zxb 1.4.0-s1779

The bad: One test (fsin) actually went slower than previously. And we're still about double the time of zip2 compiler for integer results - as discussed above. Zip2 is actually producing that pretty awful code posted in the last few posts, but it's still twice as fast as ZXB for simple integer maths. I suspect the asm modules are mostly untouched - meaning that most of the code the new version produces is identical. It's how it's getting there that's changed!

In fact it's supposed ZX BASIC 1.4 produces always the same or better code than 1.3. Try using --asm and compare diff (both fSin versions).
Regarding to zip, now that ZX BASIC is again ready for developing, we can discuss the memory / speed tradeoff (many compilers have them). So you can chose --optimize-for-speed or --optimize-for-memory
<<

boriel

Site Admin

Posts: 1463

Joined: Wed Nov 01, 2006 6:18 pm

Location: Santa Cruz de Tenerife, Spain

Post Tue Feb 25, 2014 9:52 pm

Re: Compiler Speed Trials

BTW: Happy birthday, Britlion :)
<<

britlion

Posts: 766

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Tue Feb 25, 2014 11:47 pm

Re: Compiler Speed Trials

boriel wrote:BTW: Happy birthday, Britlion :)

Thankyou very much ;)
<<

ardentcrest

Posts: 94

Joined: Fri Oct 11, 2013 5:19 pm

Post Wed Feb 26, 2014 8:05 am

Re: Compiler Speed Trials

Happy late one :lol:
I'm always on the chat or facebook.
<<

britlion

Posts: 766

Joined: Mon Apr 27, 2009 7:26 pm

Location: Slough, Berkshire, UK

Post Tue Mar 18, 2014 5:25 am

Re: Compiler Speed Trials

boriel wrote:So you can chose --optimize-for-speed or --optimize-for-memory


That would be a cool option.

I think better, though, might be to have it as an inline option - so the critical bits can be speed optimised, but the less critical bits can be memory optimised. Routines that need to be fast (e.g. sprites) are one thing, but some others are out of the game loop, or on intro screens etc. For example, I think the redefine key routine is probably not speed critical; but I bet you don't want it bloated. It probably only runs once.
Previous

Return to ZX Basic Compiler

Who is online

Users browsing this forum: No registered users and 3 guests

cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by Vjacheslav Trushkin for Free Forums/DivisionCore.

phpBB SEO