Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Serious FOR bug in latest dev build s1964
#1
Hi,

First, thanks for the compiler it's my first time playing with it.
I think I've found a nasty bug though.

A FOR loop that goes to 255 causes a hang in latest dev build 1.4.1964. Loops to 254 and 256 etc seem to work OK. It's my first time using the compiler so I don't know if it's a new problem or not. It seems to occur regardless of optimisation level.
Code:
10 FOR n=1 TO 255
20 NEXT n
A portion of the generated asm seems to assume the counter is word sized even though it's just a byte so that may be the problem:
Code:
__LABEL0:
        ld a, 255
        ld hl, (_n - 1)              <-- looks a bit dodgy
        cp h
        jp nc, __LABEL3
I'm running python 2.7.6 btw.
Hope that's enough to investigate it...

Colm
Reply
#2
This is not a bug. Look:
Code:
10 FOR n=1 TO 255
20 NEXT n

This will loop until n>255. The problem is, your variable n is defined as UBYTE (or perhaps it's undefined so compiler assumed UBYTE by default?), so you will never get n>255, therefore your loop will repeat forever.

Either replace it with FOR n=0 TO 254 or redefine n as UINTEGER.

Code:
__LABEL0:
        ld a, 255
        ld hl, (_n - 1)              <-- looks a bit dodgy
        cp h
        jp nc, __LABEL3

This code is correct. There's no instruction ld h,(_n) so it's using ld hl,(_n-1) instead, which will affect h in the same way.
Reply
#3
It was undefined so it was implicitly defined as UBYTE. And yes it works fine if explicitly defined as UINTEGER.
So not so serious then but I still think it's a minor bug because the compiler is clearly using the FOR range to decide what implicit size n should have - shouldn't it switch to UINTEGER at 255 instead of 256 as it appears to do now?

I accept your explanation of the assembler btw...
Reply
#4
ivorget Wrote:It was undefined so it was implicitly defined as UBYTE. And yes it works fine if explicitly defined as UINTEGER.
So not so serious then but I still think it's a minor bug because the compiler is clearly using the FOR range to decide what implicit size n should have - shouldn't it switch to UINTEGER at 255 instead of 256 as it appears to do now?

OK, makes sense!

Although I suppose it gave you a warning telling you it was assuming UBYTE type for n. A compiler warning (usually) means the compiler is simply trying to guess how to fix the programmer's fault. For this reason, I would classify this case as "room for improvement" in the compiler, not really as a bug Smile
Reply
#5
I have seen those warnings for other undefined variables but unfortunately there is no such warning if first use is a FOR (unless I specify --explicit) so that should probably be fixed at least. After testing some more, I also found the same problem occurs where ever the (upper-range + step) exceeds a type boundary like 65535, -128 etc.

I was curious enough to hack a fix myself though it probably needs to be thought threw a bit more by someone more familiar with the compiler such as yourself. Anyway here it is:

If zxbparser.py line 1351:
Code:
id_type = common_type(common_type(p[4], p[6]), p[7])
is changed to these two lines:
Code:
beyond = make_number(p[6].value + p[7].value,p.lineno(2)) # upperbound + step
id_type = common_type(common_type(p[4], beyond), p[7])

that seems to fix the problem Smile
Reply
#6
Sorry for the delay in answering this (a very busy week!). This problem is very well known and discussed in other posts, and it's related to the loop boundary. The problem is wat einar pointed above: the loop uses 8 bits and it overflows. The problem is the compiler, usually, cannot guess it in compile time. Consider this:

Code:
DIM a AS UBYTE
RANDOMIZE
LET a = INT(RND * 256)
FOR i = 0 TO RND * 2:
   [...]
END

The compiler cannot guess the value of a in compile time, so it won't be able to know if the buffer overflows or not.
If it does, it should emit a warning or convert a to Uinteger. If it not, then it should leave it as it is.
The difference both in memory and performance is enough to leave it as it is, this is problem everybody (Me included!) bumps into when starting using a compiler. It's equivalent to C: for (i = 0; i < 256; i++) being i of byte type.

ivorget Wrote:I was curious enough to hack a fix myself though it probably needs to be thought threw a bit more by someone more familiar with the compiler such as yourself. Anyway here it is:

If zxbparser.py line 1351:
Code:
id_type = common_type(common_type(p[4], p[6]), p[7])
is changed to these two lines:
Code:
beyond = make_number(p[6].value + p[7].value,p.lineno(2)) # upperbound + step
id_type = common_type(common_type(p[4], beyond), p[7])

that seems to fix the problem Smile
Not sure (will study it this evening), but I guess the code I put above will make your fix to fail, because value is only for constants expressions.
For the record:
  • 3 + 25 * (32 / 3) is a constant expression
  • 2 * a or RND * 4 is not.
Having said that, it's true that for constant expressions (like your first example) the compiler may warn you of the problem.
Also, another trick is to convert your for to a DO LOOP, which allows you to use 8 byte integers:
Code:
LET n = 1
DO
[...]
LOOP UNTIL n = 255
Reply
#7
Actually I agree with ivorget. I also think this could be improved. I only disagree classifying it as a bug, I think it's an improvement, because guessing the datatype the programmer wanted was not supposed to be the compiler's responsibility, and there's no perfect solution for all cases anyway.

However solving this problem is trickier than suggested. When both upperbound and step are constant expressions, type should not be inferred based only on upperbound and step separately (as current implementation) or upperbound+step and step (as ivorget suggested). The compiler must consider all 3. Otherwise some of the following cases won't be identified properly:

Code:
FOR f=x TO 255 STEP 1: NEXT f      ' cannot be UBYTE due to upperbound+step = 255+1
FOR f=x TO 256 STEP -1: NEXT f     ' cannot be UBYTE due to upperbound = 256
FOR f=x TO -129 STEP 1: NEXT f     ' cannot be BYTE due to upperbound = -129
FOR f=x TO -128 STEP -1: NEXT f    ' cannot be BYTE due to upperbound+step = -128-1
Reply
#8
@Boriel,

Yeah I got a bit carried away and my hack will probably crash or fail it encounters non-constant values but maybe there's still some potential there if some protective code were added for the non-constant cases.

I guess I naively thought that a FOR loop did work like DO UNTIL but I see that even with sinclair BASIC it works more like a C for loop so its obviously standard behaviour.

Probably the most important thing right so is to add the missing 'implicit' warning for FOR (when not using --explicit).

@Einar,
Actually the compiler does consider all three because it calls common_type twice. So hopefully your cases would be handled OK. My hack would still fail with variable upperbound and step of course...
Reply
#9
ivorget Wrote:@Einar,
Actually the compiler does consider all three because it calls common_type twice. So hopefully your cases would be handled OK.

They are not.

Current implementation will think that FOR f=0 TO 254 STEP 2 should be UBYTE, because both 254 and 2 will fit into UBYTE. But it should be UINTEGER because comparing with 254+2 requires UINTEGER.

Your proposed change will think that FOR f=x TO -128 STEP 2 should be BYTE, because both -128+2=-126 and 2 will fit into BYTE. But it should be INTEGER.

Also it seems both current implementation and your proposed change will think that FOR f=255 TO 1 STEP -1 should be INTEGER, in order to accommodate all values 255, 1 and -1. But UBYTE would suffice.
Reply
#10
Quote:Your proposed change will think that FOR f=x TO -128 STEP 2 should be BYTE, because both -128+2=-126 and 2 will fit into BYTE. But it should be INTEGER.

But for that loop to be doing anything useful x would have to be <= -128 so in most cases it would already be INTEGER and so wouldn't it be the determining type?
Reply
#11
ivorget Wrote:
Quote:Your proposed change will think that FOR f=x TO -128 STEP 2 should be BYTE, because both -128+2=-126 and 2 will fit into BYTE. But it should be INTEGER.

But for that loop to be doing anything useful x would have to be <= -128 so in most cases it would already be INTEGER and so wouldn't it be the determining type?

If x is a not a constant expression, the compiler may not be able to determine its type, so it will have to guess the correct type based on upperbound -128 and step 2 only.

Frankly, guessing types is a bad idea. ZX BASIC provides it just to reduce effort for adapting programs originally written in Sinclair BASIC (that doesn't have types), so it makes sense to provide it. But there's no perfect solution for all cases.
Reply
#12
Hi guys,

For the simple case of for a=1 to 255, that should totally work as a byte. Both values are in range. Its intuitive, and not unreasonable and certainly possible to implement. If "everyone bumps their head on this" then doesnt it make sense to fix it? Even if the value of a is 0 after the loop thats ok. But bombing out should not be acceptable or something every zx basic coder needs to add to a list of "stuff that doesnt work as u expect" in their heads.

If however the type is too small to fit the range then yeah, thats the coders problem

Just my 2 cents.

Smile
Reply
#13
stomachcontents Wrote:Hi guys,

For the simple case of for a=1 to 255, that should totally work as a byte. Both values are in range. Its intuitive, and not unreasonable and certainly possible to implement. If "everyone bumps their head on this" then doesnt it make sense to fix it? Even if the value of a is 0 after the loop thats ok. But bombing out should not be acceptable or something every zx basic coder needs to add to a list of "stuff that doesnt work as u expect" in their heads.

If however the type is too small to fit the range then yeah, thats the coders problem

Just my 2 cents.

Smile
Agree, but if the compilers adds a workaround about it, then the overload of doing so will make the FOR for uByte very suobptimal. I mean, FOR with Ubyte is very fast, because it does not check many things.
Will think more of this. Maybe a compiler flag like --check_overflow for this will work: if the programmer enable it, the code will run slower but will detect this (this is already done for Arrays, BTW)
Reply
#14
I ran across this bug as well and I'm a bit puzzled.

How do you plot pixels across ZX Spectrum screen using UBYTE?

None of the build-in loops can help, I end up with solution like this (which is almost as fast as for .. next loop)


Code:
DIM x as UBYTE
do
  plot x,30
  if x = 255 then exit do
  x = x + 1
loop

When I use:


Code:
for x=0 to 254
  print at 0,0;x
next x
print at 1,0;x

I see 254 and printed 255 below, but I can't be in the loop for 255 because it will be overflown.

Or maybe I'm missing something?
Reply
#15
(01-21-2021, 09:44 PM)smok Wrote: I ran across this bug as well and I'm a bit puzzled.

How do you plot pixels across ZX Spectrum screen using UBYTE?

None of the build-in loops can help, I end up with solution like this (which is almost as fast as for .. next loop)


Code:
DIM x as UBYTE
do
  plot x,30
  if x = 255 then exit do
  x = x + 1
loop

When I use:


Code:
for x=0 to 254
  print at 0,0;x
next x
print at 1,0;x

I see 254 and printed 255 below, but I can't be in the loop for 255 because it will be overflown.

Or maybe I'm missing something?

That's it.
You can do FOR i = 0 to 255 as Ubyte, because it will overflow and never reach 255. A warning will be issued in the future when these patterns are catched. The reason is for how FOR semantics is implemented. The loop will stop *after* the iterator variable goes beyond the limit, which will never happen in this case.

Use UInteger for plot coords. Indeed Draw x y are UIntegers (their value ranges form -255 to +255).

An alternative approach in your case is:
Code:
DIM x as UByte
x = 0
DO
    PRINT AT 0, 0; x;
    LET x = x + 1
LOOP UNTIL x = 0

The loop you write is also ok, but look it starts to get "cumbersome". You got it. That's the problem: The FOR once it reaches the end, needs to increment and execute the body only if it's equal or lower. There's another solution I'm thinking of, but in the meantime please use the LOOP approach.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)