Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Mixing variable types in a numerical calculation
#1
My program is calculating attribute memory addresses for screen characters. In the interests of only DIMing what I need, I store the screen coordinates as UBYTE since I only need numbers from 0-32 at most.

I have spent a long time to find that the compiler doesn't perform the maths properly when these UBYTE variables are used with larger numbers in a calculation - even though I'm never asking for that UBYTE variable to be made any larger. See this example:
Code:
DIM x,y AS UBYTE
DIM attrMem AS INTEGER
CLS
y=18
x=4
attrMem=22528+y*32+x
POKE attrMem,16 'red paper, black ink

The result should be a red square AT 18,4. However, it is printed in the wrong location on the screen unless I change variables x and y to type INTEGER. Why would this be, since I am not asking for y or x to be expanded beyond their existing values? Is this expected behaviour, or a bug? I haven't really programmed any languages which have needed me to be so specific about number types before, so apologies if this is a silly question.
Reply
#2
This is a recurrent question:

You're facing truncation because y * 32 is computed on first place (* has precedence over +), and y * 32 = 576 ( > 255). Some compilers do type promotion (coercion) automatically. ZX Basic does not because for Z80 (8 bits, no multiplication instruction) this will degrade performance. In the future this feature will be added as an option.

You can cast explicitly:
Code:
attrMem=22528+CAST(Uinteger, y)*32+x
The above will fix it, signaling that y value must used as an Uinteger in that multiplication. See CAST for an explanation.

Anyway, there's already a very fast function for calculating the ATTR address:
Code:
#include <attr.bas>

DIM x,y AS UBYTE
CLS
y=18
x=4
POKE AttrAddr(y, x),16 'red paper, black ink

ATTR function is implemented in a library (#include <attr.bas>). And so is POINT.
Reply
#3
Thanks for the explanation. That is pretty counter-intuitive and a bit horrible to debug when the results aren't as expected.

It's a very un-BASIC-like programming discipline, to have to break down every single mathematical expression and consider whether each variable will max out within the scope of the calculation. I'll probably just DIM more variables as UINTEGER to avoid this kind of problem, so my program's performance and memory footprint will likely end up worse than if the compiler offered coercion. I did spot some truncation issue in edge cases elsewhere in my code for similar reasons.

For the example given though I will use that suggested library, which I had also spotted after I posted this topic.
I keep thinking about that Ariane rocket that blew up because of a casting error Smile
Reply
#4
(01-05-2021, 01:15 PM)patters Wrote: Thanks for the explanation. That is pretty counter-intuitive and a bit horrible to debug when the results aren't as expected.

It's difficult to break down every single mathematical expression and consider whether each variable will max out. I'll probably just DIM more variables as UINTEGER to avoid this kind of problem, so my program's performance and memory footprint will likely end up worse than if the compiler offered coercion. I did spot some truncation issue in edge cases elsewhere in my code for similar reasons.

For the example given though I will use that suggested library, which I had also spotted after I posted this topic.

As already said before, type coercion is on the TODO's queue. Sorry, but I've had no time to implement it.
Currently coercion is done only on assignation (LET).
Reply
#5
Incidentally, what happens when you multiple two UBYTES variables together, which one needs to become UINTEGER? Or is it both?
Reply
#6
(01-05-2021, 01:42 PM)patters Wrote: Incidentally, what happens when you multiple two UBYTES variables together, which one needs to become UINTEGER? Or is it both?

One is enough. ZX Basic will understand and promote the other one automatically.
Keep in mind that C compilers automatically promote *every* operand into "int" (which is the native word size of the architecture). Z80 "int" is 1 byte or 16 bytes depending on what you consider the "native integer" register size.
More info here: https://stackoverflow.com/questions/4353...other-char

I think using that strategy is a performance killer for the Speccy Sad  (Intel CPUs have 16, 32 and 64 bits with MUL-tiplication in hardware. That's another thing). My implementation will be that the type to be promoted will be the one expected in the LEFT part of the LET assignation (in your case, Integer).
Reply
#7
Using the type from the variable being assigned certainly seems sensible.

How big a task is it for you to implement the compiler coercion, is it worth me hanging back with my programming to wait? Is it a few days/weeks away or is it months away?
Reply
#8
I'm already working in other bugfixes and features, so let's say weeks.
Better proceed on your way (your program will be compatible in future versions anyway).
Reply
#9
Ok, thanks for the update

Below is the part of my code where I originally ran into this issue. I eventually arrived at the above example while painstakingly debugging it:

Code:
castleIntactAttr=(64*CAST(UINTEGER, PEEK (n+4))+8*skyCol+PEEK (n+5)+64*PEEK (n+6)+8*skyCol+PEEK (n+7))*2

Without that CAST, it just gives the wrong answer since all the PEEK results are UBYTE.
In the above expression, did I put the CAST in the correct location?
Reply
#10
(01-05-2021, 04:44 PM)patters Wrote: Ok, thanks for the update

Below is the part of my code where I originally ran into this issue. I eventually arrived at the above example while painstakingly debugging it:

Code:
castleIntactAttr=(64*CAST(UINTEGER, PEEK (n+4))+8*skyCol+PEEK (n+5)+64*PEEK (n+6)+8*skyCol+PEEK (n+7))*2

Without that CAST, it just gives the wrong answer since all the PEEK results are UBYTE.
In the above expression, did I put the CAST in the correct location?

As a "rule of thumb" whenever you use a multiplication with an UByte, you better use it.
If "SkyCol" is Ubyte, you better cast it too.

However in this case I guess 'castleIntactAttr' is a 8-bit variable?? If so, there should be not overflow. What are you trying to do there? You're adding

Also remember you can use bit shift << and >> in this case:
64 * n  is (n SHL 6) for example. Which can also be written as (n << 6) (I guess that's BRIGHT).
What happens if you set BRIGHT 1 in peek(n + 4) and peek(n + 6) ?? does this work in Sinclair BASIC??

Better use bAND, bOR, bXOR, SHL, and SHR operators for this.
Reply
#11
castleIntactAttr is the sum of the attributes for all four character positions occupied by the castle. It will be checked every increment of the cannon shot trajectories. If it changes, then the shot has hit the castle. However, depending on how fast the trajectory maths ends up once compiled, I may be able to use PLOTs instead of multiple DRAWs. If this is the case I can simply check the PLOT coordinates are not within box occupied by the castle characters and abandon this ATTR method.

Since castleIntactAttr is the sum of four 8bit numbers it does go higher than 255, which is where I ran into the problem. The PEEK (n+4), PEEK (n+5) etc. are colour and brightness values from a lookup table of DEFB entries, since the castle sprites need different colours and brightness values depending on the landscape colours. The castle hasn't been drawn at this point so the ATTR function isn't useful here. I invoke this as part of a sub to update the castle graphics when the landscape or time of day changes, and so this is a good place to recalculate the castleIntactAttr after the graphics have changed.

To understand this a bit better:

castleIntactAttr=(64*CAST(UINTEGER, PEEK (n+4))+8*skyCol+PEEK (n+5)+64*PEEK (n+6)+8*skyCol+PEEK (n+7))*2

        UINTEGER=(UBYTE*UINTEGER)+(UBYTE*UBYTE)+UBYTE+(UBYTE*UBYTE)+(UBYTE*UBYTE)+UBYTE
                                   always<255           [font=Monaco, Consolas, Courier, monospace]always<255    [font=Monaco, Consolas, Courier, monospace]always<255       (they are attribute bits)[/font][/font]
[font=Monaco, Consolas, Courier, monospace][font=Monaco, Consolas, Courier, monospace]So[/font][/font]
        UINTEGER=    UINTEGER    +  UBYTE      +UBYTE+    UBYTE    +   UBYTE     +UBYTE

At this last point will the compiler be intelligent enough to add all of these UBYTES to the UINTEGER left to right and hence not overflow?
Reply
#12
(01-05-2021, 06:25 PM)patters Wrote: castleIntactAttr is the sum of the attributes for all four character positions occupied by the castle. It will be checked every increment of the cannon shot trajectories. If it changes, then the shot has hit the castle. However, depending on how fast the trajectory maths ends up once compiled, I may be able to use PLOTs instead of multiple DRAWs. If this is the case I can simply check the PLOT coordinates are not within box occupied by the castle characters and abandon this ATTR method.

Checking a point within a square region is definitely much faster and easy to implement. Also I don't understand why you just dont check ATTR position of the cannon itself? When it matches that of the castle you have a hit.
Reply
#13
(01-05-2021, 06:25 PM)patters Wrote: Since castleIntactAttr is the sum of four 8bit numbers it does go higher than 255, which is where I ran into the problem. The PEEK (n+4), PEEK (n+5) etc. are colour and brightness values from a lookup table of DEFB entries, since the castle sprites need different colours and brightness values depending on the landscape colours. The castle hasn't been drawn at this point so the ATTR function isn't useful here. I invoke this as part of a sub to update the castle graphics when the landscape or time of day changes, and so this is a good place to recalculate the castleIntactAttr after the graphics have changed.

To understand this a bit better:

castleIntactAttr=(64*CAST(UINTEGER, PEEK (n+4))+8*skyCol+PEEK (n+5)+64*PEEK (n+6)+8*skyCol+PEEK (n+7))*2

        UINTEGER=(UBYTE*UINTEGER)+(UBYTE*UBYTE)+UBYTE+(UBYTE*UBYTE)+(UBYTE*UBYTE)+UBYTE
                                   always<255           [font=Monaco, Consolas, Courier, monospace]always<255    [font=Monaco, Consolas, Courier, monospace]always<255       (they are attribute bits)[/font][/font]
[font=Monaco, Consolas, Courier, monospace][font=Monaco, Consolas, Courier, monospace]So[/font][/font]
        UINTEGER=    UINTEGER    +  UBYTE      +UBYTE+    UBYTE    +   UBYTE     +UBYTE

At this last point will the compiler be intelligent enough to add all of these UBYTES to the UINTEGER left to right and hence not overflow?
Yes, it should work. In such case, don't use UByte variables and simplify the above expression. If the expression is that complicated, using UByte you're not gaining anything. On the contrary, you're wasting memory and speed, because conversion takes time and generates more code.

Also you can use a macro to simplify such expression:

Code:
#define UPEEK(x)  CAST(Uinteger, Peek(x))
castleIntactAttr=(64* UPEEK(n+4)+8*skyCol+PEEK (n+5)+64*UPEEK (n+6)+8*skyCol+PEEK (n+7))*2


Finally, if you want to compare 4 bytes, for a change it's better to store them in a Ulong (32 bits) integer. You can PEEK not only bytes, but also integers, longs and floats with PEEK (<type>, <address>):
Code:
DIM castleIntactAttr as ULong

castleIntactAttr = PEEK(ULong, n + 4)

This is definitely *much* faster and cleaner.
Reply
#14
Thanks for the tips. I didn't know about macros, this would be useful to add the wiki (I couldn't find anything about it). I had forgotten about the Z80 registers being used in pairs for 16bit numbers - I had 8bits stuck in my mind so I was trying to use UBYTE wherever I could.

The four attribute bytes I need to poll in the screen memory are not all sequential (it's two pairs in fact) so I'm a little unsure of the optimal way to aggregate the four bytes into a single ULONG. How is the below code? It seems a mess, but I guess it's still probably faster than the addition I was doing? Is it? I'm not familiar with the intricacies of how a processor calculates arithmetic at the ASM level, but my guess is that it's more complex than a few PEEKs and POKEs.

Code:
DIM a1, a2 AS UINTEGER         'target gfx attribute addresses to poll (two pairs of bytes)
DIM e1, e2 AS UBYTE            'expected attribute bytes (sprite is 4 chars, but has vertical symmetry)
DIM expectedAttrs AS ULONG

'at beginning of game round (performance not important)
e1=16: e2=16                                 'these attributes do change with different landscape colours
POKE @exp, e1: POKE @exp+1,e2                'Is a union needed? Would inline ASM be better, to use a register pair? Perf not important here though
POKE (UINTEGER @exp+2, PEEK (UINTEGER @exp)) 'copy the first two bytes again
expectedAttrs=PEEK (ULONG, @exp)

'during each cycle of the cannon shot trajectory (performance-critical)
POKE (UINTEGER @test,   PEEK (UINTEGER,a1))  'copy the first pair of bytes to the ULONG
POKE (UINTEGER @test+2, PEEK (UINTEGER,a2))  'copy the second pair of bytes to the ULONG
IF PEEK (ULONG, @test) <> expectedAttrs THEN PRINT "castle hit"

STOP

exp:
ASM
DEFB 0,0,0,0 ; is there a better way to reserve bytes at a referenced address?
END ASM

test:
ASM
DEFB 0,0,0,0
END ASM
Reply
#15
(01-06-2021, 02:59 PM)patters Wrote: Thanks for the tips. I didn't know about macros, this would be useful to add the wiki (I couldn't find anything about it). I had forgotten about the Z80 registers being used in pairs for 16bit numbers - I had 8bits stuck in my mind so I was trying to use UBYTE wherever I could.

The four attribute bytes I need to poll in the screen memory are not all sequential (it's two pairs in fact) so I'm a little unsure of the optimal way to aggregate the four bytes into a single ULONG. How is the below code? It seems a mess, but I guess it's still probably faster than the addition I was doing? Is it? I'm not familiar with the intricacies of how a processor calculates arithmetic at the ASM level, but my guess is that it's more complex than a few PEEKs and POKEs.

Code:
DIM a1, a2 AS UINTEGER         'target gfx attribute addresses to poll (two pairs of bytes)
DIM e1, e2 AS UBYTE            'expected attribute bytes (sprite is 4 chars, but has lateral symmetry)
DIM expectedAttrs AS ULONG

'at beginning of game round (performance not important)
e1=16: e2=16                                 'these attributes do change with different landscape colours
POKE @exp, e1: POKE @exp+1,e2                'Is a union needed? Would inline ASM be better, to use a register pair? Perf not important here though
POKE (UINTEGER @exp+2, PEEK (UINTEGER @exp)) 'copy the first two bytes again
expectedAttrs=PEEK (ULONG, @exp)

'during each cycle of the cannon shot trajectory (performance-critical)
POKE (UINTEGER @test,   PEEK (UINTEGER,a1))  'copy the first pair of bytes to the ULONG
POKE (UINTEGER @test+2, PEEK (UINTEGER,a2))  'copy the second pair of bytes to the ULONG
IF PEEK (ULONG, @test) <> expectedAttrs THEN PRINT "castle hit"

STOP

exp:
ASM
DEFB 0,0,0,0 ; is there a better way to reserve bytes at a referenced address?
END ASM

test:
ASM
DEFB 0,0,0,0
END ASM

No need to reserve memory in this case: just declare an ULong variable and use @ULong + 0, @ULong + 1,etc..
Still not sure what are you trying to achieve with this. I suppose the castles I saw in the demo you send me are 2x2 attr cells and you want to store their values. If so:


Code:
DIM expectedAttr, testAttr as ULong
DIM e1 AT @expectedAttr as UByte      ' e1 maps into the first byte of expectedAttr
DIM e2 AT @expectedAttr + 1 as UByte  ' e2 maps into the 2nd byte of expectedAttr
DIM e3 AT @expectedAttr + 2as UByte   ' e3 maps into the 3rd byte of expectedAttr
DIM e4 AT @expectedAttr + 3as UByte   ' e4 maps into the 4th byte of expectedAttr
...

POKE Uinteger @testAttr, PEEK (Uinteger, ATTRaddr(castleRow1, castleCol1))
POKE Uinteger @testAttr + 2, PEEK (Uinteger, ATTRaddr(castleRow2, castleCol1))


Still believe you might be complicating things a bit, and doing premature optimization? Big Grin
I mean, consider use a much simpler solution first (even checking position by position). See how fast it goes and optimize later (i.e. you're starting to use this compiler: use arrays of bytes for this and compare the 4 positions instead of doing this dirty thing on first place).
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)