Login

patters · (This post was last modified: 01-05-2021, 01:32 AM by patters.)

My program is calculating attribute memory addresses for screen characters. In the interests of only DIMing what I need, I store the screen coordinates as UBYTE since I only need numbers from 0-32 at most.

I have spent a long time to find that the compiler doesn't perform the maths properly when these UBYTE variables are used with larger numbers in a calculation - even though I'm never asking for that UBYTE variable to be made any larger. See this example:

Code:
DIM x,y AS UBYTE

DIM attrMem AS INTEGER

CLS

y=18

x=4

attrMem=22528+y*32+x

POKE attrMem,16 'red paper, black ink

The result should be a red square AT 18,4. However, it is printed in the wrong location on the screen unless I change variables x and y to type INTEGER. Why would this be, since I am not asking for y or x to be expanded beyond their existing values? Is this expected behaviour, or a bug? I haven't really programmed any languages which have needed me to be so specific about number types before, so apologies if this is a silly question.

***boriel*** · (This post was last modified: 01-05-2021, 12:24 PM by boriel.)

This is a recurrent question:

You're facing truncation because y * 32 is computed on first place (* has precedence over +), and y * 32 = 576 ( > 255). Some compilers do type promotion (coercion) automatically. ZX Basic does not because for Z80 (8 bits, no multiplication instruction) this will degrade performance. In the future this feature will be added as an option.

You can cast explicitly:

Code:
attrMem=22528+CAST(Uinteger, y)*32+x

The above will fix it, signaling that y value must used as an Uinteger in that multiplication. See CAST for an explanation.

Anyway, there's already a very fast function for calculating the ATTR address:

Code:
#include <attr.bas>

DIM x,y AS UBYTE

CLS

y=18

x=4

POKE AttrAddr(y, x),16 'red paper, black ink

ATTR function is implemented in a library (#include <attr.bas>). And so is POINT.

patters · (This post was last modified: 01-05-2021, 01:26 PM by patters.)

Thanks for the explanation. That is pretty counter-intuitive and a bit horrible to debug when the results aren't as expected.

It's a very un-BASIC-like programming discipline, to have to break down every single mathematical expression and consider whether each variable will max out within the scope of the calculation. I'll probably just DIM more variables as UINTEGER to avoid this kind of problem, so my program's performance and memory footprint will likely end up worse than if the compiler offered coercion. I did spot some truncation issue in edge cases elsewhere in my code for similar reasons.

For the example given though I will use that suggested library, which I had also spotted after I posted this topic.
I keep thinking about that Ariane rocket that blew up because of a casting error Smile

***boriel*** · (This post was last modified: 01-05-2021, 01:23 PM by boriel.)

(01-05-2021, 01:15 PM)patters Wrote: Thanks for the explanation. That is pretty counter-intuitive and a bit horrible to debug when the results aren't as expected.

It's difficult to break down every single mathematical expression and consider whether each variable will max out. I'll probably just DIM more variables as UINTEGER to avoid this kind of problem, so my program's performance and memory footprint will likely end up worse than if the compiler offered coercion. I did spot some truncation issue in edge cases elsewhere in my code for similar reasons.

For the example given though I will use that suggested library, which I had also spotted after I posted this topic.

As already said before, type coercion is on the TODO's queue. Sorry, but I've had no time to implement it.
Currently coercion is done only on assignation (LET).

patters · (This post was last modified: 01-05-2021, 01:44 PM by patters.)

Incidentally, what happens when you multiple two UBYTES variables together, which one needs to become UINTEGER? Or is it both?

***boriel*** · (This post was last modified: 01-05-2021, 01:52 PM by boriel.)

(01-05-2021, 01:42 PM)patters Wrote: Incidentally, what happens when you multiple two UBYTES variables together, which one needs to become UINTEGER? Or is it both?

One is enough. ZX Basic will understand and promote the other one automatically.
Keep in mind that C compilers automatically promote *every* operand into "int" (which is the native word size of the architecture). Z80 "int" is 1 byte or 16 bytes depending on what you consider the "native integer" register size.
More info here: https://stackoverflow.com/questions/4353...other-char

I think using that strategy is a performance killer for the Speccy Sad

(Intel CPUs have 16, 32 and 64 bits with MUL-tiplication in hardware. That's another thing). My implementation will be that the type to be promoted will be the one expected in the LEFT part of the LET assignation (in your case, Integer).

patters · (This post was last modified: 01-05-2021, 03:40 PM by patters.)

Using the type from the variable being assigned certainly seems sensible.

How big a task is it for you to implement the compiler coercion, is it worth me hanging back with my programming to wait? Is it a few days/weeks away or is it months away?

***boriel*** · (This post was last modified: 01-05-2021, 04:06 PM by boriel.)

I'm already working in other bugfixes and features, so let's say weeks.
Better proceed on your way (your program will be compatible in future versions anyway).

patters · (This post was last modified: 01-05-2021, 05:33 PM by patters.)

Ok, thanks for the update

Below is the part of my code where I originally ran into this issue. I eventually arrived at the above example while painstakingly debugging it:

Code:
castleIntactAttr=(64*CAST(UINTEGER, PEEK (n+4))+8*skyCol+PEEK (n+5)+64*PEEK (n+6)+8*skyCol+PEEK (n+7))*2

Without that CAST, it just gives the wrong answer since all the PEEK results are UBYTE.
In the above expression, did I put the CAST in the correct location?

***boriel*** · 01-05-2021, 05:58 PM

(01-05-2021, 04:44 PM)patters Wrote: Ok, thanks for the update

Below is the part of my code where I originally ran into this issue. I eventually arrived at the above example while painstakingly debugging it:

Code:
castleIntactAttr=(64*CAST(UINTEGER, PEEK (n+4))+8*skyCol+PEEK (n+5)+64*PEEK (n+6)+8*skyCol+PEEK (n+7))*2

Without that CAST, it just gives the wrong answer since all the PEEK results are UBYTE.
In the above expression, did I put the CAST in the correct location?

As a "rule of thumb" whenever you use a multiplication with an UByte, you better use it.
If "SkyCol" is Ubyte, you better cast it too.

However in this case I guess 'castleIntactAttr' is a 8-bit variable?? If so, there should be not overflow. What are you trying to do there? You're adding

Also remember you can use bit shift << and >> in this case:
64 * n is (n SHL 6) for example. Which can also be written as (n << 6) (I guess that's BRIGHT).
What happens if you set BRIGHT 1 in peek(n + 4) and peek(n + 6) ?? does this work in Sinclair BASIC??

Better use bAND, bOR, bXOR, SHL, and SHR operators for this.

patters · (This post was last modified: 01-05-2021, 06:51 PM by patters.)

castleIntactAttr is the sum of the attributes for all four character positions occupied by the castle. It will be checked every increment of the cannon shot trajectories. If it changes, then the shot has hit the castle. However, depending on how fast the trajectory maths ends up once compiled, I may be able to use PLOTs instead of multiple DRAWs. If this is the case I can simply check the PLOT coordinates are not within box occupied by the castle characters and abandon this ATTR method.

Since castleIntactAttr is the sum of four 8bit numbers it does go higher than 255, which is where I ran into the problem. The PEEK (n+4), PEEK (n+5) etc. are colour and brightness values from a lookup table of DEFB entries, since the castle sprites need different colours and brightness values depending on the landscape colours. The castle hasn't been drawn at this point so the ATTR function isn't useful here. I invoke this as part of a sub to update the castle graphics when the landscape or time of day changes, and so this is a good place to recalculate the castleIntactAttr after the graphics have changed.

To understand this a bit better:

castleIntactAttr=(64*CAST(UINTEGER, PEEK (n+4))+8*skyCol+PEEK (n+5)+64*PEEK (n+6)+8*skyCol+PEEK (n+7))*2

UINTEGER=(UBYTE*UINTEGER)+(UBYTE*UBYTE)+UBYTE+(UBYTE*UBYTE)+(UBYTE*UBYTE)+UBYTE
always<255 always<255 always<255 (they are attribute bits)
So
UINTEGER= UINTEGER + UBYTE +UBYTE+ UBYTE + UBYTE +UBYTE

At this last point will the compiler be intelligent enough to add all of these UBYTES to the UINTEGER left to right and hence not overflow?

***boriel*** · (This post was last modified: 01-06-2021, 08:39 AM by boriel.)

(01-05-2021, 06:25 PM)patters Wrote: castleIntactAttr is the sum of the attributes for all four character positions occupied by the castle. It will be checked every increment of the cannon shot trajectories. If it changes, then the shot has hit the castle. However, depending on how fast the trajectory maths ends up once compiled, I may be able to use PLOTs instead of multiple DRAWs. If this is the case I can simply check the PLOT coordinates are not within box occupied by the castle characters and abandon this ATTR method.

Checking a point within a square region is definitely much faster and easy to implement. Also I don't understand why you just dont check ATTR position of the cannon itself? When it matches that of the castle you have a hit.

***boriel*** · (This post was last modified: 01-06-2021, 09:05 AM by boriel.)

(01-05-2021, 06:25 PM)patters Wrote: Since castleIntactAttr is the sum of four 8bit numbers it does go higher than 255, which is where I ran into the problem. The PEEK (n+4), PEEK (n+5) etc. are colour and brightness values from a lookup table of DEFB entries, since the castle sprites need different colours and brightness values depending on the landscape colours. The castle hasn't been drawn at this point so the ATTR function isn't useful here. I invoke this as part of a sub to update the castle graphics when the landscape or time of day changes, and so this is a good place to recalculate the castleIntactAttr after the graphics have changed.

To understand this a bit better:

castleIntactAttr=(64*CAST(UINTEGER, PEEK (n+4))+8*skyCol+PEEK (n+5)+64*PEEK (n+6)+8*skyCol+PEEK (n+7))*2

UINTEGER=(UBYTE*UINTEGER)+(UBYTE*UBYTE)+UBYTE+(UBYTE*UBYTE)+(UBYTE*UBYTE)+UBYTE
always<255 always<255 always<255 (they are attribute bits)
So
UINTEGER= UINTEGER + UBYTE +UBYTE+ UBYTE + UBYTE +UBYTE

At this last point will the compiler be intelligent enough to add all of these UBYTES to the UINTEGER left to right and hence not overflow?

Yes, it should work. In such case, don't use UByte variables and simplify the above expression. If the expression is that complicated, using UByte you're not gaining anything. On the contrary, you're wasting memory and speed, because conversion takes time and generates more code.

Also you can use a macro to simplify such expression:

Code:
#define UPEEK(x)  CAST(Uinteger, Peek(x))

castleIntactAttr=(64* UPEEK(n+4)+8*skyCol+PEEK (n+5)+64*UPEEK (n+6)+8*skyCol+PEEK (n+7))*2

Finally, if you want to compare 4 bytes, for a change it's better to store them in a Ulong (32 bits) integer. You can PEEK not only bytes, but also integers, longs and floats with PEEK (<type>, <address>):

Code:
DIM castleIntactAttr as ULong

castleIntactAttr = PEEK(ULong, n + 4)

This is definitely *much* faster and cleaner.

patters · (This post was last modified: 01-06-2021, 05:07 PM by patters.)

Thanks for the tips. I didn't know about macros, this would be useful to add the wiki (I couldn't find anything about it). I had forgotten about the Z80 registers being used in pairs for 16bit numbers - I had 8bits stuck in my mind so I was trying to use UBYTE wherever I could.

The four attribute bytes I need to poll in the screen memory are not all sequential (it's two pairs in fact) so I'm a little unsure of the optimal way to aggregate the four bytes into a single ULONG. How is the below code? It seems a mess, but I guess it's still probably faster than the addition I was doing? Is it? I'm not familiar with the intricacies of how a processor calculates arithmetic at the ASM level, but my guess is that it's more complex than a few PEEKs and POKEs.

Code:
DIM a1, a2 AS UINTEGER         'target gfx attribute addresses to poll (two pairs of bytes)

DIM e1, e2 AS UBYTE            'expected attribute bytes (sprite is 4 chars, but has vertical symmetry)

DIM expectedAttrs AS ULONG

'at beginning of game round (performance not important)

e1=16: e2=16                                 'these attributes do change with different landscape colours

POKE @exp, e1: POKE @exp+1,e2                'Is a union needed? Would inline ASM be better, to use a register pair? Perf not important here though

POKE (UINTEGER @exp+2, PEEK (UINTEGER @exp)) 'copy the first two bytes again

expectedAttrs=PEEK (ULONG, @exp)

'during each cycle of the cannon shot trajectory (performance-critical)

POKE (UINTEGER @test,   PEEK (UINTEGER,a1))  'copy the first pair of bytes to the ULONG

POKE (UINTEGER @test+2, PEEK (UINTEGER,a2))  'copy the second pair of bytes to the ULONG

IF PEEK (ULONG, @test) <> expectedAttrs THEN PRINT "castle hit"

STOP

exp:

ASM

DEFB 0,0,0,0 ; is there a better way to reserve bytes at a referenced address?

END ASM

test:

ASM

DEFB 0,0,0,0

END ASM

***boriel*** · (This post was last modified: 01-06-2021, 05:42 PM by boriel.)

(01-06-2021, 02:59 PM)patters Wrote: Thanks for the tips. I didn't know about macros, this would be useful to add the wiki (I couldn't find anything about it). I had forgotten about the Z80 registers being used in pairs for 16bit numbers - I had 8bits stuck in my mind so I was trying to use UBYTE wherever I could.

The four attribute bytes I need to poll in the screen memory are not all sequential (it's two pairs in fact) so I'm a little unsure of the optimal way to aggregate the four bytes into a single ULONG. How is the below code? It seems a mess, but I guess it's still probably faster than the addition I was doing? Is it? I'm not familiar with the intricacies of how a processor calculates arithmetic at the ASM level, but my guess is that it's more complex than a few PEEKs and POKEs.

Code:
DIM a1, a2 AS UINTEGER 'target gfx attribute addresses to poll (two pairs of bytes) DIM e1, e2 AS UBYTE 'expected attribute bytes (sprite is 4 chars, but has lateral symmetry) DIM expectedAttrs AS ULONG 'at beginning of game round (performance not important) e1=16: e2=16 'these attributes do change with different landscape colours POKE @exp, e1: POKE @exp+1,e2 'Is a union needed? Would inline ASM be better, to use a register pair? Perf not important here though POKE (UINTEGER @exp+2, PEEK (UINTEGER @exp)) 'copy the first two bytes again expectedAttrs=PEEK (ULONG, @exp) 'during each cycle of the cannon shot trajectory (performance-critical) POKE (UINTEGER @test, PEEK (UINTEGER,a1)) 'copy the first pair of bytes to the ULONG POKE (UINTEGER @test+2, PEEK (UINTEGER,a2)) 'copy the second pair of bytes to the ULONG IF PEEK (ULONG, @test) <> expectedAttrs THEN PRINT "castle hit" STOP exp: ASM DEFB 0,0,0,0 ; is there a better way to reserve bytes at a referenced address? END ASM test: ASM DEFB 0,0,0,0 END ASM

No need to reserve memory in this case: just declare an ULong variable and use @ULong + 0, @ULong + 1,etc..
Still not sure what are you trying to achieve with this. I suppose the castles I saw in the demo you send me are 2x2 attr cells and you want to store their values. If so:

Code:
DIM expectedAttr, testAttr as ULong

DIM e1 AT @expectedAttr as UByte      ' e1 maps into the first byte of expectedAttr

DIM e2 AT @expectedAttr + 1 as UByte  ' e2 maps into the 2nd byte of expectedAttr

DIM e3 AT @expectedAttr + 2as UByte   ' e3 maps into the 3rd byte of expectedAttr

DIM e4 AT @expectedAttr + 3as UByte   ' e4 maps into the 4th byte of expectedAttr

...

POKE Uinteger @testAttr, PEEK (Uinteger, ATTRaddr(castleRow1, castleCol1))

POKE Uinteger @testAttr + 2, PEEK (Uinteger, ATTRaddr(castleRow2, castleCol1))

Still believe you might be complicating things a bit, and doing premature optimization? Big Grin

I mean, consider use a much simpler solution first (even checking position by position). See how fast it goes and optimize later (i.e. you're starting to use this compiler: use arrays of bytes for this and compare the 4 positions instead of doing this dirty thing on first place).

Login
Username:
Password:	Lost Password?
	Remember me