boriel wrote:Using IY would be nice. I'm thinking in relocating TIMER routine into RAM and free IY regs. (or just use something like PUSH IY, di, call routine, POP IY, EI, RET).
Yeah I wouldn't want to mess with that either -- it is something basic programmers expect. There's a point where the basic system just gets in the way and you are probably there now.
I also think structs should be a plus, but I haven't yet figured out how to implement them efficiently in a so limited machine: na_th_an suggested using IY+n scheme to reference struct fields, but IY is used by the TIMER interrupt routine.
How did you implement structs?
As you say the addressing modes are limited so there is no clear preferred solution. I always look to what machine code programmers do and I think compilers should try to emulate that if possible.
m/c programmers organize the data in their structs so they can be walked, reading and writing data as it is needed. Random access in the struct is avoided if possible. So in my own asm code I use HL to point at a member in the struct and I am doing a lot of 'LD r,(hl); inc hl' in there as needed. That makes it 13 cycles (and two bytes) per sequential access vs 19 cycles (and four bytes) per random access using an index register. z88dk is using this method and retains the current struct ptr in hopes future accesses are near past accesses. In ideal situations this should lead to smaller code if not faster code but it depends on the programmer organizing the struct in a good order as the compiler will not reorder struct members.
There are a lot of ifs and maybes in that so it may be more efficient (not to mention easier) to do indexed register access for the general case. z88dk has chosen not to intrude on asm extensions (user code or library) so it has avoided using index registers or reserve registers for exclusive use for that reason. The thought has been (I guess since I was not there from the beginning but I do go along with it) that strong optimization would be introduced to resolve a lot of issues and maximum performance would come out of asm library code where most execution time would be spent. We don't have that uber optimization yet and there is still a long way to go.
In other words, I don't have the right answer