Login

AlcoholicsAnonymous · 08-30-2015, 06:38 AM

boriel Wrote:* Supporting memory banks internally makes the compiler very platform dependable, and definitely much more complex. This has already been discussed, and the simplest approach is to mimic LOAD! and the similar commands, with external functions (.bas files with inline asm). Trying to mimic a "linear memory model" with the ZX Spectrum RAM scheme is, IMHO, a waste of time. Not only is complex, but also makes the compiler infinitely more hard. Each block code must be "measured" (in size) and rearranged (e.g. backpatching during linking phase which is currently done by the assembler itself).

It's not easy to invisibly support a linear address space. The machines very widely in how capable their banking mechanism is. The spectrum probably has the weakest banking scheme (+3 excluded) only allowing memory to be paged into the top 16k bank. The most flexible is probably the Enterprise (although it's not the only one with a similar scheme) which divides the space into four 16k pages each of which can have any of 256 physical memory pages paged in. But not all memory pages are equal with some being "contended" (shared with the display driver) and some might be partly allocated by the operating system.

In z88dk we're giving the programmer the ability and responsibility for placing stuff in memory and trusting that the programmer writes code that properly pages in stuff as needed. This is done with SECTIONs in the assembler/linker where the program can assign code and data to named sections (aka containers) which can have their own ORG address and the linker simply spits out one binary per section. So you could create one section for each of the spectrum's extra 16k banks and put stuff into it but the program is still responsible for making sure the correct page is active before accessing that data or code.

The C standard is starting to define what to do with banked memory in embedded systems in a technical report which is the step prior to being adopted as a standard. The scheme is similar :- the programmer defines what sections code and data is poured into by name. The implementation is supposed to create code that will automatically bank when needed and the mechanism is probably supplying one bank called 'common' that is always paged in and which contains the necessary banking code to perform necessary actions. But there are many issues to overcome particularly with performance.

Anyway it's not such a bad thing to leave the responsibility with the programmer.

Quote:* For the FP format, 32 bits is not "faster". In fact, the 32 bits FP standard is temporarily converted to 40 bits in a format very similar to the one used by the FP ROM. FP ROM was used both for memory saving and compatibility purposed. I've been reading (and looking for) 32 bits FP routines for Z80 and haven't found anything of interest (any help will be welcome in this area).

On the z80 the natural float size is 48 bits and you'll see many floating point implementations from the 70s and early 80s are 48 bit. At 48 bits the z80 is able to keep the float calculation in registers as much as any size smaller so those implementations naturally aimed for the largest precision that was still fast. One implementation called math48 was used in Turbo Pascal and it can hold two 48-bit floats at the same time - one in BCDEHL and the other in BCDEHL' (in the exx set) which suits an fp library perfectly as nearly all functions take either one or two parameters.

It's also a joy to program in assembly language which is something that always indicates the implementation is well suited to the processor. Have a look at this code implementing the atan2() function:

Code:
; double atan2(double y, double x)

SECTION code_fp_math48

PUBLIC am48_atan2

EXTERN am48_dcmpa, am48_ddiv, am48_atan, am48_dneg

EXTERN am48_dconst_pi, am48_dadd, am48_dpopret

am48_atan2:

   ; compute arctan y/x in the interval [-pi, +pi] radians

   ;

   ; enter : AC = double y

   ;         AC'= double x

   ;

   ; exit  : AC' = atan2(y,x)

   ;         carry reset

   ;

   ; uses  : af, af', bc', de', hl'

   push bc                     ; save AC

   push de

   push hl

   exx

   call am48_dcmpa             ; fabs(x) >= fabs(y) ? (eff |x| - |y|)

   exx

   jr nc, greater_equal

less:

   ; AC = y

   ; AC'= x

   call am48_ddiv

   call am48_atan

   call am48_dneg              ; AC'= -atan(x/y)

   ld a,b

   and $80                     ; a = sgn(y)

   call am48_dconst_pi

   dec l                       ; AC = pi/2

join:

   or b

   ld b,a                      ; AC = sgn(y)*pi/2

   call am48_dadd

   jp am48_dpopret

greater_equal:

   ; AC = y

   ; AC'= x

   ld a,b

   and $80

   push af                     ; save sgn(y)

   exx

   ; AC = x

   ; AC'= y

   ; stack = sgn(y)

   call am48_ddiv

   call am48_atan              ; AC'= atan(y/x)

   pop af                      ; a = sgn(y)

   bit 7,b

   jp z, am48_dpopret          ; if x >= 0 done

   call am48_dconst_pi         ; AC = pi

   jr join

;double atan2(double y, double x)

;{

;   double a;

;

;   if (fabs(x) >= fabs(y))

;   {

;      a = atan(y/x);

;      if (x < 0.0)

;      {

;         if (y >= 0.0)

;            a += _pi ;

;         else a -= _pi ;

;       }

;   }

;   else

;   {

;      a = -atan(x/y);

;      if (y < 0.0)

;         a -= _halfpi;

;      else

;         a += _halfpi;

;   }

;

;   return a;

;}

AC indicates the double in the primary register set and AC' the double in the exx set. AC stands for accumulator.

The 32-bit single precision float was an invention for 32-bit processors since that's the max bit width that can be handled on those processors in the quickest instructions. On the z80 the natural size is 48-bits as mentioned. A 16-bit float might be another option but that's too limited for a float type and is more suitable to a fixed point type.

With a 48-bit float the mantissa is 40 bits and with a 32-bit float the mantissa would be 24 bits. The only advantage the 32-bit float has in terms of execution speed is bit shifting only has to be done at most 24 times rather than 40. But on a z80 there is a sense that you're losing free precision bits when constraining to 32 bits.

Boriel you mentioned you found a 32-bit fp implementation for the z80. Do you have a link for it?

Login
Username:
Password:	Lost Password?
	Remember me