![]() |
Try this for faster multiply? - Printable Version +- Forum (https://www.boriel.com/forum) +-- Forum: Compilers and Computer Languages (https://www.boriel.com/forum/forumdisplay.php?fid=12) +--- Forum: ZX Basic Compiler (https://www.boriel.com/forum/forumdisplay.php?fid=11) +---- Forum: Wishlist (https://www.boriel.com/forum/forumdisplay.php?fid=14) +---- Thread: Try this for faster multiply? (/showthread.php?tid=335) |
Try this for faster multiply? - britlion - 03-23-2011 (where /should/ I be putting this sort of thing?) Plug this in for mul16.asm Code: __MUL16: ; Mutiplies HL with the last value stored into de stack I think it saves on average about 110 T states per multiply, according to my tests. If I counted correctly, it's 10 bytes longer. Why it's faster: SLA C is a long slow opcode, compared to just doing the RLA. It's faster to loop twice and roll the A register round the two halves than it is to roll the 16 bit pair. Also in this case, JR is a better choice than the original JP instruction. Not only is it a byte shorter, but it's faster on average. Probably. 16 JP NC instructions = 160 T states. JR is 7 if condition fails, 12 if it passes. We can assume that for bits, half will be 1 and half will be 0. So that's an average of (8*12)+(8*7)=156 T states. It's worth saving the byte; which compensates for a double loop being a few extra bytes. Could also probably shave a little time by using dec b && jp nc _mul16loop since that will jump most times. Probably not worth the bytes. Having two short loops actually speeds up the DJNZ a little too ![]() Re: Try this for faster multiply? - britlion - 03-23-2011 I'm wondering if similar optimizations could be made with other 16 bit operations? |