Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Fast Integer Square Roots
A few of you have wondered why I've been obsessing over speeding up math functions, cos the compiler is for games! :-)

Well, here's a concrete example of Fast square roots in action. Part of the football algorithm is "Who do I move, and where?" on the field. I'm hoping for better graphics, but this was a mock up to look at the math of that problem. If a player is moving with the ball, who is nearest to their path to intercept them?

With players all over the place, you have to work out distances to vectors. I finally got this working correctly (it sometimes broke, the last time I looked at this - and it turned out I was crossing the 16 bit number barrier. With a newer, longer sqrt function I can go back and make the algorithm work) and here it is in two versions - one using the ROM sqr() function, and the other using isqrt.bas from - this is a concrete example of why the slow ROM routines can become a problem for certain aspects of certain games. The speed difference is startling. The ONLY difference in the source code is changing the square root used from ROM to library. A matter of a couple of characters and an include.

It picks a point for a ball, and draws the pass line to another player. Then every other player has a line drawn to the nearest point on that line, or the nearest end if that IS the nearest point on the line. But look at the speed difference!

Note: press ENTER for a new screen.


isqrt -
This is really impressive!
On the other hand, I would like to benchmark your fast FP routines. If they really improve speed we should consider to integrate them directly into the compiler (e.g. using a --fast-floating-point command line flag as other compilers do).

What do you think?
Go right ahead! The source code is listed in the library.

Note that for square roots, there are two options - an accurate one, about 6 times faster than the ROM version (which ironically, still uses the ROM's FP calculator. It's just a better algorithm by far than the horrid kludge built into the ROM - though admittedly, them doing it in 7 bytes was a neat trick). This should always give the same result as the ROM routine. You might consider the bytes it costs reasonable for general use (would it be better to default to this, and have a --small option instead? It does make the compiler look more impressive Smile ) I think I just hand counted it at 45 bytes. That's not a huge cost for X6 speed up of a function.

The integer one is about 100 times faster than the ROM routine for 16 bit numbers, and 50 times faster for 32 bit numbers. It gives an integer result, not a full floating point one. Have a play with them, and see what you think. I think I would recommend leaving the integer one as a #include option. It behaves differently, though it does go like greased lightning!

For those demos, I used the integer version; which is fine for most quick and dirty games calculations.

Play with them, and you should see how they work.

You might want to adjust the source for fsqrt.bas so that it reports an error for negative numbers, though, if you are considering it as a default. Right now it cheats and makes all numbers positive.
Oh yes - I assume that the compiler is only including run times for functions that are used?

I'm a little worried about some of the error messages coming up when I use -O3 that suggests it has issues removing run-time code it isn't using.

Forum Jump:

Users browsing this thread: 1 Guest(s)