Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
String Slicing (*solved*)
#1
Not sure whether this is as intended, but it's not quite compatible with Sinclair Basic:

LET K$="1234567890"
PRINT K$(TO 3)
PRINT K$( 3 TO)

All seems to compile - though strings are numbered from 0, whereas Sinclair basic numbers them from 1.

So ZX Basic gives the output:

1234, where Sinclair basic would give the output 123

If the aim is to be able to compile Sinclair Basic, we're not there with that one. (Perhaps a switch, or configurable option?)

Also:

Print K$(3) fails. We would need to use k$(3 TO 3). Is this a bug?

PRINT K$( TO ) also fails, and ZX Basic allows it. It's an unusual corner case, though - it means the same thing as just K$. If it was to be supported for compatibility, a preprocessor tweak ought to be able to optimize it to disappear before the compiler saw it.
#2
britlion Wrote:Not sure whether this is as intended, but it's not quite compatible with Sinclair Basic:

LET K$="1234567890"
PRINT K$(TO 3)
PRINT K$( 3 TO)

All seems to compile - though strings are numbered from 0, whereas Sinclair basic numbers them from 1.
That's right. They start from 0. The reason is it's a little faster to compute string index starting from 0 (otherwise, a "dec" asm instruction would be needed). In the future, the --sinclair compatibility flag would enable strings starting from 1 (as it does currently with arrays).

Quote:So ZX Basic gives the output:

1234, where Sinclair basic would give the output 123

If the aim is to be able to compile Sinclair Basic, we're not there with that one. (Perhaps a switch, or configurable option?)
Well, that's the idea above. On the other hand, the compiler tries to be somewhat Sinclair compatible, but it's more similar to FreeBasic. The reason is, FreeBasic, like ZX Basic is a standard, and it's oriented to compiled execution, whilst Sinclair Basic is more designed to be used as an interpreted language. Hence, emulating some Sinclair Basic instructions like VAL or VAL$ are almost impossible (VAL$ is not implemented, and VAL behaves as a in FreeBasic).

Quote:Also:

Print K$(3) fails. We would need to use k$(3 TO 3). Is this a bug?

PRINT K$( TO ) also fails, and ZX Basic allows it. It's an unusual corner case, though - it means the same thing as just K$. If it was to be supported for compatibility, a preprocessor tweak ought to be able to optimize it to disappear before the compiler saw it.
Could be a grammar bug. I will try to include these grammar cases.

Thanks a lot for your feed back. :wink:
#3
Wouldn't it produce faster code to get the preprocessor to do it?

That is if the --sinclair flag is on:

turn f$( TO ) -> f$

and f$( 7 TO 10) -> f$(6 TO 9)

before the compiler sees it.

This saves the compiled code from having to make the adjustments at run time, yes?

Yes - I know, if it's f$(a TO b) - we can't do that so easily, and we'll have to dec the number before we use it; which means that being less compatible will be faster code. But it's a thought to optimization if the string is sliced by constants.

You probably can't preprocess f$(7) so easily, because I suspect the preprocessor couldn't know if it's an array or a string variable. In this case, the compiler needs to be able to recognize that's a fair format for string variables, and means a slice of it.

Incidentally, I found out that fixed size strings don't seem to be supported:

Dim a$(10,5) makes a 2D array of variable length strings, not 10 strings of length 5; so you can't access a$(3) as you can in Sinclair basic.

It's a niggle, but I suspect that there are compiler optimizations that could be made if we know the strings are fixed in length - and, of course, if our aim is ever to be able to straight compile any Sinclair Basic program, it's another incompatibility issue.

I'm not sure if the compiler should really ever be able to deal with a Sinclair program with no modifications, but it's one possible goal.


For the record, Boriel; I come here and post problems and bug issues. It sounds like I only complain.
But I'm a BIG fan of the work, I think it's awesome. Don't ever forget that! You've done a fantastic job so far!
#4
britlion Wrote:Wouldn't it produce faster code to get the preprocessor to do it?

That is if the --sinclair flag is on:

turn f$( TO ) -> f$

and f$( 7 TO 10) -> f$(6 TO 9)

before the compiler sees it.

This saves the compiled code from having to make the adjustments at run time, yes?
Not exactly. The compiler will tray to guess everything at compile time (not runtime). It will guess, f$( TO) --> f$, and will simply ignore the '(TO)' part. The problem with f$(<expression>), e.g. f$(4 + 5*8 - int(sqrt(a))) is it could mean a function call (function named f$) and an alphanumeric variable name f$( x ). But I think it could be done, regardless this little problem: the compiler should be smart enough to do this. I will try to implement it.

Quote:Yes - I know, if it's f$(a TO b) - we can't do that so easily, and we'll have to dec the number before we use it; which means that being less compatible will be faster code. But it's a thought to optimization if the string is sliced by constants.

You probably can't preprocess f$(7) so easily, because I suspect the preprocessor couldn't know if it's an array or a string variable. In this case, the compiler needs to be able to recognize that's a fair format for string variables, and means a slice of it.

Incidentally, I found out that fixed size strings don't seem to be supported:
That's right. But I wanted to make it FreeBasic compatible. On the other hand, freebasic has a fixed-length string, declared as String(N). We can discuss it in the future, if you find this feature is needed.

The syntactic ambiguity you commented -f$(7)- could be solved by the compiler using semantic information (the variable data type). By default it will take it as a Function call, unless you have previously declared it as an array of strings (using DIM). Function names can also have the dollar sign (it's optional, by the way), for alphanumeric functions.

Quote:Dim a$(10,5) makes a 2D array of variable length strings, not 10 strings of length 5; so you can't access a$(3) as you can in Sinclair basic.
That's right. You will have to do something like this:
Code:
DIM a$(10)
FOR i = 0 TO 10 : REM a$ lower bound is 0, higher is 10. Otherwise use DIM a$(1 TO 10) or use the --array-base=1 option.
     a$(i) = "     ":  REM 5 spaces
NEXT i

PRINT a$(3 TO 3) : REM Will try to allow a$(3) in the future

Quote:It's a niggle, but I suspect that there are compiler optimizations that could be made if we know the strings are fixed in length - and, of course, if our aim is ever to be able to straight compile any Sinclair Basic program, it's another incompatibility issue.
I will try to implement it. :| But don't know if I could do it. The compiler was targeted at the very first time to game programming Tongue (even adventure text games), so I allowed dynamic strings, because having an array of 100 elements of 20 bytes is a waste of memory considering the 48K the ZX has. Tongue

Quote:I'm not sure if the compiler should really ever be able to deal with a Sinclair program with no modifications, but it's one possible goal.
I try to get closer, but it's almost impossible in some cases. For example, this cannot be guessed at compiler time, so it was to be always interpreted:
Code:
10 LET a = 5
20 LET a$ = "a"
30 IF RND > 0.5 THEN LET a$ = a$ + "+a"
40 PRINT VAL a$
Run it on the ZX Spectum several times. It should print 5 or 10 at random. Computing the value of a$ means parsing a$ content during runtime => hence interpreted.

Quote:For the record, Boriel; I come here and post problems and bug issues. It sounds like I only complain.
But I'm a BIG fan of the work, I think it's awesome. Don't ever forget that! You've done a fantastic job so far!
Thanks a lot for your appreciations (really!) Wink. And no problem: I know you are just commenting for improvements. Really, feedback is *VERY* important to me, because many compiler bugs are not detected until someone tries to compile a source code (I've made many many test, but it's not enough; more testing is needed and this is done using it!)

BTW: At the moment of writing this, F$(TO) is already done! Wink

UPDATE: Finally I've implemented F(), F(TO), and F(<expression>) for strings. Wink Will be available for 1.1.7


Forum Jump:


Users browsing this thread: 2 Guest(s)