Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There should be case sensitivity options and entry shortcut/tokens #12

Open
erroneus0 opened this issue Mar 8, 2021 · 7 comments
Open

Comments

@erroneus0
Copy link

Commodore 64 BASIC enjoyed two modes of display. One where all the characters were displayed in upper-case (Where shift-KEY would display a graphical character) and another where all characters were displayed in lower-case and shift-KEYs were displayed as upper-case letters.

When in "upper-case mode" (the default) you were actually writing in lower-case.

All that said, the entry to lines and command should be case insensitive. The interpreter should also operate where the commands and variable names are "monocase" which is to say upper-case as far as the BASIC interpreter is concerned.

So all entries not within quotes or following a REM command should be converted into uppercase upon entry. This way the program listing is correct.

Additionally, token shortcuts like "?" for "PRINT" should be supported. Not that I would expect "POKE" to work, but the shortcut for the POKE command would be pO (P and Shift-O).

@pentalive
Copy link

I would like to second this suggestion:
So all entries not within quotes or following a REM command should be converted into uppercase upon entry. This way the program listing is correct.

@mist64
Copy link
Owner

mist64 commented Sep 23, 2022

Remember that cbmbasic is a statically recompiled version of the unmodified 6502 implementation in the C64 ROM, bundled with C support code to properly interface with the Unix/Windows command line.

In standard ASCII, 'A' is 0x41 and 'a' is 0x61. On a C64, typing an "unshifted" 'A' (upper case in upper/graphics mode and lower case in upper/lower mode) produces code 0x41, and a shifted 'A' produces 0xc1. The code in this project that sends Unix command codes to the interpreter does not do any character conversion, so uppercase characters become "unshifted" characters (so uppercase keywords like PRINT work), and lower case characters become key codes (0x61..0x5a) that BASIC will happily just store as what they are, but they are not recognized as either upper or lower case.

As for the suggestion that the C support code should make all lowercase input upper case: Besides quoted text and REM statements, another problem is that no conversion should happen for text that is input while the program is running. A possible approach would be:

  • no conversion when in quote = 1 (every quote toggles it, RETURN resets it)
  • no conversion after REM is typed, until the end of the line, except in quotes
  • no conversion when the program is running (there is a flag in zero page for direct mode)

Doable, but tricky. And not very pretty, since the C level will be amended with knowledge about some internal details of how BASIC works.

Here's a different suggestion: PETSCII mode.

  • All input does PETSCII conversion, i.e. lower case input (i.e. unshifted) gets converted to codes 0x41..0x5a, upper case (shifted) to 0xc1..0xda; everything else stays the same.
  • Output does the opposite conversion (plus 0x61..0x7a -> upper case).

Keywords will be displayed as lower case, but the user can type them unshifted, and shortcuts like "pO" will work.

The downside of this mode is that BASIC can no longer be used to work with text that is neither PETSCII nor 7-bit-ASCII: The current version will happily do Unicode:

10 PRINT"Hällo Wörld!"

The conversion (especially the part where both 0x61..0x7a and 0xc1..0xda represent shifted in PETSCII) will break Unicode.

That's why it should be a mode (e.g. cbmbasic -p) if we decide to do this.

(Note: The ? shortcut works right now.)

@mist64
Copy link
Owner

mist64 commented Sep 23, 2022

And another alternative suggestion: Only when supplying a .bas file as an argument, parse the full file in C code, fix keyword case, maybe even tokenize it. Then feed it to the BASIC interpreter. In fact, I believe this exactly what the petcat already does, so an option to feed .bas files through this tool could be added.

@TheGeekOnSkates
Copy link

Hey there,

First off, let me say, I LOVE this project! It's insanely cool, and it inspired me to create my own BASIC, Breakaway BASIC (currently in version 0.4 I think - nowhere near as robust or portable as yours). In addition to being fun and retro (and nostalgic for those who had a C64 or other 8-bit computer growing up - which is actually not me), it has some really interesting practical applications for scripting. Its syntax is a lot more human-readable, easy-to-follow, and generally just more sane than i.e. Bash (which also played a role in me building a BASIC of my own, lol). But your BASIC has file I/O, multi-instruction lines (using the ":" delimiter) and a whole lot more. My BASIC... well, I'm still trying to implement simple stuff like string variables. 😆 So to @mist64 You sir are the man! Bottom line, I admire the heck out of this project and want to get involved. Just forked it.

Having said that... a few thoughts and questions come to mind:

  1. First of all, my BASIC has some features I wish this BASIC had. Lowercase is one of them, and I won't get into the others cuz those are other "issues" entirely. Which kinda begs the question, how important is it that there be no OS-dependent code? I'm all for portability, but it would be nice if pressing up-arrow worked like in other shells (sorry, guess I lied, I did bring up another "issue" entirely 😆 ).
  2. Specific to lowercase... I really like the PETSCII mode idea. All the characters in PETSCII have Unicode equivalents (see one of my other projects, the Geek-Rig. And I mean, if you really want to support other characters (and I get that, the whole i18n thing), maybe you could have a way to redefine them. lol, obviously I don't mean like on the C64 where you could change what they look like; I mean, a way to assign character codes to key codes. Like okay, idk how your POKE works (or if it even has one), but I could see something like POKE 0, 128512 (1F600 hex) and then doing PRINT CHR$(0) would create a smiley emoji instead of an "@". Of course this would mean allowing numbers > 65535, which might not be possible, but it was just a thought.
  3. Most importantly: If I want to look at your code, maybe take on PETSCII mode, or maybe just add some of the stuff my BASIC had (not necessarily to contribue, though it could be for that), where would I start? What's the right way to get to know your code? There are a lot of files to go over, and I'll be messing with this after-hours (I have a day job, also writing code, lol). And I don't really know anything about how Commodore BASIC works under the hood. I didn't even know that info was available anywhere. I would love to find out - for my own BASIC at least, if not contributing here - but again, not having a ridiculous amount of spare time doesn't help.

Anyway, thanks again for the amazing project! cbmbasic is awesome!

@ratboy666
Copy link

ratboy666 commented Nov 17, 2022 via email

@secristr
Copy link

secristr commented Nov 17, 2022 via email

@TheGeekOnSkates
Copy link

@ratboy666 Awesome! I just forked your BASIC and I'm looking forward to messing with it. might be easier to understand (I didn't see nearly as many files, lol). I did go on to read the "internals" section of cbmbasic's README, and it sounds like yours is more my speed... but cbmbasic is still awesome. 😄

PS: Your nickname is hilarious! "the Geek on Skates" is an inside joke only my family and closest friends would know, so I can't even imagine what "ratboy666" means. 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants