Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion of proposal 2015-006 (Additional string conversion functionality) #7

Open
rossberg-old opened this issue Aug 16, 2015 · 13 comments

Comments

@rossberg-old
Copy link

This issue is for discussion of proposal 2015-006 (Additional string conversion functionality).

@JohnReppy
Copy link
Contributor

I think that the REs for specifying the accepted formats are not the right ones. I propose

StringCvt.BIN   [+~-]?[0-1]+(_[0-1]+)*
StringCvt.OCT   [+~-]?[0-7]+(_[0-7]+)*
StringCvt.DEC   [+~-]?[0-9]+(_[0-9]+)*
StringCvt.HEX   [+~-]?(0x|0X)?[0-9a-fA-F]+(_[0-9a-fA-F]+)*

In other words, allow the underscore as a digit separator, but not as a initial or terminal character
in a numeric literal.

BTW, this change agrees with the way that I wrote up this feature in the Proposed Definition of SuccessorML.

@rossberg
Copy link
Member

Good point. Shouldn't the underscore at least be allowed after a 0x, though?

@JohnReppy
Copy link
Contributor

Something like 0x_5 looks weird to me, but perhaps it should be allowed. I just took a look at what Rust allows for numeric literals and it is

[+~-]?[0-9][_0-9]*

for decimal literals and

[+~-]?(0x|0X)?[0-9a-fA-F_]+

for hexadecimal literals (modulo the SML tilde as minus). Are there other examples of languages that use underscore in numeric literals that we might compare to?

@rossberg
Copy link
Member

So 0x_ is a legal literal in Rust? That looks weird.

Ocaml has also introduced them a while ago. It does not seem to allow an underscore after 0x.

Whatever we pick, the rules for word prefixes should be consistent with it.

@JohnReppy
Copy link
Contributor

The reference states

A hex literal starts with the character sequence U+0030 U+0078 (0x) and continues as any mixture of hex digits and underscores.

but the implementation may do something different.

@JohnReppy
Copy link
Contributor

Java only allows underscores between digits, but allows an arbitrary number of them. see here.

@olopierpa
Copy link

On Tue, Aug 18, 2015 at 8:54 AM, John Reppy [email protected]
wrote:

Are there other examples of languages that use underscore in numeric
literals that we might compare to?

Ada (not at start of digits, not at end of digits, no two consecutive
underscores)

@larsbergstrom
Copy link

@rossberg 0x_ in Rust generates an error no valid digits found for number. See: https://play.rust-lang.org/?gist=ac5c4c71e296bed79a0e&version=stable

I think the text in the Rust language reference could possibly be more clear on this point: https://doc.rust-lang.org/stable/reference.html#integer-literals
CC @nmatsakis

@JohnReppy
Copy link
Contributor

So it looks like the Rust language should be

A hex literal starts with the character sequence U+0030 U+0078 (0x) and continues as any mixture of hex digits and underscores containing at least one digit.

@MatthewFluet
Copy link

In the MLton implementation (https://github.com/MLton/mlton/blob/master/mlton/front-end/ml.lex#L152), we require leading and trailing digits and no consecutive underscores. As I recall, there is no ambiguity or lexical trouble allowing leading or trailing or repeated underscores, although it has some very minor impacts on backwards compatibility:

fun foo _1__2_ = true
   | foo _ _ _ _ _ _ = false

@JohnReppy
Copy link
Contributor

This discussion should is better attached to the Successor ML Definition; the Basis Library should follow whatever we agree to do there. I've created an issue for that purpose.

@JohnReppy
Copy link
Contributor

Closed by mistake.

@JohnReppy JohnReppy reopened this Aug 18, 2015
@RobertHarper
Copy link
Contributor

it’s very minor, but it might be worth it to be compatible with ocaml’s notation. i can’t really say why, except that further divergence isn’t helpful.

bob


Bob Harper
Carnegie Mellon CSD

[email protected]

On Aug 18, 2015, at 3:00 AM, rossberg-chromium [email protected] wrote:

So 0x_ is a legal literal in Rust? That looks weird.

Ocaml has also introduced them a while ago. It does not seem to allow an underscore after 0x.

Whatever we pick, the rules for word prefixes should be consistent with it.


Reply to this email directly or view it on GitHub #7 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants