Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define ordering and equality on Char #306

Merged
merged 4 commits into from
Dec 7, 2020

Conversation

oxinabox
Copy link
Contributor

@oxinabox oxinabox commented Dec 6, 2020

This closes #301
It defines a built in operation codepoint that should return the ASCII code point.
Ideally it would return the unicode codepoint, but we are a fair way from unicode support, see #302.
So rright now for characters outside the ASCII character set it returns nonsense.

@google-cla
Copy link

google-cla bot commented Dec 6, 2020

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no label Dec 6, 2020
@oxinabox oxinabox changed the base branch from main to dev December 6, 2020 17:35
prelude.dx Outdated Show resolved Hide resolved
@google-cla
Copy link

google-cla bot commented Dec 6, 2020

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

prelude.dx Outdated Show resolved Hide resolved
src/lib/Imp.hs Outdated Show resolved Hide resolved
src/lib/Type.hs Outdated Show resolved Hide resolved
@google-cla google-cla bot added cla: yes and removed cla: no labels Dec 6, 2020
src/lib/Type.hs Outdated
@@ -728,6 +728,9 @@ typeCheckOp op = case op of
i |: TC (IntRange (IdxRepVal 0) (IdxRepVal $ fromIntegral vectorWidth))
return $ BaseTy $ Scalar sb
ThrowError ty -> ty|:TyKind $> ty
-- TODO: type check that c is a character
-- TODO: this should really be a 32 bit integer for unicode reasons: 8 bit is just 1 codeunit
CodePoint c -> return $ BaseTy $ Scalar Int8Type
Copy link
Contributor Author

@oxinabox oxinabox Dec 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably there is more checking i should do to make sure the c is a character not some other type?
I don't know how though.

Right now i get an error that crashes the repl if i try to do nonsense

>=> %codePoint 1
dex: src/lib/Imp.hs:(194,27)-(275,2): Non-exhaustive patterns in Con (CharCon x)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the check would be something like c |: CharTy. But you might still get a compiler error, just an earlier one, if you write %codePoint 1. The %-prefixed names are supposed to be wrapped in the prelude and not used directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does c |: CharTy do?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It checks that c has type CharTy. (|:) is defined here and CharTy is defined here.

Copy link
Contributor Author

@oxinabox oxinabox Dec 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, now we get a better error that doesn't crash the REPL

>=> %codePoint 'a'
97
>=> %codePoint 1
Compiler bug!
Please report this at github.com/google-research/dex-lang/issues

assertion failure ():
Char != Int32

checking decl:
tmp:Int8 = %codePoint 1
Checking body types
Checking module:
Module (Typed)
  unevaluated decls:
  tmp:Int8 = %codePoint 1
  tmp1:Int8 = tmp
  _ans_:Int8 = tmp1

  evaluated bindings:
  ()

Though still i guess not great since it says it is a compiler bug.
But still noone should use %codePoint directly anyway, we have codepoint in the prelude for that, as you say.

examples/eval-tests.dx Outdated Show resolved Hide resolved
@oxinabox
Copy link
Contributor Author

oxinabox commented Dec 6, 2020

I think I have confused the bot a lot.

Copy link
Collaborator

@dougalm dougalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

src/lib/Type.hs Outdated
@@ -728,6 +728,9 @@ typeCheckOp op = case op of
i |: TC (IntRange (IdxRepVal 0) (IdxRepVal $ fromIntegral vectorWidth))
return $ BaseTy $ Scalar sb
ThrowError ty -> ty|:TyKind $> ty
-- TODO: type check that c is a character
-- TODO: this should really be a 32 bit integer for unicode reasons: 8 bit is just 1 codeunit
CodePoint c -> return $ BaseTy $ Scalar Int8Type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the check would be something like c |: CharTy. But you might still get a compiler error, just an earlier one, if you write %codePoint 1. The %-prefixed names are supposed to be wrapped in the prelude and not used directly.

@apaszke apaszke merged commit 6f79a59 into google-research:dev Dec 7, 2020
@oxinabox oxinabox deleted the ox/codepoint branch December 7, 2020 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Equality/Ordering for Characters
3 participants