Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infer method calls on known types and warn about nonexistent methods #11399

Open
wants to merge 137 commits into
base: develop
Choose a base branch
from

Conversation

radeusgd
Copy link
Member

Pull Request Description

  • Implements Inferring types of method calls on Atoms #9812
  • Allows us to propagate type inference through method calls, at least in a basic way.
  • Adds a 'lint warning': when calling a method on an object of a known specific (non-Any) type that does not exist on that type, a warning is reported, because such a call would always result in No_Such_Method error at runtime.
    • Any has special behaviour - if x : Any we don't know what x might be, so we allow all methods on it and defer to the runtime to see if they will be valid or not.
    • This check is not proving correctness, but instead it's only triggering for 'provably wrong' situations.

Important Notes

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    TypeScript,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • Unit tests have been written where possible.
  • If meaningful changes were made to logic or tests affecting Enso Cloud integration in the libraries,
    or the Snowflake database integration, a run of the Extra Tests has been scheduled.
    • If applicable, it is suggested to paste a link to a successful run of the Extra Tests.

…ce tests

Ideally, the static analysis tests should be runnable without the Graal runtime context. We'd like to be able to run these static passes without Graal runtime, e.g. inside of code editors. But refactoring import resolution and PackageRepository to work without Graal is out of scope of the current types work. It can be planned for later, when needed. For now the focus is to make type checking provide actual value - before that any editor integration wouldn't be useful anyway.
@radeusgd
Copy link
Member Author

radeusgd commented Oct 28, 2024

Benchmarks scheduled:

GitHub
Hybrid visual and textual functional programming. Contribute to enso-org/enso development by creating an account on GitHub.
GitHub
Hybrid visual and textual functional programming. Contribute to enso-org/enso development by creating an account on GitHub.

@radeusgd radeusgd linked an issue Oct 29, 2024 that may be closed by this pull request
2 tasks
Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the (amount of) @Builtin_Method changes.

@@ -129,6 +129,10 @@ type Comparable

new value:Any comparator:Any -> Comparable = Comparable.By value comparator

## PRIVATE
less_than_builtin : Any -> Any -> Boolean | Nothing
less_than_builtin = @Builtin_Method "Comparable.less_than_builtin"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this method. @Builtin_Methods are supposed to be only in private modules. Moreover there already is such a less_than_builtin in Ordering_Helpers.enso.

Copy link
Member Author

@radeusgd radeusgd Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this method is indeed wrong, as I don't see a Comparable.less_than_builtin builtin in Java.

It was probably an artifact from some merges that happened along the way (or just my oversight). I'll remove this one.

@@ -37,8 +37,10 @@ type Default_Comparator
## PRIVATE
hash : Number -> Integer
hash x = Default_Comparator.hash_builtin x

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just trying to apply consistent formatting.

@@ -91,6 +91,9 @@ type Project_Description
namespace : Text
namespace self = self.ns

## PRIVATE
enso_project_builtin module = @Builtin_Method "Project_Description.enso_project_builtin"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't put Builtin_Methods into non-private modules.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@radeusgd has a good point. The corresponding builtin method is already defined as EnsoProjectNode. Thus, this code does not introduce any changes. It seems that BuiltinFunction.isAutoRegister makes it possible to defined a builtin method in Java without any shadow definition in Enso code. I think having these kind of shadow definitions is beneficial, as we can immediately see all the definitions of builtin methods on a type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's how auto register was supposed to work.

@@ -832,6 +832,12 @@ type File
to_display_text : Text
to_display_text self = self.to_text

## PRIVATE
copy_builtin self target options = @Builtin_Method "File.copy_builtin"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing all these (undesirable) @Builtin_Methods changes, I wonder how are they related to type inference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no changes to builtin methods structure.

I'm only reproducing in Enso code the structure that has already been there in the builtins. That is - we already were having calls to File.copy_builtin or Comparable.less_than_builtin in the code, they were just lacking the related definitions.

If that structure is undesirable, we should restructure the builtins, but that is not really in the scope of the PR. But if you will not allow me to merge it without that, I will probably be forced to do this unrelated change.

I wonder how are they related to type inference?

I've added checking of method calls. What methods are available on an object was taken from the type definitions. If a builtin method was called e.g. File.copy_builtin but there was no copy_builtin defined on File, my type inference was giving warnings about calling nonexstent methods. I don't think builtins shall be a special case (apart from a naming convention I cannot easily tell if a method is a builtin or doesn't exist). Most of our builtins already had such 'shadow definitions', there was just a few that were lacking and I've added these missing ones. See for example, File.input_stream_builtin - it's been there for a long time. I think we should do a proper refactor of builtins to move them into private modules. But that's the #6282 ticket.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, these changes are not absolutely required for the type inference changes. They are there to make the tests pass, but I can @Ignore a few tests and revert these changes so that we can handle them separately. Maybe I should have done them as a separate PR like with #11422

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like these kind of shadow definitions of builtin methods. They don't (at least they should not) introduce any changes (that is, they don't introduce any new builtins) and they make it easy to see which builtin methods are defined on a type. It may be worth revisiting BuiltinMethod.autoRegister annotation argument.

See my other comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a builtin method was called e.g. File.copy_builtin but there was no copy_builtin defined on File, my type inference was giving warnings about calling nonexstent methods.

I see. That's indeed related.

I don't think builtins shall be a special case

Builtin methods are special case! They are not behaving as normal methods.

we should do a proper refactor of builtins to move them into private modules. ... that's the #6282 ticket.

... shadow definitions of builtin methods ... They don't introduce any changes ...

OK.

worth revisiting BuiltinMethod.autoRegister annotation argument.

Builtins...carry a autoRegister property to determine if a builtin method should be automatically registered with the type.

What's the benefit of registering a builtin automatically with the type, @hubertp?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while ago but IIRC that was specifically requested either by you or @kustosz, so that we don't have to write all those builtin methods as it is done now in this PR.

Copy link
Member

@Akirathan Akirathan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good overall. Published few minor comments. How am I supposed to actually run the static type checker from the command line? I have tried

enso --enable-static-analysis --no-ir-caches --compile ~/tmp/Proj

on a project with this single source Main.enso:

from Standard.Base import all

foo x:Integer -> Integer =
    x + 1

main =
    foo "Hello"

and expected an error, but it normally compiles.

Moreover, do we have an issue that tracks integration of static type checking into our tests?

@@ -3,12 +3,13 @@
/** Defines a stage of compilation of the module. */
public enum CompilationStage {
INITIAL(0),
AFTER_PARSING(1),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this change?

Copy link
Member Author

@radeusgd radeusgd Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was introducing a new stage: AFTER_TYPE_INFERENCE_PASSES, that was supposed to be between two other existing passes. I thought that multiplying the numbers by 10 allows us to introduce such middle stages in the future in an easier manner - one would not need to modify all values as there will still be 'free' integers between existing stage numbers.

But I can just bring this back to a natural 0, 1, 2... ordering if that's preferred - please let me know.

private final Type associatedType;
private final Module module;
private final Map<String, Supplier<TruffleObject>> polyglotSymbols;
private final Map<String, Type> types;
private final Map<Type, Map<String, Supplier<Function>>> methods;

/**
* First key is target type, second key is source type. The value is the conversion function from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated line

.invokeMember(MethodNames.TopScope.LEAK_CONTEXT)
.asHostObject();

protected Module compile(Source src) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRY: I believe that instead of this, and manual context creation, you could use methods from ContextUtils, like

public static org.enso.compiler.core.ir.Module compileModule(Context ctx, String src) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I think that these just did not exist when I wrote this, and I did not notice now that this was introduced - thanks for letting me know!

* This should be aligned with Type.allTypes in the interpreter.
*/
public class TypeHierarchy {
public TypeScopeReference getParent(TypeScopeReference type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor to static method? No need for this to be instance.

@@ -5,15 +5,11 @@

/** A helper class providing the builtin types. */
public class BuiltinTypes {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are not all the methods in this class static, and this class does not have a private constructor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Initially I thought that I will need to initialize the BuiltinTypes with some references to the compiler context to be able to know more info about these builtins. After all it was handled a different way. I did not change it as I did not want to close the way for BuitlinTypes having some members that would have to be initialized.

But given that for now it could be static, I guess I can change it, and if the need ever arises, revert such change. As you are right that for now this does not serve much purpose.


private void assertInferredType(IR ir, String expected) {
TypeRepresentation inferred = getInferredType(ir);
assertEquals(expected, inferred.toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Relying on correct toString implementation in tests does not seem like a good idea. What about moving the current toString implementation to a class in tests dir in the same package? It would then look something like this:

assertEquals(expected, TestUtil.toString(inferred));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on correct toString implementation in tests does not seem like a good idea.

Hmm, I never looked at it this way. IMHO it may make sense to test toString, but I guess it's a habit from working on Libs where the to_text implementation is part of the public user-facing API and is supposed to be well tested.

I'm happy to move it to a helper if you think that will be better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on correct toString implementation in tests does not seem like a good idea

Why? toString() format is part of an API. But yeah, having for example:

assertTypeRepresentation(....)

might be more flexible.

import org.graalvm.polyglot.Source;
import scala.jdk.javaapi.CollectionConverters;

public abstract class StaticAnalysisTest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Methods in this class seems useful even outside of static analysis. Could you try either merging them to ContextUtils or just to the same package and refactoring the methods here as static?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will see what I can do.

This PR was in progress for such a long time (because I was rarely finding time to work on it), that I think StaticAnalysisTest was created before ContextUtils. I guess if I were creating it now I'd indeed put them somewhere around there.

@Akirathan
Copy link
Member

Could you also add another test that creates a dummy project and invokes the type checker and checks that there are some warnings? Something like this:

diff --git a/engine/runtime-integration-tests/src/test/java/org/enso/compiler/test/TypeInferenceTest.java b/engine/runtime-integration-tests/src/test/java/org/enso/compiler/test/TypeInferenceTest.java
index 416c067a0..c4999481f 100644
--- a/engine/runtime-integration-tests/src/test/java/org/enso/compiler/test/TypeInferenceTest.java
+++ b/engine/runtime-integration-tests/src/test/java/org/enso/compiler/test/TypeInferenceTest.java
@@ -2,14 +2,20 @@ package org.enso.compiler.test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.assertNotEquals;
+import static org.junit.Assert.assertNotNull;
 import static org.junit.Assert.assertTrue;
 import static org.junit.Assert.fail;
 
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
 import java.net.URI;
+import java.nio.file.Files;
+import java.nio.file.Path;
 import java.util.HashSet;
 import java.util.List;
 import java.util.Optional;
 import java.util.Set;
+import org.enso.common.RuntimeOptions;
 import org.enso.compiler.core.IR;
 import org.enso.compiler.core.ir.Module;
 import org.enso.compiler.core.ir.ProcessingPass;
@@ -19,11 +25,19 @@ import org.enso.compiler.core.ir.module.scope.definition.Method;
 import org.enso.compiler.pass.analyse.types.InferredType;
 import org.enso.compiler.pass.analyse.types.TypeInferencePropagation;
 import org.enso.compiler.pass.analyse.types.TypeRepresentation;
+import org.enso.test.utils.ContextUtils;
+import org.enso.test.utils.ProjectUtils;
 import org.graalvm.polyglot.Source;
 import org.junit.Ignore;
 import org.junit.Test;
 import scala.Option;
 
+import static org.hamcrest.MatcherAssert.assertThat;
+import static org.hamcrest.Matchers.is;
+import static org.hamcrest.Matchers.notNullValue;
+import static org.hamcrest.Matchers.containsString;
+
+
 public class TypeInferenceTest extends StaticAnalysisTest {
   @Test
   public void zeroAryCheck() throws Exception {
@@ -1430,6 +1444,38 @@ public class TypeInferenceTest extends StaticAnalysisTest {
     assertAtomType("local.Project1.modB.Typ_Y", findAssignment(foo, "x2"));
   }
 
+  @Test
+  public void staticTypeCheckerReportsWarningsOnProject() throws IOException {
+    var mainSrc = """
+        type My_Type
+            Cons data
+        
+        foo x:My_Type -> My_Type =
+            My_Type.Cons (x.data + 1)
+        
+        bar =
+            foo "Hello"
+        
+        main =
+            42
+        """;
+    Path projDir = Files.createTempDirectory("enso-tests");
+    ProjectUtils.createProject("Proj", mainSrc, projDir);
+    var out = new ByteArrayOutputStream();
+    var ctxBuilder= ContextUtils.defaultContextBuilder()
+        .option(RuntimeOptions.DISABLE_IR_CACHES, "true")
+        .option(RuntimeOptions.ENABLE_STATIC_ANALYSIS, "true")
+        .option(RuntimeOptions.STRICT_ERRORS, "true")
+        .out(out)
+        .err(out)
+        .logHandler(out);
+    ProjectUtils.testProjectRun(ctxBuilder, projDir, res -> {
+      assertThat(res.isNumber(), is(true));
+      assertThat(res.asInt(), is(42));
+      assertThat(out.toString(), containsString("Type mismatch"));
+    });
+  }
+
   private TypeRepresentation getInferredType(IR ir) {
     var option = getInferredTypeOption(ir);
     assertTrue(

Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • avoid snake case in Java code
  • make public methods final when they aren't supposed to be overriden
  • avoid extends in algorithms - I believe they aren't necessary and just complicate the algorithm with external references.

@@ -29,4 +29,5 @@
exports org.enso.compiler.phase;
exports org.enso.compiler.phase.exports;
exports org.enso.compiler.refactoring;
exports org.enso.compiler.common_logic;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No snakecase, please. That's not typical Java convention.

}

/** Runs the main processing on a module, that will build the module scope for it. */
public void processModule(Module moduleIr, BindingsMap bindingsMap) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall this method be overriden? If not, make it final.

private final Logger logger = LoggerFactory.getLogger(BuildScopeFromModuleAlgorithm.class);

/** The scope builder to which the algorithm will register the entities. */
protected final ModuleScopeBuilderType scopeBuilder;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this field protected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think child classes were also using it in the implementations of the protected methods. If that ceased to be the case and I forgot to notice, I will make it private. Otherwise, if it's used in child classes, do you think it can stay protected?

Because alternatively, I could make it private and store yet another reference to the scope builder in each child class. But storing the same reference twice in the same class felt weird for me and I didn't think it was necessary, that's why I opted for protected final. But I'm open to a different approach if you think it's preferred.

TypeScopeReferenceType,
ImportExportScopeType,
ModuleScopeType extends
CommonModuleScopeShape<FunctionType, TypeScopeReferenceType, ImportExportScopeType>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! One should always beware of what one asks for!

ModuleScopeType extends
CommonModuleScopeShape<FunctionType, TypeScopeReferenceType, ImportExportScopeType

Showing object algebras to a functional programmers may lead to code that I no longer understand...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is any need for T1 extends CT1<....>.

If you want to perform operation on ModuleScopeType, then add a protected abstract method that works on ModuleScopeType.

Avoid extends in the algorithm generic arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, okay I can rewrite it to such a form. It felt to me that it's a bit easier and allows me to separate concerns - the parts providing the ModuleScopeType interface were separated from the main algorithm and could be shared across 2 different 'algorithm' abstract classes.

If I put these methods directly into the algorithm, I'll need to duplicate them for each one. How can I ensure I can still share them? I guess I need to create a separate class ModuleScopeInterface<ModuleScopeType> that will allow me to act on the ModuleScopeType values?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, okay I can rewrite it to such a form.

That's indeed my preference. As an alternative...

need to duplicate them for each one

... you can try to convince me that your design is right. I just feel scared for a signature like:

ModuleScopeType extends
        CommonModuleScopeShape<FunctionType, TypeScopeReferenceType, ImportExportScopeType>

* <li>Finally, methods imported from other modules.
* </ol>
*/
public FunctionType lookupMethodDefinition(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be overriden or not? If not, make final.

Copy link
Contributor

@hubertp hubertp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of changes here, making it hard to review. It would help, in the future, if scope changes were committed separately because I think they can.

I don't see the need for explicit builtin definitions so maybe I'm missing something. Auto-register was defined for that. If no longer allow/desire it then that functionality can go as well.

@@ -91,6 +91,9 @@ type Project_Description
namespace : Text
namespace self = self.ns

## PRIVATE
enso_project_builtin module = @Builtin_Method "Project_Description.enso_project_builtin"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's how auto register was supposed to work.

@@ -832,6 +832,12 @@ type File
to_display_text : Text
to_display_text self = self.to_text

## PRIVATE
copy_builtin self target options = @Builtin_Method "File.copy_builtin"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while ago but IIRC that was specifically requested either by you or @kustosz, so that we don't have to write all those builtin methods as it is done now in this PR.

override protected def processConversion(
conversion: Method.Conversion
): Unit = {
lazy val where =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change it do def and make rootScopeInfo accept a function instead. Having a lazy val here and below is rather pointless. I know it has been like that before :)

)
case _ =>
throw new CompilerError(
"Conversion bodies must be functions at the point of codegen."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include toType and fromType for easier identification of bug

override protected def processMethodDefinition(
method: Method.Explicit
): Unit = {
lazy val where =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

"Cannot create a qualified name from an empty list of parts."
)
}
QualifiedName(parts.dropRight(1), parts.last)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
QualifiedName(parts.dropRight(1), parts.last)
QualifiedName(parts.init, parts.last)

@radeusgd
Copy link
Member Author

radeusgd commented Nov 7, 2024

There is a lot of changes here, making it hard to review. It would help, in the future, if scope changes were committed separately because I think they can.

They certainly couldn't have been done separately, as I had to first figure out what I need in StaticModuleScope to be able to find what is common enough that can be relatively easily extracted to the common base. Now, having created it, yes, I could artificially split the code changes into 2 separate patches to make the review easier - it just required additional work that I thought maybe will not be necessary. But please let me know if I should do that. Just noting that I could not have avoided the additional work of splitting by 'better planning', as at the beginning I didn't know what the split would have to look like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: No changelog needed Do not require a changelog entry for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inferring types of method calls on Atoms
4 participants