Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

experiment with named classes in operations #265

Merged
merged 35 commits into from
Apr 6, 2022
Merged

Conversation

erikerlandson
Copy link
Owner

a followup to #264, alternative to reducing inlined coad bloat

@armanbilge
Copy link
Contributor

Yes, thanks! Exactly what I had in mind :)

@erikerlandson
Copy link
Owner Author

@armanbilge how would you recommend measuring/verifying the reduction of code bloat? write some operator invocations and measure .jar size?

@armanbilge
Copy link
Contributor

Right, that would probably be the way to do it. Another metric would be to introspect the jars and count the number of class files inside.

@armanbilge
Copy link
Contributor

Here's a Cats PR focused on class size. But that was generated sources, not macros.
typelevel/cats#3871

Macros makes it a lot harder. Actually, maybe what you want to measure is not the absolute artifact size, but how it grows. E.g. do more ops calls create more anonymous classes linearly, or is it a constant factor of code size.

@erikerlandson
Copy link
Owner Author

Experiment results.

Control: using the current system, anonymous classes. The main point of comparison is the code generated for study2 object,
which invokes pow on some variables of differing quantity types:

$ ls *study2*class
'study2$$anon$1.class'  'study2$$anon$3.class'  'study2$$anon$5.class'  'study2$.class'
'study2$$anon$2.class'  'study2$$anon$4.class'  'study2$$anon$6.class'   study2.class

$ javap -c $(ls *study2*class) | wc -l
251

We see that six anonymous classes were generated. Three of these were Pow anonymous classes, and three are SimplifyUnit anonymous classes.

Here are the corresponding results for using the named class PowImplFP:

$ ls *study2*class
'study2$$anon$1.class'  'study2$$anon$2.class'  'study2$$anon$3.class'  'study2$.class'   study2.class

$ javap -c $(ls *study2*class) | wc -l
137

We can see that using the named class only 3 anonymous classes were created (again, the SimplifyUnit ones).
No anonymous classes were required for PowImplFP, even though the type signatures are different in each case.

Comparing the total lines of byte code from javap -c we see that that total bytecode is cut in half.

This probably is not the most precise possible comparison, but the overall picture is pretty clear that doing this has a huge impact on the anonymous classes and corresponding byte code generated. Based on this experiment I feel pretty confident that this can be done basically everywhere. It's a bit annoying but the benefits appear to be massive.

@erikerlandson
Copy link
Owner Author

cc @cquiroz

@armanbilge
Copy link
Contributor

Thank you very much for doing the experiment! Cool, good to see this lines up with our intuition and most importantly that your fix works!

@cquiroz
Copy link

cquiroz commented Apr 1, 2022

Cool, this is looking great, thanks a lot

@erikerlandson
Copy link
Owner Author

erikerlandson commented Apr 1, 2022

Applying this same measurement to QuantitySuite:

$ ls -1 QuantitySuite*.class | wc -l
350

$ javap -c QuantitySuite*.class | wc -l
12649

I should be able to track the improvement from adding named classes across coulomb

or "all anonymous classes in the unit testing"

$ ls -l *anon*.class | wc -l
420

$ javap -c *anon*.class | wc -l
15158

"all class files in testing"

$ ls -l *.class | wc -l
427

$ javap -c *.class | wc -l
16012

@cquiroz
Copy link

cquiroz commented Apr 2, 2022

That's quite a difference

@erikerlandson
Copy link
Owner Author

@armanbilge While I've been doing this, I noticed something interesting. The various typeclasses that are subclasses of X => Y (functions) were not producing anonymous types in the output code, even though Function and Conversion are both abstract classes. (abstract classes I defined were). I'm not sure why that is, unless functions have special status to the compiler.

It got me thinking about the operator classes, take Add as an example:

abstract class Add[VL, UL, VR, UR]:
    type VO
    type UO
    def apply(ql: Quantity[VL, UL], qr: Quantity[VR, UR]): Quantity[VO, UO]

This is very close to being a function:

abstract class Add[VL, UL, VR, UR] extends (Quantity[VL, UL], qr: Quantity[VR, UR]) => Quantity[VO, UO]

Except that the types VO and UO cannot be forward resolved, or at least none of the hackery I've tried so far has persuaded the compiler.

It's not a big deal if it can't work, but if it could, it might allow more elegant definitions.

@armanbilge
Copy link
Contributor

Ah yes, you've discovered the magic of single-abstract-methods (SAMs) and invokedynamic introduced in Scala 2.12.0 :)

The FunctionN classes in Scala’s standard library are now Single Abstract Method (SAM) types, and all SAM types are treated uniformly – from type checking through code generation. No class file is generated for a lambda; invokedynamic is used instead.

https://www.scala-lang.org/news/2.12.0/

@erikerlandson
Copy link
Owner Author

erikerlandson commented Apr 5, 2022

I think I cleared up one issue that was confusing me - the "SAM/lambda" trick will work with inline given but it will still generate anonymous classes. If you remove inline it will also not generate the anonymous classes.

When I define my own named classes, I can avoid anonymous class generation regardless.

@armanbilge
Copy link
Contributor

@erikerlandson I couldn't quite follow your last comment ... does that mean you found a workaround, to avoid generating anonymous classes without having to name them?

@erikerlandson
Copy link
Owner Author

@armanbilge for any of my inline given functions, I had to explicitly define a named class to avoid anonymous generation. This was also true for SAM classes.

For given functions (no inline), assigning a lambda to a SAM class would not generate any anonymous classes.

@armanbilge
Copy link
Contributor

Ah, thanks for the clarification. Makes sense about given-not-inline. Interesting observation about inline given 🤔

@erikerlandson
Copy link
Owner Author

when using inline given, it is still fine to assign a lambda to a SAM class. It compiles and runs correctly, it just also happens to generate an anonymous class file for the class.

@armanbilge
Copy link
Contributor

Yeah, I wonder if there's some underlying compiler issue there. Scala 2 has some bugs that prevent SAMs in some situations where otherwise they could be ok.

@erikerlandson
Copy link
Owner Author

possibly it's a "bug" in the sense that it could avoid the anonymous generation but it doesn't. Regardless, it's been a useful exercise. Most of my operator classes require a named class but I've been able to simplify it considerably from my first attempt.

@erikerlandson
Copy link
Owner Author

Before (anonymous classes)

$ ls -l *.class | wc -l
427

$ javap -c *.class | wc -l
16012

After (using all named classes)

$ ls -l *.class | wc -l
7
$ javap -c *.class | wc -l
782

Near as I can tell, coulomb is now introducing NO anonymous classes into the compiled code.

@erikerlandson erikerlandson merged commit 12adbf4 into scala3 Apr 6, 2022
@cquiroz
Copy link

cquiroz commented Apr 6, 2022

This is great, it is especially useful for scala.js

@erikerlandson erikerlandson deleted the named-classes branch April 9, 2022 00:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants