Replies: 11 comments
-
Agree 100%. Already did a couple tests like that here and there but not nearly covering all the functionality we have |
Beta Was this translation helpful? Give feedback.
-
We shouldn't really need to if we don't store these as videos (because of the lossy compression), although that will increase refimage size. But each animation can usually be lowered to 2-4 frames anyway for what we care about so it's more like |
Beta Was this translation helpful? Give feedback.
-
One thing I think not mentioned here is update tests. We have a couple of those here and there but we should systematically check that we can update plot observables with different lengths without them breaking. Even if we have to use the val strategy at least we should ensure that that workaround is always available. |
Beta Was this translation helpful? Give feedback.
-
I think we'll discover lots of stuff to add in unit tests if we actually sit down and check coverage without high level calls. But I agree that the synchronous updates should be tested. |
Beta Was this translation helpful? Give feedback.
-
Another example from #3102: |
Beta Was this translation helpful? Give feedback.
-
Maybe we should check for literature on reference image testing. It really seems like the default method we're using is not suitable. |
Beta Was this translation helpful? Give feedback.
-
Another minor suggestion for reference images: I think it would be good to sort these into folders based on the file they come from. I sometimes look at the generated images directly and it's always a pain because it's just a big pile of images. |
Beta Was this translation helpful? Give feedback.
-
Maybe we should also add timings to our tests so we can track which tests eat up a lot of time. |
Beta Was this translation helpful? Give feedback.
-
Oh yeah, been meaning to do something like that. TimerOutputs.jl might make this relatively painless? I also wanted to have it for the docs to understand which pages or examples take the most time |
Beta Was this translation helpful? Give feedback.
-
For docs, maybe TimerOutputs could be useful if |
Beta Was this translation helpful? Give feedback.
-
For the docs it might be enough to use a running index for each codeblock. Name is tough, internally I use hashes of the code. |
Beta Was this translation helpful? Give feedback.
-
I wanted to summarize some goals and issues with Makies testing here.
We currently rely quite heavily on cross-backend reference image tests, which need to have a relatively high tolerance to deal with differences between OpenGL, WebGL and Cairo. This leads to small visual changes being missed by testing and in pr reviews regularly. To improve I think we broadly need to do two things - reduce our reliance on reference image tests by improving coverage on (code based) unit tests and improve the reliability of reference image tests.
Code based unit tests
Currently most of Makie's unit tests are part of Makie.jl, with little to none in the backends. This is understandable since the backends are mostly responsible for rendering, however there is also some code that can and should be tested there. Makie.jl itself has fairly spotty testing. Ultimately we will need to spend some time here to figure out what is missing an what is reasonable to test. For now here is an incomplete list of possible additions:
convert_arguments()
, i.e. every possible conversion path. This may also help us find redundant or missing methods, if there are any.angle(p1, p2)
,to_ndim()
, etcparent(scene)
,events()
, etcI also want to note that test coverage can be quite deceptive. I tried just
f, a, p = scatter(Point2f(0)); @test p[1][] == Point2f(0)
a while ago and got 26% coverage with just that. It may make sense to avoid and/or exclude these kind of full-stack calls to get more useful results.Reference image tests
For reference images we should start including (more) backend specific tests. These tests should be running with low tolerance and be very specific, like unit tests. I.e. they should not include an Axis, they should not rely on camera placement, they should be as independent of Makie.jl as they can be. They should test the specific things implemented by the backend and aim to check edge cases as well. There is some previous discussion on this in #298.
For cross-backend tests we likely need to do some cleanup. Some questions we can ask ourselves here are:
And more generally:
Another group of reference image tests that would be good to include are the images from documentation. These, I think, would not aim to test functionality as much, but rather make sure that our documentation looks correct.
General remarks:
Some other failures of reference images:
Beta Was this translation helpful? Give feedback.
All reactions