Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: integration test battery #3654

Merged
merged 61 commits into from
Jul 6, 2020
Merged

Conversation

oliver-sanders
Copy link
Member

@oliver-sanders oliver-sanders commented Jun 13, 2020

Implementation of the proposal #3616
Closes #3547

TODO:

  • Tidy some of the integration tests.
  • Sort out the integration test skips.

Highlights:

  • Only ±2500 lines, nice small change (how the hell did that happen).
  • Flaky unit tests now fixed.
  • Successfully test runs no longer leave behind files in ~/cylc-run.
  • New integration test framework.
  • Separation of CLI/scheduler functionality in cylc/flow/scheduler*.py.
  • All three test batteries now colocated in tests/.

Usage:

To run all tests:

$ pytest tests/u tests/i
$ etc/bin/run-functional-tests tests/f tests/k

One day this will be simplified further to:

$ pytest tests

Philosophy

There is some kind of scale with testing approaches: unit < integration < functional.

  • Unit tests, quick to write, quick to debug.
  • Functional tests, slow to write, slow to debug.
  • (integration tests somewhere in-between).

When debugging start with the unittests and move up once you've fixed them all. The integration tests don't stand a chance of passing if the unittest are broken, the functional tests don't stand a chance of passing if the integration tests are broken.

Keep the test types separate, use the right tool for the job.

I've put README.md files in each of the test directories, take a look and see what you think, I'm not a testing expert - edits welcome!

Integration Tests:

Documentation:

  • For information on the new "integration tests" see tests/i/README.md
  • For a functional documentation on writing integration tests see tests/i/test_examples.py

Implementation:

  • Test flows get created in a directory hierarchy similar to the functional tests.
  • These dirs get cleaned up in the event tests pass, else they are left behind for debugging.
  • Don't expect to find suite logs in these directories, you wont find them!
  • Captured logging is presented by pytest in traceback and can be used in tests.

Parallelism:

  • The tests use pytest.asyncio
    • This allows you to write async test functions.
    • They are tested in SERIES just like regular pytests.
    • pytest.asyncio is not for parallelising tests (see GH issues).
  • Pytest now uses pytest-xdist for parallelism:
    • Pytest is configured to run tests from the same module together.
    • This means if you define a module-scoped fixture it will only get created once.
    • So we can run a flow one, then perform multiple tests on it.

Scheduler:

  • You can now run workflows by importing the Scheduler class in Python.
  • The CLI stuff (e.g. deamonize, sys.exit, logging) has been moved to scheduler_cli.
  • You can run multiple schedulers in the same asyncio event loop.
  • Interestingly this opens up running sub-suites in the same event loop as the parent.

Caveats:

  • Test teardown fails on NFS. Error suppressed until resolution of public database not closed correctly - consider persisting the connection #3666
  • The stdout of cylc run is not displayed in colour after daemonize().
  • The distinction between unit/integration/functional tests is a grey area.
  • I've had to [temporally] skip the client connection tests due to an asyncio event loop incompatibility. Fixed.
  • The scheduler exits with os._exit rather than sys.exit, this is not ideal.
  • There pytest waits for approx 10s after running integration tests, asyncio shutdown issue?

Requirements check-list:

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
    • Contains off-topic changes documented above
  • Appropriate tests are included (unit and/or functional).
  • No change log entry required (why? e.g. invisible to users).
  • No documentation update required.
    • documentation now in readme files under tests/

@oliver-sanders oliver-sanders added this to the cylc-8.0a3 milestone Jun 13, 2020
@oliver-sanders oliver-sanders self-assigned this Jun 13, 2020
Copy link
Member Author

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three tests didn't port well from the unit test framework and have errors which need cleaning up, any ideas?

)


@pytest.mark.skip('TODO: the delta doesnt seem to have an id for some reason')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The delta seems to be missing the id field.

Comment on lines +188 to +190
@pytest.mark.skip(
reason='TODO: trigger_tasks is resultin in traceback due to '
'missing task_globs arg')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing due to an argument error in the server?

assert len(update_tasks) == len(collect_states(data, TASK_PROXIES))


@pytest.mark.skip('TODO: fix this test')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The update doesn't do anything because the workflow hasn't changed state, how do I trick it?

cylc/flow/hostuserutil.py Outdated Show resolved Hide resolved
Copy link
Member

@kinow kinow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 95% of the changes in this pull request are files being moved.

Probably %2 are the scheduler/cli changes, that I will leave to others with more experience there to review/comment.

The rest are test related, and it looks great.

In Java, normally we would have unit & functional tests. Mocked and even things like the pytest fixtures exist in JUnit/TestNG too, but they are used in both unit & functional tests. Functional tests would be for things related to stories/requirements for features of the system (e.g. we send a JMS message to a queue for every new blog post), and integration for things that required external modules/systems (e.g. the system is able to send a JMS message to ActiveMQ version 1.2.3).

In JS it depends on the framework, but nowadays people are converging towards unit without a DOM, or with a mocked DOM, and end-to-end with cypress/selenium/nightwatch, meaning that it has a browser with a valid DOM. I think I haven't seen integrations tests in JS yet, only unit/functional/e2e/and acceptance testing, all but unit very similar.

For a Ruby developer, I think they have some sort of tests for model classes that are unit. Then they have specs that may be functional tests, and Capybara which they say is for acceptance, which looks like functional or e2e tests too 😕

So the idea of a functional and unit test is clear for me. The idea of functional and integration is a bit blur, unless I really understand both the system under test and the test harness & tools, as that may differ in a different project (even though there are definitions on Wikipedia, books, etc).

After these changes in Cylc, I think I will write unit tests for any small changes, that can be done without a running workflow or with minimum mocking.

If I require mocking complex objects, or have parts of the system working/running, I will probably need an integration test, using the fixtures in the integration module.

And I would write a functional test if I introduced something new in Cylc that I would like to make sure it works as a whole (even if I have parts of the new feature being tested with unit tests, maybe I could still want to have a complete end to end test of all the parts of the system?).

Is that close to the right use of the new different tests @oliver-sanders for Cylc?

Cheers
Bruno

p.s.: ISTQB has a glossary for testing, it used to be popular some 10 years ago, there are probably over 20 different types of testing 👀

SchedulerStop(StopMode.AUTO.value)
)
except asyncio.TimeoutError:
# but be prepared to use the nuclear option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄

@pytest.fixture(scope='session')
def port_range():
ports = glbl_cfg().get(['suite servers', 'run ports'])
return min(ports), max(ports)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

sys.exit(1)
self.close_logs()
* Initialise the network components.
* Initialise mangers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/mangers/managers

It is free software, you are welcome to
redistribute it under certain conditions;
see `COPYING' in the Cylc source distribution.
""" % CYLC_VERSION
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was in the scheduler before? It makes sense to keep it here IMO 👍

@@ -77,6 +84,7 @@
"overrides the suite definition.")


@lru_cache()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sonds like a good idea!


The Scheduler itself should be a Python object you can import and
run in a regular Python session so cannot contain this kind of
functionality.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 ! The code of the Scheduler class/object looks way better. I can't comment further as I am not that familiar, so will leave to others (especially @hjoliver ) to review that. And it got some types added to the class attributes 🎉 thanks!

cylc/flow/scheduler_cli.py Outdated Show resolved Hide resolved
raise exc2 from None
ret = 2
except Exception:
ret = 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was going to ask why not throw/return the exception. But then realized this is the code to be used in exit() function. Maybe add a comment in the function stating that the returned value is an int, and that it's used to provide the command execution status/exit code?

@oliver-sanders
Copy link
Member Author

So the idea of a functional and unit test is clear for me. The idea of functional and integration is a bit blur

Yes, the distinction is subtle and I think in truth it's really more of a scale between unit and functional rather than a distinct clustering.

I've put the framework for running suites within the integration battery to help "draw a line in the sand".

Is that close to the right use of the new different tests

Yep, that's what I was thinking, bang on!

My thinking is:

  • Unit tests to get coverage up.
  • Unit/Integration tests to cover all eventualities / use cases
  • One functional test to make sure that core functions are intact (but preferably only one, no need to test all eventualities here)

But I'm no testing expert! Input greatly appreciated.

@hjoliver
Copy link
Member

So the idea of a functional and unit test is clear for me. The idea of functional and integration is a bit blur ...

Someone wrote a helpful article for us 😁

https://www.softwaretestinghelp.com/the-difference-between-unit-integration-and-functional-testing/

@kinow
Copy link
Member

kinow commented Jun 16, 2020

I've put the framework for running suites within the integration battery to help "draw a line in the sand".

:+1 we have at least a clear definition for Cylc of what is what, and where each test should go. I can't promise I will get it right when I have to choose between the three types, but I feel confident that reading the docs and looking at existing tests I should find the right test, and eventually that will become natural 🤞

So the idea of a functional and unit test is clear for me. The idea of functional and integration is a bit blur ...

Someone wrote a helpful article for us grin

https://www.softwaretestinghelp.com/the-difference-between-unit-integration-and-functional-testing/

Yup! I had seen this one too. And each directory has the helpful README that @oliver-sanders wrote. If anyone has problems or is confused about which one to write, I think at least someone can point to those files and also to existing tests as reference.

@oliver-sanders
Copy link
Member Author

Yep, I think I used the diagram from that article in my proposal. I did a bit of reading which was quite confidence giving seeing people confirm what I've been slowly realising.

The really big bit is time to write and time to debug. We spend so much time on functional tests, I personally sunk nearly an entire week into them (less than the time it took to write this new framework) just to fix broken tests. Hopefully this will help accelerate future development, might take a while to get there though...

@oliver-sanders
Copy link
Member Author

Rebased and deconflicted, test changes ported from #3500.

@oliver-sanders
Copy link
Member Author

  • Rebased.
  • Deconflicted.
  • Fixed asyncio issues (you will need to pip install again).
  • Suppressed NFS issue as unrelated to this change.

@oliver-sanders
Copy link
Member Author

@datamel and @kinow thank you for stepping in to review.

There are a couple of specific aspects which could do with standalone reviewing:

  • @hjoliver best make sure you're happy with the scheduler / scheduler_cli re-alignment and the general gist of the change.
  • @dwsutherland if you could cast your eye over the asyncio stuff in scheduler/scheduler_cli.
  • @dwsutherland best make sure I've not butchered the tests.

--ignore=cylc/flow/data_messages_pb2.py
testpaths =
cylc/flow/
tests/lib/python/
tests/unit/
Copy link
Contributor

@datamel datamel Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tests/unit/
tests/unit/
tests/integration/

Adding this path enables integration tests to be picked up correctly by visual studio code for easy debugging. I'm happy for this change not to go in (as it relates to my debugging process) but may help others too. Without it I'm not 100% sure why pytest was picking up the integration tests...hmmm.

Copy link
Member Author

@oliver-sanders oliver-sanders Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I purposefully left the integration tests out of testpaths so that when you run pytest it defaults to running tests/unit. If there's another way to define the default test set I'm game

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no worries :)

@oliver-sanders
Copy link
Member Author

The last commit should fix the hanging functional test. Hopefully this is the last time I have to mess with asyncio in this PR!

Even though it's only a few lines getting the asyncio code right has been a lot of work, the integration test framework only took one day. Here are some lessons learned:

  • I no longer believe in magic, but I now believe in ghosts.
  • Async generators must be manually shutdown, you might not use async generators in your code but that doesn't mean they aren't there. If you use asyncio.run it creates and closes an event loop for you tidying up async gens correctly. Look at the source code for asyncio.run, it's not actually doing much under-the-hood.
  • If sys.exit hangs it means you've got threads (which have sleep statements) that haven't been shutdown correctly. Keep track of threads closely and ensure they are always shutdown. If os._exit fixes it then this is probably the issue.
  • Daemonisation is a hard barrier that event loops cannot persist through. This is totally expected but it means you need to tidy up before daemonisation else the original process won't be able to exit.

@oliver-sanders
Copy link
Member Author

Merged and de-conflicted (tried rebase, it was horrible).

@oliver-sanders
Copy link
Member Author

Fixed tests/f/cylc-ping/03 which was dependent on key_housekeeping being called before remrun.

@hjoliver
Copy link
Member

hjoliver commented Jul 3, 2020

Here are some lessons learned:...

That sounds like some stuff that I don't want to rediscover the hard way in the future. Would it be worth copying to code comments for reference (in scheduler.py?).

@oliver-sanders
Copy link
Member Author

Yep, will do.

@oliver-sanders
Copy link
Member Author

Deconflicted.

@@ -122,7 +124,8 @@ def __str__(self):
return self.value


class Scheduler(object):
@dataclass
class Scheduler:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

cylc/flow/scheduler.py Show resolved Hide resolved
pytest.ini Show resolved Hide resolved
Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive/massive job 👍 I scanned almost everything; tests work; I like the new integration test framework; my two comments above aren't important so I'll merge this and we can get on with the horrifying conflict resolution ...

@hjoliver best make sure you're happy with the scheduler / scheduler_cli re-alignment and the general gist of the change.

I'm happy with it - one question above indicates I don't fully understand your use of asyncio yet ... but it works fine, and it I guess it doesn't matter if some asyncio tasks are still entirely blocking.

  • @dwsutherland if you could cast your eye over the asyncio stuff in scheduler/scheduler_cli.
  • @dwsutherland best make sure I've not butchered the tests.

David is on leave for a week and I think he can come back to this post merge. The first bullet point (scheduler/_cli) seems simple enough (my question above notwithstanding) and the second is "just tests".

@hjoliver hjoliver merged commit e458603 into cylc:master Jul 6, 2020
@hjoliver hjoliver modified the milestones: cylc-8.0a3, cylc-8.0b0 Feb 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fix flaky unit test
4 participants