tests: integration test battery #3654

oliver-sanders · 2020-06-13T19:03:30Z

Implementation of the proposal #3616
Closes #3547

TODO:

Tidy some of the integration tests.
Sort out the integration test skips.

Highlights:

Only ±2500 lines, nice small change (how the hell did that happen).
Flaky unit tests now fixed.
Successfully test runs no longer leave behind files in ~/cylc-run.
New integration test framework.
Separation of CLI/scheduler functionality in cylc/flow/scheduler*.py.
All three test batteries now colocated in tests/.

Usage:

To run all tests:

$ pytest tests/u tests/i
$ etc/bin/run-functional-tests tests/f tests/k

One day this will be simplified further to:

$ pytest tests

Philosophy

There is some kind of scale with testing approaches: unit < integration < functional.

Unit tests, quick to write, quick to debug.
Functional tests, slow to write, slow to debug.
(integration tests somewhere in-between).

When debugging start with the unittests and move up once you've fixed them all. The integration tests don't stand a chance of passing if the unittest are broken, the functional tests don't stand a chance of passing if the integration tests are broken.

Keep the test types separate, use the right tool for the job.

I've put README.md files in each of the test directories, take a look and see what you think, I'm not a testing expert - edits welcome!

Integration Tests:

Documentation:

For information on the new "integration tests" see tests/i/README.md
For a functional documentation on writing integration tests see tests/i/test_examples.py

Implementation:

Test flows get created in a directory hierarchy similar to the functional tests.
These dirs get cleaned up in the event tests pass, else they are left behind for debugging.
Don't expect to find suite logs in these directories, you wont find them!
Captured logging is presented by pytest in traceback and can be used in tests.

Parallelism:

The tests use pytest.asyncio
- This allows you to write async test functions.
- They are tested in SERIES just like regular pytests.
- pytest.asyncio is not for parallelising tests (see GH issues).
Pytest now uses pytest-xdist for parallelism:
- Pytest is configured to run tests from the same module together.
- This means if you define a module-scoped fixture it will only get created once.
- So we can run a flow one, then perform multiple tests on it.

Scheduler:

You can now run workflows by importing the Scheduler class in Python.
The CLI stuff (e.g. deamonize, sys.exit, logging) has been moved to scheduler_cli.
You can run multiple schedulers in the same asyncio event loop.
Interestingly this opens up running sub-suites in the same event loop as the parent.

Caveats:

~~Test teardown fails on NFS.~~ Error suppressed until resolution of public database not closed correctly - consider persisting the connection #3666
The stdout of cylc run is not displayed in colour after daemonize().
The distinction between unit/integration/functional tests is a grey area.
~~I've had to [temporally] skip the client connection tests due to an asyncio event loop incompatibility.~~ Fixed.
~~The scheduler exits with os._exit rather than sys.exit, this is not ideal.~~
~~There pytest waits for approx 10s after running integration tests, asyncio shutdown issue?~~

Requirements check-list:

I have read CONTRIBUTING.md and added my name as a Code Contributor.
Contains logically grouped changes (else tidy your branch by rebase).
Does not contain off-topic changes (use other PRs for other changes).
- Contains off-topic changes documented above
Appropriate tests are included (unit and/or functional).
No change log entry required (why? e.g. invisible to users).
No documentation update required.
- documentation now in readme files under tests/

oliver-sanders

Three tests didn't port well from the unit test framework and have errors which need cleaning up, any ideas?

oliver-sanders · 2020-06-15T09:31:30Z

tests/integration/test_publisher.py

+)
+
+
+@pytest.mark.skip('TODO: the delta doesnt seem to have an id for some reason')


The delta seems to be missing the id field.

oliver-sanders · 2020-06-15T09:32:22Z

tests/integration/test_resolvers.py

+@pytest.mark.skip(
+    reason='TODO: trigger_tasks is resultin in traceback due to '
+    'missing task_globs arg')


Failing due to an argument error in the server?

oliver-sanders · 2020-06-15T09:33:17Z

tests/integration/test_data_store_mgr.py

+    assert len(update_tasks) == len(collect_states(data, TASK_PROXIES))
+
+
+@pytest.mark.skip('TODO: fix this test')


The update doesn't do anything because the workflow hasn't changed state, how do I trick it?

cylc/flow/hostuserutil.py

kinow

I think 95% of the changes in this pull request are files being moved.

Probably %2 are the scheduler/cli changes, that I will leave to others with more experience there to review/comment.

The rest are test related, and it looks great.

In Java, normally we would have unit & functional tests. Mocked and even things like the pytest fixtures exist in JUnit/TestNG too, but they are used in both unit & functional tests. Functional tests would be for things related to stories/requirements for features of the system (e.g. we send a JMS message to a queue for every new blog post), and integration for things that required external modules/systems (e.g. the system is able to send a JMS message to ActiveMQ version 1.2.3).

In JS it depends on the framework, but nowadays people are converging towards unit without a DOM, or with a mocked DOM, and end-to-end with cypress/selenium/nightwatch, meaning that it has a browser with a valid DOM. I think I haven't seen integrations tests in JS yet, only unit/functional/e2e/and acceptance testing, all but unit very similar.

For a Ruby developer, I think they have some sort of tests for model classes that are unit. Then they have specs that may be functional tests, and Capybara which they say is for acceptance, which looks like functional or e2e tests too 😕

So the idea of a functional and unit test is clear for me. The idea of functional and integration is a bit blur, unless I really understand both the system under test and the test harness & tools, as that may differ in a different project (even though there are definitions on Wikipedia, books, etc).

After these changes in Cylc, I think I will write unit tests for any small changes, that can be done without a running workflow or with minimum mocking.

If I require mocking complex objects, or have parts of the system working/running, I will probably need an integration test, using the fixtures in the integration module.

And I would write a functional test if I introduced something new in Cylc that I would like to make sure it works as a whole (even if I have parts of the new feature being tested with unit tests, maybe I could still want to have a complete end to end test of all the parts of the system?).

Is that close to the right use of the new different tests @oliver-sanders for Cylc?

Cheers
Bruno

p.s.: ISTQB has a glossary for testing, it used to be popular some 10 years ago, there are probably over 20 different types of testing 👀

kinow · 2020-06-15T23:29:39Z

tests/integration/__init__.py

+                        SchedulerStop(StopMode.AUTO.value)
+                    )
+        except asyncio.TimeoutError:
+            # but be prepared to use the nuclear option


kinow · 2020-06-15T23:31:15Z

tests/integration/conftest.py

+@pytest.fixture(scope='session')
+def port_range():
+    ports = glbl_cfg().get(['suite servers', 'run ports'])
+    return min(ports), max(ports)


kinow · 2020-06-15T23:47:06Z

cylc/flow/scheduler.py

-                sys.exit(1)
-            self.close_logs()
+        * Initialise the network components.
+        * Initialise mangers.


s/mangers/managers

kinow · 2020-06-15T23:50:55Z

cylc/flow/scheduler_cli.py

+It is free software, you are welcome to
+redistribute it under certain conditions;
+see `COPYING' in the Cylc source distribution.
+""" % CYLC_VERSION


I think this was in the scheduler before? It makes sense to keep it here IMO 👍

kinow · 2020-06-15T23:51:22Z

cylc/flow/scheduler_cli.py

@@ -77,6 +84,7 @@
    "overrides the suite definition.")


+@lru_cache()


Sonds like a good idea!

kinow · 2020-06-15T23:52:35Z

cylc/flow/scheduler_cli.py

+
+    The Scheduler itself should be a Python object you can import and
+    run in a regular Python session so cannot contain this kind of
+    functionality.


+1 ! The code of the Scheduler class/object looks way better. I can't comment further as I am not that familiar, so will leave to others (especially @hjoliver ) to review that. And it got some types added to the class attributes 🎉 thanks!

cylc/flow/scheduler_cli.py

kinow · 2020-06-15T23:56:59Z

cylc/flow/scheduler_cli.py

+            raise exc2 from None
+        ret = 2
+    except Exception:
+        ret = 3


Was going to ask why not throw/return the exception. But then realized this is the code to be used in exit() function. Maybe add a comment in the function stating that the returned value is an int, and that it's used to provide the command execution status/exit code?

oliver-sanders · 2020-06-16T08:17:38Z

So the idea of a functional and unit test is clear for me. The idea of functional and integration is a bit blur

Yes, the distinction is subtle and I think in truth it's really more of a scale between unit and functional rather than a distinct clustering.

I've put the framework for running suites within the integration battery to help "draw a line in the sand".

Is that close to the right use of the new different tests

Yep, that's what I was thinking, bang on!

My thinking is:

Unit tests to get coverage up.
Unit/Integration tests to cover all eventualities / use cases
One functional test to make sure that core functions are intact (but preferably only one, no need to test all eventualities here)

But I'm no testing expert! Input greatly appreciated.

hjoliver · 2020-06-16T09:47:51Z

So the idea of a functional and unit test is clear for me. The idea of functional and integration is a bit blur ...

Someone wrote a helpful article for us 😁

https://www.softwaretestinghelp.com/the-difference-between-unit-integration-and-functional-testing/

kinow · 2020-06-16T09:59:08Z

I've put the framework for running suites within the integration battery to help "draw a line in the sand".

:+1 we have at least a clear definition for Cylc of what is what, and where each test should go. I can't promise I will get it right when I have to choose between the three types, but I feel confident that reading the docs and looking at existing tests I should find the right test, and eventually that will become natural 🤞

So the idea of a functional and unit test is clear for me. The idea of functional and integration is a bit blur ...

Someone wrote a helpful article for us grin

https://www.softwaretestinghelp.com/the-difference-between-unit-integration-and-functional-testing/

Yup! I had seen this one too. And each directory has the helpful README that @oliver-sanders wrote. If anyone has problems or is confused about which one to write, I think at least someone can point to those files and also to existing tests as reference.

oliver-sanders · 2020-06-16T12:11:22Z

Yep, I think I used the diagram from that article in my proposal. I did a bit of reading which was quite confidence giving seeing people confirm what I've been slowly realising.

The really big bit is time to write and time to debug. We spend so much time on functional tests, I personally sunk nearly an entire week into them (less than the time it took to write this new framework) just to fix broken tests. Hopefully this will help accelerate future development, might take a while to get there though...

oliver-sanders · 2020-06-18T12:20:11Z

Rebased and deconflicted, test changes ported from #3500.

* move cli stuff into the cli * move functional stuff out of the cli * add an interface for creating scheduler options objects * tidy the --format argument * move daemonise logic to scheudler_cli * move event loop logic to scheduler_cli * move logging into scheduler_cli * move start message to scheduler_cli * store id as top-level attr

oliver-sanders · 2020-06-30T10:05:21Z

Rebased.
Deconflicted.
Fixed asyncio issues (you will need to pip install again).
Suppressed NFS issue as unrelated to this change.

oliver-sanders · 2020-06-30T10:18:39Z

@datamel and @kinow thank you for stepping in to review.

There are a couple of specific aspects which could do with standalone reviewing:

@hjoliver best make sure you're happy with the scheduler / scheduler_cli re-alignment and the general gist of the change.
@dwsutherland if you could cast your eye over the asyncio stuff in scheduler/scheduler_cli.
@dwsutherland best make sure I've not butchered the tests.

datamel · 2020-07-01T12:21:16Z

pytest.ini

    --ignore=cylc/flow/data_messages_pb2.py
 testpaths =
-    cylc/flow/
-    tests/lib/python/
+    tests/unit/


Suggested change

tests/unit/

tests/unit/

tests/integration/

Adding this path enables integration tests to be picked up correctly by visual studio code for easy debugging. I'm happy for this change not to go in (as it relates to my debugging process) but may help others too. Without it I'm not 100% sure why pytest was picking up the integration tests...hmmm.

I purposefully left the integration tests out of testpaths so that when you run pytest it defaults to running tests/unit. If there's another way to define the default test set I'm game

Ah, no worries :)

oliver-sanders · 2020-07-02T08:55:38Z

The last commit should fix the hanging functional test. Hopefully this is the last time I have to mess with asyncio in this PR!

Even though it's only a few lines getting the asyncio code right has been a lot of work, the integration test framework only took one day. Here are some lessons learned:

I no longer believe in magic, but I now believe in ghosts.
Async generators must be manually shutdown, you might not use async generators in your code but that doesn't mean they aren't there. If you use asyncio.run it creates and closes an event loop for you tidying up async gens correctly. Look at the source code for asyncio.run, it's not actually doing much under-the-hood.
If sys.exit hangs it means you've got threads (which have sleep statements) that haven't been shutdown correctly. Keep track of threads closely and ensure they are always shutdown. If os._exit fixes it then this is probably the issue.
Daemonisation is a hard barrier that event loops cannot persist through. This is totally expected but it means you need to tidy up before daemonisation else the original process won't be able to exit.

….merge.scheduler_key_tidy.1

oliver-sanders · 2020-07-02T09:33:34Z

Merged and de-conflicted (tried rebase, it was horrible).

oliver-sanders · 2020-07-02T10:32:51Z

Fixed tests/f/cylc-ping/03 which was dependent on key_housekeeping being called before remrun.

hjoliver · 2020-07-03T02:45:38Z

Here are some lessons learned:...

That sounds like some stuff that I don't want to rediscover the hard way in the future. Would it be worth copying to code comments for reference (in scheduler.py?).

oliver-sanders · 2020-07-03T07:53:17Z

Yep, will do.

oliver-sanders · 2020-07-03T08:00:06Z

Deconflicted.

hjoliver · 2020-07-06T00:35:55Z

cylc/flow/scheduler.py

@@ -122,7 +124,8 @@ def __str__(self):
        return self.value


-class Scheduler(object):
+@dataclass
+class Scheduler:


cylc/flow/scheduler.py

pytest.ini

hjoliver

Impressive/massive job 👍 I scanned almost everything; tests work; I like the new integration test framework; my two comments above aren't important so I'll merge this and we can get on with the horrifying conflict resolution ...

@hjoliver best make sure you're happy with the scheduler / scheduler_cli re-alignment and the general gist of the change.

I'm happy with it - one question above indicates I don't fully understand your use of asyncio yet ... but it works fine, and it I guess it doesn't matter if some asyncio tasks are still entirely blocking.

@dwsutherland if you could cast your eye over the asyncio stuff in scheduler/scheduler_cli.

@dwsutherland best make sure I've not butchered the tests.

David is on leave for a week and I think he can come back to this post merge. The first bullet point (scheduler/_cli) seems simple enough (my question above notwithstanding) and the second is "just tests".

oliver-sanders added this to the cylc-8.0a3 milestone Jun 13, 2020

oliver-sanders self-assigned this Jun 13, 2020

oliver-sanders force-pushed the scheduler-options branch from 46c84cc to 0e41d68 Compare June 15, 2020 09:29

oliver-sanders commented Jun 15, 2020

View reviewed changes

datamel reviewed Jun 15, 2020

View reviewed changes

cylc/flow/hostuserutil.py Outdated Show resolved Hide resolved

kinow approved these changes Jun 16, 2020

View reviewed changes

oliver-sanders force-pushed the scheduler-options branch from 0e41d68 to 6c79c30 Compare June 18, 2020 12:19

kinow approved these changes Jun 18, 2020

View reviewed changes

oliver-sanders added 17 commits June 19, 2020 16:16

pytest: add pytest-xdist to developer dependencies

bacfa16

tests: scheduler_cli

6dc3b86

itests: new integration test framework for cylc in python

e7ccde2

itests: meta-testing

e2cc259

tests: add readme files to explain differences

f0dc942

itests: conftest setup

526b56a

itests: add cylc.flow.network.client.SuiteRuntimeClient tests

b70f83f

temp: remove optparer2nametuple dead end

ad97f7c

itests: mutiprocessing -> asycio

dae3b07

itests: test the tests

565afec

itests: convert SuiteRuntimeClient to itest

04bb253

itests: convert Publisher to itest

3d98fce

itests: session, module and function scoping

018e355

itests: allow logging level selection

d37f046

tests: test option_parsers.Options

96bbd0b

scheduler: re-organisation of startup sequence

c298985

pycodestyle++

743c3ff

tests/i: simplify teardown

8619d62

TomekTrzeciak mentioned this pull request Jul 1, 2020

Execute user scripts in a subshell #3440

Merged

11 tasks

scheduler_cli: simplify cli teardown

714bcf1

datamel reviewed Jul 1, 2020

View reviewed changes

oliver-sanders added 2 commits July 1, 2020 17:26

pytest: exclude un-collectable test file

f8b7e54

tests/i: tidy test utilities away out of sight

9d90b44

This was referenced Jul 1, 2020

Db faker for functional tests #3591

Open

Integration tests: restart with faked task pool #3668

Closed

scheduler: shutdown logic for paritally initiated flows

9e681d7

Merge remote-tracking branch 'upstream/master' into scheduler-options…

3c29153

….merge.scheduler_key_tidy.1

tests: fix tests/f/cylc-ping/03

e865e9e

oliver-sanders mentioned this pull request Jul 2, 2020

Calling publisher coroutine from another thread #3564

Closed

oliver-sanders added 2 commits July 3, 2020 08:54

Merge remote-tracking branch 'upstream/master' into scheduler-options

425b869

scheduler_cli: add asyncio notes

a19966f

hjoliver reviewed Jul 6, 2020

View reviewed changes

hjoliver approved these changes Jul 6, 2020

View reviewed changes

hjoliver merged commit e458603 into cylc:master Jul 6, 2020

This was referenced Jul 6, 2020

pytest: remove unnecessarily ignored files #3677

Merged

tests/i: remove test skips #3678

Closed

hjoliver modified the milestones: cylc-8.0a3, cylc-8.0b0 Feb 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests: integration test battery #3654

tests: integration test battery #3654

oliver-sanders commented Jun 13, 2020 •

edited

Loading

oliver-sanders left a comment

oliver-sanders Jun 15, 2020

oliver-sanders Jun 15, 2020

oliver-sanders Jun 15, 2020

kinow left a comment

kinow Jun 15, 2020

kinow Jun 15, 2020

kinow Jun 15, 2020

kinow Jun 15, 2020

kinow Jun 15, 2020

kinow Jun 15, 2020

kinow Jun 15, 2020

oliver-sanders commented Jun 16, 2020

hjoliver commented Jun 16, 2020

kinow commented Jun 16, 2020

oliver-sanders commented Jun 16, 2020

oliver-sanders commented Jun 18, 2020

oliver-sanders commented Jun 30, 2020

oliver-sanders commented Jun 30, 2020

datamel Jul 1, 2020 •

edited

Loading

oliver-sanders Jul 1, 2020 •

edited

Loading

datamel Jul 1, 2020

oliver-sanders commented Jul 2, 2020

oliver-sanders commented Jul 2, 2020

oliver-sanders commented Jul 2, 2020

hjoliver commented Jul 3, 2020

oliver-sanders commented Jul 3, 2020

oliver-sanders commented Jul 3, 2020

hjoliver Jul 6, 2020

hjoliver left a comment •

edited

Loading

		)


		@pytest.mark.skip('TODO: the delta doesnt seem to have an id for some reason')

		assert len(update_tasks) == len(collect_states(data, TASK_PROXIES))


		@pytest.mark.skip('TODO: fix this test')

		@@ -77,6 +84,7 @@
		"overrides the suite definition.")


		@lru_cache()

tests: integration test battery #3654

tests: integration test battery #3654

Conversation

oliver-sanders commented Jun 13, 2020 • edited Loading

Highlights:

Usage:

Philosophy

Integration Tests:

Scheduler:

Caveats:

Requirements check-list:

oliver-sanders left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kinow left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oliver-sanders commented Jun 16, 2020

hjoliver commented Jun 16, 2020

kinow commented Jun 16, 2020

oliver-sanders commented Jun 16, 2020

oliver-sanders commented Jun 18, 2020

oliver-sanders commented Jun 30, 2020

oliver-sanders commented Jun 30, 2020

datamel Jul 1, 2020 • edited Loading

Choose a reason for hiding this comment

oliver-sanders Jul 1, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oliver-sanders commented Jul 2, 2020

oliver-sanders commented Jul 2, 2020

oliver-sanders commented Jul 2, 2020

hjoliver commented Jul 3, 2020

oliver-sanders commented Jul 3, 2020

oliver-sanders commented Jul 3, 2020

Choose a reason for hiding this comment

hjoliver left a comment • edited Loading

Choose a reason for hiding this comment

oliver-sanders commented Jun 13, 2020 •

edited

Loading

datamel Jul 1, 2020 •

edited

Loading

oliver-sanders Jul 1, 2020 •

edited

Loading

hjoliver left a comment •

edited

Loading