Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[frozen] pyDKB.dataflow.stage tests #196

Open
wants to merge 28 commits into
base: pyDKB
Choose a base branch
from
Open

[frozen] pyDKB.dataflow.stage tests #196

wants to merge 28 commits into from

Conversation

Evildoor
Copy link
Contributor

Tests for AbstractStage and ProcessorStage.

So far the tests are checking that the command line arguments are processed correctly.

Now the name of an argument that causes assertion failure should be
obtainable.
This includes mode dictionary creation, removal of 'mode' member
from args_to_add dictionary and repositioning of the latter for
the sake of consistency.
This way the code should be more flexible and easier to read.
Both short and long versions of defining an argument are doing
the same thing and are processed by argparse. Therefore, it is
presumed that testing both versions of defining each argument
is enough and overriding tests can be done just for one version.
Note: new version includes mode args check that was missing.
In current version the line of code is pointless because arg
is either 'source' or 'dest' which are redefined in the next few
lines. However, the code reader may not know that. Furthermore,
this line may be of use if it would be necessary to test
some other arguments' interaction with hdfs.
tearDown() and check_args() were modified to accomodate
the changes.
The argument differs from the others because its type is not
string - therefore, argparse performs some manipulations with
the supplied file, and these manipulations have to be taken
into consideration.
Argparse module throws SystemExit when something is wrong with the
supplied arguments, regardless of the problem's type. Implemented a
function to intercept such behaviour and retrieve argparse' message.
Analysis of the message is used to check that the encountered
problem is the expected one.
These arguments have a fixed sets of allowed values. The tests
are checking that the code correctly displays error information
when an unexpected argument value is supplied.
The functions now perform 'short'-related manipulations by
defining varialbes and using them in functions instead of
having two different calls of the same function.
Now the tests check that the parameter in question is a file,
rather than not None.
@Evildoor Evildoor added the pyDKB label Jan 25, 2019
@Evildoor Evildoor self-assigned this Jan 25, 2019
@@ -170,6 +170,24 @@ def f(self):
setattr(ProcessorStageArgsTestCase, 'test_%s_%s' % (arg, val), f)


def add_arg_incorrect(arg, short=False):
if short:
val = 'i'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be better to use here 'incorrect' as value instead of 'i'?
I have mentioned it earlier, but the comment was buried after push --force :) -- values for the parameters being tested are more likely to be one-char length, so using the longer word we are almost guaranteed from hitting a new valid value by chance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -55,81 +55,74 @@ class ProcessorStageArgsTestCase(unittest.TestCase):

def setUp(self):
self.stage = pyDKB.dataflow.stage.ProcessorStage()
self.args = dict(self.default_args)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe name it expected_args (or exp_args)? Longer, right, but more readable... it`s just now it makes me remid myself every now and then how it is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@mgolosova
Copy link
Collaborator

- Replace stage.parse_args() with stage.configure().
- Update tests to set input_dir if source is either stream or hdfs.
- Add an input file name into most tests - 'file' is now a default source, and
  it demands an input file. Exceptions: test_i/test_input_dir - all files in
  the input dir will be taken, so input file is not mandatory; all tests
  including --hdfs; test_default - its role is to simulate an execution with
  no command line arguments.
- Change test_default() to catch the error about missing input file.
- Change add_override_mode() - file source/destination is not allowed in
  map-reduce mode, meaning that such combination raises an error that must be
  caught.
- Define stage.configure() arguments in list args in some tests to follow
  the 80 symbols line length restriction without stretching a single code line
  into 3+ lines.
If EOM is set to empty string, EOP will be set to '\n', unless overridden.
It is possible that value 'i' will become valid in the future, while
'incorrect' is much 'safer' in this regard - all of the values are
single letters.
@Evildoor
Copy link
Contributor Author

And maybe there should be two tests (with raw and interpreted strings passed); I dont know what can go wrong, but it doesnt mean it can`t :)

Done.

This combination:

args['hdfs'] = True
args['source'] = 'h'
args['dest'] = 'h'

appears here and there; maybe a shourtcut can be used (as with mode-dependent parameters)? Kind of:

hdfs_args = {'hdfs': True, 'source': 'h', 'dest': 'h'}
<...>
self.stage.parse_args(['--hdfs'])
self.args.update(hdfs_args)

Done.

@Evildoor
Copy link
Contributor Author

So, unless there are objections or comments, then I'll split the file into two, simplify the case-processing mechanism, set a fixed name for a test case, and change the "how-to:tests" accordingly.

Done.

@mgolosova, all comments seem to be addressed. I believe that we should follow your suggestion in trello - merge this pull request into pyDKB (if no further changes are required, of course) before proceeding with further tests.

@Evildoor Evildoor changed the title [WIP] pyDKB.dataflow.stage tests pyDKB.dataflow.stage tests Apr 30, 2019
@mgolosova
Copy link
Collaborator

@Evildoor, I am sorry for putting this PR aside for such a long time.

Tried to restore what was going on here, but when I try to run tests I see things like this:

$ git fetch
$ git checkout origin/pyDKB-tests
HEAD is now at 7dad261... Divide test_ProcessorStage.py per how-to: tests.
$ cd pyDKB/dataflow/stage/
$ python -m unittest discover  -t ../../../
<...>
ERROR: test_override_dest_f_mode_m (pyDKB.dataflow.stage.tests.ProcessorStage.test_ProcessorStage.ProcessorStageArgsTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/dkb/dkb-dev.git/Utils/Dataflow/pyDKB/dataflow/stage/tests/ProcessorStage/test_ProcessorStage.py", line 222, in f
    self.stage.parse_args(['--' + arg, val, '--mode', mode_val])
  File "/home/dkb/dkb-dev.git/Utils/Dataflow/pyDKB/dataflow/stage/AbstractStage.py", line 204, in parse_args
    sys.exit(1)
SystemExit: 1
<...>

I don't think it is how the test supposed to react. Have I missed something, maybe tests should be run with some additional parameters?

The part with sys.exit(1) came here with merge from pyDKB branch (c81a096), so it was not here when tests were created -- but now it seems like the tests code requires some update to handle the exit in case of wrong arguments.


I believe this PR is still actuall since untit tests for the library would be a Good Thing, but for now I can't review it.

Copy link
Collaborator

@mgolosova mgolosova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests seem not to be fully adapted to the pyDKB version that was merged here.
There were no global changes in the pyDKB.dataflow.stage after this PR, so I believe it won't be harder to adapt tests to the latest version (0.3.20190916), than to the one merged here in c81a096.

@Evildoor Evildoor changed the title pyDKB.dataflow.stage tests [frozen] pyDKB.dataflow.stage tests Jul 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants