Cleanups from #30 #38

jurraca · 2024-11-20T21:23:10Z

Following up on #30:

Make dict indexed by IP version
Subnet indexes are both integers
refactor some of the logic on inclusion
rename included function to contains_row
pull out the extra file to dataframe logic into its own function
add a basic test for the BaseNetworkIndex class
address pylint stuff

fjahr

The refactorings look very good already, thanks!

But I am not sure how to execute the test and I think we could use some documentation on that. Maybe you add a small README in the tests folder to describe how to execute the tests. Here is what I get when I run pytest tests/merge_base_class_test.py from the project root dir:

_______________________________________________________________________________________________________ ERROR collecting tests/merge_base_class_test.py _______________________________________________________________________________________________________
ImportError while importing test module '/Users/XXX/projects/python/kartograf/tests/merge_base_class_test.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/homebrew/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/importlib/__init__.py:90: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/merge_base_class_test.py:3: in <module>
    from kartograf.merge import BaseNetworkIndex
E   ModuleNotFoundError: No module named 'kartograf'

fjahr · 2024-11-24T13:23:53Z

tests/merge_base_class_test.py

+def test_base_dict_create():
+    base = BaseNetworkIndex()
+    state = base.get()
+    assert state == {4: {}, 6: {}}


nit: This is a bit pedantic but I don't think it's great testing style to expose the internals. Ideally the tests would check the internal through the interface we need and we wouldn't need a special function (get) to expose the internals. It's fine for me to keep it for now but I would consider it a code smell if we can not refactor the internals without changing the test. My alternative suggestion would be to randomly check 10 keys of IPv4 and 10 keys of the IPv6 dict via the inclusion function and ensure that no result is ever returned. For IPv4 it could even be all 255 and wouldn't take much time either.

not pedantic, thanks! I modified the test to check for one IPv4 row, the logic is equally valid for 1 or 10 rows I think. Still need to expand it to cover IPv6 though.

fjahr · 2024-11-24T13:31:25Z

tests/merge_base_class_test.py

+    state = base.get()
+    network = "10.10.0.0/16"
+    base.update(network)
+    assert state[4][10]


Maybe assert here on the actual value we expect?

And similar to what I wrote above the test could work with adding the network and then checking inclusion without looking at the internals.

jurraca · 2024-11-25T11:30:09Z

Thanks. Added pytest to dev dependencies, and the tests/__init__.py empty file it needs to find the project modules, as well as a note in the README.

fjahr · 2024-11-26T00:39:10Z

I tried to reproduce the results of the latest map in asmap-data with 8c476bc but I got this error:

--- Merging RPKI and IRR data ---

Parse base file to dictionary
Traceback (most recent call last):
  File "/Users/FJ/projects/python/kartograf/./run", line 95, in <module>
    Kartograf.map(args)
  File "/Users/FJ/projects/python/kartograf/kartograf/kartograf.py", line 83, in map
    merge_irr(context)
  File "/Users/FJ/projects/python/kartograf/kartograf/timed.py", line 10, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/FJ/projects/python/kartograf/kartograf/merge.py", line 71, in merge_irr
    general_merge(
  File "/Users/FJ/projects/python/kartograf/kartograf/merge.py", line 147, in general_merge
    base.update(pfx)
  File "/Users/FJ/projects/python/kartograf/kartograf/merge.py", line 36, in update
    root_net = int(str(pfx).split(":", maxsplit=1)[0])
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: invalid literal for int() with base 10: '2a11'

fjahr · 2024-11-26T11:55:53Z

An idea for another test based on this: You could extract all the first numbers from the IPv6 space in a recent run and use that as a testing body for the IPv6 logic where IPv4 is using all numbers exhaustively (or generates something random).

jurraca · 2024-11-27T18:18:17Z

I tried to reproduce the results of the latest map in asmap-data with 8c476bc but I got this error:

🤦‍♂️ my bad, fixed

An idea for another test based on this:

The problem with randomizing inputs is if there's a problem with the test logic it results in a flaky test instead of straight failing. I was looking at a library like hypothesis which might be good for generating all the test cases for networks.

I added tests for format_pfx and get_root_network to test the handling of IP addr/network inputs, though the overall flow of how we catch invalid prefixes or network seems to need more followup work. If we test the functions that wrap ipaddress.ip_network(net) and catch errors, it should be a good start to coverage.

…r clarity

…all fixups

fjahr · 2024-12-08T17:54:08Z

The problem with randomizing inputs is if there's a problem with the test logic it results in a flaky test instead of straight failing. I was looking at a library like hypothesis which might be good for generating all the test cases for networks.

I am not suggesting to randomize any inputs. Just taking all the numbers from a recent run an make a test with these that is completely static, it would just be a pretty exhaustive fixture file that should catch regressions like the previous. The fixture file doesn't have to be updated often, maybe once every few months or just whenever someone volunteers to do it.

fjahr · 2024-12-08T17:55:40Z

tACK

All looks good to me now, I did extensive testing, reproduced multiple recent runs and did two new runs.

jurraca force-pushed the merge-cleanup branch from 1a1f6b7 to 6f7a396 Compare November 20, 2024 21:35

fjahr reviewed Nov 24, 2024

View reviewed changes

jurraca force-pushed the merge-cleanup branch 2 times, most recently from 3113502 to 8c476bc Compare November 25, 2024 11:29

jurraca force-pushed the merge-cleanup branch from 1669cd5 to d4a7323 Compare November 27, 2024 18:13

jurraca added 9 commits December 3, 2024 11:54

simplify check_inclusion result handling ; rename var and function fo…

02dd9ca

…r clarity

split dict by IP version, add basic tests for class methods

95caedc

pull out extra file dataframe creation into function, formatting & sm…

7747f71

…all fixups

add pylint to dev deps, add tests/__init__.py so pytest can find modules

ffb6c4d

add testing section to README

d0053ec

handle IPv6 in test extra_df generation

30d4bf1

handle IPv6 correctly, extract root network function

07c4004

update base_class test to handle multiple networks, add ipv6 to tests

2b8b0bc

add tests for util.py and update format_pfx logic

97f0fbe

jurraca force-pushed the merge-cleanup branch from d4a7323 to 97f0fbe Compare December 3, 2024 11:59

fjahr merged commit 7c67bb7 into asmap:master Dec 8, 2024
1 check passed

jurraca deleted the merge-cleanup branch December 9, 2024 17:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleanups from #30 #38

Cleanups from #30 #38

jurraca commented Nov 20, 2024

fjahr left a comment

fjahr Nov 24, 2024

jurraca Nov 25, 2024 •

edited

Loading

fjahr Nov 24, 2024

fjahr Nov 24, 2024

jurraca commented Nov 25, 2024

fjahr commented Nov 26, 2024

fjahr commented Nov 26, 2024

jurraca commented Nov 27, 2024

fjahr commented Dec 8, 2024 •

edited

Loading

fjahr commented Dec 8, 2024

Cleanups from #30 #38

Cleanups from #30 #38

Conversation

jurraca commented Nov 20, 2024

fjahr left a comment

Choose a reason for hiding this comment

fjahr Nov 24, 2024

Choose a reason for hiding this comment

jurraca Nov 25, 2024 • edited Loading

Choose a reason for hiding this comment

fjahr Nov 24, 2024

Choose a reason for hiding this comment

fjahr Nov 24, 2024

Choose a reason for hiding this comment

jurraca commented Nov 25, 2024

fjahr commented Nov 26, 2024

fjahr commented Nov 26, 2024

jurraca commented Nov 27, 2024

fjahr commented Dec 8, 2024 • edited Loading

fjahr commented Dec 8, 2024

jurraca Nov 25, 2024 •

edited

Loading

fjahr commented Dec 8, 2024 •

edited

Loading