-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test vectors #37
Test vectors #37
Conversation
kartograf/util/generate_data.py
Outdated
|
||
MAX_ASN = 33521664 | ||
|
||
def generate_ip(ip_type="v4", subnet_size="16"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about having these test specific util functions under a util folder in tests, similar to bitcoin core: https://github.com/bitcoin/bitcoin/tree/master/test? If we don't need these in the normal code, I would just put them in the test space and let them be shared there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jurraca When moving the file from the normal util to tests/util you could also get rid of the additional util folder and avoid the util/util namespace. I think we only need a util folder when the split up the util file which would then create namespaces like util/foo, util/bar etc. If you want to split it up now that would be also fine with me but I don't think it's needed yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I misunderstood.
16e0d8a
to
cd535cf
Compare
a542492
to
04953b8
Compare
Cleaned this up a bit. edit: in fact, this 04953b8 commit is a bug fix -- tests are broken. I can split it out on its own if that's better. lmk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test vectors part looks good at first pass but it seems to me there is something missing in the pfx part (or I am just not getting it ;)
kartograf/util.py
Outdated
formatted_pfx = str(ipaddress.ip_network(pfx)) | ||
return f"{formatted_pfx}" | ||
return str(ipaddress.ip_address(pfx)) | ||
except ValueError: | ||
return pfx | ||
|
||
|
||
def is_valid_pfx(pfx): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this function is only used in tests it should go into tests/util.py or directly into the test file if it's just used in one file. But I'm not sure if this is what was intended, the commit description makes it sound a bit like the is_valid_pfx
should be used in the actual code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is run before format_pfx
then the try block could could be removed there I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, thanks that makes sense. Is there a reason we're checking ipaddress.ip_address(pfx)
actually? Shouldn't we reject things that are not networks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we're checking ipaddress.ip_address(pfx) actually?
Hm, not sure I understand the question correctly but the idea in my old code was to use str
-> ipaddress
-> str
to standardize the formatting. Stuff like leading zeros should be handled consistently through this for example. At least that is the rationale I remember from taking a quick look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here is of course that I expected ipaddress
to handle this better than I ever could by writing a custom parser. I.e. parsing to ipaddress
should be able to read as many weird formats as possible and then stringifying ipaddress
should always be consistent. But honestly I can't say that I have put in a lot of effort to verify that ipaddress
is super solid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just meant checking ipaddress.ip_address(pfx)
vs ipaddress.ip_network(pfx)
-- the first checks an address, the other a network. Don't we expect only networks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, I misunderstood the question. If I remember correctly single IPs are valid entries in RPKI repos and IRR. So there is nothing wrong with them per se although i don't remember if I looked into if there are specific use cases behind it. But this looks like something where I encountered an error and then looked at the spec and then went with what the spec allowed.
0c1c703
to
e21877d
Compare
Would have been easier for me to keep it separate. It's not an issue if you open many small PRs if you are explicit which ones you want me to look at first. As I mentioned in the other one you now, you can make the others drafts or say "this depends on X, review that one first". |
yea, sorry about this. Separated the bug fix out to #47 and rebased this branch on that. |
Looks good to me but needs rebase first |
create tests/util folder and resolve local import
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK, leaving some suggestion for a follow-up but merging this for now since it's a good addition either way.
l = f.readlines() | ||
final_network_count = len(l) | ||
expected_count = (len(base_nets) + len(extra_nets)) - (base_subnet_count + extra_subnet_count) | ||
assert final_network_count == expected_count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine to check the count but it leaves the possibility of uncaught bugs open if the count is offset to the correct number again. Imagine we add two networks, one should go into final and the other not (simple one valid case + one invalid case). If we don't check here which one goes in the behavior could be the opposite of what we expect (invalid in, valid out) and the test would still pass.
So as a follow-up I think it would be great if this is switched from checking the count to checking the actual content matches. That shouldn't be too bigh of a change, instead of counting in read_test_vectors, build a list of included ones and then compare the list to a readout of the final_result. I hope it works as simple as I imagine it :)
def test_merge_from_fixtures(tmp_path): | ||
''' | ||
Assert that general_merge merges subnets correctly. | ||
''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
formatting-nit: A bit inconsistent to use '''
here and """
above
Stacked on #32
Two commits:
Let me know if this is not ok, I struggled with Python's imports logic a bit.