Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test vectors #37

Merged
merged 1 commit into from
Jan 21, 2025
Merged

Test vectors #37

merged 1 commit into from
Jan 21, 2025

Conversation

jurraca
Copy link
Contributor

@jurraca jurraca commented Nov 18, 2024

Stacked on #32

Two commits:

  • first moves all util-style modules to their own folder/namespace
  • second adds test vectors/fixtures to test network merging from. Also adds an init.py to the tests folder so pytest can resolve modules.

Let me know if this is not ok, I struggled with Python's imports logic a bit.


MAX_ASN = 33521664

def generate_ip(ip_type="v4", subnet_size="16"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about having these test specific util functions under a util folder in tests, similar to bitcoin core: https://github.com/bitcoin/bitcoin/tree/master/test? If we don't need these in the normal code, I would just put them in the test space and let them be shared there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jurraca When moving the file from the normal util to tests/util you could also get rid of the additional util folder and avoid the util/util namespace. I think we only need a util folder when the split up the util file which would then create namespaces like util/foo, util/bar etc. If you want to split it up now that would be also fine with me but I don't think it's needed yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I misunderstood.

@jurraca jurraca force-pushed the test-vectors branch 2 times, most recently from 16e0d8a to cd535cf Compare December 3, 2024 12:39
@jurraca jurraca force-pushed the test-vectors branch 2 times, most recently from a542492 to 04953b8 Compare December 30, 2024 12:56
@jurraca
Copy link
Contributor Author

jurraca commented Dec 30, 2024

Cleaned this up a bit.
I reverted the change to format_pfx from 97f0fbe , since it just removed the leading zero of private networks and resulted in invalid IP networks ( 0.128.0.0/16 -> .128.0.0/15). The follow ups to this will be changing how we check that the prefix we ingest is invalid vs valid but poorly formatted in the parsing logic.

edit: in fact, this 04953b8 commit is a bug fix -- tests are broken. I can split it out on its own if that's better. lmk.

Copy link
Collaborator

@fjahr fjahr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test vectors part looks good at first pass but it seems to me there is something missing in the pfx part (or I am just not getting it ;)

formatted_pfx = str(ipaddress.ip_network(pfx))
return f"{formatted_pfx}"
return str(ipaddress.ip_address(pfx))
except ValueError:
return pfx


def is_valid_pfx(pfx):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this function is only used in tests it should go into tests/util.py or directly into the test file if it's just used in one file. But I'm not sure if this is what was intended, the commit description makes it sound a bit like the is_valid_pfx should be used in the actual code?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is run before format_pfx then the try block could could be removed there I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, thanks that makes sense. Is there a reason we're checking ipaddress.ip_address(pfx) actually? Shouldn't we reject things that are not networks?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we're checking ipaddress.ip_address(pfx) actually?

Hm, not sure I understand the question correctly but the idea in my old code was to use str -> ipaddress -> str to standardize the formatting. Stuff like leading zeros should be handled consistently through this for example. At least that is the rationale I remember from taking a quick look.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is of course that I expected ipaddress to handle this better than I ever could by writing a custom parser. I.e. parsing to ipaddress should be able to read as many weird formats as possible and then stringifying ipaddress should always be consistent. But honestly I can't say that I have put in a lot of effort to verify that ipaddress is super solid.

Copy link
Contributor Author

@jurraca jurraca Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just meant checking ipaddress.ip_address(pfx) vs ipaddress.ip_network(pfx) -- the first checks an address, the other a network. Don't we expect only networks?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I misunderstood the question. If I remember correctly single IPs are valid entries in RPKI repos and IRR. So there is nothing wrong with them per se although i don't remember if I looked into if there are specific use cases behind it. But this looks like something where I encountered an error and then looked at the spec and then went with what the spec allowed.

kartograf/util.py Outdated Show resolved Hide resolved
@jurraca jurraca force-pushed the test-vectors branch 3 times, most recently from 0c1c703 to e21877d Compare January 10, 2025 18:45
@jurraca jurraca mentioned this pull request Jan 11, 2025
@fjahr
Copy link
Collaborator

fjahr commented Jan 12, 2025

edit: in fact, this 04953b8 commit is a bug fix -- tests are broken. I can split it out on its own if that's better. lmk.

Would have been easier for me to keep it separate. It's not an issue if you open many small PRs if you are explicit which ones you want me to look at first. As I mentioned in the other one you now, you can make the others drafts or say "this depends on X, review that one first".

@jurraca
Copy link
Contributor Author

jurraca commented Jan 14, 2025

Would have been easier for me to keep it separate.

yea, sorry about this. Separated the bug fix out to #47 and rebased this branch on that.

@fjahr
Copy link
Collaborator

fjahr commented Jan 19, 2025

Looks good to me but needs rebase first

create tests/util folder and resolve local import
Copy link
Collaborator

@fjahr fjahr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, leaving some suggestion for a follow-up but merging this for now since it's a good addition either way.

l = f.readlines()
final_network_count = len(l)
expected_count = (len(base_nets) + len(extra_nets)) - (base_subnet_count + extra_subnet_count)
assert final_network_count == expected_count
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine to check the count but it leaves the possibility of uncaught bugs open if the count is offset to the correct number again. Imagine we add two networks, one should go into final and the other not (simple one valid case + one invalid case). If we don't check here which one goes in the behavior could be the opposite of what we expect (invalid in, valid out) and the test would still pass.

So as a follow-up I think it would be great if this is switched from checking the count to checking the actual content matches. That shouldn't be too bigh of a change, instead of counting in read_test_vectors, build a list of included ones and then compare the list to a readout of the final_result. I hope it works as simple as I imagine it :)

def test_merge_from_fixtures(tmp_path):
'''
Assert that general_merge merges subnets correctly.
'''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting-nit: A bit inconsistent to use ''' here and """ above

@fjahr fjahr merged commit d6f675c into asmap:master Jan 21, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants