Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extractor: Add list based lookups #619

Merged
merged 2 commits into from
Dec 25, 2024
Merged

extractor: Add list based lookups #619

merged 2 commits into from
Dec 25, 2024

Conversation

aditya-nambiar
Copy link
Contributor

@aditya-nambiar aditya-nambiar commented Dec 25, 2024

Important

Add LIST_LOOKUP type to extractors for list-based lookups, update query engine, and add tests.

  • Behavior:
    • Add LIST_LOOKUP type to ExtractorType in featureset_pb2.py and featureset_pb2.pyi.
    • Update Extractor class in featureset.py to support LIST_LOOKUP.
    • Modify _get_generated_extractors and _validate in featureset.py to handle list-based features.
  • Query Engine:
    • Implement _compute_list_lookup_extractor in query_engine.py to handle list-based lookups.
    • Update run_extractors in query_engine.py to process LIST_LOOKUP extractors.
  • Testing:
    • Add tests for list-based lookups in test_featureset.py and test_invalid_featureset.py.
    • Update existing tests in test_expr.py and test_complex_autogen_extractor.py to reflect changes.
  • Miscellaneous:
    • Update pyproject.toml to include attrs dependency.

This description was created by Ellipsis for bb924f2. It will automatically update as commits are pushed.

Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 22e839f in 1 minute and 30 seconds

More details
  • Looked at 779 lines of code in 12 files
  • Skipped 1 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. fennel/testing/query_engine.py:470
  • Draft comment:
    Remove print statements used for debugging purposes. Consider using a logging framework if needed for debugging.
  • Reason this comment was not posted:
    Marked as duplicate.
2. fennel/testing/query_engine.py:471
  • Draft comment:
    Remove print statements used for debugging purposes. Consider using a logging framework if needed for debugging.
  • Reason this comment was not posted:
    Marked as duplicate.
3. fennel/featuresets/featureset.py:704
  • Draft comment:
    Remove print statements used for debugging purposes. Consider using a logging framework if needed for debugging.
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_rEmp52xpSvgtHjqx


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

input_feature.name
].explode()

print(repeated_timestamps)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove print statements used for debugging purposes. Consider using a logging framework if needed for debugging.

Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 0055a20 in 41 seconds

More details
  • Looked at 830 lines of code in 13 files
  • Skipped 1 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. fennel/internal_lib/to_proto/to_proto.py:727
  • Draft comment:
    Simplify the extractor type check by using a set for comparison.
    if extractor.extractor_type in {ExtractorType.LOOKUP, ExtractorType.LIST_LOOKUP}:
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The code in fennel/internal_lib/to_proto/to_proto.py has a repeated logic for checking the extractor type. This can be simplified by using a set for comparison.

Workflow ID: wflow_DXCh8OltvKXMNLKD


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 9f87486 in 44 seconds

More details
  • Looked at 833 lines of code in 13 files
  • Skipped 1 files when reviewing.
  • Skipped posting 4 drafted comments based on config settings.
1. fennel/client_tests/test_complex_autogen_extractor.py:349
  • Draft comment:
    Consider using relativedelta from the dateutil library to calculate age_years more clearly and accurately.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The code in fennel/client_tests/test_complex_autogen_extractor.py uses a manual calculation for age_years that could be simplified using a library function. This would improve readability and maintainability.
2. fennel/featuresets/featureset.py:702
  • Draft comment:
    Good use of get_origin to check for list types. This ensures proper handling of generic types.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The code in fennel/featuresets/featureset.py uses get_origin to check for list types. This is a good practice to ensure type checking for generic types.
3. fennel/testing/query_engine.py:493
  • Draft comment:
    Consider refactoring the repeated pattern of setting dataset_lookup to improve code clarity and reduce redundancy.
  • Reason this comment was not posted:
    Confidence changes required: 30%
    The code in fennel/testing/query_engine.py has a repeated pattern for handling dataset_lookup which could be refactored for clarity.
4. fennel/testing/query_engine.py:487
  • Draft comment:
    Good use of fillna for handling default values in dataframes. This is a standard practice for managing missing data.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The code in fennel/testing/query_engine.py uses fillna for handling default values, which is a good practice for handling missing data.

Workflow ID: wflow_ZYreki6olbI58Tb8


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on ff27874 in 43 seconds

More details
  • Looked at 77 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. fennel/testing/mock_client.py:530
  • Draft comment:
    Remove the print statement in the exception handling block. Use logging if necessary.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    This is a testing/mock client file used for testing purposes. The print statement is added in an error case to help with debugging by showing the problematic dataframe before raising the exception. While using logging would be more standard, in a testing context having print statements for debugging is quite reasonable. The print provides immediately visible debug info that could be helpful during test failures.
    The comment makes a valid point that logging is generally preferred over print statements in production code. Print statements can be problematic if output needs to be captured or redirected.
    However, this is test code where print statements are commonly used for debugging visibility. The print statement serves a clear debugging purpose by showing the problematic dataframe before the exception.
    The comment should be deleted. While it makes a technically valid point, using print for debugging in test code is reasonable and the print statement serves a clear purpose here.

Workflow ID: wflow_gMizF6igq6OgoGvD


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on bb924f2 in 23 seconds

More details
  • Looked at 75 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. fennel/testing/mock_client.py:530
  • Draft comment:
    Remove the print statement inside the exception handling block to clean up the code.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The print statement in the exception handling block is not necessary for production code. It can be removed to clean up the code.

Workflow ID: wflow_BGqUV3FajGLQSBFW


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@aditya-nambiar aditya-nambiar merged commit 2ef4d63 into main Dec 25, 2024
8 checks passed
@aditya-nambiar aditya-nambiar deleted the aditya/list-lookup branch December 25, 2024 05:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants