-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the big date extract rewrite #34
base: dev
Are you sure you want to change the base?
Conversation
get_ordinal -> get_date_ordinal test exceptions fix other lang tests (fallback parser) exception error mesage more tests fixes eras fix more eras get_ordinal unittests before present negative eras - count years backwards from {reference_date} add new_year + valentines_day holidays fix week parsing (weeks start at sundays) fix fallback parser unittests - get_XXX_range datetime fallback fix upcoming {X} strict parser fallback - extract_time parser fallback - anchorDate param parser fallback fallback parser tests strict parser fallback - extract_datetime more date resolutions - Enum only strict parser fallback DatetimeResolution eras as reference_dates not absolute_dates named eras version pinning active_location location based holidays test location based holidays "easter" "mid summer" named holidays support timedelta fallback test relative delta extract_time in follow up PR duplicate rebase use average number of days in 1 year/month for duration calculations handle "the" utterance remainder comments + pep8 better year parsing duplicate code BC vs infinity error message comments / readability add location_markers for location parsing handle "standalone" {season} remove duplicate code boiler plate cleanup more tests location aware seasons extract_date start typo extract_calendar_duration handle relativedelta constraints relativedelta default add "replace_token" more tests more tests more tests more resolutions, + tests DurationResolution average number of days in month/year revert brain fart use weeks instead of days long duration
@coderabbitai review |
Actions performedReview triggered.
|
WalkthroughThe changes introduce new functionalities and data structures for improved handling of dates, times, locations, and seasons across multiple languages. This includes adding various enums, new functions for extracting and parsing dates, location management, and corresponding unit tests to ensure the accuracy of these enhancements. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant lingua_franca.location
participant lingua_franca.parse
User->>lingua_franca.location: set_active_location(latitude, longitude)
User->>lingua_franca.parse: extract_date(text, lang, location)
lingua_franca.parse->>lingua_franca.location: get_active_location()
lingua_franca.location->>lingua_franca.parse: return active_location
lingua_franca.parse->>User: return parsed_date
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 29
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (13)
- lingua_franca/lang/common_data_en.py (8 hunks)
- lingua_franca/lang/parse_common.py (2 hunks)
- lingua_franca/lang/parse_en.py (7 hunks)
- lingua_franca/location.py (1 hunks)
- lingua_franca/parse.py (5 hunks)
- lingua_franca/time.py (2 hunks)
- requirements.txt (1 hunks)
- test/test_time_utils.py (1 hunks)
- test/unittests/test_parse_en.py (4 hunks)
- test/unittests/test_parse_es.py (2 hunks)
- test/unittests/test_parse_fr.py (3 hunks)
- test/unittests/test_parse_it.py (2 hunks)
- test/unittests/test_parse_pt.py (2 hunks)
Files not summarized due to errors (2)
- lingua_franca/lang/parse_en.py: Error: Message exceeds token limit
- test/unittests/test_parse_en.py: Error: Message exceeds token limit
Files skipped from review due to trivial changes (1)
- requirements.txt
Additional context used
Ruff
test/unittests/test_parse_es.py
16-16:
datetime.time
imported but unusedRemove unused import:
datetime.time
(F401)
268-268: Undefined name
extract_date
(F821)
273-273: Undefined name
extract_date
(F821)
test/unittests/test_parse_pt.py
23-23: Redefinition of unused
get_gender
from line 21Remove definition:
get_gender
(F811)
24-24: Redefinition of unused
extract_datetime
from line 21Remove definition:
extract_datetime
(F811)
26-26: Redefinition of unused
extract_number
from line 21Remove definition:
extract_number
(F811)
27-27: Redefinition of unused
normalize
from line 21Remove definition:
normalize
(F811)
lingua_franca/parse.py
28-28:
lingua_franca.location.get_active_hemisphere
imported but unusedRemove unused import:
lingua_franca.location.get_active_hemisphere
(F401)
31-31:
dateparser
imported but unusedRemove unused import:
dateparser
(F401)
216-216: Undefined name
lang_code
(F821)
362-365: Use ternary operator
hemisphere = Hemisphere.SOUTH if lat < 0 else Hemisphere.NORTH
instead ofif
-else
-blockReplace
if
-else
-block withhemisphere = Hemisphere.SOUTH if lat < 0 else Hemisphere.NORTH
(SIM108)
368-368: Undefined name
extract_date_en
(F821)
374-374: Undefined name
_log_unsupported_language
(F821)
388-388: Local variable
remainder
is assigned to but never usedRemove assignment to unused variable
remainder
(F841)
436-436: Local variable
code
is assigned to but never usedRemove assignment to unused variable
code
(F841)
441-441: Undefined name
_log_unsupported_language
(F821)
451-451: Local variable
remainder
is assigned to but never usedRemove assignment to unused variable
remainder
(F841)
463-466: Use ternary operator
year = anchor_date.year if anchor_date else now_local().year
instead ofif
-else
-blockReplace
if
-else
-block withyear = anchor_date.year if anchor_date else now_local().year
(SIM108)
476-476: Undefined name
get_named_dates_en
(F821)
479-479: Undefined name
_log_unsupported_language
(F821)
497-497: Undefined name
get_named_eras_en
(F821)
498-498: Undefined name
date
(F821)
lingua_franca/time.py
20-20: Redefinition of unused
datetime
from line 16Remove definition:
datetime
(F811)
132-132: Multiple
isinstance
calls formonth
, merge into a single callMerge
isinstance
calls formonth
(SIM101)
145-145: Multiple
isinstance
calls forweekday
, merge into a single callMerge
isinstance
calls forweekday
(SIM101)
test/test_time_utils.py
21-21:
lingua_franca.time.get_week_number
imported but unusedRemove unused import:
lingua_franca.time.get_week_number
(F401)
test/unittests/test_parse_it.py
17-17:
datetime.date
imported but unusedRemove unused import
(F401)
17-17:
datetime.timedelta
imported but unusedRemove unused import
(F401)
lingua_franca/lang/parse_en.py
33-33: Redefinition of unused
timedelta
from line 18Remove definition:
timedelta
(F811)
33-33: Redefinition of unused
datetime
from line 18Remove definition:
datetime
(F811)
33-33: Redefinition of unused
time
from line 18Remove definition:
time
(F811)
35-35: Redefinition of unused
invert_dict
from line 30Remove definition:
invert_dict
(F811)
36-36: Redefinition of unused
ReplaceableNumber
from line 30Remove definition:
ReplaceableNumber
(F811)
36-36: Redefinition of unused
partition_list
from line 30Remove definition:
partition_list
(F811)
36-36: Redefinition of unused
tokenize
from line 30Remove definition:
tokenize
(F811)
36-36: Redefinition of unused
Token
from line 30Remove definition:
Token
(F811)
36-36: Redefinition of unused
Normalizer
from line 30Remove definition:
Normalizer
(F811)
37-37: Redefinition of unused
is_numeric
from line 29Remove definition:
is_numeric
(F811)
37-37: Redefinition of unused
look_for_fractions
from line 29Remove definition:
look_for_fractions
(F811)
38-38: Redefinition of unused
_ARTICLES_EN
from line 23Remove definition:
_ARTICLES_EN
(F811)
38-38:
lingua_franca.lang.common_data_en._NUM_STRING_EN
imported but unusedRemove unused import:
lingua_franca.lang.common_data_en._NUM_STRING_EN
(F401)
39-39: Redefinition of unused
_LONG_ORDINAL_EN
from line 23Remove definition:
_LONG_ORDINAL_EN
(F811)
39-39: Redefinition of unused
_LONG_SCALE_EN
from line 23Remove definition:
_LONG_SCALE_EN
(F811)
39-39: Redefinition of unused
_SHORT_SCALE_EN
from line 23Remove definition:
_SHORT_SCALE_EN
(F811)
39-39: Redefinition of unused
_SHORT_ORDINAL_EN
from line 24Remove definition:
_SHORT_ORDINAL_EN
(F811)
42-42: Redefinition of unused
re
from line 17Remove definition:
re
(F811)
43-43: Redefinition of unused
json
from line 16Remove definition:
json
(F811)
45-45: Redefinition of unused
resolve_resource_file
from line 22(F811)
50-50: Redefinition of unused
now_local
from line 31Remove definition:
now_local
(F811)
52-52:
lingua_franca.location.get_active_location
imported but unusedRemove unused import:
lingua_franca.location.get_active_location
(F401)
2010-2010: Loop control variable
dt
not used within loop bodyRename unused
dt
to_dt
(B007)
2018-2018: Loop control variable
dt
not used within loop bodyRename unused
dt
to_dt
(B007)
2025-2025: Loop control variable
dt
not used within loop bodyRename unused
dt
to_dt
(B007)
2709-2709: Local variable
wordPrevPrev
is assigned to but never usedRemove assignment to unused variable
wordPrevPrev
(F841)
2714-2714: Local variable
wordNextNextNext
is assigned to but never usedRemove assignment to unused variable
wordNextNextNext
(F841)
2757-2760: Use ternary operator
_delta = 7 - _w + int_week if int_week < _w else int_week - _w
instead ofif
-else
-blockReplace
if
-else
-block with_delta = 7 - _w + int_week if int_week < _w else int_week - _w
(SIM108)
3295-3298: Use ternary operator
method = 1 if _year.year < 1583 else 3
instead ofif
-else
-blockReplace
if
-else
-block withmethod = 1 if _year.year < 1583 else 3
(SIM108)
3308-3311: Use ternary operator
method = 1 if _year.year < 1583 else 3
instead ofif
-else
-blockReplace
if
-else
-block withmethod = 1 if _year.year < 1583 else 3
(SIM108)
3318-3321: Use ternary operator
method = 1 if _year < 1583 else 3
instead ofif
-else
-blockReplace
if
-else
-block withmethod = 1 if _year < 1583 else 3
(SIM108)
3328-3331: Use ternary operator
method = 1 if ref_date.year < 1583 else 3
instead ofif
-else
-blockReplace
if
-else
-block withmethod = 1 if ref_date.year < 1583 else 3
(SIM108)
3516-3523: Use ternary operator
_year = _base + _year if _delta > _year else _base - 100 + _year
instead ofif
-else
-block(SIM108)
test/unittests/test_parse_en.py
35-35: Redefinition of unused
DAYS_IN_1_YEAR
from line 31Remove definition:
DAYS_IN_1_YEAR
(F811)
35-35: Redefinition of unused
DAYS_IN_1_MONTH
from line 31Remove definition:
DAYS_IN_1_MONTH
(F811)
39-39: Redefinition of unused
datetime
from line 17Remove definition:
datetime
(F811)
39-39: Redefinition of unused
timedelta
from line 17Remove definition:
timedelta
(F811)
2946-2946:
simple_NER
imported but unused; consider usingimportlib.util.find_spec
to test for availability(F401)
Additional comments not posted (36)
lingua_franca/location.py (4)
22-25
: EnumHemisphere
implementation looks good.The enum is correctly implemented with two values: NORTH and SOUTH.
27-33
: Functionget_active_location
implementation looks good.The function correctly returns the default latitude and longitude.
55-62
: Functionget_active_location_code
implementation looks good.The function correctly returns the default location ISO code.
17-19
: Global variables initialization looks good.The global variables __default_location, __latitude, and __longitude are correctly initialized.
lingua_franca/lang/common_data_en.py (7)
17-20
: Imports look good.The new imports are necessary for the new functionality and correctly implemented.
295-303
: Dictionary_WEEKDAY_EN
implementation looks good.The dictionary correctly maps integer keys to weekday names.
305-318
: Dictionary_MONTH_EN
implementation looks good.The dictionary correctly maps integer keys to month names.
320-328
: Dictionary_WEEKDAY_SHORT_EN
implementation looks good.The dictionary correctly maps integer keys to short weekday names.
330-343
: Dictionary_MONTH_SHORT_EN
implementation looks good.The dictionary correctly maps integer keys to short month names.
345-348
: Dictionary_HEMISPHERES_EN
implementation looks good.The dictionary correctly maps
Hemisphere
enum values to lists of corresponding names.
350-355
: Dictionary_SEASONS_EN
implementation looks good.The dictionary correctly maps
Season
enum values to lists of corresponding names.test/unittests/test_parse_pt.py (1)
320-338
: Test methodtest_fallback_parser
implementation looks good.The test method correctly checks the date extraction functionality in Portuguese.
lingua_franca/parse.py (2)
Line range hint
126-150
:
Ensure correct integration of new parameters.The new parameters
resolution
andreplace_token
are correctly integrated. Ensure that the function calls are updated to include these parameters.
102-123
: Add docstring details and verify function usage.The function
extract_calendar_duration
wrapsextract_duration
with a specific resolution. Ensure that the docstring provides enough detail and verify the function usage in the codebase.However, ensure that all function calls to
extract_calendar_duration
are correctly implemented.Verification successful
No issues found with the function usage.
The function
extract_calendar_duration
is not called anywhere else in the codebase, so there are no existing usages to verify against the new implementation.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify all function calls to `extract_calendar_duration` match the new implementation. # Test: Search for the function usage. Expect: Only occurances of the new implementation. rg --type python -A 5 $'extract_calendar_duration'Length of output: 80
Script:
#!/bin/bash # Description: Verify all function calls to `extract_calendar_duration` match the new implementation. # Test: Search for the function usage. Expect: Only occurrences of the new implementation. rg --glob '*.py' -A 5 $'extract_calendar_duration'Length of output: 362
lingua_franca/time.py (6)
117-122
: LGTM!The function
int_to_month
is correctly implemented.
124-129
: LGTM!The function
int_to_weekday
is correctly implemented.
158-203
: LGTM!The functions for calculating date ranges are correctly implemented.
205-475
: LGTM!The function
get_date_ordinal
is correctly implemented.
477-650
: LGTM!The functions for handling seasons are correctly implemented.
652-654
: LGTM!The function
get_week_number
is correctly implemented.test/unittests/test_parse_fr.py (3)
Line range hint
17-28
:
LGTM!The imports and the new test class
TestExtractDate
are correctly implemented.
30-42
: LGTM!The test method
test_fallback_parser
is correctly implemented and covers necessary cases.
30-42
: LGTM!The modifications to the existing test cases are correctly implemented.
test/test_time_utils.py (2)
27-77
: LGTM!The test cases in the
TestRanges
class are comprehensive and well-structured.
451-519
: LGTM!The test cases in the
TestHemisphere
class are comprehensive and well-structured.lingua_franca/lang/parse_common.py (3)
466-484
: LGTM!The
DurationResolution
Enum is well-defined and each constant uniquely represents a specific time resolution.
486-953
: LGTM!The
DateTimeResolution
Enum is comprehensive and well-defined, with each constant uniquely representing a specific date or time resolution.
956-960
: LGTM!The
Season
Enum is straightforward and well-defined, with each constant uniquely representing a season.test/unittests/test_parse_it.py (2)
25-26
: LGTM!The
now_local
andextract_date
imports are necessary for the new test method.
40-47
: LGTM!The new test method
test_fallback_parser
is well-structured and necessary for testing fallback behavior to thedateparser
module.lingua_franca/lang/parse_en.py (6)
222-227
: LGTM!The changes to
_extract_fraction_with_text_en
are correct.
272-277
: LGTM!The changes to
_extract_decimal_with_text_en
are correct.
Line range hint
586-836
:
LGTM!The changes to
extract_duration_en
are correct.
1921-1943
: LGTM!The changes to
get_named_dates_en
are correct.
1946-1959
: LGTM!The changes to
get_named_eras_en
are correct.
2034-3560
: LGTM!The changes to
extract_date_en
are correct.Tools
Ruff
2709-2709: Local variable
wordPrevPrev
is assigned to but never usedRemove assignment to unused variable
wordPrevPrev
(F841)
2714-2714: Local variable
wordNextNextNext
is assigned to but never usedRemove assignment to unused variable
wordNextNextNext
(F841)
2757-2760: Use ternary operator
_delta = 7 - _w + int_week if int_week < _w else int_week - _w
instead ofif
-else
-blockReplace
if
-else
-block with_delta = 7 - _w + int_week if int_week < _w else int_week - _w
(SIM108)
3295-3298: Use ternary operator
method = 1 if _year.year < 1583 else 3
instead ofif
-else
-blockReplace
if
-else
-block withmethod = 1 if _year.year < 1583 else 3
(SIM108)
3308-3311: Use ternary operator
method = 1 if _year.year < 1583 else 3
instead ofif
-else
-blockReplace
if
-else
-block withmethod = 1 if _year.year < 1583 else 3
(SIM108)
3318-3321: Use ternary operator
method = 1 if _year < 1583 else 3
instead ofif
-else
-blockReplace
if
-else
-block withmethod = 1 if _year < 1583 else 3
(SIM108)
3328-3331: Use ternary operator
method = 1 if ref_date.year < 1583 else 3
instead ofif
-else
-blockReplace
if
-else
-block withmethod = 1 if ref_date.year < 1583 else 3
(SIM108)
3516-3523: Use ternary operator
_year = _base + _year if _delta > _year else _base - 100 + _year
instead ofif
-else
-block(SIM108)
from datetime import datetime, time, date, timedelta | ||
from dateutil.relativedelta import relativedelta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused imports.
The date
and timedelta
imports are unused and should be removed to clean up the code.
- from datetime import datetime, time, date, timedelta
+ from datetime import datetime, time
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from datetime import datetime, time, date, timedelta | |
from dateutil.relativedelta import relativedelta | |
from datetime import datetime, time | |
from dateutil.relativedelta import relativedelta |
Tools
Ruff
17-17:
datetime.date
imported but unusedRemove unused import
(F401)
17-17:
datetime.timedelta
imported but unusedRemove unused import
(F401)
@@ -28,6 +28,15 @@ | |||
from lingua_franca.time import default_timezone, to_local | |||
from lingua_franca.parse import extract_langcode | |||
from lingua_franca.parse import yes_or_no | |||
from lingua_franca.time import DAYS_IN_1_YEAR, DAYS_IN_1_MONTH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove redefinition of DAYS_IN_1_MONTH
.
The DAYS_IN_1_MONTH
is redefined on line 35.
- from lingua_franca.time import DAYS_IN_1_YEAR, DAYS_IN_1_MONTH
+ from lingua_franca.time import DAYS_IN_1_YEAR
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from lingua_franca.time import DAYS_IN_1_YEAR, DAYS_IN_1_MONTH | |
from lingua_franca.time import DAYS_IN_1_YEAR |
Remove redefinition of DAYS_IN_1_YEAR
.
The DAYS_IN_1_YEAR
is redefined on line 35.
- from lingua_franca.time import DAYS_IN_1_YEAR, DAYS_IN_1_MONTH
+ from lingua_franca.time import DAYS_IN_1_MONTH
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from lingua_franca.time import DAYS_IN_1_YEAR, DAYS_IN_1_MONTH | |
from lingua_franca.time import DAYS_IN_1_MONTH |
from lingua_franca.lang.parse_common import DateTimeResolution, Season | ||
from lingua_franca.location import Hemisphere | ||
from lingua_franca.lang.parse_en import extract_date_en | ||
from datetime import date, datetime, timedelta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove redefinition of datetime
and timedelta
.
The datetime
and timedelta
are redefined on line 39.
- from datetime import date, datetime, timedelta
+ from datetime import date
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from datetime import date, datetime, timedelta | |
from datetime import date |
Tools
Ruff
39-39: Redefinition of unused
datetime
from line 17Remove definition:
datetime
(F811)
39-39: Redefinition of unused
timedelta
from line 17Remove definition:
timedelta
(F811)
Season.SPRING) | ||
|
||
try: | ||
import simple_NER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused import simple_NER
.
The simple_NER
is imported but not used.
- import simple_NER
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
import simple_NER |
Tools
Ruff
2946-2946:
simple_NER
imported but unused; consider usingimportlib.util.find_spec
to test for availability(F401)
class TestOrdinals(unittest.TestCase): | ||
def test_first_day(self): | ||
ref_date = date(day=27, month=2, year=4567) | ||
week_start, week_end = get_week_range(ref_date) | ||
weekend_start, weekend_end = get_weekend_range(ref_date) | ||
|
||
self.assertEqual(get_date_ordinal(1, ref_date, | ||
DateTimeResolution.DAY), | ||
date(day=1, month=1, year=1)) | ||
# TODO not implemented yet | ||
# self.assertEqual(get_ordinal(1, ref_date, | ||
# DateTimeResolution.DAY_OF_WEEK), | ||
# week_start) | ||
# self.assertEqual(get_ordinal(1, ref_date, | ||
# DateTimeResolution.DAY_OF_WEEKEND), | ||
# weekend_start) | ||
self.assertEqual(get_date_ordinal(1, ref_date, | ||
DateTimeResolution.DAY_OF_MONTH), | ||
ref_date.replace(day=1)) | ||
self.assertEqual(get_date_ordinal(1, ref_date, | ||
DateTimeResolution.DAY_OF_YEAR), | ||
ref_date.replace(day=1, month=1)) | ||
self.assertEqual(get_date_ordinal(1, ref_date, | ||
DateTimeResolution.DAY_OF_DECADE), | ||
date(day=1, month=1, year=4560)) | ||
self.assertEqual(get_date_ordinal(1, ref_date, | ||
DateTimeResolution.DAY_OF_CENTURY), | ||
date(day=1, month=1, year=4500)) | ||
self.assertEqual(get_date_ordinal(1, ref_date, | ||
DateTimeResolution.DAY_OF_MILLENNIUM), | ||
date(day=1, month=1, year=4000)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! But implement the missing tests.
The test cases in the TestOrdinals
class are comprehensive and well-structured. However, there are TODO comments indicating missing tests.
Do you want me to implement the missing tests or open a GitHub issue to track this task?
# hard-parse, fallback to dateparser | ||
# this brings "free support" for many languages | ||
print("No dates found, falling back to strict parser") | ||
_dates = search_dates(text, languages=[lang_code], | ||
settings={'RELATIVE_BASE': anchorDate, | ||
'STRICT_PARSING': True}) | ||
if _dates is not None: | ||
# return first datetime only | ||
# TODO extract_datetimes | ||
# TODO extract_datetime_range | ||
date_str, extracted_date = _dates[0] | ||
remainder = text.replace(date_str, "") | ||
return extracted_date, remainder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix undefined variable lang_code
.
The variable lang_code
is used but not defined in the function.
- _dates = search_dates(text, languages=[lang_code],
+ lang_code = get_primary_lang_code(lang)
+ _dates = search_dates(text, languages=[lang_code],
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# hard-parse, fallback to dateparser | |
# this brings "free support" for many languages | |
print("No dates found, falling back to strict parser") | |
_dates = search_dates(text, languages=[lang_code], | |
settings={'RELATIVE_BASE': anchorDate, | |
'STRICT_PARSING': True}) | |
if _dates is not None: | |
# return first datetime only | |
# TODO extract_datetimes | |
# TODO extract_datetime_range | |
date_str, extracted_date = _dates[0] | |
remainder = text.replace(date_str, "") | |
return extracted_date, remainder | |
# hard-parse, fallback to dateparser | |
# this brings "free support" for many languages | |
print("No dates found, falling back to strict parser") | |
lang_code = get_primary_lang_code(lang) | |
_dates = search_dates(text, languages=[lang_code], | |
settings={'RELATIVE_BASE': anchorDate, | |
'STRICT_PARSING': True}) | |
if _dates is not None: | |
# return first datetime only | |
# TODO extract_datetimes | |
# TODO extract_datetime_range | |
date_str, extracted_date = _dates[0] | |
remainder = text.replace(date_str, "") | |
return extracted_date, remainder |
Tools
Ruff
216-216: Undefined name
lang_code
(F821)
@localized_function() | ||
def extract_date(text, anchor_date=None, lang="", location=None): | ||
""" | ||
Extracts date information from a sentence. Parses many of the | ||
common ways that humans express dates, including relative dates | ||
like "5 days from today", "tomorrow', and "Tuesday". | ||
|
||
Vague terminology are given arbitrary values, accounting for | ||
geographic location, like: | ||
- summer = XXX | ||
- spring = XXX | ||
- winter = XXX | ||
|
||
Args: | ||
text (str): the text to be interpreted | ||
anchor_date (:obj:`datetime`, optional): the date to be used for | ||
relative dating (for example, what does "tomorrow" mean?). | ||
Defaults to the current local date/time. | ||
lang (str): the BCP-47 code for the language to use, None uses default | ||
location (str, float, float): ISO code, lat, lon of reference | ||
location, used for holidays and seasons | ||
|
||
Returns: | ||
[:obj:`date`, :obj:`str`]: 'date' is the extracted date | ||
as a date object in the user's local timezone. | ||
'leftover_string' is the original phrase with all date | ||
related keywords stripped out. See examples for further | ||
clarification | ||
|
||
Returns 'None' if no date related text is found. | ||
|
||
Examples: | ||
|
||
>>> extract_date( | ||
... "What is the weather like the day after tomorrow?", | ||
... date(2017, 06, 30) | ||
... ) | ||
[datetime.date(2017, 7, 2), 'what is weather like'] | ||
|
||
>>> extract_date( | ||
... "Set up an appointment 2 weeks from Sunday", | ||
... date(2016, 02, 19) | ||
... ) | ||
[datetime.datetime(2016, 3, 6), 'set up appointment'] | ||
|
||
>>> extract_date( | ||
... "Set up an appointment", | ||
... date(2016, 02, 19) | ||
... ) | ||
None | ||
""" | ||
|
||
lang_code = get_primary_lang_code(lang) | ||
|
||
if not anchor_date: | ||
anchor_date = now_local() | ||
|
||
if location is not None: | ||
code, lat, lon = location | ||
else: | ||
code = get_active_location_code() | ||
lat, lon = get_active_location() | ||
|
||
if lat < 0: | ||
hemisphere = Hemisphere.SOUTH | ||
else: | ||
hemisphere = Hemisphere.NORTH | ||
|
||
if lang_code == "en": | ||
extracted_date = extract_date_en(text, anchor_date, | ||
hemisphere=hemisphere, | ||
location_code=code) | ||
else: | ||
extracted_date = None | ||
# TODO: extract_date for other languages | ||
_log_unsupported_language(lang_code, ['en']) | ||
|
||
if extracted_date is None: | ||
# hard-parse, fallback to dateparser | ||
# this brings "free support" for many languages | ||
print("No dates found, falling back to strict parser") | ||
_dates = search_dates(text, languages=[lang_code], | ||
settings={'RELATIVE_BASE': anchor_date, | ||
'STRICT_PARSING': True}) | ||
if _dates: | ||
# return first date only | ||
# TODO extract_dates | ||
# TODO extract_date_range | ||
date_str, extracted_datetime = _dates[0] | ||
remainder = text.replace(date_str, "") | ||
extracted_date = extracted_datetime.date() | ||
|
||
return extracted_date |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix undefined variables and unused assignments.
The variables extract_date_en
and _log_unsupported_language
are undefined. The variable code
is assigned but never used.
- code = get_active_location_code()
+ code = get_active_location_code() # Ensure this is used or remove it.
- extracted_date = extract_date_en(text, anchor_date,
- _log_unsupported_language(lang_code, ['en'])
+ # Define or import `extract_date_en` and `_log_unsupported_language`
Committable suggestion was skipped due to low confidence.
Tools
Ruff
362-365: Use ternary operator
hemisphere = Hemisphere.SOUTH if lat < 0 else Hemisphere.NORTH
instead ofif
-else
-blockReplace
if
-else
-block withhemisphere = Hemisphere.SOUTH if lat < 0 else Hemisphere.NORTH
(SIM108)
368-368: Undefined name
extract_date_en
(F821)
374-374: Undefined name
_log_unsupported_language
(F821)
388-388: Local variable
remainder
is assigned to but never usedRemove assignment to unused variable
remainder
(F841)
@@ -15,6 +15,16 @@ | |||
# | |||
from datetime import datetime | |||
from dateutil.tz import gettz, tzlocal | |||
from dateutil.relativedelta import relativedelta | |||
from lingua_franca.lang import get_primary_lang_code | |||
from datetime import timedelta, datetime, date |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove redefined import.
The import datetime
is redefined.
- from datetime import timedelta, datetime, date
+ from datetime import timedelta, date
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from datetime import timedelta, datetime, date | |
from datetime import timedelta, date |
Tools
Ruff
20-20: Redefinition of unused
datetime
from line 16Remove definition:
datetime
(F811)
def weekday_to_int(weekday, lang=None): | ||
if isinstance(weekday, int) or isinstance(weekday, float): | ||
return int(weekday) | ||
lang_code = get_primary_lang_code(lang) | ||
inv_map = {} | ||
if lang_code.startswith("en"): | ||
inv_map = {v: k for k, v in _WEEKDAY_SHORT_EN.items()} | ||
|
||
for short in inv_map: | ||
if weekday.startswith(short): | ||
return inv_map[short] | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge isinstance
calls.
Merge multiple isinstance
calls for weekday
.
- if isinstance(weekday, int) or isinstance(weekday, float):
+ if isinstance(weekday, (int, float)):
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def weekday_to_int(weekday, lang=None): | |
if isinstance(weekday, int) or isinstance(weekday, float): | |
return int(weekday) | |
lang_code = get_primary_lang_code(lang) | |
inv_map = {} | |
if lang_code.startswith("en"): | |
inv_map = {v: k for k, v in _WEEKDAY_SHORT_EN.items()} | |
for short in inv_map: | |
if weekday.startswith(short): | |
return inv_map[short] | |
return None | |
def weekday_to_int(weekday, lang=None): | |
if isinstance(weekday, (int, float)): | |
return int(weekday) | |
lang_code = get_primary_lang_code(lang) | |
inv_map = {} | |
if lang_code.startswith("en"): | |
inv_map = {v: k for k, v in _WEEKDAY_SHORT_EN.items()} | |
for short in inv_map: | |
if weekday.startswith(short): | |
return inv_map[short] | |
return None |
Tools
Ruff
145-145: Multiple
isinstance
calls forweekday
, merge into a single callMerge
isinstance
calls forweekday
(SIM101)
def month_to_int(month, lang=None): | ||
if isinstance(month, int) or isinstance(month, float): | ||
return int(month) | ||
lang_code = get_primary_lang_code(lang) | ||
inv_map = {} | ||
if lang_code.startswith("en"): | ||
inv_map = {v: k for k, v in _MONTH_SHORT_EN.items()} | ||
for short in inv_map: | ||
if month.startswith(short): | ||
return inv_map[short] | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge isinstance
calls.
Merge multiple isinstance
calls for month
.
- if isinstance(month, int) or isinstance(month, float):
+ if isinstance(month, (int, float)):
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def month_to_int(month, lang=None): | |
if isinstance(month, int) or isinstance(month, float): | |
return int(month) | |
lang_code = get_primary_lang_code(lang) | |
inv_map = {} | |
if lang_code.startswith("en"): | |
inv_map = {v: k for k, v in _MONTH_SHORT_EN.items()} | |
for short in inv_map: | |
if month.startswith(short): | |
return inv_map[short] | |
return None | |
def month_to_int(month, lang=None): | |
if isinstance(month, (int, float)): | |
return int(month) | |
lang_code = get_primary_lang_code(lang) | |
inv_map = {} | |
if lang_code.startswith("en"): | |
inv_map = {v: k for k, v in _MONTH_SHORT_EN.items()} | |
for short in inv_map: | |
if month.startswith(short): | |
return inv_map[short] | |
return None |
Tools
Ruff
132-132: Multiple
isinstance
calls formonth
, merge into a single callMerge
isinstance
calls formonth
(SIM101)
this is mainly a code backup of a WIP rewrite of the date extractor HelloChatterbox#26
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Dependencies
python-dateutil
,holidays
, anddateparser
.