Feat/extract timespan #37

JarbasAl · 2022-11-26T18:39:22Z

Summary by CodeRabbit

New Features
- Introduced functions to extract and normalize time durations from text in German, English, Spanish, and Portuguese.
- Added support for new time duration units (months, years, decades, centuries, and millennia).
Bug Fixes
- Improved accuracy in converting phrases to durations by updating regex patterns and number normalization.
Tests
- Added comprehensive tests for all new duration extraction functions across multiple languages.
Chores
- Updated documentation and configuration files to support new features and improvements.

codecov · 2022-11-26T18:40:57Z

Codecov Report

❗ No coverage uploaded for pull request base (dev@135d8c5). Click here to learn what that means.
The diff coverage is n/a.

@@          Coverage Diff          @@
##             dev     #37   +/-   ##
=====================================
  Coverage       ?   0.00%           
=====================================
  Files          ?      65           
  Lines          ?   16825           
  Branches       ?       0           
=====================================
  Hits           ?       0           
  Misses         ?   16825           
  Partials       ?       0

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

lingua_franca/lang/parse_en.py

NeonDaniel · 2022-11-28T20:36:35Z

lingua_franca/lang/parse_en.py

+
+def extract_timespan_en(text, time_unit=TimespanUnit.TIMEDELTA, replace_token=""):
+ """
+ Convert an english phrase into a number of seconds


Isn't this just extract_duration?

yes but it changes return type and adds the replace token, its mainly a compatibility thing so extract_duration always means the same (mycroft) thing

lingua_franca/lang/parse_en.py

JarbasAl · 2024-07-18T23:55:30Z

@coderabbitai review

coderabbitai · 2024-07-18T23:55:38Z

Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2024-07-18T23:55:41Z

Walkthrough

The recent changes introduce new functions and updates across multiple language modules to enhance text parsing capabilities in lingua_franca. Major additions include new functions for extracting durations and timespans in German, English, Spanish, and Portuguese, along with normalization functions and extensive updates to unit tests. These improvements enable more accurate and versatile time-related text processing.

Changes

Files	Change Summary
`lingua_franca/lang/parse_de.py`	Added imports, new `extract_timespan_de` function, modified `extract_duration_de`, updated `GermanNormalizer`, added `normalize_de` function.
`lingua_franca/lang/parse_en.py`	Added imports, new `extract_timespan_en` function, modified `extract_duration_en`, added `normalize_en` function.
`lingua_franca/lang/parse_es.py`, `lingua_franca/lang/parse_pt.py`	Added imports, new `extract_duration_es` and `extract_duration_pt` functions.
`lingua_franca/parse.py`	Added new `extract_timespan` function.
`lingua_franca/res/text/de-de/normalize.json`	Introduced new text normalization rules for German.
`lingua_franca/time.py`	Added `TimespanUnit` enum, constants `DAYS_IN_1_YEAR`, `DAYS_IN_1_MONTH`.
`test/unittests/test_parse_es.py`, `test/unittests/test_parse_pt.py`	Added imports, new tests for `extract_duration`, `extract_timespan`, and related functions.

Poem

Amidst the code, a change did bloom,
Enhancing time from dawn to gloom.
From days to years, in tongues so bright,
Durations clear, from morn to night.
With new commands, the code does flow,
Making time parse, a skill to show.
🌟⏳✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 16

Outside diff range, codebase verification and nitpick comments (4)

lingua_franca/lang/parse_de.py (1)

Line range hint 1267-1269: Refactor to use GermanNormalizer.

The TODO comment indicates that the function should use GermanNormalizer for normalization.

- def normalize_de(text, remove_articles=True):
-     """ German string normalization """
-     # TODO return GermanNormalizer().normalize(text, remove_articles)
-     words = text.split()  # this also removed extra spaces
-     normalized = ""
-     for word in words:
-         if remove_articles and word in ["der", "die", "das", "des", "den",
-                                         "dem"]:
-             continue
- 
-         # Expand common contractions, e.g. "isn't" -> "is not"
-         contraction = ["net", "nett"]
-         if word in contraction:
-             expansion = ["nicht", "nicht"]
-             word = expansion[contraction.index(word)]
- 
-         # Convert numbers into digits, e.g. "two" -> "2"
- 
-         if word in _DE_NUMBERS:
-             word = str(_DE_NUMBERS[word])
- 
-         normalized += " " + word
- 
-     return normalized[1:]  # strip the initial space
+ def normalize_de(text, remove_articles=True):
+     return GermanNormalizer().normalize(text, remove_articles)

Tools

Ruff

1267-1267: Redefinition of unused normalize_de from line 1218

(F811)

lingua_franca/lang/parse_en.py (1)

1741-1757: Suggestion: Improve the text normalization comment.

The comment about the text normalization hack can be improved for clarity.

- # NOTE: above is a nasty hack, _convert_words_to_numbers normalized the
- # string so we can do this, but this is essentially incorrect since each
- # replaced number word should be replaced with a single replace_token
+ # NOTE: The above is a workaround. _convert_words_to_numbers normalizes the
+ # string, allowing us to replace two words (number and unit) with a single
+ # replace_token. Ideally, each number word should be replaced with a single
+ # replace_token.

test/unittests/test_parse_en.py (2)

600-606: Ensure consistency in test cases.

The test_extract_duration method already covers some cases for months and years. Ensure consistency and avoid redundancy.
-        self.assertEqual(extract_duration("1 month"),
-                         (timedelta(days=DAYS_IN_1_MONTH), ""))
-        self.assertEqual(
-            extract_duration("1 month"),
-            (timedelta(days=DAYS_IN_1_MONTH), ""))
Line range hint 2521-2527: Consider localization for gender tests.

The gender tests are not localized. Consider whether localization is needed for English.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 135d8c5 and dead6ff.

Files selected for processing (11)

lingua_franca/lang/parse_de.py (6 hunks)
lingua_franca/lang/parse_en.py (4 hunks)
lingua_franca/lang/parse_es.py (2 hunks)
lingua_franca/lang/parse_pt.py (2 hunks)
lingua_franca/parse.py (3 hunks)
lingua_franca/res/text/de-de/normalize.json (1 hunks)
lingua_franca/time.py (1 hunks)
test/unittests/test_parse_de.py (7 hunks)
test/unittests/test_parse_en.py (3 hunks)
test/unittests/test_parse_es.py (3 hunks)
test/unittests/test_parse_pt.py (3 hunks)

Files not summarized due to errors (2)

test/unittests/test_parse_de.py: Error: Message exceeds token limit
test/unittests/test_parse_en.py: Error: Message exceeds token limit

Additional context used

Ruff

lingua_franca/parse.py

16-16: datetime imported but unused

Remove unused import: datetime

(F401)

test/unittests/test_parse_es.py

24-24: lingua_franca.parse.get_gender imported but unused

Remove unused import: lingua_franca.parse.get_gender

(F401)

24-24: Redefinition of unused extract_datetime from line 22

Remove definition: extract_datetime

(F811)

24-24: Redefinition of unused extract_number from line 21

Remove definition: extract_number

(F811)

24-24: Redefinition of unused normalize from line 21

Remove definition: normalize

(F811)

24-24: Redefinition of unused yes_or_no from line 22

Remove definition: yes_or_no

(F811)

25-25: lingua_franca.time.to_local imported but unused

Remove unused import: lingua_franca.time.to_local

(F401)

test/unittests/test_parse_pt.py

23-23: lingua_franca.time.to_local imported but unused

Remove unused import: lingua_franca.time.to_local

(F401)

lingua_franca/lang/parse_es.py

1128-1128: Function definition does not bind loop variable unit_en

(B023)

1131-1131: re may be undefined, or defined from star imports

(F405)

1139-1139: Function definition does not bind loop variable unit_en

(B023)

1141-1141: Function definition does not bind loop variable unit_en

(B023)

1143-1143: Function definition does not bind loop variable unit_en

(B023)

1145-1145: Function definition does not bind loop variable unit_en

(B023)

1147-1147: Function definition does not bind loop variable unit_en

(B023)

1152-1152: re may be undefined, or defined from star imports

(F405)

lingua_franca/lang/parse_pt.py

1156-1156: Function definition does not bind loop variable unit_en

(B023)

1167-1167: Function definition does not bind loop variable unit_en

(B023)

1169-1169: Function definition does not bind loop variable unit_en

(B023)

1171-1171: Function definition does not bind loop variable unit_en

(B023)

1173-1173: Function definition does not bind loop variable unit_en

(B023)

1175-1175: Function definition does not bind loop variable unit_en

(B023)

lingua_franca/lang/parse_de.py

144-144: Function definition does not bind loop variable unit_en

(B023)

1267-1267: Redefinition of unused normalize_de from line 1218

(F811)

lingua_franca/lang/parse_en.py

605-605: Function definition does not bind loop variable unit_en

(B023)

610-610: Function definition does not bind loop variable unit_en

(B023)

612-612: Function definition does not bind loop variable unit_en

(B023)

614-614: Function definition does not bind loop variable unit_en

(B023)

616-616: Function definition does not bind loop variable unit_en

(B023)

618-618: Function definition does not bind loop variable unit_en

(B023)

Additional comments not posted (87)

lingua_franca/res/text/de-de/normalize.json (5)

1-8: Configuration settings look good.

The initial configuration settings for text normalization in German are correctly defined. They include options for lowercase conversion, number-to-digit conversion, contraction expansion, symbol removal, accent removal, article removal, and stopword removal.

9-25: Contractions mapping is comprehensive.

The contractions mapping for German contractions is well-defined and covers common contractions.

26-29: Word replacements are appropriate.

The word replacements for "m" to "meter" and "km" to "kilometer" are correctly defined.

30-63: Number replacements are thorough.

The number replacements for German words to digits are comprehensive and cover numbers from 0 to 90.

64-72: Stopwords and articles are correctly listed.

The stopwords list is empty, and the articles list includes common German articles.

lingua_franca/time.py (2)

21-38: Enum TimespanUnit is well-defined.

The TimespanUnit enum is comprehensive and includes various time units for timespan calculations.

42-43: Constants for time calculations are correctly defined.

The constants DAYS_IN_1_YEAR and DAYS_IN_1_MONTH are accurately defined for time calculations.

lingua_franca/parse.py (2)

29-29: Imports for time calculations are appropriate.

The imports for TimespanUnit, DAYS_IN_1_MONTH, and DAYS_IN_1_YEAR from lingua_franca.time are correctly added.

132-203: Function extract_timespan is well-defined but has a TODO comment.

The extract_timespan function is comprehensive and handles various time units. However, there is a TODO comment indicating that the remainder is imperfect and ignores the replace_token.

Ensure that the TODO comment is addressed in future updates.

test/unittests/test_parse_es.py (18)

16-17: Imports for timedelta and relativedelta are appropriate.

The imports for timedelta from datetime and relativedelta from dateutil.relativedelta are correctly added.

22-22: Import extract_duration is appropriate.

The import for extract_duration from lingua_franca.parse is correctly added.

249-267: Tests for extract_duration are comprehensive.

The tests for the extract_duration function cover various scenarios and time units.

268-291: Tests for non-standard units in extract_duration are thorough.

The tests for non-standard units like months, years, decades, centuries, and millennia are well-defined and cover various scenarios.

308-327: Tests for extract_timespan are comprehensive.

The tests for the extract_timespan function cover various scenarios and time units.

329-367: Tests for extract_timespan with RELATIVEDELTA are thorough.

The tests for the extract_timespan function with the RELATIVEDELTA time unit are well-defined and cover various scenarios.

368-396: Tests for extract_timespan with TOTAL_MICROSECONDS are comprehensive.

The tests for the extract_timespan function with the TOTAL_MICROSECONDS time unit are well-defined and cover various scenarios.

397-425: Tests for extract_timespan with TOTAL_MILLISECONDS are comprehensive.

The tests for the extract_timespan function with the TOTAL_MILLISECONDS time unit are well-defined and cover various scenarios.

426-450: Tests for extract_timespan with TOTAL_SECONDS are thorough.

The tests for the extract_timespan function with the TOTAL_SECONDS time unit are well-defined and cover various scenarios.

451-471: Tests for extract_timespan with TOTAL_MINUTES are comprehensive.

The tests for the extract_timespan function with the TOTAL_MINUTES time unit are well-defined and cover various scenarios.

472-493: Tests for extract_timespan with TOTAL_HOURS are thorough.

The tests for the extract_timespan function with the TOTAL_HOURS time unit are well-defined and cover various scenarios.

494-516: Tests for extract_timespan with TOTAL_DAYS are comprehensive.

The tests for the extract_timespan function with the TOTAL_DAYS time unit are well-defined and cover various scenarios.

517-538: Tests for extract_timespan with TOTAL_WEEKS are thorough.

The tests for the extract_timespan function with the TOTAL_WEEKS time unit are well-defined and cover various scenarios.

539-561: Tests for extract_timespan with TOTAL_MONTHS are comprehensive.

The tests for the extract_timespan function with the TOTAL_MONTHS time unit are well-defined and cover various scenarios.

562-583: Tests for extract_timespan with TOTAL_YEARS are thorough.

The tests for the extract_timespan function with the TOTAL_YEARS time unit are well-defined and cover various scenarios.

584-604: Tests for extract_timespan with TOTAL_DECADES are comprehensive.

The tests for the extract_timespan function with the TOTAL_DECADES time unit are well-defined and cover various scenarios.

605-625: Tests for extract_timespan with TOTAL_CENTURIES are thorough.

The tests for the extract_timespan function with the TOTAL_CENTURIES time unit are well-defined and cover various scenarios.

626-646: Tests for extract_timespan with TOTAL_MILLENNIUMS are comprehensive.

The tests for the extract_timespan function with the TOTAL_MILLENNIUMS time unit are well-defined and cover various scenarios.

test/unittests/test_parse_pt.py (17)

260-287: Ensure proper handling of imperfect remainders.

The TODO comments indicate that the handling of imperfect remainders is not yet implemented. Ensure these cases are handled correctly in future updates.

376-396: Tests for extract_timespan look good!

The tests cover various time units and edge cases.

397-435: Tests for extract_timespan_delta look good!

The tests cover various time units using TimespanUnit.RELATIVEDELTA.

436-464: Tests for extract_timespan_microsegundos look good!

The tests cover various time units in microseconds.

465-493: Tests for extract_timespan_milisegundos look good!

The tests cover various time units in milliseconds.

494-518: Tests for extract_timespan_segundos look good!

The tests cover various time units in seconds.

519-539: Tests for extract_timespan_minutos look good!

The tests cover various time units in minutes.

540-561: Tests for extract_timespan_horas look good!

The tests cover various time units in hours.

562-584: Tests for extract_timespan_dias look good!

The tests cover various time units in days.

585-606: Tests for extract_timespan_semanas look good!

The tests cover various time units in weeks.

607-629: Tests for extract_timespan_months look good!

The tests cover various time units in months.

630-651: Tests for extract_timespan_years look good!

The tests cover various time units in years.

652-672: Tests for extract_timespan_decades look good!

The tests cover various time units in decades.

673-693: Tests for extract_timespan_centuries look good!

The tests cover various time units in centuries.

694-714: Tests for extract_timespan_millennia look good!

The tests cover various time units in millennia.

Line range hint 320-348:
Tests for get_gender look good!

The tests cover various cases for gender extraction in Portuguese.

Line range hint 351-374:
Tests for yes_or_no look good!

The tests cover various cases for yes/no extraction in Portuguese.

lingua_franca/lang/parse_es.py (1)

Line range hint 1159-1177:
Ensure get_gender_es handles context correctly.

The function now has a default value for the context parameter. Ensure it handles context correctly.

Tools

Ruff

1128-1128: Function definition does not bind loop variable unit_en

(B023)

1131-1131: re may be undefined, or defined from star imports

(F405)

1139-1139: Function definition does not bind loop variable unit_en

(B023)

1141-1141: Function definition does not bind loop variable unit_en

(B023)

1143-1143: Function definition does not bind loop variable unit_en

(B023)

1145-1145: Function definition does not bind loop variable unit_en

(B023)

1147-1147: Function definition does not bind loop variable unit_en

(B023)

1152-1152: re may be undefined, or defined from star imports

(F405)

lingua_franca/lang/parse_pt.py (2)

23-23: LGTM! Imports are necessary and correctly used.

The updated imports for timedelta, now_local, DAYS_IN_1_MONTH, and DAYS_IN_1_YEAR are necessary for the new functionality in the code.

Also applies to: 31-31

1102-1121: Well-documented function.

The docstring provides a clear and comprehensive explanation of the function's purpose, arguments, and return values.

lingua_franca/lang/parse_de.py (3)

117-117: Ensure the correctness of the regex pattern and replacement logic.

The regex pattern and replacement logic for converting German phrases to duration should be thoroughly tested to ensure they cover all edge cases.

154-154: Ensure the correctness of the regex pattern and replacement logic.

The regex pattern and replacement logic for converting German phrases to timespan should be thoroughly tested to ensure they cover all edge cases.

1263-1265: Ensure the correctness of JSON loading.

The usage of resolve_resource_file and JSON loading should be thoroughly tested to ensure the configuration is loaded correctly.
test/unittests/test_parse_de.py (15)
703-746: Ensure comprehensive test coverage for extract_timespan with days.

The new test cases for extract_timespan with days should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

Verification successful

Ensure comprehensive test coverage for extract_timespan with days.

The test cases for extract_timespan with days in test_parse_de.py and test_parse_en.py cover various scenarios, including edge cases and common use cases. The provided test cases appear to be comprehensive.

test/unittests/test_parse_de.py

test/unittests/test_parse_en.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with days.

# Test: Search for test cases related to `extract_timespan` with days.
rg --type python -A 5 $'extract_timespan.*days'
Length of output: 79

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with days.

# Test: Search for test cases related to `extract_timespan` with days.
rg --glob '*.py' -A 5 'extract_timespan.*days'
Length of output: 3102
504-563: Ensure comprehensive test coverage for extract_timespan with milliseconds.

The new test cases for extract_timespan with milliseconds should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

Verification successful

Ensure comprehensive test coverage for extract_timespan with milliseconds.

The test cases for extract_timespan with milliseconds in test/unittests/test_parse_de.py cover various scenarios, including edge cases and common use cases. The test cases appear to be well-implemented and thorough.

test/unittests/test_parse_de.py

test/unittests/test_parse_en.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with milliseconds.

# Test: Search for test cases related to `extract_timespan` with milliseconds.
rg --type python -A 5 $'extract_timespan.*milliseconds'
Length of output: 87

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with milliseconds.

# Test: Search for test cases related to `extract_timespan` with milliseconds.
rg --type py -A 5 $'extract_timespan.*milliseconds'
Length of output: 1025
658-702: Ensure comprehensive test coverage for extract_timespan with hours.

The new test cases for extract_timespan with hours should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

Verification successful

Comprehensive test coverage for extract_timespan with hours verified.

The test cases for extract_timespan with hours are well-covered in both the German (test_parse_de.py) and English (test_parse_en.py) test files. No further actions are required.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with hours.

# Test: Search for test cases related to `extract_timespan` with hours.
rg --type python -A 5 $'extract_timespan.*hours'
Length of output: 80

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with hours.

# Test: Search for test cases related to `extract_timespan` with hours.
rg --type py -A 5 $'extract_timespan.*hours'
Length of output: 2984
614-657: Ensure comprehensive test coverage for extract_timespan with minutes.

The new test cases for extract_timespan with minutes should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

Verification successful

Ensure comprehensive test coverage for extract_timespan with minutes.

The test cases for extract_timespan with minutes in both test_parse_de.py and test_parse_en.py provide comprehensive coverage, including various scenarios and edge cases. No further action is needed.

test/unittests/test_parse_de.py

test/unittests/test_parse_en.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with minutes.

# Test: Search for test cases related to `extract_timespan` with minutes.
rg --type python -A 5 $'extract_timespan.*minutes'
Length of output: 82

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with minutes.

# Test: Search for test cases related to `extract_timespan` with minutes.
rg -A 5 'extract_timespan.*minutes'
Length of output: 5779
566-613: Ensure comprehensive test coverage for extract_timespan with seconds.

The new test cases for extract_timespan with seconds should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

Verification successful

Ensure comprehensive test coverage for extract_timespan with seconds.

The test cases for extract_timespan with seconds in test_parse_de.py cover various scenarios, including edge cases and common use cases. Similar tests are also present in test_parse_en.py, indicating comprehensive coverage.

test/unittests/test_parse_de.py: test_extract_timespan_seconds_de

test/unittests/test_parse_en.py: test_extract_timespan_seconds_en
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with seconds.

# Test: Search for test cases related to `extract_timespan` with seconds.
rg --type python -A 5 $'extract_timespan.*seconds'
Length of output: 82

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with seconds.

# Test: Search for test cases related to `extract_timespan` with seconds.
rg --type py -A 5 $'extract_timespan.*seconds'
Length of output: 6865
462-503: Ensure comprehensive test coverage for extract_timespan with microseconds.

The new test cases for extract_timespan with microseconds should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

19-25: Ensure new imports are used.

The new imports for extract_duration, extract_timespan, and TimespanUnit should be used in the file. Verify their usage in the test methods.

Verification successful

Ensure new imports are used.

The new imports for extract_duration, extract_timespan, and TimespanUnit are correctly used in the file.

extract_duration is used in multiple test cases such as test_extract_duration_de.

extract_timespan is used in multiple test cases such as test_extract_timespan_de.

TimespanUnit is used in multiple test cases such as test_extract_timespan_ambiguous.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of new imports in the file.

# Test: Search for the usage of `extract_duration`, `extract_timespan`, and `TimespanUnit`.
rg --type python -A 5 $'extract_duration|extract_timespan|TimespanUnit'
Length of output: 103

Script:
#!/bin/bash
# Description: Verify the usage of new imports in the file.

# Test: Search for the usage of `extract_duration`, `extract_timespan`, and `TimespanUnit`.
rg --glob '*.py' -A 5 'extract_duration|extract_timespan|TimespanUnit'
Length of output: 157997
Line range hint 220-250:
Ensure comprehensive test coverage for extract_duration.

The new test cases for extract_duration should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

Verification successful

Test coverage for extract_duration is comprehensive.

The test cases for extract_duration cover a wide range of scenarios and languages, ensuring robust validation of the function.

test/unittests/test_parse_de.py

test/unittests/test_parse_ru.py

test/unittests/test_parse_pt.py

test/unittests/test_parse_nl.py

test/unittests/test_parse_pl.py

test/unittests/test_parse_cs.py

test/unittests/test_parse_fr.py

test/unittests/test_parse_sv.py

test/unittests/test_parse_az.py

test/unittests/test_parse_es.py

test/unittests/test_parse_fa.py

test/unittests/test_parse_en.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_duration`.

# Test: Search for test cases related to `extract_duration`.
rg --type python -A 5 $'extract_duration'
Length of output: 71

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_duration`.

# Test: Search for test cases related to `extract_duration`.
rg --type py -A 5 $'extract_duration'
Length of output: 57945
747-788: Ensure comprehensive test coverage for extract_timespan with weeks.

The new test cases for extract_timespan with weeks should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

Verification successful

Ensure comprehensive test coverage for extract_timespan with weeks.

The test cases for extract_timespan with weeks in test_parse_de.py are comprehensive and cover various scenarios, including different time units, edge cases, and phrases with additional text. No further actions are necessary.

test/unittests/test_parse_de.py: Comprehensive test cases for extract_timespan with weeks.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with weeks.

# Test: Search for test cases related to `extract_timespan` with weeks.
rg --type python -A 5 $'extract_timespan.*weeks'
Length of output: 80

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with weeks.

# Test: Search for test cases related to `extract_timespan` with weeks.
rg --glob '*.py' -A 5 $'extract_timespan.*weeks'
Length of output: 3148
260-289: Ensure comprehensive test coverage for extract_timespan.

The new test cases for extract_timespan should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

Verification successful

Comprehensive test coverage for extract_timespan verified.

The test cases for extract_timespan are extensive and cover various scenarios, including different time units and edge cases. The function is well-tested across multiple languages.

Files and lines with test cases:

test/unittests/test_parse_de.py

test/unittests/test_parse_pt.py

test/unittests/test_parse_es.py

test/unittests/test_parse_en.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan`.

# Test: Search for test cases related to `extract_timespan`.
rg --type python -A 5 $'extract_timespan'
Length of output: 71

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan`.

# Test: Search for test cases related to `extract_timespan`.
rg --type py -A 5 $'extract_timespan'
Length of output: 81947
789-833: Ensure comprehensive test coverage for extract_timespan with months.

The new test cases for extract_timespan with months should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

Verification successful

Ensure comprehensive test coverage for extract_timespan with months.

The new test cases for extract_timespan with months in test_parse_de.py contribute to a broader set of tests across multiple languages, ensuring comprehensive coverage.

test/unittests/test_parse_pt.py

test/unittests/test_parse_es.py

test/unittests/test_parse_de.py

test/unittests/test_parse_en.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with months.

# Test: Search for test cases related to `extract_timespan` with months.
rg --type python -A 5 $'extract_timespan.*months'
Length of output: 81

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with months.

# Test: Search for test cases related to `extract_timespan` with months.
rg --type py -A 5 $'extract_timespan.*months'
Length of output: 5467
877-915: Ensure comprehensive test coverage for extract_timespan with decades.

The new test cases for extract_timespan with decades should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

Verification successful

Ensure comprehensive test coverage for extract_timespan with decades.

The test cases for extract_timespan with decades are present across multiple language files, indicating comprehensive coverage.

test/unittests/test_parse_pt.py

test/unittests/test_parse_es.py

test/unittests/test_parse_en.py

test/unittests/test_parse_de.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with decades.

# Test: Search for test cases related to `extract_timespan` with decades.
rg --type python -A 5 $'extract_timespan.*decades'
Length of output: 82

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with decades.

# Test: Search for test cases related to `extract_timespan` with decades.
rg -A 5 'extract_timespan.*decades'
Length of output: 2961
956-975: Ensure comprehensive test coverage for extract_timespan with millennia.

The new test cases for extract_timespan with millennia should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

Verification successful

Comprehensive test coverage for extract_timespan with millennia verified.

The test cases for extract_timespan with millennia are well-covered across multiple language-specific test files, ensuring thorough validation of the function.

test/unittests/test_parse_de.py

test/unittests/test_parse_es.py

test/unittests/test_parse_pt.py

test/unittests/test_parse_en.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with millennia.

# Test: Search for test cases related to `extract_timespan` with millennia.
rg --type python -A 5 $'extract_timespan.*millennia'
Length of output: 84

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with millennia.

# Test: Search for test cases related to `extract_timespan` with millennia.
rg --type py -A 5 'extract_timespan.*millennia'
Length of output: 1960
834-876: Ensure comprehensive test coverage for extract_timespan with years.

The new test cases for extract_timespan with years should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

Verification successful

Ensure comprehensive test coverage for extract_timespan with years.

The test cases for extract_timespan with years are well-covered across multiple language-specific test files, including test_parse_de.py, test_parse_es.py, test_parse_pt.py, and test_parse_en.py. The variety of test cases in test_parse_de.py confirms thorough testing for different durations and edge cases.

test/unittests/test_parse_de.py

test/unittests/test_parse_es.py

test/unittests/test_parse_pt.py

test/unittests/test_parse_en.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with years.

# Test: Search for test cases related to `extract_timespan` with years.
rg --type python -A 5 $'extract_timespan.*years'
Length of output: 80

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with years.

# Test: Search for test cases related to `extract_timespan` with years.
rg --type py -A 5 'extract_timespan.*years'
Length of output: 2918
916-955: Ensure comprehensive test coverage for extract_timespan with centuries.

The new test cases for extract_timespan with centuries should cover various scenarios, including edge cases and common use cases. Verify that all relevant scenarios are tested.

Verification successful

Comprehensive test coverage for extract_timespan with centuries verified.

The test cases for extract_timespan with centuries cover a wide range of scenarios, including different time units and edge cases across multiple languages. No further action is needed.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with centuries.

# Test: Search for test cases related to `extract_timespan` with centuries.
rg --type python -A 5 $'extract_timespan.*centuries'
Length of output: 84

Script:
#!/bin/bash
# Description: Verify the test coverage for `extract_timespan` with centuries.

# Find all Python files and search for test cases related to `extract_timespan` with centuries.
fd --type f --extension py --exec rg 'extract_timespan.*centuries' {}
Length of output: 369

Script:
#!/bin/bash
# Description: Extract the contents of test functions related to `extract_timespan` with centuries.

# Extract the contents of the test functions found in the previous step.
rg --type py -A 20 'def test_extract_timespan_centuries' 
Length of output: 6477
lingua_franca/lang/parse_en.py (6)

18-18: LGTM! Import changes are necessary and appropriate.

The addition of the math module and the modifications to the from lingua_franca.time import statement are necessary for the new functionality.

Also applies to: 31-31

1713-1713: LGTM! The new function normalize_en is necessary and appropriate.

The addition of the normalize_en function for English string normalization is well-implemented.

1715-1737: LGTM! The new function extract_timespan_en is comprehensive and well-implemented.

The function handles converting English phrases into durations and includes comprehensive handling for various time units and conversions.

1769-1806: LGTM! The handling of time units for TIMEDELTA is comprehensive.

The function correctly handles various time units for TIMEDELTA and includes appropriate conversions.

1822-1858: LGTM! The handling of time units for RELATIVEDELTA is comprehensive.

The function correctly handles various time units for RELATIVEDELTA and includes appropriate conversions.

1892-1965: LGTM! The handling of time units for other TimespanUnit values is comprehensive.

The function correctly handles various time units for other TimespanUnit values and includes appropriate conversions.

test/unittests/test_parse_en.py (16)

1934-1949: Ensure accurate extraction of microseconds.

The extraction of microseconds should be accurate and handle small floating point errors.

1977-1989: Ensure accurate extraction of milliseconds.

The extraction of milliseconds should be accurate and handle small floating point errors.

2046-2057: Ensure accurate extraction of seconds.

The extraction of seconds should be accurate and handle small floating point errors.

2098-2109: Ensure accurate extraction of minutes.

The extraction of minutes should be accurate and handle small floating point errors.

2146-2157: Ensure accurate extraction of hours.

The extraction of hours should be accurate and handle small floating point errors.

2194-2205: Ensure accurate extraction of days.

The extraction of days should be accurate and handle small floating point errors.

2242-2253: Ensure accurate extraction of weeks.

The extraction of weeks should be accurate and handle small floating point errors.

2288-2299: Ensure accurate extraction of months.

The extraction of months should be accurate and handle small floating point errors.

2340-2351: Ensure accurate extraction of years.

The extraction of years should be accurate and handle small floating point errors.

2387-2398: Ensure accurate extraction of decades.

The extraction of decades should be accurate and handle small floating point errors.

2430-2441: Ensure accurate extraction of centuries.

The extraction of centuries should be accurate and handle small floating point errors.

2474-2485: Ensure accurate extraction of millennia.

The extraction of millennia should be accurate and handle small floating point errors.

Line range hint 2530-2567: Ensure comprehensive yes/no tests.

The yes/no tests cover a wide range of cases. Ensure that all possible edge cases are covered.

Line range hint 2569-2581: Ensure accurate language code extraction.

The language code extraction tests should ensure accurate results with high confidence.

Line range hint 2583-2596: Ensure accurate language code extraction for specific cases.

The tests for specific language codes should ensure accurate results with high confidence.

1698-1711: Ensure accurate normalization in tests.

The normalization of "fifty seven and a half" to "57.5" may not be accurate. Consider reviewing the normalization logic.

coderabbitai · 2024-07-18T23:59:50Z

lingua_franca/lang/parse_pt.py

+ time_units[unit_en] = 0
+
+ def repl(match):
+ time_units[unit_en] += float(match.group(1))


Fix the loop variable binding issue.

The function definitions inside the loop do not bind the loop variable unit_en. This can lead to unexpected behavior.

- def repl(match): - time_units[unit_en] += float(match.group(1)) + def repl(unit): + def _repl(match): + time_units[unit] += float(match.group(1)) + return _repl - text = re.sub(unit_pattern, repl, text) + text = re.sub(unit_pattern, repl(unit_en), text) - def repl_non_std(match): - val = float(match.group(1)) - if unit_en == "months": - val = DAYS_IN_1_MONTH * val - if unit_en == "years": - val = DAYS_IN_1_YEAR * val - if unit_en == "decades": - val = 10 * DAYS_IN_1_YEAR * val - if unit_en == "centurys": - val = 100 * DAYS_IN_1_YEAR * val - if unit_en == "millenniums": - val = 1000 * DAYS_IN_1_YEAR * val - time_units["days"] += val - return '' + def repl_non_std(unit): + def _repl_non_std(match): + val = float(match.group(1)) + if unit == "months": + val = DAYS_IN_1_MONTH * val + if unit == "years": + val = DAYS_IN_1_YEAR * val + if unit == "decades": + val = 10 * DAYS_IN_1_YEAR * val + if unit == "centurys": + val = 100 * DAYS_IN_1_YEAR * val + if unit == "millenniums": + val = 1000 * DAYS_IN_1_YEAR * val + time_units["days"] += val + return '' + return _repl_non_std - text = re.sub(unit_pattern, repl_non_std, text) + text = re.sub(unit_pattern, repl_non_std(unit_en), text)

Also applies to: 1167-1176

Tools

Ruff

1156-1156: Function definition does not bind loop variable unit_en

(B023)

coderabbitai · 2024-07-18T23:59:50Z

lingua_franca/lang/parse_pt.py

+def extract_duration_pt(text):
+ """
+ Convert an portuguese phrase into a number of seconds
+ Convert things like:
+ "10 Minutos"
+ "3 dias 8 horas 10 Minutos e 49 Segundos"
+ into an int, representing the total number of seconds.
+ The words used in the duration will be consumed, and
+ the remainder returned.
+ As an example, "set a timer for 5 minutes" would return
+ (300, "set a timer for").
+ Args:
+ text (str): string containing a duration
+ Returns:
+ (timedelta, str):
+ A tuple containing the duration and the remaining text
+ not consumed in the parsing. The first value will
+ be None if no duration is found. The text returned
+ will have whitespace stripped from the ends.
+ """
+ if not text:
+ return None
+
+ text = text.lower()
+ time_units = {
+ 'microseconds': 'microsegundos',
+ 'milliseconds': 'milisegundos',
+ 'seconds': 'segundos',
+ 'minutes': 'minutos',
+ 'hours': 'horas',
+ 'days': 'dias',
+ 'weeks': 'semanas'
+ }
+ non_std_un = {
+ "months": "meses",
+ "years": "anos",
+ 'decades': "decadas",
+ 'centurys': "seculos",
+ 'millenniums': "milenios"
+ }
+
+ pattern = r"(?P<value>\d+(?:\.?\d+)?)(?:\s+|\-){unit}[s]?"
+
+ text = text.replace("mês", "meses").replace("é", "e")
+ text = text.replace("segundo", "_s_") # HACK - segundo (second) will be replaced with 2
+ text = PortugueseNormalizer().numbers_to_digits(text)
+ text = text.replace("_s_", "segundo") # undo HACK
+
+ for (unit_en, unit_pt) in time_units.items():
+ unit_pattern = pattern.format(
+ unit=unit_pt[:-1]) # remove 's' from unit
+ time_units[unit_en] = 0
+
+ def repl(match):
+ time_units[unit_en] += float(match.group(1))
+ return ''
+
+ text = re.sub(unit_pattern, repl, text)
+
+ for (unit_en, unit_pt) in non_std_un.items():
+ unit_pattern = pattern.format(
+ unit=unit_pt[:-1]) # remove 's' from unit
+
+ def repl_non_std(match):
+ val = float(match.group(1))
+ if unit_en == "months":
+ val = DAYS_IN_1_MONTH * val
+ if unit_en == "years":
+ val = DAYS_IN_1_YEAR * val
+ if unit_en == "decades":
+ val = 10 * DAYS_IN_1_YEAR * val
+ if unit_en == "centurys":
+ val = 100 * DAYS_IN_1_YEAR * val
+ if unit_en == "millenniums":
+ val = 1000 * DAYS_IN_1_YEAR * val
+ time_units["days"] += val
+ return ''
+
+ text = re.sub(unit_pattern, repl_non_std, text)
+
+ text = text.strip()
+ duration = timedelta(**time_units) if any(time_units.values()) else None
+
+ return (duration, text)


Consider adding unit tests for extract_duration_pt.

To ensure the correctness of this function, consider adding unit tests that cover various edge cases and typical use cases.

Do you want me to generate the unit testing code or open a GitHub issue to track this task?

Tools

Ruff

1156-1156: Function definition does not bind loop variable unit_en

(B023)

1167-1167: Function definition does not bind loop variable unit_en

(B023)

1169-1169: Function definition does not bind loop variable unit_en

(B023)

1171-1171: Function definition does not bind loop variable unit_en

(B023)

1173-1173: Function definition does not bind loop variable unit_en

(B023)

1175-1175: Function definition does not bind loop variable unit_en

(B023)

@JarbasAl, here are the unit tests for the extract_duration_pt function:

import unittest from datetime import timedelta from lingua_franca.lang.parse_pt import extract_duration_pt class TestExtractDurationPT(unittest.TestCase): def test_single_unit(self): self.assertEqual(extract_duration_pt("10 segundos"), (timedelta(seconds=10), "")) self.assertEqual(extract_duration_pt("5 minutos"), (timedelta(minutes=5), "")) self.assertEqual(extract_duration_pt("2 horas"), (timedelta(hours=2), "")) self.assertEqual(extract_duration_pt("3 dias"), (timedelta(days=3), "")) self.assertEqual(extract_duration_pt("1 semana"), (timedelta(weeks=1), "")) def test_multiple_units(self): self.assertEqual(extract_duration_pt("3 dias 8 horas 10 minutos e 49 segundos"), (timedelta(days=3, hours=8, minutes=10, seconds=49), "")) self.assertEqual(extract_duration_pt("2 semanas 3 dias 4 horas"), (timedelta(weeks=2, days=3, hours=4), "")) self.assertEqual(extract_duration_pt("1 ano 2 meses 3 dias"), (timedelta(days=365.2425 + 2 * 30.42 + 3), "")) def test_non_standard_units(self): self.assertEqual(extract_duration_pt("1 mês"), (timedelta(days=30.42), "")) self.assertEqual(extract_duration_pt("1 ano"), (timedelta(days=365.2425), "")) self.assertEqual(extract_duration_pt("1 década"), (timedelta(days=10 * 365.2425), "")) self.assertEqual(extract_duration_pt("1 século"), (timedelta(days=100 * 365.2425), "")) self.assertEqual(extract_duration_pt("1 milênio"), (timedelta(days=1000 * 365.2425), "")) def test_mixed_units(self): self.assertEqual(extract_duration_pt("1 ano 2 meses 3 dias 4 horas 5 minutos 6 segundos"), (timedelta(days=365.2425 + 2 * 30.42 + 3, hours=4, minutes=5, seconds=6), "")) def test_edge_cases(self): self.assertEqual(extract_duration_pt(""), (None, "")) self.assertEqual(extract_duration_pt("sem tempo"), (None, "sem tempo")) self.assertEqual(extract_duration_pt("10"), (None, "10")) if __name__ == '__main__': unittest.main()

These tests cover various scenarios including single time units, multiple time units combined, non-standard time units like months and years, and edge cases such as empty strings and invalid input. This should help ensure the robustness of the extract_duration_pt function.

Learnings added

Learnt from: JarbasAl PR: OpenVoiceOS/ovos-lingua-franca#37 File: lingua_franca/lang/parse_pt.py:1102-1185 Timestamp: 2024-07-19T00:19:09.766Z Learning: Generate unit tests for the `extract_duration_pt` function in `lingua_franca/lang/parse_pt.py` to cover various scenarios including single time units, multiple time units combined, non-standard time units, and edge cases.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai · 2024-07-18T23:59:50Z

lingua_franca/lang/parse_de.py

@@ -122,7 +127,7 @@ def extract_duration_de(text):
 }

 # Einzahl und Mehrzahl
- pattern = r"(?P<value>\d+(?:\.?\d+)?)(?:\s+|\-){unit}[ne]?"
+ pattern = r"(?P<value>\d+(?:[.,]?\d+)?|\bein[e]?[nsm]?\b)(?:\s+|\-)(?P<unit>{unit}[nes]?[sn]?\b)"


Reminder: Implement text-to-number conversion.

The TODO comment indicates that the text-to-number conversion is missing.

Do you want me to help implement the text-to-number conversion function?

@JarbasAl, I'll implement the text-to-number conversion function and integrate it into the extract_duration_de function. Here is the updated code:

def _convert_words_to_numbers_de(text): """ Convert German number words in the text to digits. Args: text (str): The input text containing number words. Returns: str: The text with number words converted to digits. """ words = text.split() for i, word in enumerate(words): if word in de_numbers: words[i] = str(de_numbers[word]) return ' '.join(words) def extract_duration_de(text): """ Convert a German phrase into a number of seconds. Convert things like: "10 Minuten" "3 Tage 8 Stunden 10 Minuten und 49 Sekunden" into an int, representing the total number of seconds. The words used in the duration will be consumed, and the remainder returned. As an example, "set a timer for 5 minutes" would return (300, "set a timer for"). Args: text (str): string containing a duration Returns: (timedelta, str): A tuple containing the duration and the remaining text not consumed in the parsing. The first value will be None if no duration is found. The text returned will have whitespace stripped from the ends. """ if not text: return None text = GermanNormalizer().numbers_to_digits(text.lower()) text = _convert_words_to_numbers_de(text) # Convert number words to digits # die time_unit values werden für timedelta() mit dem jeweiligen Wert überschrieben time_units = { 'microseconds': 'mikrosekunden', 'milliseconds': 'millisekunden', 'seconds': 'sekunden', 'minutes': 'minuten', 'hours': 'stunden', 'days': 'tage', 'weeks': 'wochen' } # Einzahl und Mehrzahl pattern = r"(?P<value>\d+(?:[.,]?\d+)?|\bein[e]?[nsm]?\b)(?:\s+|\-)(?P<unit>{unit}[nes]?[sn]?\b)" for (unit_en, unit_de) in time_units.items(): unit_pattern = pattern.format( unit=unit_de[:-1]) # remove 'n'/'e' from unit time_units[unit_en] = 0 def repl(match): value = match.group(1).replace(",",".").replace("einem", "1") \ .replace("eines", "1").replace("einen", "1") \ .replace("eine", "1").replace("ein", "1") time_units[unit_en] += float(value) return '' text = re.sub(unit_pattern, repl, text) text = text.strip() duration = timedelta(**time_units) if any(time_units.values()) else None return (duration, text)

This code includes the _convert_words_to_numbers_de function to convert German number words to digits and integrates it into the extract_duration_de function.

Learnings added

Learnt from: JarbasAl PR: OpenVoiceOS/ovos-lingua-franca#37 File: lingua_franca/lang/parse_de.py:130-130 Timestamp: 2024-07-19T00:18:56.842Z Learning: The `extract_duration_de` function in `lingua_franca/lang/parse_de.py` requires a text-to-number conversion function to handle German number words.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai · 2024-07-18T23:59:50Z

lingua_franca/lang/parse_de.py

+ # TODO German version of _convert_words_to_numbers is missing completely
+ # Have to port my implementation, up until then -> Normalizer + replace
+ text = GermanNormalizer().numbers_to_digits(text.lower())


Reminder: Implement text-to-number conversion.

The TODO comment indicates that the text-to-number conversion is missing.

Do you want me to help implement the text-to-number conversion function?

coderabbitai · 2024-07-18T23:59:50Z

lingua_franca/lang/parse_de.py

+ print("[WARNING] relativedelta requires {unit} to be an "
+ "integer".format(unit=unit))
+ # fallback to timedelta resolution / raw tokens text with no flags
+ return extract_timespan_de(text, TimespanUnit.TIMEDELTA)


Avoid self-recursive calls without base case.

The call to extract_timespan_de within itself can cause infinite recursion if not handled properly. Ensure there's a base case to prevent this.

coderabbitai · 2024-07-18T23:59:51Z

test/unittests/test_parse_es.py

 from lingua_franca.lang.parse_es import extract_datetime_es, is_fractional_es
-from lingua_franca.time import default_timezone
+from lingua_franca.parse import get_gender, extract_datetime, extract_number, normalize, yes_or_no, extract_timespan
+from lingua_franca.time import default_timezone, to_local, DAYS_IN_1_YEAR, DAYS_IN_1_MONTH, TimespanUnit


Remove unused import.

The to_local import from lingua_franca.time is not used in the file.

- from lingua_franca.time import default_timezone, to_local, DAYS_IN_1_YEAR, DAYS_IN_1_MONTH, TimespanUnit + from lingua_franca.time import default_timezone, DAYS_IN_1_YEAR, DAYS_IN_1_MONTH, TimespanUnit

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from lingua_franca.time import default_timezone, to_local, DAYS_IN_1_YEAR, DAYS_IN_1_MONTH, TimespanUnit

from lingua_franca.time import default_timezone, DAYS_IN_1_YEAR, DAYS_IN_1_MONTH, TimespanUnit

Tools

Ruff

25-25: lingua_franca.time.to_local imported but unused

Remove unused import: lingua_franca.time.to_local

(F401)

coderabbitai · 2024-07-18T23:59:51Z

test/unittests/test_parse_es.py

 from lingua_franca.lang.parse_es import extract_datetime_es, is_fractional_es
-from lingua_franca.time import default_timezone
+from lingua_franca.parse import get_gender, extract_datetime, extract_number, normalize, yes_or_no, extract_timespan


Remove unused and redefined imports.

The imports for get_gender, extract_datetime, extract_number, normalize, and yes_or_no are either unused or redefined.

- from lingua_franca.parse import get_gender, extract_datetime, extract_number, normalize, yes_or_no, extract_timespan + from lingua_franca.parse import extract_timespan

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from lingua_franca.parse import get_gender, extract_datetime, extract_number, normalize, yes_or_no, extract_timespan

from lingua_franca.parse import extract_timespan

Tools

Ruff

24-24: lingua_franca.parse.get_gender imported but unused

Remove unused import: lingua_franca.parse.get_gender

(F401)

24-24: Redefinition of unused extract_datetime from line 22

Remove definition: extract_datetime

(F811)

24-24: Redefinition of unused extract_number from line 21

Remove definition: extract_number

(F811)

24-24: Redefinition of unused normalize from line 21

Remove definition: normalize

(F811)

24-24: Redefinition of unused yes_or_no from line 22

Remove definition: yes_or_no

(F811)

coderabbitai · 2024-07-18T23:59:51Z

test/unittests/test_parse_en.py

+ def test_extract_timespan_ambiguous(self):
+ self.assertRaises(ValueError, extract_timespan, "1.3 months",
+ time_unit=TimespanUnit.RELATIVEDELTA)
+ self.assertRaises(ValueError, extract_timespan, "1.3 months",
+ time_unit=TimespanUnit.RELATIVEDELTA_STRICT)
+ self.assertEqual(
+ extract_timespan("1.3 months",
+ time_unit=TimespanUnit.RELATIVEDELTA_FALLBACK),
+ (timedelta(days=1.3 * DAYS_IN_1_MONTH), ""))
+
+
+ # NOTE: for some reason this test fails with
+ # (relativedelta(months=+1, days=+9.126), '') != (relativedelta(months=+1, days=+9.126), '')
+ # correct result is being returned
+
+ #self.assertEqual(
+ # extract_timespan("1.3 months",
+ # resolution=TimespanUnit.RELATIVEDELTA_APPROXIMATE),
+ # (relativedelta(months=1, days=0.3 * DAYS_IN_1_MONTH), ""))
+
+ self.assertEqual(
+ extract_timespan("1.3 months",
+ time_unit=TimespanUnit.RELATIVEDELTA_APPROXIMATE
+ )[0].months, 1)
+ self.assertAlmostEquals(
+ extract_timespan("1.3 months",
+ time_unit=TimespanUnit.RELATIVEDELTA_APPROXIMATE
+ )[0].days, 0.3 * DAYS_IN_1_MONTH)
+


Handle ambiguous timespan extraction gracefully.

The tests for ambiguous timespan extraction should handle exceptions gracefully and ensure accurate results.

- self.assertRaises(ValueError, extract_timespan, "1.3 months", - time_unit=TimespanUnit.RELATIVEDELTA) - self.assertRaises(ValueError, extract_timespan, "1.3 months", - time_unit=TimespanUnit.RELATIVEDELTA_STRICT) - self.assertEqual( - extract_timespan("1.3 months", - time_unit=TimespanUnit.RELATIVEDELTA_FALLBACK), - (timedelta(days=1.3 * DAYS_IN_1_MONTH), "")) + with self.assertRaises(ValueError): + extract_timespan("1.3 months", time_unit=TimespanUnit.RELATIVEDELTA) + with self.assertRaises(ValueError): + extract_timespan("1.3 months", time_unit=TimespanUnit.RELATIVEDELTA_STRICT) + self.assertEqual( + extract_timespan("1.3 months", time_unit=TimespanUnit.RELATIVEDELTA_FALLBACK), + (timedelta(days=1.3 * DAYS_IN_1_MONTH), ""))

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def test_extract_timespan_ambiguous(self):

self.assertRaises(ValueError, extract_timespan, "1.3 months",

time_unit=TimespanUnit.RELATIVEDELTA)

self.assertRaises(ValueError, extract_timespan, "1.3 months",

time_unit=TimespanUnit.RELATIVEDELTA_STRICT)

self.assertEqual(

extract_timespan("1.3 months",

time_unit=TimespanUnit.RELATIVEDELTA_FALLBACK),

(timedelta(days=1.3 * DAYS_IN_1_MONTH), ""))

# NOTE: for some reason this test fails with

# (relativedelta(months=+1, days=+9.126), '') != (relativedelta(months=+1, days=+9.126), '')

# correct result is being returned

#self.assertEqual(

# extract_timespan("1.3 months",

# resolution=TimespanUnit.RELATIVEDELTA_APPROXIMATE),

# (relativedelta(months=1, days=0.3 * DAYS_IN_1_MONTH), ""))

self.assertEqual(

extract_timespan("1.3 months",

time_unit=TimespanUnit.RELATIVEDELTA_APPROXIMATE

)[0].months, 1)

self.assertAlmostEquals(

extract_timespan("1.3 months",

time_unit=TimespanUnit.RELATIVEDELTA_APPROXIMATE

)[0].days, 0.3 * DAYS_IN_1_MONTH)

with self.assertRaises(ValueError):

extract_timespan("1.3 months", time_unit=TimespanUnit.RELATIVEDELTA)

with self.assertRaises(ValueError):

extract_timespan("1.3 months", time_unit=TimespanUnit.RELATIVEDELTA_STRICT)

self.assertEqual(

extract_timespan("1.3 months", time_unit=TimespanUnit.RELATIVEDELTA_FALLBACK),

(timedelta(days=1.3 * DAYS_IN_1_MONTH), ""))

coderabbitai · 2024-07-18T23:59:51Z

test/unittests/test_parse_en.py

+ self.assertEqual(extract_duration("3 months"),
+ (timedelta(days=DAYS_IN_1_MONTH * 3), ""))
+ self.assertEqual(extract_duration("a year"),
+ (timedelta(days=DAYS_IN_1_YEAR), ""))
+ self.assertEqual(extract_duration("1 year"),
+ (timedelta(days=DAYS_IN_1_YEAR * 1), ""))
+ self.assertEqual(extract_duration("5 years"),
+ (timedelta(days=DAYS_IN_1_YEAR * 5), ""))
+ self.assertEqual(extract_duration("a decade"),
+ (timedelta(days=DAYS_IN_1_YEAR * 10), ""))
+ self.assertEqual(extract_duration("1 decade"),
+ (timedelta(days=DAYS_IN_1_YEAR * 10), ""))
+ self.assertEqual(extract_duration("5 decades"),
+ (timedelta(days=DAYS_IN_1_YEAR * 10 * 5), ""))
+ self.assertEqual(extract_duration("1 century"),
+ (timedelta(days=DAYS_IN_1_YEAR * 100), ""))
+ self.assertEqual(extract_duration("a century"),
+ (timedelta(days=DAYS_IN_1_YEAR * 100), ""))
+ self.assertEqual(extract_duration("5 centuries"),
+ (timedelta(days=DAYS_IN_1_YEAR * 100 * 5), ""))
+ self.assertEqual(extract_duration("1 millennium"),
+ (timedelta(days=DAYS_IN_1_YEAR * 1000), ""))
+ self.assertEqual(extract_duration("5 millenniums"),
+ (timedelta(days=DAYS_IN_1_YEAR * 1000 * 5), ""))
+


Consolidate similar test cases.

The test cases for months and years can be consolidated to avoid redundancy and improve maintainability.

- self.assertEqual(extract_duration("3 months"), - (timedelta(days=DAYS_IN_1_MONTH * 3), "")) - self.assertEqual(extract_duration("a year"), - (timedelta(days=DAYS_IN_1_YEAR), "")) - self.assertEqual(extract_duration("1 year"), - (timedelta(days=DAYS_IN_1_YEAR * 1), "")) - self.assertEqual(extract_duration("5 years"), - (timedelta(days=DAYS_IN_1_YEAR * 5), "")) - self.assertEqual(extract_duration("a decade"), - (timedelta(days=DAYS_IN_1_YEAR * 10), "")) - self.assertEqual(extract_duration("1 decade"), - (timedelta(days=DAYS_IN_1_YEAR * 10), "")) - self.assertEqual(extract_duration("5 decades"), - (timedelta(days=DAYS_IN_1_YEAR * 10 * 5), "")) - self.assertEqual(extract_duration("1 century"), - (timedelta(days=DAYS_IN_1_YEAR * 100), "")) - self.assertEqual(extract_duration("a century"), - (timedelta(days=DAYS_IN_1_YEAR * 100), "")) - self.assertEqual(extract_duration("5 centuries"), - (timedelta(days=DAYS_IN_1_YEAR * 100 * 5), "")) - self.assertEqual(extract_duration("1 millennium"), - (timedelta(days=DAYS_IN_1_YEAR * 1000), "")) - self.assertEqual(extract_duration("5 millenniums"), - (timedelta(days=DAYS_IN_1_YEAR * 1000 * 5), "")) + test_cases = [ + ("3 months", timedelta(days=DAYS_IN_1_MONTH * 3)), + ("a year", timedelta(days=DAYS_IN_1_YEAR)), + ("1 year", timedelta(days=DAYS_IN_1_YEAR)), + ("5 years", timedelta(days=DAYS_IN_1_YEAR * 5)), + ("a decade", timedelta(days=DAYS_IN_1_YEAR * 10)), + ("1 decade", timedelta(days=DAYS_IN_1_YEAR * 10)), + ("5 decades", timedelta(days=DAYS_IN_1_YEAR * 10 * 5)), + ("1 century", timedelta(days=DAYS_IN_1_YEAR * 100)), + ("a century", timedelta(days=DAYS_IN_1_YEAR * 100)), + ("5 centuries", timedelta(days=DAYS_IN_1_YEAR * 100 * 5)), + ("1 millennium", timedelta(days=DAYS_IN_1_YEAR * 1000)), + ("5 millenniums", timedelta(days=DAYS_IN_1_YEAR * 1000 * 5)) + ] + for text, expected in test_cases: + self.assertEqual(extract_duration(text), (expected, ""))

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

self.assertEqual(extract_duration("3 months"),

(timedelta(days=DAYS_IN_1_MONTH * 3), ""))

self.assertEqual(extract_duration("a year"),

(timedelta(days=DAYS_IN_1_YEAR), ""))

self.assertEqual(extract_duration("1 year"),

(timedelta(days=DAYS_IN_1_YEAR * 1), ""))

self.assertEqual(extract_duration("5 years"),

(timedelta(days=DAYS_IN_1_YEAR * 5), ""))

self.assertEqual(extract_duration("a decade"),

(timedelta(days=DAYS_IN_1_YEAR * 10), ""))

self.assertEqual(extract_duration("1 decade"),

(timedelta(days=DAYS_IN_1_YEAR * 10), ""))

self.assertEqual(extract_duration("5 decades"),

(timedelta(days=DAYS_IN_1_YEAR * 10 * 5), ""))

self.assertEqual(extract_duration("1 century"),

(timedelta(days=DAYS_IN_1_YEAR * 100), ""))

self.assertEqual(extract_duration("a century"),

(timedelta(days=DAYS_IN_1_YEAR * 100), ""))

self.assertEqual(extract_duration("5 centuries"),

(timedelta(days=DAYS_IN_1_YEAR * 100 * 5), ""))

self.assertEqual(extract_duration("1 millennium"),

(timedelta(days=DAYS_IN_1_YEAR * 1000), ""))

self.assertEqual(extract_duration("5 millenniums"),

(timedelta(days=DAYS_IN_1_YEAR * 1000 * 5), ""))

test_cases = [

("3 months", timedelta(days=DAYS_IN_1_MONTH * 3)),

("a year", timedelta(days=DAYS_IN_1_YEAR)),

("1 year", timedelta(days=DAYS_IN_1_YEAR)),

("5 years", timedelta(days=DAYS_IN_1_YEAR * 5)),

("a decade", timedelta(days=DAYS_IN_1_YEAR * 10)),

("1 decade", timedelta(days=DAYS_IN_1_YEAR * 10)),

("5 decades", timedelta(days=DAYS_IN_1_YEAR * 10 * 5)),

("1 century", timedelta(days=DAYS_IN_1_YEAR * 100)),

("a century", timedelta(days=DAYS_IN_1_YEAR * 100)),

("5 centuries", timedelta(days=DAYS_IN_1_YEAR * 100 * 5)),

("1 millennium", timedelta(days=DAYS_IN_1_YEAR * 1000)),

("5 millenniums", timedelta(days=DAYS_IN_1_YEAR * 1000 * 5))

]

for text, expected in test_cases:

self.assertEqual(extract_duration(text), (expected, ""))

coderabbitai · 2024-07-18T23:59:51Z

lingua_franca/lang/parse_es.py

+def extract_duration_es(text):
+ """
+ Convert an spanish phrase into a number of seconds
+ Convert things like:
+ "10 Minutos"
+ "3 dias 8 horas 10 Minutos e 49 Segundos"
+ into an int, representing the total number of seconds.
+ The words used in the duration will be consumed, and
+ the remainder returned.
+ As an example, "set a timer for 5 minutes" would return
+ (300, "set a timer for").
+ Args:
+ text (str): string containing a duration
+ Returns:
+ (timedelta, str):
+ A tuple containing the duration and the remaining text
+ not consumed in the parsing. The first value will
+ be None if no duration is found. The text returned
+ will have whitespace stripped from the ends.
+ """
+ if not text:
+ return None
+
+ text = text.lower().replace("í", "i")
+ time_units = {
+ 'microseconds': 'microsegundos',
+ 'milliseconds': 'milisegundos',
+ 'seconds': 'segundos',
+ 'minutes': 'minutos',
+ 'hours': 'horas',
+ 'days': 'dias',
+ 'weeks': 'semanas'
+ }
+ non_std_un = {
+ "months": "mes",
+ "years": "anos",
+ 'decades': "decadas",
+ 'centurys': "siglos",
+ 'millenniums': "milenios"
+ }
+
+ pattern = r"(?P<value>\d+(?:\.?\d+)?)(?:\s+|\-){unit}[s]?"
+
+ text = text.replace("í", "i").replace("é", "e").replace("ñ", "n").replace("meses", "mes")
+ text = SpanishNormalizer().numbers_to_digits(text)
+
+ for (unit_en, unit_es) in time_units.items():
+ unit_pattern = pattern.format(
+ unit=unit_es[:-1]) # remove 's' from unit
+ time_units[unit_en] = 0
+
+ def repl(match):
+ time_units[unit_en] += float(match.group(1))
+ return ''
+
+ text = re.sub(unit_pattern, repl, text)
+
+ for (unit_en, unit_es) in non_std_un.items():
+ unit_pattern = pattern.format(
+ unit=unit_es[:-1]) # remove 's' from unit
+
+ def repl_non_std(match):
+ val = float(match.group(1))
+ if unit_en == "months":
+ val = DAYS_IN_1_MONTH * val
+ if unit_en == "years":
+ val = DAYS_IN_1_YEAR * val
+ if unit_en == "decades":
+ val = 10 * DAYS_IN_1_YEAR * val
+ if unit_en == "centurys":
+ val = 100 * DAYS_IN_1_YEAR * val
+ if unit_en == "millenniums":
+ val = 1000 * DAYS_IN_1_YEAR * val
+ time_units["days"] += val
+ return ''
+
+ text = re.sub(unit_pattern, repl_non_std, text)
+
+ text = text.strip()
+ duration = timedelta(**time_units) if any(time_units.values()) else None
+
+ return (duration, text)


Ensure re is imported.

The re module is used but not imported.

+ import re

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def extract_duration_es(text):

"""

Convert an spanish phrase into a number of seconds

Convert things like:

"10 Minutos"

"3 dias 8 horas 10 Minutos e 49 Segundos"

into an int, representing the total number of seconds.

The words used in the duration will be consumed, and

the remainder returned.

As an example, "set a timer for 5 minutes" would return

(300, "set a timer for").

Args:

text (str): string containing a duration

Returns:

(timedelta, str):

A tuple containing the duration and the remaining text

not consumed in the parsing. The first value will

be None if no duration is found. The text returned

will have whitespace stripped from the ends.

"""

if not text:

return None

text = text.lower().replace("í", "i")

time_units = {

'microseconds': 'microsegundos',

'milliseconds': 'milisegundos',

'seconds': 'segundos',

'minutes': 'minutos',

'hours': 'horas',

'days': 'dias',

'weeks': 'semanas'

}

non_std_un = {

"months": "mes",

"years": "anos",

'decades': "decadas",

'centurys': "siglos",

'millenniums': "milenios"

}

pattern = r"(?P<value>\d+(?:\.?\d+)?)(?:\s+|\-){unit}[s]?"

text = text.replace("í", "i").replace("é", "e").replace("ñ", "n").replace("meses", "mes")

text = SpanishNormalizer().numbers_to_digits(text)

for (unit_en, unit_es) in time_units.items():

unit_pattern = pattern.format(

unit=unit_es[:-1]) # remove 's' from unit

time_units[unit_en] = 0

def repl(match):

time_units[unit_en] += float(match.group(1))

return ''

text = re.sub(unit_pattern, repl, text)

for (unit_en, unit_es) in non_std_un.items():

unit_pattern = pattern.format(

unit=unit_es[:-1]) # remove 's' from unit

def repl_non_std(match):

val = float(match.group(1))

if unit_en == "months":

val = DAYS_IN_1_MONTH * val

if unit_en == "years":

val = DAYS_IN_1_YEAR * val

if unit_en == "decades":

val = 10 * DAYS_IN_1_YEAR * val

if unit_en == "centurys":

val = 100 * DAYS_IN_1_YEAR * val

if unit_en == "millenniums":

val = 1000 * DAYS_IN_1_YEAR * val

time_units["days"] += val

return ''

text = re.sub(unit_pattern, repl_non_std, text)

text = text.strip()

duration = timedelta(**time_units) if any(time_units.values()) else None

return (duration, text)

def extract_duration_es(text):

"""

Convert an spanish phrase into a number of seconds

Convert things like:

"10 Minutos"

"3 dias 8 horas 10 Minutos e 49 Segundos"

into an int, representing the total number of seconds.

The words used in the duration will be consumed, and

the remainder returned.

As an example, "set a timer for 5 minutes" would return

(300, "set a timer for").

Args:

text (str): string containing a duration

Returns:

(timedelta, str):

A tuple containing the duration and the remaining text

not consumed in the parsing. The first value will

be None if no duration is found. The text returned

will have whitespace stripped from the ends.

"""

if not text:

return None

text = text.lower().replace("í", "i")

time_units = {

'microseconds': 'microsegundos',

'milliseconds': 'milisegundos',

'seconds': 'segundos',

'minutes': 'minutos',

'hours': 'horas',

'days': 'dias',

'weeks': 'semanas'

}

non_std_un = {

"months": "mes",

"years": "anos",

'decades': "decadas",

'centurys': "siglos",

'millenniums': "milenios"

}

pattern = r"(?P<value>\d+(?:\.?\d+)?)(?:\s+|\-){unit}[s]?"

text = text.replace("í", "i").replace("é", "e").replace("ñ", "n").replace("meses", "mes")

text = SpanishNormalizer().numbers_to_digits(text)

for (unit_en, unit_es) in time_units.items():

unit_pattern = pattern.format(

unit=unit_es[:-1]) # remove 's' from unit

time_units[unit_en] = 0

def repl(match):

time_units[unit_en] += float(match.group(1))

return ''

text = re.sub(unit_pattern, repl, text)

for (unit_en, unit_es) in non_std_un.items():

unit_pattern = pattern.format(

unit=unit_es[:-1]) # remove 's' from unit

def repl_non_std(match):

val = float(match.group(1))

if unit_en == "months":

val = DAYS_IN_1_MONTH * val

if unit_en == "years":

val = DAYS_IN_1_YEAR * val

if unit_en == "decades":

val = 10 * DAYS_IN_1_YEAR * val

if unit_en == "centurys":

val = 100 * DAYS_IN_1_YEAR * val

if unit_en == "millenniums":

val = 1000 * DAYS_IN_1_YEAR * val

time_units["days"] += val

return ''

text = re.sub(unit_pattern, repl_non_std, text)

text = text.strip()

duration = timedelta(**time_units) if any(time_units.values()) else None

return (duration, text)

Tools

Ruff

1128-1128: Function definition does not bind loop variable unit_en

(B023)

1131-1131: re may be undefined, or defined from star imports

(F405)

1139-1139: Function definition does not bind loop variable unit_en

(B023)

1141-1141: Function definition does not bind loop variable unit_en

(B023)

1143-1143: Function definition does not bind loop variable unit_en

(B023)

1145-1145: Function definition does not bind loop variable unit_en

(B023)

1147-1147: Function definition does not bind loop variable unit_en

(B023)

1152-1152: re may be undefined, or defined from star imports

(F405)

extract_timespan

d595559

JarbasAl added the enhancement New feature or request label Nov 26, 2022

JarbasAl requested a review from NeonDaniel November 26, 2022 18:39

rename argument for clarity

37abb70

JarbasAl force-pushed the feat/extract_timespan branch from 29e2bd3 to 37abb70 Compare November 26, 2022 19:02

JarbasAl added 9 commits November 26, 2022 19:10

fix test

8b3aadb

extract_duration pt + extract_timespan fallback

d21ca2f

extract_duration es

fd71f41

extract_duration en - month/year/decade/century/millenium support

57ca94f

extract_duration es/pt - month/year/decade/century/millenium support

7b8a17e

extract_duration pt - fix normalization

5e3985c

extract_duration es - í vs i

16035e9

more extract_duration pt unittests

0ec21f6

extract_duration es unittests

ba487e2

NeonDaniel reviewed Nov 28, 2022

View reviewed changes

add german extract_timespan (#44)

dead6ff

JarbasAl mentioned this pull request Dec 27, 2022

feat/extract_duration_pt_es #48

Merged

JarbasAl marked this pull request as draft January 4, 2023 12:55

coderabbitai bot reviewed Jul 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/extract timespan #37

Feat/extract timespan #37

JarbasAl commented Nov 26, 2022 •

edited by coderabbitai bot

Loading

codecov bot commented Nov 26, 2022 •

edited

Loading

NeonDaniel Nov 28, 2022

JarbasAl Nov 29, 2022

JarbasAl commented Jul 18, 2024

coderabbitai bot commented Jul 18, 2024

coderabbitai bot commented Jul 18, 2024 •

edited

Loading

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Jul 18, 2024

coderabbitai bot Jul 18, 2024 •

edited

Loading

JarbasAl Jul 19, 2024

coderabbitai bot Jul 19, 2024

coderabbitai bot Jul 18, 2024 •

edited

Loading

JarbasAl Jul 19, 2024

coderabbitai bot Jul 19, 2024

coderabbitai bot Jul 18, 2024

coderabbitai bot Jul 18, 2024

coderabbitai bot Jul 18, 2024

coderabbitai bot Jul 18, 2024

coderabbitai bot Jul 18, 2024

coderabbitai bot Jul 18, 2024

coderabbitai bot Jul 18, 2024

	from lingua_franca.time import default_timezone, to_local, DAYS_IN_1_YEAR, DAYS_IN_1_MONTH, TimespanUnit
	from lingua_franca.time import default_timezone, DAYS_IN_1_YEAR, DAYS_IN_1_MONTH, TimespanUnit

	from lingua_franca.parse import get_gender, extract_datetime, extract_number, normalize, yes_or_no, extract_timespan
	from lingua_franca.parse import extract_timespan

Feat/extract timespan #37

Are you sure you want to change the base?

Feat/extract timespan #37

Conversation

JarbasAl commented Nov 26, 2022 • edited by coderabbitai bot Loading

Summary by CodeRabbit

codecov bot commented Nov 26, 2022 • edited Loading

Codecov Report

NeonDaniel Nov 28, 2022

Choose a reason for hiding this comment

JarbasAl Nov 29, 2022

Choose a reason for hiding this comment

JarbasAl commented Jul 18, 2024

coderabbitai bot commented Jul 18, 2024

coderabbitai bot commented Jul 18, 2024 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jul 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 18, 2024 • edited Loading

Choose a reason for hiding this comment

JarbasAl Jul 19, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 19, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 18, 2024 • edited Loading

Choose a reason for hiding this comment

JarbasAl Jul 19, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 19, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 18, 2024

Choose a reason for hiding this comment

JarbasAl commented Nov 26, 2022 •

edited by coderabbitai bot

Loading

codecov bot commented Nov 26, 2022 •

edited

Loading

coderabbitai bot commented Jul 18, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coderabbitai bot Jul 18, 2024 •

edited

Loading

coderabbitai bot Jul 18, 2024 •

edited

Loading