Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make some string comparisons case insensitive #707

Open
ysuarez opened this issue Nov 1, 2023 · 5 comments
Open

Make some string comparisons case insensitive #707

ysuarez opened this issue Nov 1, 2023 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@ysuarez
Copy link
Contributor

ysuarez commented Nov 1, 2023

noticed some bits of code using the Python built-in string methods of .startswith() and .endswith() that are case sensitive. Though for some of those situations it may be better the checks were case insensitive.

The check are mostly related to checking if a “file” field value starts with “http” and at least one situation were we check if a file suffix ends in “.xslx.” Usually these strings will be lowercase, but we may find situations where the incoming data is upper case or mix case.

For example,

islandora_workbench/workbench

Lines 2123 to 2126 in b38d37e

if config['task'] != 'create_from_files' and config['input_csv'].startswith('http') is True:
get_csv_from_google_sheet(config)
if config['task'] != 'create_from_files' and config['input_csv'].endswith('.xlsx') is True:
get_csv_from_excel(config)

I have a WIP PR that just adds a call to .lower() in the method chaining, to address this issues. Let me know this is a change that I should implement to all the relevant .startswith() and .endswith() uses.

@mjordan
Copy link
Owner

mjordan commented Nov 1, 2023

Thanks for catching this @ysuarez , I'll when you're ready.

@ysuarez ysuarez mentioned this issue Nov 2, 2023
7 tasks
@ysuarez
Copy link
Contributor Author

ysuarez commented Nov 2, 2023

I just put in the PR, but I realized my code was not 100% ready so I reverted a few things from this first PR. Though I seem to be correctly catching an .xlsx suffix with uppercase letters.

Need to work more to account for uppercase http prefix in 1) "file" values and 2) Google Sheets URLs. I will create another PR(s) next week.

@ysuarez
Copy link
Contributor Author

ysuarez commented Nov 12, 2023

@mjordan I just pushed a new commit to PR #708 to make the check of an URL in input_csvcase-insensitive when the code just checks the 'http' prefix. The code passed the local tests I ran.

I will need another week or so to work on making the various metadata.csv file column 'http' prefix checks case-insentive.

Would you like to merge these changes so far and I can make another PR, or should I keep working on making more 'http' check case-insensitive and then we merge? Thanks

@mjordan mjordan added the bug Something isn't working label Nov 23, 2023
@ysuarez
Copy link
Contributor Author

ysuarez commented Dec 11, 2023

@mjordan I just updated #708 to add the final set uses of lower() in front of .startswith('http') and did some trial runs with mix case uses of 'http' and the use of lower() worked.

I think this can be merged now. Let em know what changes you would like to me to make.

@mjordan
Copy link
Owner

mjordan commented Dec 11, 2023

Thanks, I'll review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants