Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Club names and aliases not correctly mapped at teamname_replacements.json #703

Open
4 of 12 tasks
MartiONE opened this issue Sep 12, 2024 · 2 comments · May be fixed by #755
Open
4 of 12 tasks

Club names and aliases not correctly mapped at teamname_replacements.json #703

MartiONE opened this issue Sep 12, 2024 · 2 comments · May be fixed by #755
Labels
bug Something isn't working

Comments

@MartiONE
Copy link

MartiONE commented Sep 12, 2024

Describe the bug
The bug appears when you try to call read_team_history from ClubElo. The file that contains aliases for team names that might differ from the ClubElo website gets correctly loaded and stored inside the _config.py variable TEAMNAME_REPLACEMENTS.
However, whenever the function read_team_history inside the ClubElo class tries to filter the names to process does it in the reverse way, the problematic line is this one

Affected scrapers
This affects the following scrapers:

  • ClubElo
  • ESPN
  • FBref
  • FiveThirtyEight
  • FotMob
  • Match History
  • SoFIFA
  • Understat
  • WhoScored

Code example
Considering you have a minimal teamname_replacements.json like

{"Tottenham": ["Tottenham Hotspur", "Tottenham Hotspur FC", "Spurs"]}

and then you run

import soccerdata as sd
elo = sd.ClubElo()
elo.read_team_history(team="Spurs")

Error message

ValueError                                Traceback (most recent call last)
Cell In[4], [line 1](vscode-notebook-cell:?execution_count=4&line=1)
----> [1](vscode-notebook-cell:?execution_count=4&line=1) elo.read_team_history(team="Spurs")

File ~/jupyter-env/venv/lib/python3.11/site-packages/soccerdata/clubelo.py:179, in ClubElo.read_team_history(self, team, max_age)
    [176](https://vscode-remote+ssh-002dremote-002b192-002e168-002e1-002e136.vscode-resource.vscode-cdn.net/home/jupyter/jupyter-env/notebooks/~/jupyter-env/venv/lib/python3.11/site-packages/soccerdata/clubelo.py:176)         df.replace({"team": TEAMNAME_REPLACEMENTS}, inplace=True)
    [177](https://vscode-remote+ssh-002dremote-002b192-002e168-002e1-002e136.vscode-resource.vscode-cdn.net/home/jupyter/jupyter-env/notebooks/~/jupyter-env/venv/lib/python3.11/site-packages/soccerdata/clubelo.py:177)         return df
--> [179](https://vscode-remote+ssh-002dremote-002b192-002e168-002e1-002e136.vscode-resource.vscode-cdn.net/home/jupyter/jupyter-env/notebooks/~/jupyter-env/venv/lib/python3.11/site-packages/soccerdata/clubelo.py:179) raise ValueError(f"No data found for team {team}")

ValueError: No data found for team Spurs

Additional context
The same scenario can be also happenning in more dedicated classes, I did not check in depth: Here, here and here
Contributor Action Plan

  • I can fix this issue and will submit a pull request.
  • I’m unsure how to fix this, but I'm willing to work on it with guidance.
  • I’m not able to fix this issue.

This is a trivial change but an important one, I can also fix the tests or make them check the file. Also, I'd suggest we don't use one liners with variables like k and v as those tend to be hard to debug.

@MartiONE MartiONE added the bug Something isn't working label Sep 12, 2024
@MartiONE
Copy link
Author

Hey @probberechts , am I in the clear to provide a fix for this? :)

@probberechts
Copy link
Owner

Great catch! It would be great if you provide a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants