Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Categories - fixed pipeline #308

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

provinzio
Copy link
Contributor

@provinzio provinzio commented May 30, 2024

Issue #, if available: replaces #274

Description of changes: fixed the pipeline of the other PR by converting the encoding of categories.yaml-file to utf-8

I removed old category.yaml file and replace with the complete category list, I scraped from kleinanzeigen.de.
These should be all categories as of March 2024.
I created a category tree therefore all leaf categories are included in the yaml-file (with name and path).

I haven't tested the categeories, just fixed the pipeline.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

cooukiez and others added 3 commits March 14, 2024 16:34
Delete old categories file for new file
I have collected all categories from kleinanzeigen.de
Format is:
category_name:
    subcategory_name:
         subsubcategory_name: path
...
@sebthom
Copy link
Contributor

sebthom commented May 30, 2024

I don't think this works. Categories are currently expected to be a flat structure, not a tree.

I am not too big of a fan of the category update like this anyways as it may breaks existing ads.

@provinzio
Copy link
Contributor Author

I have actually no idea how the categories work, but I just did some quick regex search an replace magic to flatten the structure.

Hope it helps.

@provinzio
Copy link
Contributor Author

Ah ok, now I see your point... I'll reformat the categories.yaml

@provinzio provinzio marked this pull request as draft May 30, 2024 19:22
@provinzio
Copy link
Contributor Author

provinzio commented May 30, 2024

Ok, I added all missing categories from #274 at the bottom.

I haven't sorted them in, because that could be too much of a pain.

I was quite random when changing the keys for new duplicated entries. Please have look and tell me what you think about it.

Easter egg: Reciever has multiple different typos but is never written without typos (reciver, receiver)

@provinzio provinzio marked this pull request as ready for review May 30, 2024 20:54
@Vel-San
Copy link

Vel-San commented Jun 22, 2024

@provinzio Good approach!

However, the bot is still unable to publish ads even with this PR. I am still getting random errors for Art e.g.

[ERROR] TimeoutError: Failed to set special attribute [art_s]

For category

special_attributes:
  art_s: sonstige
  condition_s: like_new
  versand_s: f

I think the problem is with the download option, where the category is not saved like the ones you have updated.
It requires mapping for it to work properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants