Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to specify the search term for a tv show #175

Open
JorTurFer opened this issue Feb 4, 2022 · 14 comments
Open

Add support to specify the search term for a tv show #175

JorTurFer opened this issue Feb 4, 2022 · 14 comments

Comments

@JorTurFer
Copy link

Nefarious monitor the availability of new episodes for the TV shows and this is awesome feature ❤️
The problem is sometimes that in non-English tackers, sometimes the uploaders names the torrents with the Spanish translation and other with the English name. In these cases, it'd be nice if I can override the search term for this specific tv show.

An example, The Legend of Vox Machina is translated into La leyenda de Vox Machina. Nefarious try to look for La leyenda de Vox Machina but the uploader uploaded all the show using the original name instead of the translated name. For movies this is not a problem because I go manually and select the torrent, but in tv shows, this means that I have to add all the torrents one by one, and I lose also the capability of automatic tracking for new episodes.

Thanks for this awesome tool! ❤️ ❤️ ❤️ ❤️

@lardbit
Copy link
Owner

lardbit commented Feb 4, 2022

Hey @JorTurFer,

Yeah that certainly makes sense. I'm trying to think of the best way to solve this. I'd like to avoid having users manually editing records.

TMDB returns the language-specific results, but it also include the original_name. So, when searching for The Legend of Vox Machina, it returns "name": "La leyenda de Vox Machina" and "original_name": "The Legend of Vox Machina".

So, there could just be a checkbox for shows/movies titled "use original title when searching" and then nefarious would use the original_name vs name result when searching torrents.

Would that suffice your use case?

Example requesting Spanish results:

https://api.themoviedb.org/3/search/tv?api_key=21c8985a267ac3f11ea75baf2c05c3ba&query=The%20Legend%20of%20Vox%20Machina&language=en

{
    "page": 1,
    "results": [{
        "backdrop_path": "/lX33BV2g6O2B6PwMtTUSyzGrfq9.jpg",
        "first_air_date": "2022-01-27",
        "genre_ids": [
            16,
            10765
        ],
        "id": 135934,
        "name": "La leyenda de Vox Machina",
        "origin_country": [
            "US"
        ],
        "original_language": "en",
        "original_name": "The Legend of Vox Machina",
        "overview": "Son un hatajo de pendencieros inadaptados reconvertidos en mercenarios. A Vox Machina le interesa más el dinero fácil y la cerveza barata que proteger el reino. Pero cuando este se ve amenazado por algo maligno, esta bulliciosa panda se da cuenta de que nadie más puede restablecer la justicia. Lo que empezó como un día de pago más es ahora la historia del origen de los nuevos héroes de Exandria.",
        "popularity": 178.913,
        "poster_path": "/4fqfhmVNOHe2nLcligiVMtMnfeM.jpg",
        "vote_average": 8.8,
        "vote_count": 28
    }],
    "total_pages": 1,
    "total_results": 1
}

@JorTurFer
Copy link
Author

JorTurFer commented Feb 4, 2022

That's perfect for my use case :)

@lardbit
Copy link
Owner

lardbit commented Feb 4, 2022

Great. I don't think this will be a difficult implementation. I'll leave this ticket open until I have time to work on it.

@JorTurFer
Copy link
Author

Thanks!! No rush at all 😄

@lardbit
Copy link
Owner

lardbit commented Feb 14, 2022

I think I spoke too soon on the difficulty of this task. All the torrent name parsing logic (borrowed from sonarr/radarr) is largely expecting Latin-based languages which is obviously very limiting. In addition, I'm noticing foreign language films (relative to usa) include the original title and the english title which is a separate challenge in parsing out the titles since it includes both.

For instance, searching for the movie Parasite (which the original korean title is 기생충 returns results like:

기생충 Parasite (2019) (2160p BluRay x265 HEVC 10bit HDR AAC 7.​1 Bandi)

@JorTurFer
Copy link
Author

in that case no worries, thanks for trying it ❤️
WDYT about the option of specifying the name manually? Maybe it's easier
If not, don't worry, as I said, you tried it :) You are doing an awesome system

@lardbit
Copy link
Owner

lardbit commented Feb 14, 2022

I think specifying the title manually would have the same challenges. For instance, if you're searching for the original title for Parasite (e.g 기생충), then nefarious would have to parse results like 기생충 Parasite (2019) (2160p BluRay x265 HEVC 10bit HDR AAC 7.​1 Bandi) and ignore the title Parasite to match the original title. It gets a little tricky. Maybe I'm missing something, though. Do you have an idea to solve this?

@JorTurFer
Copy link
Author

But you could ignore the current value if another value has been specified. I mean, if I specify 기생충 because I know that my tracker uses original name, nefarious could ignore Parasite from the search and searches only with the provided name {givenName} (2160p BluRay x265 HEVC 10bit HDR AAC 7.​1 Bandi).
Provided name can replace the original and it's not necessary of any extra parsing

@lardbit
Copy link
Owner

lardbit commented Feb 14, 2022

That's true. Maybe it could be that simple. So, if the search original name option is chosen, we'd just need to strip out any matching translated part. If we're looking for the japanese film Spirited Away (original title 千と千尋の神隠し) and a search result was:

劇場版 千と千尋の神隠し Spirited.Away (Sen to Chihiro no Kamikakushi) (BD 1280x720p AVC AACx9 Subx7).​mp4 [encoded by SEED] (Jap,Eng,Fre,Ger,Fin,Kor,Chi)

We'd have to remove Spirited.Away (any other word separator variation).

I'll give this a test and see how well it does.

@lardbit
Copy link
Owner

lardbit commented Feb 14, 2022

Well, shoot. I just tested the current parsing logic in nefarious and it doesn't match anything for the above result, with Spirited Away removed. nefarious has a command line parsing utility to tell you what it parses and it unfortunately didn't work:

python manage.py  re-test-movie "劇場版 千と千尋の神隠し (Sen to Chihiro no Kamikakushi) (BD 1280x720p AVC AACx9 Subx7).​mp4 [encoded by SEED] (Jap,Eng,Fre,Ger,Fin,Kor,Chi)"

Returns None.

I'm assuming we'd have to update the parsing logic and I haven't dug into that yet. Maybe it's a non-latinlanguage issue?

Parsing logic for movies:
https://github.com/lardbit/nefarious/blob/master/src/nefarious/parsers/movie.py

@JorTurFer
Copy link
Author

Let me try during the day using that name with the serie that I say (the legend of vox machina)
Should I append the quality to the name? I mean, how should be the command? (sorry, I'm totally a noob with python)

@JorTurFer
Copy link
Author

Something like python manage.py re-test-movie "La Legenda de Vox Machina (BD 1280x720p AVC AACx9 Subx7).​mp4 [encoded by SEED]" ?

@lardbit
Copy link
Owner

lardbit commented Feb 15, 2022

Yeah you have the command right. I usually just manually run a search against jackett to find real responses to test with, and then use the title against the command line utility to see how nefarious parses the title. I found out a couple things, most of the parsers expects a year in the title which is why the previous examples weren't working. Secondly, a while ago I added a unicode "transliteration" to ASCII which aimed to successfully match non-original ascii titles produced by indexers. (e.g Ö becomes O). However, that is messing up our intention here. We could conditionally disable that transliteration when the search original title option is enabled.

Here's a quick way to setup your local python environment to be able to run the command line utility:

Change to nefarious source directory

cd nefarious/src

Create new python virtual environment (in /tmp)

python3 -mvenv /tmp/nefarious

Install python dependencies:

/tmp/nefarious/bin/pip install -r requirements.txt

Parse 1:

/tmp/nefarious/bin/python manage.py re-test-movie "La Legenda de Vox Machina 2022"

{'title': 'la legenda de vox machina', 'year': ['2013'], 'match_name': 'Normal movie format, e.g: Mission.Impossible.3.2011', 'quality': 'Bluray-720p', 'resolution': 'unknown', 'hc': False}

Parse 2 without transliteration (just return title in function):

/tmp/nefarious/bin/python manage.py re-test-movie "劇場版 千と千尋の神隠し 2013 (BD 1280x720p AVC AACx9 Subx7).​mp4 [encoded by SEED]"

{'title': '劇場版 千と千尋の神隠し', 'year': ['2013'], 'match_name': 'Normal movie format, e.g: Mission.Impossible.3.2011', 'quality': 'Bluray-720p', 'resolution': 'unknown', 'hc': False}

Movie Parsers
https://github.com/lardbit/nefarious/blob/master/src/nefarious/parsers/movie.py

@lardbit
Copy link
Owner

lardbit commented Feb 15, 2022

So long story short, maybe by disabling transliteration when searching original titles may do the trick. I'll investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants