Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About re-training #2

Open
zqlsnr opened this issue May 9, 2024 · 1 comment
Open

About re-training #2

zqlsnr opened this issue May 9, 2024 · 1 comment

Comments

@zqlsnr
Copy link

zqlsnr commented May 9, 2024

Hi, dennisvdang! Thank you for open-sourcing your project. I have tried the model and it works really well. Additionally, I have a question. If I want to re-train the model on the harmonixset dataset, could you please guide me on how to do it? There seems to be an issue with the CSV files under ./data. When I run the Preprocessing, I find that the fields for TrackName, Artists, and Genre are all empty. Do you have any suggestions to resolve this issue?

@dennisvdang
Copy link
Owner

Hi, I'm glad to hear you were able to try the model successfully! Re-training the model on the harmonixset dataset sounds like a great idea and i'd be interested in seeing how it turns out.

Regarding the preprocessing steps, the only "necessary" step is the silence trimming, which can be accomplished using this function:

from functools import reduce
from pydub import AudioSegment
from pydub.silence import detect_nonsilent

def strip_silence(audio_path):
    """Removes silent parts from an audio file."""
    sound = AudioSegment.from_file(audio_path)
    nonsilent_ranges = detect_nonsilent(
        sound, min_silence_len=500, silence_thresh=-50)
    stripped = reduce(lambda acc, val: acc + sound[val[0]:val[1]],
                      nonsilent_ranges, AudioSegment.empty())
    stripped.export(audio_path, format='mp3')

The other preprocessing steps were to manage audio file metadata in case I wanted to use that information later (but I didn't end up using it). So all I really did for preprocessing was clean up the audio by removing any leading or trailing silences. Anything else can be adjusted to match your dataset/dataframe structure preferences. I'll also revisit the preprocessing notebook and make a more generalized version of it. Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants