About re-training #2

zqlsnr · 2024-05-09T01:25:46Z

Hi, dennisvdang! Thank you for open-sourcing your project. I have tried the model and it works really well. Additionally, I have a question. If I want to re-train the model on the harmonixset dataset, could you please guide me on how to do it? There seems to be an issue with the CSV files under ./data. When I run the Preprocessing, I find that the fields for TrackName, Artists, and Genre are all empty. Do you have any suggestions to resolve this issue?

dennisvdang · 2024-05-10T19:21:07Z

Hi, I'm glad to hear you were able to try the model successfully! Re-training the model on the harmonixset dataset sounds like a great idea and i'd be interested in seeing how it turns out.

Regarding the preprocessing steps, the only "necessary" step is the silence trimming, which can be accomplished using this function:

from functools import reduce
from pydub import AudioSegment
from pydub.silence import detect_nonsilent

def strip_silence(audio_path):
    """Removes silent parts from an audio file."""
    sound = AudioSegment.from_file(audio_path)
    nonsilent_ranges = detect_nonsilent(
        sound, min_silence_len=500, silence_thresh=-50)
    stripped = reduce(lambda acc, val: acc + sound[val[0]:val[1]],
                      nonsilent_ranges, AudioSegment.empty())
    stripped.export(audio_path, format='mp3')

The other preprocessing steps were to manage audio file metadata in case I wanted to use that information later (but I didn't end up using it). So all I really did for preprocessing was clean up the audio by removing any leading or trailing silences. Anything else can be adjusted to match your dataset/dataframe structure preferences. I'll also revisit the preprocessing notebook and make a more generalized version of it. Hope this helps!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About re-training #2

About re-training #2

zqlsnr commented May 9, 2024

dennisvdang commented May 10, 2024

About re-training #2

About re-training #2

Comments

zqlsnr commented May 9, 2024

dennisvdang commented May 10, 2024