Skip to content

Zappandy/spoken_language_detector

Repository files navigation

Project

Instructions

Must make dir called model_output

Dataset

spoken language dataset. Be mindful that it's a 16 GB dataset

Extra optional tools

You can check this thread of jupyter lab vs. jupyter notebooks

pip3 install jupyterlab
pip3 install notebook

These tools were only used to run .ipynb files and facilitate visualizations!

Jupyterlab guide

Python script to jupyter-notebook converter

Exploratory steps

How did we clean up the files?

ls <dir> | grep -o '.....$' | uniq
<dir> | grep -o '^es.*'  # finds the spanish ones

For our work, we used the test set found in local dirs such as

/media/andres/2D2DA2454B8413B5/test/test/

The final version is the file_cleaner script found in this dir. That one copies the Spanish files to a new given dir as its second argument

Theory

Tutorial on mel spectograms

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages