This repository covers some developed/in development snippets used for machine learning in Python, such as: CSV to ARFF, CSV percentage split.
1. CSV percentage split (supervised learning): insert a CSV file with semicolon delimitators and split proportionally to each true label, exporting two files - one for training (normally 80%) and another for testing (normally 20%).
1. Set permissions to split-supervised-learning.py to run:
$ chmod +x split-supervised-learning.py
2. Insert the CSV file to be splitted inside /data/raw.
3. Make sure the delimitators for each value is a semicolon and the header titles are between quotes.
4. Run python script with the true label name, the file name from /data/raw (without .csv extension), the training-rate and the testing rate. Examples:
$ ./split-supervised-learning.py lettr letters 80 20
$ ./split-supervised-learning.py quality wine 90 10
KAWASAKI, Davi // davishinjik [at] gmail.com
FLAUSINO, Matheus // matheus.negocio [at] gmail.com
Feel free to contact or pull request me to any relevant updates you may enquire:
KAWASAKI, Davi // davishinjik [at] gmail.com