This is the project repo for the Courera course Getting and Cleaning Data.
The project have the following files/folder:
-
README.md - this file
-
codebook.md - code book describing the tidy data set
-
run_analysis.R - R script to process the data and transform it to a tidy data set
-
UCI HAR Dataset/ - data downloaded from https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip. See the README.txt file in the folder for the description of this raw data set.
The script file does the following steps:
- Read the X_train file
- Read subject_train file
- Read y_train file
- Read feautres file
- Substitute column names of X_train file with features
- Combine subject_train, y_train and X_train
- Repeat the previous steps for X_test
- Merge the X_train and X_test
- Extracts only the measurements on the mean and standard deviation for each measurement.
- Create a narrow tidy data set.
- Create a wide tidy data set with the average of each variable for each activity and each subject
- Write the tidy data set to file