You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Goal: To clarify the intended preprocessing algorithm for refactoring (starting with DLC outputs and ending right before the creation of the training set). Feedback on the below approach from Luiz and others is encouraged.
Create new main function: preprocess_posedata() in a new script preprocessing.py.
This function should have four main steps, with a final fifth step saving the cleaned data to file and optionally outputting a figure visualizing the before-and-after for the user to QC. Currently, some of these functions exist in align_egocentrical.py and create_trainset.py.
lowconf_cleaning(): Load in DeepLabCut CSV containing 3 columns per DLC-tracked body part (X pixel coordinate, Y pixel coordinate, DLC confidence p value). For each bodypart, nan out any frames with confidence less than confidence_threshold for that body part only. Linear interpolate over nan-ed frames.
egocentrically_align_and_center(): Using the two reference key points (usually belly and tailbase), find and apply the rotation matrix required to align all body parts egocentrically using a rotation matrix. Shift the points to make the belly (or whatever the centered reference point is) 0,0 with the orientation reference point always down along the -y axis with x always zero. <Suggest to add the belly centering to crop_and_flip(), renaming to crop_flip_center.> Save PE-seq.npy, which is the ego aligned and belly-centered poses BEFORE any IQR cleaning. Consider renamed PE-seq to egoaligned_pose_estimation.npy
-- Are the units of PE-seq.npy pixels? I think so yes.
outlier_cleaning(): For each bodypart of a given session, z-score the aligned coordinates and using the IQR_val, identify outliers, nan them out and interpolate over the nans. Repeat for all bodyparts. Redo the z-score to remove the bias of the now-removed outliers.
savgol_filtering(): If savitzky-golay filtering is desired, apply that now.
save_and_visualize_cleaned_egoaligned_poses(): Save PE-seq_clean.npy. Consider renamed PE-seq_clean to cleaned_egoaligned_pose_estimation.npy. Plot a visualization showing the raw DLC timeseries, followed by the results of steps, 1, 2, 3, and 4 for all bodyparts for a sample 1-minute of 1 session.
Anything described above that currently exists in create_trainset(), like IQR and savegol, should be removed from create_trainset(), which should only create the train and test data sets, not transform the data.
The text was updated successfully, but these errors were encountered:
katiekly
added
the
refactor
Improve internal software structure without changing observable outcomes
label
Dec 7, 2024
User should input by the reference body part name string, not the index number. The two reference points should be inputted as arguments or entered into the config as separate parameters for clarity, named "centered_reference_point" and "orientation_reference_point".
Rather than creating a visual as shown above, let's instead save a table that reports for each session (rows), the % of frames removed and interpolated over due to low confidence (column 1), and the % of frames removed and interpolated over due to IQR outlier detection (column 2).
Goal: To clarify the intended preprocessing algorithm for refactoring (starting with DLC outputs and ending right before the creation of the training set). Feedback on the below approach from Luiz and others is encouraged.
Create new main function: preprocess_posedata() in a new script preprocessing.py.
This function should have four main steps, with a final fifth step saving the cleaned data to file and optionally outputting a figure visualizing the before-and-after for the user to QC. Currently, some of these functions exist in align_egocentrical.py and create_trainset.py.
-- Are the units of PE-seq.npy pixels? I think so yes.
Anything described above that currently exists in create_trainset(), like IQR and savegol, should be removed from create_trainset(), which should only create the train and test data sets, not transform the data.
The text was updated successfully, but these errors were encountered: