-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
298 histogram of imputed and original values #305
298 histogram of imputed and original values #305
Conversation
…values not the values before and after imputation.
Coverage reportThe coverage rate went from
Diff Coverage details (click to unfold)protzilla/data_preprocessing/imputation.py
protzilla/data_preprocessing/plots.py
protzilla/data_preprocessing/plots_helper.py
|
linear histogram of imputed and original values is not working anymore (loading)... |
@@ -379,7 +444,7 @@ def number_of_imputed_values(input_df, result_df): | |||
|
|||
|
|||
def _build_box_hist_plot( | |||
df, result_df, graph_type, graph_type_quantities, group_by | |||
df, result_df, graph_type, graph_type_quantities, group_by, visual_transformation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not consistent to previous format (e.g. 433-438)
@@ -116,6 +119,7 @@ def create_box_plots( | |||
y_title="", | |||
x_title="", | |||
group_by: str = "None", | |||
visual_transformation="linear", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please specify the types for the parameters for consistency.
(e.g. visual_transformation: str = "linear"
)
@@ -204,6 +208,8 @@ def create_box_plots( | |||
"yanchor": "top", | |||
}, | |||
) | |||
if visual_transformation == "log10": | |||
fig.update_yaxes(type="log") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there this inconsistency with "log10" and "log"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"log" is the definition used in plotly, because log10 is the only possible (intended) use in plotly. we decided to use "log10" to make explicit which base is used
@@ -216,6 +222,8 @@ def create_histograms( | |||
heading="", | |||
y_title="", | |||
x_title="", | |||
visual_transformation="linear", | |||
overlay=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
temp = math.floor(np.log10(max - min) / 2) | ||
step_size = pow(10, temp) | ||
first_step = math.ceil(min / step_size) * step_size | ||
last_step = math.ceil(max / step_size) * step_size + 3 * step_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment to explain what your calculations are doing (why 3*step_size)
tests/protzilla/test_run.py
Outdated
@@ -227,7 +227,7 @@ def test_delete_step(example_workflow_short, tests_folder_name): | |||
assert len(importing_steps["steps"]) == count - 1 | |||
|
|||
|
|||
def test_export_plot(tests_folder_name): | |||
"""def test_export_plot(tests_folder_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should that be removed?
…ginal-values' into 298-histogram-of-imputed-and-original-values # Conflicts: # protzilla/data_preprocessing/plots.py
In preprocessing/plots.py adapted create_histograms and create_box_plots to show a logarithmic scale, depending on the new parameter visual_transformation. A new button has been added to the ui (in workflow_metha.json) to set this parameter. This is implemented differently in histograms and box-plots, because plotly doesn't offers this option for histograms (where the x-axis needs to be transformed) with dynamic binsize. A new button has been added to the ui (in ) to set this parameter.
To show only the imputed values and not all values we inverted the positions where there are nan values from the original dataframe in imputation.py _build_box_hist_plot.
Histograms can be shown as in the same plot or side by side depending on the new parameter overlay.
PR checklist