Milestone 1 Review #36

mohamad-amin · 2021-12-01T03:26:52Z

Good job! Here are my feedbacks for milestone 1 assessment.

Project proposal: reasoning
You might need to pay more attention to these parts:

"Clearly state the research question and any natural sub-questions you need to address, and their type." In your proposal, have you analyzed different possible situations that might arise when working with textual data? Why do you use logistic regression if you are facing a classification problem? If you are not doing regression and are doing classification, why do you have AUC score? Moreover, these details are not very much acceptable by a not-so-technical person (like AUC score).
What about data visualization? What specifically are you going to do?
For these algorithms, what packages will you use? Have you thought of using wrapper algorithms (boruta algorithm) for feature selection?

Exploratory data analysis in a literate code document: VIZ
Have you looked into the HTML report file that you have provided? It's not really opening on github. First, your report should be openable on github so that everyone could see this. Second, you don't need to convert it to HTML. That's why it breaks. Please do not convert your notebooks to HTML files again.
Exploratory data analysis in a literate code document: QUALITY

It's nice that you have used the pandas profiling tool, but where is your motivation for the things that you have done? How do you wanna handle the missing values? What did you infer from your analysis? Just plotting the results without any results seems a bit pointless.

PANDASANG1231 · 2021-12-02T01:53:20Z

@mohamad-amin Hey, thank you for the feedback. It is really helpful.

I think your idea is very clear. Just one question, I am not sure if I understand this word clearly.
"Why do you use logistic regression if you are facing a classification problem?" Although logistic regression has a name ending with the word 'regression', it is actually not a regression. Because it has a Softmax in the final layer and turns the algorithm into a binary classification algorithm. So do you mean we should try other classification algorithms besides LR, or you don't think LR is a good algorithm for classification? Thanks

mohamad-amin · 2021-12-02T05:09:24Z

Hey, sorry isn't your problem inherently a regression problem? (Ramens' rating)
I assumed it would be a numerical rating, am I wrong?

PANDASANG1231 · 2021-12-02T22:52:17Z

Yeah, finally we changed it into a binary classification. Maybe we can state it more clearly in the summary

datallurgy · 2021-12-08T02:19:29Z

Hi @mohamad-amin!

Re: Comment 5: Pandas-profiling does not render in the ipynb file and only exports in HTML and JSON. Pandas-profiling to_file documentation. I understand it's not ideal, as the HTML does not render in github because it's interactive, but the file is easily downloadable and you can open it in browser. It doesn't print nicely to PDF either because we considered uploading the PDF of the EDA as well.

What would be your recommendation for rendering pandas-profiling reports?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Milestone 1 Review #36

Milestone 1 Review #36

mohamad-amin commented Dec 1, 2021

PANDASANG1231 commented Dec 2, 2021 •

edited

Loading

mohamad-amin commented Dec 2, 2021

PANDASANG1231 commented Dec 2, 2021

datallurgy commented Dec 8, 2021

Milestone 1 Review #36

Milestone 1 Review #36

Comments

mohamad-amin commented Dec 1, 2021

PANDASANG1231 commented Dec 2, 2021 • edited Loading

mohamad-amin commented Dec 2, 2021

PANDASANG1231 commented Dec 2, 2021

datallurgy commented Dec 8, 2021

PANDASANG1231 commented Dec 2, 2021 •

edited

Loading