-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Submission: GROUP 17: Giant_Pumpkins_Weight_Prediction #15
Comments
(Work in progress) Data analysis review checklistReviewer: @stevenleung2018Conflict of interest
Code of Conduct
General checks
Documentation
Code quality
Reproducibility
Analysis report
Estimated hours spent reviewing:About 1.5 hours. Review Comments:
There are, however, a few issues I have spotted: (base) stevenprivate@StevenMac ~/mds/522/Giant_Pumpkins_Weight_Prediction (main)
$ conda env create -f environment.yaml
Collecting package metadata (repodata.json): done
Solving environment: failed
ResolvePackageNotFound:
- pyqt5-sip==4.19.18=py39h415ef7b_8
- graphite2==1.3.13=1000
- openjpeg==2.4.0=hb211442_1
- libffi==3.4.2=h8ffe710_5
- selenium==3.141.0=py39hb82d6ee_1003
- intel-openmp==2021.4.0=h57928b3_3556
- pandas==1.3.4=py39h2e25243_1
- win_inet_pton==1.1.0=py39hcbf5309_3
- ucrt==10.0.20348.0=h57928b3_0
- libxgboost==1.3.0=h0e60522_3
- xorg-libx11==1.7.2=hcd874cb_0
- sqlite==3.36.0=h8ffe710_2
- setuptools==59.2.0=py39hcbf5309_0
- zeromq==4.3.4=h0e60522_1
- nodejs==14.17.4=h57928b3_0
- preshed==3.0.6=py39h415ef7b_1
- libsodium==1.0.18=h8d14728_1
- xorg-libice==1.0.10=hcd874cb_0
- murmurhash==1.0.6=py39h415ef7b_2
- cairo==1.16.0=hb19e0ff_1008
- cffi==1.15.0=py39h0878f49_0
- jpeg==9d=h8ffe710_0
- xorg-libxau==1.0.9=hcd874cb_0
- libpng==1.6.37=h1d00b33_2
- libwebp==1.2.1=h57928b3_0
- pyqt==5.12.3=py39hcbf5309_8
- statsmodels==0.13.1=py39h5d4886f_0
- scikit-learn==1.0.1=py39he931e04_2
- zlib==1.2.11=h8ffe710_1013
- gettext==0.19.8.1=ha2e2712_1008
- py-xgboost==1.3.0=py39hcbf5309_3
- pyqtwebengine==5.12.1=py39h415ef7b_8
- tornado==6.1=py39hb82d6ee_2
- psutil==5.8.0=py39hb82d6ee_2
- libclang==11.1.0=default_h5c34c98_1
- libbrotlicommon==1.0.9=h8ffe710_6
- xorg-libxt==1.2.1=hcd874cb_2
- certifi==2021.10.8=py39hcbf5309_1
- m2w64-gcc-libs-core==5.3.0=7
- graphviz==2.49.3=hefbd956_0
- libbrotlienc==1.0.9=h8ffe710_6
- catalogue==2.0.6=py39hcbf5309_0
- libxcb==1.13=hcd874cb_1004
- harfbuzz==3.1.1=hc601d6f_0
- pyzmq==22.3.0=py39he46f08e_1
- zstd==1.5.0=h6255e5f_0
- regex==2021.11.10=py39hb82d6ee_0
- xorg-libxpm==3.5.13=hcd874cb_0
- vega-cli==5.17.0=h0e60522_4
- srsly==2.4.2=py39h415ef7b_0
- pcre==8.45=h0e60522_0
- pyrsistent==0.18.0=py39hb82d6ee_0
- scipy==1.7.2=py39hc0c34ad_0
- ipykernel==6.5.0=py39h832f523_1
- xorg-libsm==1.2.3=hcd874cb_1000
- cython-blis==0.7.5=py39h5d4886f_1
- pthread-stubs==0.4=hcd874cb_1001
- jbig==2.1=h8d14728_2003
- lcms2==2.12=h2a16943_0
- vc==14.2=hb210afc_5
- matplotlib==3.5.0=py39hcbf5309_0
- libwebp-base==1.2.1=h8ffe710_0
- fonttools==4.28.1=py39hb82d6ee_0
- debugpy==1.5.1=py39h415ef7b_0
- importlib-metadata==4.8.2=py39hcbf5309_0
- libbrotlidec==1.0.9=h8ffe710_6
- chardet==4.0.0=py39hcbf5309_2
- python==3.9.7=h7840368_3_cpython
- libxml2==2.9.12=hf5bbc77_1
- liblapack==3.9.0=12_win64_mkl
- libblas==3.9.0=12_win64_mkl
- llvmlite==0.36.0=py39ha0cd8c8_0
- numpy==1.21.4=py39h6635163_0
- libzlib==1.2.11=h8ffe710_1013
- m2w64-gmp==6.1.0=2
- pysocks==1.7.1=py39hcbf5309_4
- tk==8.6.11=h8ffe710_1
- libglib==2.70.1=h3be07f2_0
- lerc==3.0=h0e60522_0
- brotli==1.0.9=h8ffe710_6
- msys2-conda-epoch==20160418=1
- cryptography==36.0.0=py39h7bc7c5c_0
- pywin32==302=py39hb82d6ee_2
- libdeflate==1.8=h8ffe710_0
- numba==0.53.0=py39h69f9ab1_0
- vega-lite-cli==4.17.0=h57928b3_2
- fribidi==1.0.10=h8d14728_0
- catboost==1.0.3=py39hcbf5309_1
- click==8.0.3=py39hcbf5309_1
- m2w64-gcc-libs==5.3.0=7
- xz==5.2.5=h62dcd97_1
- jupyter_core==4.9.1=py39hcbf5309_1
- matplotlib-base==3.5.0=py39h581301d_0
- gts==0.7.6=h7c369d9_2
- jedi==0.18.1=py39hcbf5309_0
- ca-certificates==2021.10.8=h5b45459_0
- brotli-bin==1.0.9=h8ffe710_6
- libgd==2.3.3=h8bb91b0_0
- markupsafe==2.0.1=py39hb82d6ee_1
- lightgbm==3.3.1=py39h415ef7b_1
- libtiff==4.3.0=hd413186_2
- xorg-libxdmcp==1.1.3=hcd874cb_0
- kiwisolver==1.3.2=py39h2e07f2f_1
- m2w64-gcc-libgfortran==5.3.0=6
- qt==5.12.9=h5909a2a_4
- m2w64-libwinpthread-git==5.0.0.4634.697f757=2
- xorg-xextproto==7.3.0=hcd874cb_1002
- pyqt-impl==5.12.3=py39h415ef7b_8
- fontconfig==2.13.1=h1989441_1005
- pango==1.48.10=h33e4779_2
- vs2015_runtime==14.29.30037=h902a5da_5
- spacy==3.2.0=py39hefe7e4c_0
- freetype==2.10.4=h546665d_1
- lz4-c==1.9.3=h8ffe710_1
- xorg-kbproto==1.0.7=hcd874cb_1002
- mkl==2021.4.0=h0e2418a_729
- brotlipy==0.7.0=py39hb82d6ee_1003
- xorg-xproto==7.0.31=hcd874cb_1007
- thinc==8.0.12=py39hefe7e4c_0
- tbb==2021.4.0=h2d74725_1
- pyqtchart==5.12=py39h415ef7b_8
- cymem==2.0.6=py39h415ef7b_2
- libiconv==1.16=he774522_0
- pandoc==2.16.2=h8ffe710_0
- icu==68.2=h0e60522_0
- ipython==7.29.0=py39h832f523_2
- expat==2.4.1=h39d44d4_0
- pixman==0.40.0=h8ffe710_0
- libcblas==3.9.0=12_win64_mkl
- pydantic==1.8.2=py39hb82d6ee_2
- pillow==8.4.0=py39h916092e_0
- shap==0.40.0=py39h2e25243_0
- xorg-libxext==1.3.4=hcd874cb_1
- getopt-win32==0.1=h8ffe710_0 Thus the subsequent The
AttributionThis was derived from the JOSE review checklist and the ROpenSci review checklist. |
Data analysis review checklistReviewer: @RamiroMejiaConflict of interest
Code of Conduct
General checks
Documentation
Code quality
Reproducibility
Analysis report
Estimated hours spent reviewing: 1.5Review Comments:Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.
AttributionThis was derived from the JOSE review checklist and the ROpenSci review checklist. |
|
Data analysis review checklistReviewer: @riddhisansareConflict of interest
Code of Conduct
General checks
Documentation
Code quality
Reproducibility
Analysis report
Estimated hours spent reviewing:1.7 hours. Review Comments:
AttributionThis was derived from the JOSE review checklist and the ROpenSci review checklist. |
Thanks Rameiro for the inputs, truly appreciate your detailed observations. Following are our immediate plans to implement your feedbacks and suggestions:
|
Data analysis review checklistReviewer: @ruben1dlgConflict of interest
Code of Conduct
General checks
Documentation
Code quality
Reproducibility
Analysis report
Estimated hours spent reviewing: 1.5Review Comments:Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.
AttributionThis was derived from the JOSE review checklist and the ROpenSci review checklist. |
Thanks everyone for your valuable feedback so that we can improve our project. We have made the following changes in regarding to your comment:
|
Submitting authors: @mahsasarafrazi, @shivajena, @Rowansiv, @imtvwy
Repository: https://github.com/UBC-MDS/Giant_Pumpkins_Weight_Prediction
Report link: https://github.com/UBC-MDS/Giant_Pumpkins_Weight_Prediction/blob/main/doc/pumpkin.html
Abstract/executive summary:
This project is an attempt to build a prediction model using regression based machine learning models to estimate the weight of giant pumpkins based on their features such as year of cultivation, place, and over the top(ott) size in order to predict the next year’s winner of the GP competition. Different regression based prediction models such as Linear, Ridge and Random Forest were used for training and cross-validation on the training data. For the Ridge model, the hyperparameter (α) was optimised to return the best cross validation score. This model performed fairly well in predicting on the test data which led us to finalise the use of the model for prediction. The best score on cross validation sets is 0.6666134 and the mean test score is 0.6619808. The Random Forest model had similar cross-validation and test scores, but due to its high fit times, it was not chosen for this report. Therefore, for the purpose of reproducibility, we have decided to utilise the Ridge model as our prediction model. For better performance and precision, other models may also be tried on the data.
The data used for this project comes from BigPumpkins.com.
The dataset is a public domain resource which pertains to the attributes of giant pumpkins grown in around 20 countries across the world in different regions. The raw data which was used in this project for the analysis can be found here : https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-19/pumpkins.csv
Editor: @mahsasarafrazi, @shivajena, @Rowansiv, @imtvwy
Reviewer: @RamiroMejia, @riddhisansare, @stevenleung2018, @ruben1dlg
The text was updated successfully, but these errors were encountered: