Driven by curiosity, eager self-learner, OpenSource contributor, PyData volunteer and speaker, currently searching for a Data Scientist role where I could fully employ my analytical, programming and communication skills
Passionate about developing and deploying machine learning models. Proficient in using Python for data analysis and machine learning tasks. Skilled in data visualization tools to communicate insights effectively. Familiar with with cloud platforms such as Azure and Google Cloud. I want to drive decision-making and solve problems using data.
-
Iβm currently working on:
π PyData Amsterdam 2024 Talk I will be talking about my experience of stepping into the rabbit hole of contributing to open-source software, highlighting key learnings and practical steps for beginners. It covers overcoming self-doubt, learning through collaboration, and the unexpected joys of community engagement. What you can learn from contributing to Open Source and what you probably will not as an aspiring Data Scientist.πPyData Amsterdam 2024 Open Source Sprint: Narwhals Narwhals is an extremely lightweight and extensible compatibility layer between dataframe libraries, and it needs your help! An open source sprint is the perfect opportunity to make your first contribution to open source. The core maintainers of the Narwhals package will prepare a list of easy and accessible first issues to get started with, and will be present in this session to guide you to make your first commit to the package. This is the perfect opportunity to give back to the Python ecosystem, while having some fun.
π³Contributing to Dask/PyArrow backend in Narwhals At Narwhals, weβre committed to helping you build dataframe-agnostic tools. Whether your users prefer pandas, polars dataframes, or even pyarrow tables, Narwhals has you covered. Thereβs still plenty of work to do, so if youβd like to contribute and enhance Narwhals, feel free to check out our Contributing Guide and join us on Discord.
π€ Did ChatGPT replace Juniors? Inspired by personal curiosity and a 2023 Hackathon challenge (won in the βMost Polishedβ category). This project investigates the impact of large language models like ChatGPT on entry-level roles in tech. Demonstrated skills include data cleaning, data wrangling, data analysis, and modeling, using tools such as Python, APIs, Polars, and Hvplot.
-
π± Iβm currently learning π»ββοΈ Polars and that's Ritchie Vink - creator of Polars with my graffiti:
-
π¨βπ» All of my projects are available at https://github.com/anopsy
-
πIf you'd like to hire me, check my CV
-
π I write about my learning journey on https://medium.com/@anopsy28
-
π« How to reach me [email protected]
-
β‘ Fun fact π¨ I paint graffiti portraits
π¨ Selected Portfolio Projects β£ββ contributing to OSS at: β β£ββ π§±scikit-lego β β β£ββ contributed to docs β β βββ made ColumnSelector dataframe agnostic using Narwhals β βββ π³π¦narwhals β β β£ββ worked on pyarrow/dask backend implementation β β βββ contributed to docs and tests β βββ π‘embetter β β£ββ deprecated a method β βββ added pre-commit hooks β β£ββ Juniors_vs_ChatGPT β - Did ChatGPT replaced Juniors and Interns? β β£ββ data cleaning β β£ββ data wrangling β β£ββ data analysis β β£ββ modeling β βββ pythonπ/API/polarsπ»ββοΈ/hvplotπ β β£ββ Compensation Prediction β - How much do Engineers earn? β β£ββ data modeling β β£ββ model evaluation β β£ββ containerization using docker β β£ββ building streamlit app β βββ pythonπ/scikit-learn/streamlitπ/dockerπ¦ β β£ββ MaskMap: Decoding the Hidden Spectrum β - Prototype of a diagnosis support tool using the power of NLP to identify symptoms of Autistic Masking β β£ββ data scraping β β£ββ data cleaning β β£ββ modeling β β£ββ deploying β βββ pythonπ/pandasπΌ/FastAPI β β£ββ Equity in Healthcare: Women in Data Science Datathon 2024 β - WIDS Datathon Project predicting a timely diagnosis of Metastatic Cancer β β£ββ data cleaning β β£ββ data wrangling β β£ββ data analysis β β£ββ modeling β βββ pythonπ/pandasπΌ/ensembleπ³/kerasπ§ β β£ββ Relative Search Volumes Analysis β - Search Volumes for Autism vs Autism Spectrum Disorder around the world β β£ββ data scraping β β£ββ data cleaning β β£ββ modeling WIP β βββ pythonπ/pandasπΌ β β£ββ Steelplate Defect Visual EDA β - Colorful joyplots for Visual EDA β β£ββ data visualization β β£ββ ensemble β βββ pythonπ/pandasπΌ/xgbπ³/seabornπ¨ β β£ββ hossenfelder - π¦ΊWIP β - Data Analysis and Prediction of views on Sabine Hossenfelder YT channel β β£ββ data scraping β β£ββ data cleaning β β£ββ modeling WIP β βββ pythonπ/pandasπΌ β βββ MyFalaClassifier - π¦ΊWIP - Detector of surfable waves β£ββ live-stream scraping β£ββ image processing β£ββ transfer learning β£ββ deploying βββ pythonπ/kerasπ§