Through this page, I demonstrate my skills in solving business challenges through my knowledge and tools of Data Science.
Data Scientist/Analyst
I am a Mechanical Engineering student, and I had my first experience with data working on the Technology team of a Public Health organization. During my experience there, I was tasked with executing Data Analysis projects that involved tracking and monitoring Dengue cases in the city, managing exams orders, and following up on cases of Covid-19. This experience allowed me to develop my understanding of the key data analysis tools and how to use them to generate insights that could encourage data-driven decision making.
Additionally, I have had the opportunity to work in the Logistics department of an automobile industry, where my responsibilities included generating customer demand, inventory, and purchase order reports through data analysis. To this end, I developed efficient data extraction and transformation processes that made relevant logistic key performance indicators (KPIs) easily accessible through interactive dashboards. Moreover, during this experience, I spearheaded several data flow automation projects, which helped optimize work performance and streamline access to important information for the team.
Currently, I am a member of a Data Science community where I run personal projects in order to gain experience in solving business problems using data analysis concepts and tools. In this community, I have the opportunity to share knowledge with other members by participating in study groups, monitoring, discussions and competitions.
Skills:
-
Programming Languages:
- Python for Data Analysis
- Linear Algebra and Data Manipulation: Pandas, Numpy
- Data Vizualization: Streamlit, Matplotlib, Seaborn, Plotly
- Machine Learning: Scikit-Learn, Scipy ( Classification, Regression and Clustering)
- Web Scraping: Beautiful Soup, Selenium
- APIs building: Flask
- SQL
- Python for Data Analysis
-
Data Vizualization Tools:
- Power BI
- Tableau
- Google Looker Studio
-
Machine Learning Deployment:
- AWS(S3, EC2, RDS)
- Heroku
- Render
-
Other tools:
- Microsoft Excel
Links:
An Insurance company that has provided Health Insurance to its customers need to predict whether the policyholders (customers) from past year will also be interested in Car Insurance provided by the company. With the information about customers, the company did a survey asking them if they were interested in car insurance. With the results of this survey and the characteristics of the customers, the company is able to maximize profit for this product through Machine Learning techniques.
In this context, I developed a Learning to Rank Machine Learning model that is able to rank customers by their propensity to buy auto insurance, so salespeople can target customers who are most likely to buy it.
Compared to a random selection of customers to be contacted, the machine learning model developed proved to be about 3 times more efficient, generating an extra gain of 35 million dollars.
An E-commerce company wants to implement a loyalty program for its most valuable customer group called "Insiders", so the marketing team can provide benefits to this group and encourage other customers to join it. To find out who are those customers, the company provided a dataset contains all the transactions occurring between Nov-2016 to Dec-2017 to investigate.
For this problem, an unsupervised Machine Learning model was developed to cluster customers based on their similarities. By doing so, it was possible to identify a subset of high-value customers, who made up just 15.7% of the total customer base, but contributed but contribute with 51.7% of the total revenue.
The European pharmacy chain Rossmann plans to allocate a portion of their budget towards renovating their stores. In order to calculate the amount that will be dedicated to this purpose, the company's CFO requested that the data team develop a sales revenue forecasting solution for the next 6 weeks.
A regression model using Machine Learning was developed and was able to achieve a MAPE (mean absolute percentage error) of 14%. This model predicts a sales value of $283.7M for the next 6 weeks.