My name is TuΓ’n, currently, I am on the way to becoming a Data Engineer. I have a master's degree in Artificial Intelligence, Systems, and Data from PSL University.
- Programming Languages
- Data Technologies: Apache Spark, Kafka, Cassandra, LakeHouse
- GCP Services: Pub/Sub, DataFlow, BigQuery, Cloud Storage, DataPlex
- Deployment and Automation: Terraform, Bash Script, Ansible, GitLab CI
- Monitoring: Grafana
- Dashboard: Looker Studio
- Docker, Linux
I am spending some time on the following projects:
- IoT automobile: My first data pipeline for IoT connected device. In this project, I monitor data from automobile, and save the data to Big database. I have 2 solutions: GCP and Open-Source. You can check my source code (almost bash script and terraform)
- MLOps zoomcampt: An bootcamp from Datatalk club, in this boot camp I will learn how to deploy an end-to-end Machine Learning model. I will update homework and project regularly.
- TIAD: an NLP contest that I participated in 2021.
The books that I am reading:
- Fundamentals of Data Engineering (Book by Joe Reis and Matt Housley)
- Spark: The Definitive Guide: Big Data Processing Made Simple (Book by Matei Zaharia)
- Data Engineering with Apache Spark, Delta Lake, and Lakehouse: Create Scalable Pipelines that Ingest, Curate, and Aggregate Complex Data in a Timely and Secure Way (Book by Manoj Kukreja)