I'm a seasoned Software Developer and I'm interested in everything ML/AI
๐ฑ Iโm currently learning with DataTalks Club Data Engineering Zoomcamp
- LLM-powered Question Answering Slack bot (Ongoing)
- End-to-end ML Project on Blood Vessel Segmentation
- Diabetes classification model training and deployment
- End-to-end MLOps Pipeline
- Climate change-related news articles Scientific verification (NLP-powered project for Omdena)
to accompany 3 Zoomcamps by DataTalksClub
DE and ML Zoomcamps branch
Connecting the dots
Llamaindex
Vertor DB
Milvus and
Zilliz (Cloud-Native Milvus)
Orchestration
Prefect
Embeddings
BAAI/bge-small-en-v1.5
Re-ranker
Cohere re-ranker
MLOps Zoomcamp branch
Course FAQ Google Document and the Course repo get indexed to the Pinecone vector store.
Then semantic search retrieves the most similar (and hopefully most relevant) pieces to the question asked.
Then this information is passed as a context to a conversational LLM to form the final answer.
The Star of the show
LangChain
Vertor DB
Pinecone
Orchestration
Prefect
Semantic Search
Sentence Transformers
Fine-tune Ultralytics YOLOv8 segmentation model on 3D Hierarchical Phase-Contrast Tomography (HiP-CT) data from human kidneys to segment blood vessels.
EDA => Training => Hyperparameter tuning => Deployment as a service (FastAPI) => Containerization => Deployment to AWS EKS
- Trained several models
- Logistic regression
- Random Forest
- XGBoost
- Hyperparameter finetuning with
- Deploy containerized model with FastAPI on AWS Elastic Beanstalk
A tool and an API for Climate change-related news articles Scientific verification and Global warming stance detection
NLP Framework
Haystack
Vertor DB
FAISS
API
FastAPI
UI
Streamlit
Semantic Search
Sentence Transformers
developed as a part of Omdena's Detecting Bias in Climate Reporting in English and German Language News Media Local Chapter Challenge