diff --git a/README.md b/README.md index c624a12..2e78353 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,20 @@ This project provides a tool to rank and summarize CVs based on a given job description. The application is built using FastAPI for the backend, with endpoints for ranking and summarizing CVs. It also includes evaluation metrics such as MRR, NDCG, and BERTScore for performance measurement. +## Table of Contents + +- [Features](#features) +- [Prerequisites](#prerequisites) +- [Installation](#installation) + - [Setup with Docker](#setup-with-docker) +- [Running the Application](#running-the-application) +- [Project Structure](#project-structure) +- [Process Flow Diagram](#process-flow-diagram) +- [Reason for Choice of Models](#reason-for-choice-of-models) +- [Reason for Selecting These Evaluation Metrics](#reason-for-selecting-these-evaluation-metrics) +- [Future Improvements](#future-improvements) +- [Contributing and Feedback](#contributing-and-feedback) + ## Features - **CV Ranking**: Rank CVs based on their relevance to a job description. @@ -38,6 +52,27 @@ Before setting up the project, ensure you have the following installed: pip install -r requirements.txt ``` +### Setup with Docker + +If you prefer to use Docker, you can follow these steps: + +1. **Ensure Docker is Installed:** + + Make sure Docker is installed on your system. You can download it from [Docker's official website](https://www.docker.com/get-started). + +2. **Build and Run the Backend:** + + ```bash + docker-compose up --build + ``` + + This command will build the Docker images for the backend and frontend and start the services. + +3. **Access the Application:** + + - The FastAPI application will be available at `http://localhost:8000`. + - The Streamlit frontend will be accessible at `http://localhost:8501`. + ## Running the Application 1. **Start the FastAPI Application:** @@ -196,10 +231,14 @@ flowchart TD - - ## Future Improvements - Implement additional evaluation metrics. - Improve the robustness of the text extraction and cleaning processes. -- Implement a good vector database structure with milvus or weaviate. +- Implement a good vector database structure with Milvus or Weaviate. + +## Contributing and Feedback + +Your feedback and contributions are highly appreciated! If you have ideas on how to improve this project or encounter any issues, feel free to open an issue or submit a pull request on GitHub. + +You can also reach out to me directly for any discussions or suggestions. diff --git a/backend/metrics.py b/backend/metrics.py index 9efc713..0045187 100644 --- a/backend/metrics.py +++ b/backend/metrics.py @@ -19,6 +19,7 @@ def mean_reciprocal_rank(ranked_cvs: list, weighted_scores: list) -> float: if weighted_scores[idx] > 0: ranks.append(1 / (idx + 1)) return sum(ranks) / len(ranked_cvs) if ranked_cvs else 0.0 + def evaluate_and_log_mrr(ranked_cvs: list, logger) -> None: """ Evaluates the Mean Reciprocal Rank (MRR) score for a list of ranked CVs and logs the result. @@ -51,7 +52,9 @@ def dcg(scores): """ return sum(score / math.log2(idx + 2) for idx, score in enumerate(scores)) -def ndcg(ranked_cvs, logger): +def ndcg(ranked_cvs, logger): + # rethink this ideal DCG, because this doesn't measure the correctness of the ranking model truly + # Think also about a way to get a ground truth """ Calculate the Normalized Discounted Cumulative Gain (NDCG) for a list of ranked CVs. @@ -84,6 +87,14 @@ def evaluate_and_log_ndcg(ranked_cvs, logger): """ ndcg(ranked_cvs, logger) def log_bert_scores(generated_summary, logger): + # The bert score calculation is wrong as the BERT score should not have the same candidate and reference + # In this case explore options to get a reference or ground truth. + # Some options I have considered are: + # - Get a professional to develop a reference dataset for initial testing and evaluation + # - Use round-trip consistency + # - Instead of relying solely on the BERT score, monitor other metrics such as summary length, coherence, coverage, and whether key information is preserved. + + # Each of these has its own problems, and I will be addressing them later """ Logs the BERT scores (Precision, Recall, and F1) for a given generated summary.