Skip to content

Commit

Permalink
fix broken test case
Browse files Browse the repository at this point in the history
  • Loading branch information
Charles Yuan authored and Charles Yuan committed Dec 16, 2024
1 parent f292038 commit 9aa3f48
Show file tree
Hide file tree
Showing 3 changed files with 38 additions and 109 deletions.
Binary file modified examples/sample_data/resume_1.pdf
Binary file not shown.
134 changes: 31 additions & 103 deletions tests/outputs/correct_pdf_output.txt
Original file line number Diff line number Diff line change
@@ -1,115 +1,43 @@
STOXX INDEX METHODOLOGY GUIDE
John Doe
[email protected] | (123) 456-7890 | San Francisco, CA

## CONTENTS
## Professional Summary


| 6.5.1. | OVERVIEW | 49 |
|---|---|---|
| 6.5.2. | INDEX REVIEW | 49 |
| 6.5.3. | ONGOING MAINTENANCE | 51 |

Experienced Machine Learning Engineer with 5+ years of expertise in developing and deploying ML models. Skilled in Python, TensorFlow, and cloud-based ML solutions. Passionate about leveraging AI to solve complex business problems.


| 7. | STOXX BENCHMARK INDICES (BMI) | 52 |
|---|---|---|
| 7.1. | STOXX GLOBAL INDICES | 52 |
| 7.1.1. | OVERVIEW | 52 |
| 7.1.2. | INDEX REVIEW | 53 |
| 7.1.3. | ONGOING MAINTENANCE | 55 |
| 7.2 | STOXX GLOBAL 1800 AND DERIVED INDICES | 56 |
| 7.2.1. | OVERVIEW | 56 |
| 7.2.2. | INDEX REVIEW | 56 |
| 7.2.3. | ONGOING MAINTENANCE | 58 |
| 7.3 | SIZE INDICES BASED ON THE STOXX GLOBAL INDICES | 60 |
| 7.3.1. | OVERVIEW | 60 |
| 7.3.2. | INDEX REVIEW | 60 |
| 7.3.3. | ONGOING MAINTENANCE | 62 |
| 7.4 | SECTOR INDICES BASED ON THE STOXX GLOBAL INDICES | 63 |
| 7.4.1. | OVERVIEW | 63 |
| 7.4.2. | INDEX REVIEW | 63 |
| 7.4.3. | ONGOING MAINTENANCE | 64 |
| 7.5 | STOXX EUROPE 600 AND EURO STOXX SUPERSECTOR INDICES: 30% / 15% CAPS | 65 |
| 7.5.1. | OVERVIEW | 65 |
| 7.5.2. | INDEX REVIEW | 65 |
| 7.5.3. | ONGOING MAINTENANCE | 66 |
| 7.6 | STOXX REGIONAL REAL ESTATE INDICES: 20% CAPS67 | 67 |
| 7.6.1. | OVERVIEW | 67 |
| 7.6.2. | INDEX REVIEW | 67 |
| 7.6.3. | ONGOING MAINTENANCE | 67 |
| 7.7 | STOXX EMERGING MARKETS 800 LO | 68 |
| 7.7.1. | OVERVIEW | 68 |
| 7.7.2. | INDEX REVIEW | 68 |
| 7.7.3. | ONGOING MAINTENANCE | 68 |
| 7.8 | STOXX INDUSTRY AND SUPERSECTOR LEGACY INDICES | 70 |
| 7.8.1. | OVERVIEW | 70 |
| 7.8.2. | INDEX REVIEW | 71 |
| 7.8.3. | ONGOING MAINTENANCE | 71 |
| 7.9 | EURO STOXX SUPERSECTOR 5/10/40 INDICES | 72 |
| 7.9.1. | OVERVIEW | 72 |
| 7.9.2. | INDEX REVIEW | 72 |
| 7.9.3. | ONGOING MAINTENANCE | 73 |
| 7.10 | STOXX EUROPE 600 INDUSTRY 30-15 INDICES | 74 |
| 7.10.1. | OVERVIEW | 74 |
| 7.10.2. | INDEX REVIEW | 74 |
| 7.10.3. | ONGOING MAINTENANCE | 75 |

## Skills


| 7.11. | STOXX SEMICONDUCTOR 30 INDEX | 76 |
|---|---|---|
| 7.11.1. | OVERVIEW | 76 |
| 7.11.2. | INDEX REVIEW | 76 |
| 7.11.3. | ONGOING MAINTENANCE | 77 |
| Python | TensorFlow | PyTorch | Scikit-learn | Deep Learning |
|---|---|---|---|---|
| Natural Language Processing | Computer Vision | AWS SageMaker | Docker | Git |


## 8. STOXX EQUAL WEIGHT INDICES
## Work Experience


| 8.1. | STOXX EQUAL WEIGHT INDICES | 78 |
|---|---|---|
| 8.1.1. | OVERVIEW | 78 |
| 8.1.2. | INDEX REVIEW | 78 |
| 8.1.3. | ONGOING MAINTENANCE | 78 |

### Senior Machine Learning Engineer
TechCorp Inc., San Francisco, CA
June 2019 - Present
- Led a team of 5 ML engineers in developing a state-of-the-art recommendation system, increasing user engagement by 35%
- Implemented and optimized deep learning models for natural language processing tasks, improving accuracy by 20%
- Designed and deployed scalable ML pipelines using AWS SageMaker, reducing model training time by 40%

## 9. STOXX BLUE-CHIP INDICES
### Machine Learning Engineer
Al Innovations Ltd., Boston, MA
August 2016 - May 2019
- Developed computer vision algorithms for autonomous vehicles, achieving 98% accuracy in object detection
- Collaborated with cross-functional teams to integrate ML models into production systems
- Authored technical documentation and conducted knowledge-sharing sessions on ML best practices


| 9.1. | STOXX GLOBAL AND COUNTRY BLUE-CHIP INDICES | 80 |
|---|---|---|
| 9.1.1. | OVERVIEW | 80 |
| 9.1.2. | INDEX REVIEW | 81 |
| 9.1.3. | ONGOING MAINTENANCE | 84 |
| 9.2 | EURO STOXX 50 | 85 |
| 9.2.1. | OVERVIEW | 85 |
| 9.2.2. | INDEX REVIEW | 85 |
| 9.2.3. | ONGOING MAINTENANCE | 86 |
| 9.3 | STOXX REGIONAL BLUE-CHIP INDICES | 88 |
| 9.3.1. | OVERVIEW | 88 |
| 9.3.2. | INDEX REVIEW | 88 |
| 9.3.3. | ONGOING MAINTENANCE | 89 |
| 9.4 | STOXX GLOBAL 150 | 91 |
| 9.4.1. | OVERVIEW | 91 |
| 9.4.2. | INDEX REVIEW | 91 |
| 9.4.3. | ONGOING MAINTENANCE | 91 |
| 9.5 | STOXX BALKAN 50 EQUAL WEIGHT | 92 |
| 9.5.1. | OVERVIEW | 92 |
| 9.5.2. | INDEX REVIEW | 92 |
| 9.5.3. | ONGOING MAINTENANCE | 93 |
| 9.6 | STOXX CANADA 60 | 94 |
| 9.6.1. | OVERVIEW | 94 |
| 9.6.2. | INDEX REVIEW | 94 |
| 9.6.3. | ONGOING MAINTENANCE | 95 |

## Education

## 10. STOXX DIVIDEND INDICES
Master of Science in Computer Science, Specialization in Machine Learning
Stanford University, Stanford, CA (2014 - 2016)


| 10.1. | STOXX SELECT DIVIDEND INDICES | 96 |
|---|---|---|
| 10.1.1. | OVERVIEW | 96 |
| 10.1.2. | INDEX REVIEW | 96 |
| 10.1.3. | STOXX SELECT DIVIDEND INDICES | 99 |
| 10.1.4. | ONGOING MAINTENANCE | 101 |
| 10.2 | STOXX ASEAN-FIVE SELECT DIVIDEND 50 | 104 |
| 10.
Bachelor of Science in Computer Engineering
Massachusetts Institute of Technology, Cambridge, MA (2010 - 2014)

## Certifications

- AWS Certified Machine Learning - Specialty
- Google Cloud Professional Machine Learning Engineer
13 changes: 7 additions & 6 deletions tests/test.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,13 +48,12 @@ def setUp(self):

def test_pdf_sync_parse(self):
"""Synchronous PDF Parse"""
working_file = "./examples/sample_data/stoxx_index_guide_0003.pdf"
working_file = "./examples/sample_data/resume_1.pdf"
correct_output_file = "./tests/outputs/correct_pdf_output.txt"

# extract
markdown_list, elapsed_time = self.ap.parse(file_path=working_file)
markdown = "\n".join(markdown_list)

self.assertFalse(markdown.startswith("Error:"), markdown)
correct_output = get_ground_truth(correct_output_file)
percentage = compare_markdown(markdown, correct_output)
Expand All @@ -66,7 +65,7 @@ def test_pdf_sync_parse(self):

def test_pdf_sync_parse_with_file_content(self):
"""Synchronous PDF Parse with file content"""
working_file = "./examples/sample_data/stoxx_index_guide_0003.pdf"
working_file = "./examples/sample_data/resume_1.pdf"
correct_output_file = "./tests/outputs/correct_pdf_output.txt"

with open(working_file, "rb") as file:
Expand All @@ -90,7 +89,7 @@ def test_pdf_sync_parse_with_file_content(self):

def test_pdf_async_parse_and_fetch(self):
"""Asynchronous PDF Parse and Fetch"""
working_file = "./examples/sample_data/stoxx_index_guide_0003.pdf"
working_file = "./examples/sample_data/resume_1.pdf"
correct_output_file = "./tests/outputs/correct_pdf_output.txt"

# extract
Expand All @@ -109,15 +108,17 @@ def test_pdf_async_parse_and_fetch(self):

def test_pdf_async_parse_and_fetch_with_file_content(self):
"""Asynchronous PDF Parse and Fetch with file content"""
working_file = "./examples/sample_data/stoxx_index_guide_0003.pdf"
working_file = "./examples/sample_data/resume_1.pdf"
correct_output_file = "./tests/outputs/correct_pdf_output.txt"

with open(working_file, "rb") as file:
file_content = base64.b64encode(file.read()).decode("utf-8")
file_type = Path(working_file).suffix.lower().lstrip(".")

# extract
file_id = self.ap.async_parse(file_content=file_content, file_type=file_type)
file_id = self.ap.async_parse(
file_content=file_content, file_type=file_type
)
self.assertFalse(file_id.startswith("Error:"), file_id)
# fetch
markdown_list = self.ap.async_fetch(file_id=file_id)
Expand Down

0 comments on commit 9aa3f48

Please sign in to comment.