Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime Error on Large Repositories #1

Open
bfuerholz opened this issue Jun 9, 2024 · 0 comments
Open

Runtime Error on Large Repositories #1

bfuerholz opened this issue Jun 9, 2024 · 0 comments

Comments

@bfuerholz
Copy link
Owner

Description
When fetching and processing large repositories, the application encounters a runtime error due to the limitations of the Vercel serverless environment. This results in a 504 Gateway Timeout error or a filesystem error indicating a read-only file system.

Error Details
Error 1: 504 Gateway Timeout
Status Code: 504
Gateway Timeout

Error 2: Read-only File System
Error scraping repository: [Errno 30] Read-only file system: 'bfuerholz_bfuerholz_20240609135955.txt'

Steps to Reproduce
Deploy the backend on Vercel.
Use the frontend to fetch a large GitHub repository.
Observe the runtime error in the backend logs.
Expected Behavior
The backend should handle large repositories without running into timeout or filesystem errors.

Actual Behavior
The backend encounters runtime errors and fails to process large repositories.

Possible Solutions
Increase Timeout Limit: Adjust the timeout settings for the Vercel serverless functions to handle longer-running processes.
Implement Asynchronous Processing: Use a task queue like Celery to offload the processing of large repositories to background tasks.
Use Persistent Storage: Store temporary files and processing results in a persistent storage service like AWS S3 or a database.
Optimize File Processing: Improve the efficiency of the file fetching and processing logic to reduce execution time.
Additional Context
This issue occurs because the current Vercel serverless function has a maximum execution time and does not support writing to the filesystem persistently.

Proposed Workaround
For now, consider running the backend on a different platform that supports longer execution times and persistent storage, such as AWS Lambda with S3 or a dedicated server.

Logs
[ERROR] 2024-06-09T13:59:55.831Z 4386ea68-b43b-4121-8177-2a4205ad57ff Error scraping repository: [Errno 30] Read-only file system: 'bfuerholz_bfuerholz_20240609135955.txt'
[ERROR] 2024-06-09T13:59:44.788Z ceac7550-6716-472d-a6fb-12d4befc539e Error scraping repository: [Errno 30] Read-only file system: 'bfuerholz_bfuerholz_20240609135944.txt'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant