-
Notifications
You must be signed in to change notification settings - Fork 546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Online weight update [WIP] #2119
Conversation
test/srt/stderr.txt
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this, it's useless.
test/srt/stdout.txt
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this, it's useless.
@classmethod | ||
def init_process(cls, rank, world_size, base_url, model_name, server_pid): | ||
try: | ||
# 设置分布式环境 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not use Chinese annotations.
You can rebase and add your 2-gpu test here sglang/.github/workflows/pr-test.yml Lines 106 to 109 in fa27161
|
return ret | ||
except Exception as e: | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST |
Check warning
Code scanning / CodeQL
Information exposure through an exception Medium
Stack trace information
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 1 day ago
To fix the problem, we need to ensure that the exception details are not exposed to the user. Instead, we should log the exception details on the server and return a generic error message to the user. This can be achieved by modifying the exception handling block to log the exception and return a generic error message.
- Modify the exception handling block in the
get_memory_pool_size
function to log the exception and return a generic error message. - Add the necessary import for logging if it is not already present.
-
Copy modified line R219 -
Copy modified line R221
@@ -218,4 +218,5 @@ | ||
except Exception as e: | ||
logging.error("Exception occurred in get_memory_pool_size: %s", str(e)) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) |
return ORJSONResponse(ret, status_code=200) | ||
except Exception as e: | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST |
Check warning
Code scanning / CodeQL
Information exposure through an exception Medium
Stack trace information
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 1 day ago
To fix the problem, we need to ensure that detailed exception messages are not exposed to the user. Instead, we should log the detailed error information on the server and return a generic error message to the user. This can be achieved by modifying the exception handling code to log the exception and return a generic error message.
- Import the
traceback
module to log the stack trace. - Modify the exception handling blocks to log the stack trace and return a generic error message.
-
Copy modified line R32 -
Copy modified line R286 -
Copy modified line R288 -
Copy modified line R299 -
Copy modified line R301 -
Copy modified line R312 -
Copy modified line R314
@@ -31,2 +31,3 @@ | ||
import torch | ||
import traceback | ||
|
||
@@ -284,4 +285,5 @@ | ||
except Exception as e: | ||
logging.error(traceback.format_exc()) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -296,4 +298,5 @@ | ||
except Exception as e: | ||
logging.error(traceback.format_exc()) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -308,4 +311,5 @@ | ||
except Exception as e: | ||
logging.error(traceback.format_exc()) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) |
return ret | ||
except ValueError as e: | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST |
Check warning
Code scanning / CodeQL
Information exposure through an exception Medium
Stack trace information
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 1 day ago
To fix the problem, we need to ensure that detailed exception messages are not exposed to the user. Instead, we should log the detailed error message on the server and return a generic error message to the user. This can be achieved by modifying the exception handling code to log the exception and return a generic error message.
- Import the
logging
module if not already imported. - Modify the exception handling code to log the exception message using the
logging
module. - Return a generic error message to the user.
-
Copy modified lines R326-R327 -
Copy modified line R343 -
Copy modified line R345 -
Copy modified line R359 -
Copy modified line R361 -
Copy modified line R374 -
Copy modified line R376 -
Copy modified line R391 -
Copy modified line R393
@@ -325,3 +325,4 @@ | ||
except ValueError as e: | ||
out = {"error": {"message": str(e)}} | ||
logging.error(f"Error in stream_results: {e}") | ||
out = {"error": {"message": "An internal error has occurred."}} | ||
yield b"data: " + orjson.dumps( | ||
@@ -341,4 +342,5 @@ | ||
except ValueError as e: | ||
logging.error(f"Error in generate_request: {e}") | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -356,4 +358,5 @@ | ||
except ValueError as e: | ||
logging.error(f"Error in init_parameter_update_group_request: {e}") | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -370,4 +373,5 @@ | ||
except ValueError as e: | ||
logging.error(f"Error in get_weights_by_parameter_name_request: {e}") | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -386,4 +390,5 @@ | ||
except ValueError as e: | ||
logging.error(f"Error in update_parameter_from_distributed_request: {e}") | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) |
return ret | ||
except ValueError as e: | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST |
Check warning
Code scanning / CodeQL
Information exposure through an exception Medium
Stack trace information
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 1 day ago
To fix the problem, we need to ensure that detailed error messages and stack traces are not exposed to the user. Instead, we should log the detailed error information on the server and return a generic error message to the user. This can be achieved by modifying the exception handling code to log the exception and return a generic error message.
- Modify the exception handling code in the
generate_request
,init_parameter_update_group_request
,get_weights_by_parameter_name_request
,update_parameter_from_distributed_request
, andencode_request
functions. - Use the
logging
module to log the detailed error message on the server. - Return a generic error message to the user.
-
Copy modified lines R326-R327 -
Copy modified line R343 -
Copy modified line R345 -
Copy modified line R359 -
Copy modified line R361 -
Copy modified line R374 -
Copy modified line R376 -
Copy modified line R391 -
Copy modified line R393 -
Copy modified line R409 -
Copy modified line R411
@@ -325,3 +325,4 @@ | ||
except ValueError as e: | ||
out = {"error": {"message": str(e)}} | ||
logging.error("Error in stream_results: %s", str(e)) | ||
out = {"error": {"message": "An internal error has occurred."}} | ||
yield b"data: " + orjson.dumps( | ||
@@ -341,4 +342,5 @@ | ||
except ValueError as e: | ||
logging.error("Error in generate_request: %s", str(e)) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -356,4 +358,5 @@ | ||
except ValueError as e: | ||
logging.error("Error in init_parameter_update_group_request: %s", str(e)) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -370,4 +373,5 @@ | ||
except ValueError as e: | ||
logging.error("Error in get_weights_by_parameter_name_request: %s", str(e)) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -386,4 +390,5 @@ | ||
except ValueError as e: | ||
logging.error("Error in update_parameter_from_distributed_request: %s", str(e)) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -403,4 +408,5 @@ | ||
except ValueError as e: | ||
logging.error("Error in encode_request: %s", str(e)) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) |
return ret | ||
except ValueError as e: | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST |
Check warning
Code scanning / CodeQL
Information exposure through an exception Medium
Stack trace information
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 1 day ago
To fix the problem, we need to ensure that detailed exception messages are not exposed to the user. Instead, we should log the detailed error message on the server and return a generic error message to the user. This can be achieved by modifying the exception handling code to log the exception and return a generic error message.
- Import the
logging
module if it is not already imported. - Replace the return statements that expose the exception message with code that logs the exception and returns a generic error message.
-
Copy modified line R297 -
Copy modified line R299 -
Copy modified line R310 -
Copy modified line R312 -
Copy modified lines R328-R329 -
Copy modified line R345 -
Copy modified line R347 -
Copy modified line R361 -
Copy modified line R363 -
Copy modified line R376 -
Copy modified line R378 -
Copy modified line R393 -
Copy modified line R395
@@ -296,4 +296,5 @@ | ||
except Exception as e: | ||
logging.error("Error in open_session: %s", str(e)) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -308,4 +309,5 @@ | ||
except Exception as e: | ||
logging.error("Error in close_session: %s", str(e)) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -325,3 +327,4 @@ | ||
except ValueError as e: | ||
out = {"error": {"message": str(e)}} | ||
logging.error("Error in stream_results: %s", str(e)) | ||
out = {"error": {"message": "An internal error has occurred."}} | ||
yield b"data: " + orjson.dumps( | ||
@@ -341,4 +344,5 @@ | ||
except ValueError as e: | ||
logging.error("Error in generate_request: %s", str(e)) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -356,4 +360,5 @@ | ||
except ValueError as e: | ||
logging.error("Error in init_parameter_update_group_request: %s", str(e)) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -370,4 +375,5 @@ | ||
except ValueError as e: | ||
logging.error("Error in get_weights_by_parameter_name_request: %s", str(e)) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) | ||
@@ -386,4 +392,5 @@ | ||
except ValueError as e: | ||
logging.error("Error in update_parameter_from_distributed_request: %s", str(e)) | ||
return ORJSONResponse( | ||
{"error": {"message": str(e)}}, status_code=HTTPStatus.BAD_REQUEST | ||
{"error": {"message": "An internal error has occurred."}}, status_code=HTTPStatus.BAD_REQUEST | ||
) |
Motivation
Modifications
Checklist