[pull] master from ray-project:master #2395

pull · 2023-11-10T03:40:44Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

Data arrow 6 test jobs are failing on master and blocking folks from merging. There are a lot of tests so I soft fail the entire jobs. There are still the arrow 12 job that protect bad PRs that break data tests. Signed-off-by: can <[email protected]>

Signed-off-by: Cindy Zhang <[email protected]> Co-authored-by: Edward Oakes <[email protected]>

This change is for code refactoring, including several changes: Remove the logic for generating the keys, which has been handled by Packer Remove one meaningless, e.g., mock-too-much, UT case. We have e2e pipline to cover that case. Rewrite the scheduler for scheduling the next Ray node. The previous "factory mode" looks not very good. Remove the useless code for reusing the node. On vSphere, thanks to instant clone, we don't need to cache the VM. We can always launch a new Ray node quickly. For the old ones, we just delete the VM after the Ray process on it quits the Ray cluster. Remove the cache for the nodes. The cache could be outdated when the node information changes on vSphere side. We have only one place leverages the node cache so the benefit for keeping the cache doesn't worth its potential issue. Moved some functions which have no dependencies on the VsphereNodeProvider instance to the outside scope, instead of being a member function of the VsphereNodeProvider instance. Some changes caused by above changes. Some Python grammar level optimization. Signed-off-by: Chen Jing <[email protected]>

…mote_storage`, etc.) (#40207) This PR removes some legacy utilities, including `air._internal.remote_storage`, `TrainableUtil`, and more. --------- Signed-off-by: Justin Yu <[email protected]>

As a byproduct of the recent documentation rewrites, the Train docs contain several code snippets that aren't tested. This PR updates the snippets to test the ones that can be reasonably tested. --------- Signed-off-by: Balaji Veeramani <[email protected]>

This PR adds the support to run MPI based code on top of Ray. The support is done with runtime env plugin. To enable it, the following decorator needs to be added inside ray remote options: @ray.remote( runtime_env={ "mpi": { "args": ["-n", "4"], "worker_entry": "mpi_worker.run", } } ) def f(): pass Here the mpi_worker.run is the function the process with rank > 0 will run. It'll run as import mpi_worker; mpi_worker.run(). The parameter needs to be passed with MPI comm.bcast. Here the process with rank 0 sill will run the remote function f. Signed-off-by: Yi Cheng <[email protected]>

Follow up with the pr to implement the dynamic logging. --------- Signed-off-by: Sihan Wang <[email protected]>

…n. (#41062) Signed-off-by: Yi Cheng <[email protected]>

When you launch a read task generated byFileBasedDatasource, Ray serializes the FileBasedDatasource instance because the read task calls FileBasedDatasource._read_stream. This wasn't an issue before, because FileBasedDatasource was stateless and therefore quick to serialize. However, #40900 moved state like the input file paths from _FileBasedDatasourceReader to FileBasedDatasource. As a result, FileBasedDatasource is now slow to serialize and read tasks can be slow to launch. To fix this issue, this PR stores paths and file sizes in the object store, as only stores references to them in FileBasedDatasource. --------- Signed-off-by: Balaji Veeramani <[email protected]>

can-anyscale and others added 9 commits November 9, 2023 10:52

[serve] Add route_prefix flag to serve run (#41011)

6dc2c99

Signed-off-by: Cindy Zhang <[email protected]> Co-authored-by: Edward Oakes <[email protected]>

[tune] New persistence mode cleanup: Delete legacy utilities (`air.re…

837ec26

…mote_storage`, etc.) (#40207) This PR removes some legacy utilities, including `air._internal.remote_storage`, `TrainableUtil`, and more. --------- Signed-off-by: Justin Yu <[email protected]>

[2/2][Serve] Support dynamic log (#40735)

00a1dcb

Follow up with the pr to implement the dynamic logging. --------- Signed-off-by: Sihan Wang <[email protected]>

[core] Change __import__ to importlib.import_module for MPI plugi…

5a0fd38

…n. (#41062) Signed-off-by: Yi Cheng <[email protected]>

pull bot added the ⤵️ pull label Nov 10, 2023

pull bot merged commit 2d3865f into miqdigital:master Nov 10, 2023
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from ray-project:master #2395

[pull] master from ray-project:master #2395

pull bot commented Nov 10, 2023 •

edited

Loading

[pull] master from ray-project:master #2395

[pull] master from ray-project:master #2395

Conversation

pull bot commented Nov 10, 2023 • edited Loading

pull bot commented Nov 10, 2023 •

edited

Loading