Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from ray-project:master #2399

Merged
merged 24 commits into from
Nov 15, 2023
Merged

Conversation

pull[bot]
Copy link

@pull pull bot commented Nov 15, 2023

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

rickyyx and others added 24 commits November 13, 2023 23:19
The TFRecords docstring says that non-.tfrecord files are filtered out, but that isn't true. We don't specify any file extension filter.

Signed-off-by: Balaji Veeramani <[email protected]>
Adds a dropdown for operator level metrics for each dataset row in the ray data table.
In addition to storing dataset state, `_StatsActors` now also stores operator level state.

Signed-off-by: Andrew Xue <[email protected]>
Co-authored-by: Alan Guo <[email protected]>
Fault tolerance is table stakes for Ray Data, and this PR adds the feature for batch inference. To ensure that tasks are retried in the case of system failures (like nodes crashing), this PR configures max_retries for all remote tasks. It also adds a chaos release test.

---------

Signed-off-by: Balaji Veeramani <[email protected]>
…in reversed insert order(#40897)

Ray >=2.8 will not support Python 3.7, so we can use reversed/items (introduced in [Python 3.8](https://docs.python.org/3/whatsnew/3.8.html#other-language-changes)) instead of list/[::-1] to avoid converting the whole topology dict into a list many times.

Signed-off-by: z4y1b2 <[email protected]>
…#41111)

This PR is to deprecate `num_blocks` parameter from `random_shuffle()` This is to avoid surfacing block concept from our API.

Signed-off-by: Cheng Su <[email protected]>
This cpp tests have been blocking postmerge twice

Signed-off-by: can <[email protected]>
Fix rest api system logging config


---------

Signed-off-by: Sihan Wang <[email protected]>
Co-authored-by: Cindy Zhang <[email protected]>
Signed-off-by: allenwang28 <[email protected]>
Signed-off-by: Allen Wang <[email protected]>
From ESXI 7.0’s release there was an addition called Dynamic DirectPath IO which is an option for GPU passthrough.

DirectPath IO is a feature that allows a physical PCIe device to be directly mapped to a VM, similar to Dynamic DirectPath IO. The main difference between the two is that DirectPath IO allows the VM to have direct access to the physical device all the time, while Dynamic DirectPath IO allows the VM to access the physical device only when the VM is powered on.

This is to support Dynamic DirectPath IO. If you want to enable, do it like this:

provider:
    ...
    vsphere_config:    
      .....
      gpu_config:
        dynamic_pci_passthrough: True

Signed-off-by: Chen Hui <[email protected]>
quay.io has issues pulling older images (https://status.quay.io/incidents/z7sbjqmb34p1). Let's just upgrade manylinux image while we are here

Signed-off-by: can <[email protected]>
Improve setup_ray_cluster document for num_worker_nodes argument

Signed-off-by: Weichen Xu <[email protected]>
Link in the doc string for ray.util.state.list_jobs was broken because of syntax error (extra space between :ref: and the target.

Signed-off-by: angelinalg <[email protected]>
Build arm64 docker images in civ2

Signed-off-by: can <[email protected]>
…ss code in databricks hook (#40823)

Fix code for getting databricks entry_point, and remove useless code in databricks hook.

This removed code calls internal (but unstable) databricks API, but we don't need it actually.

Signed-off-by: Weichen Xu <[email protected]>
…parameter (#41120)

Several users have ran into friction when configuring file extensions filters. Our current API requires configuring partition_filter, and it’s confusing because file extensions have nothing to do with partitions (e.g., HIVE-style partitions).
It's cleaner if we add a file_extensions parameter to file-based APIs.

---------

Signed-off-by: Balaji Veeramani <[email protected]>
]These tests don't seem to be flaky based on the flaky dahsboard
This PR fixes a bug with fault tolerance that caused duplicate blocks to be written if a single block write fails.

Also fixes a bug where the write silently succeeds if MAX_RETRY_CNT is reached.

---------

Signed-off-by: Matthew Tang <[email protected]>
…thmConfig.rl_module_spec was NOT a @Property yet) breaks when trying to load from this checkpoint. (#41157)
…en reading large files (#40533)

Adds an example for a workaround when reading large files with ray.data.read_json, which involves setting the block size used by PyArrow's JSON loader. Generated docs page

---------

Signed-off-by: Scott Lee <[email protected]>
@pull pull bot added the ⤵️ pull label Nov 15, 2023
@pull pull bot merged commit d4cae1d into miqdigital:master Nov 15, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.