forked from tensorflow/tensorflow
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull] master from tensorflow:master #238
Open
pull
wants to merge
1,124
commits into
GesuBackups:master
Choose a base branch
from
tensorflow:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+106,575
−67,163
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Imported from GitHub PR openxla/xla#19426 After this change to the test inputs openxla/xla@b10653f "too many blocks" exception is not getting triggered anymore (shape is not big enough). Due to the low importance of the test, it was decided to disable it. Copybara import of the project: -- ee36ca0f0c0dddb8db724327af78f0b59de3903b by Milica Makevic <[email protected]>: Disable gpu_too_many_blocks_test for rocm Merging this change closes #19426 PiperOrigin-RevId: 697974812
PiperOrigin-RevId: 697988497
Change existing pytorch composites to unify the upsample-bilinear composites from JAX and PyTorch. PiperOrigin-RevId: 698012187
Updates LLVM usage to match [b03a747fc0fc](llvm/llvm-project@b03a747fc0fc) PiperOrigin-RevId: 698047676
For Linux platforms the build rule generates the `auditwheel show` log in the output regardless of compliance check flag. PiperOrigin-RevId: 698048983
PiperOrigin-RevId: 698049836
…e ALL broadcast-like inputs on TFLite ops that support implicit broadcasting PiperOrigin-RevId: 698054216
* Change default QNN graph config to use HTP FP16 precision backend config, this is required to correctly compile FP32 OPs. * Create 1-element 1D tensor out of scalar value, QNN OP always use ranked tensor type as input. PiperOrigin-RevId: 698081261
… the .td definition. PiperOrigin-RevId: 698082807
* Add FC Op legalization and test data. * Add Select/Select_v2 Op legalization. * Mics cleanups. PiperOrigin-RevId: 698094953
PiperOrigin-RevId: 698111562
…rializing any modules. Also pulled the deserialization a little further up the stack and only do it if the input doesn't already have a full module op. PiperOrigin-RevId: 698116466
PiperOrigin-RevId: 698121993
PiperOrigin-RevId: 698132925
PiperOrigin-RevId: 698133024
… Todo (resolved) PiperOrigin-RevId: 698133747
This is to fix issue with gsutil which expects Python 3.5-3.11: ``` Error: gsutil requires Python version 2.7 or 3.5-3.11, but a different version is installed. ``` PiperOrigin-RevId: 698134102
PiperOrigin-RevId: 698137647
PiperOrigin-RevId: 698150097
PiperOrigin-RevId: 698163185
…. This Extend() call would also lead to a memory assignment issue since it wasn't accompanied by the necessary chunk commit requests. We also add a VerifyAllocations() function that uses a BufferIntervalTree to check for overlapping Allocations before scheduling the asynchronous copies. This is an extra check for the correctness of MsaAlgorithm allocations, and is only applied if options_.verify is enabled in MSA options. options_.verify is disabled by default. PiperOrigin-RevId: 698164396
PiperOrigin-RevId: 698164750
PiperOrigin-RevId: 698164921
This change adds the legalization pass from IFRT to VIFRT. Legalization uses a templated OpConversion class, which is refined via the `IFRT` <-> `VIFRT` and `mlir::Func::*` <-> `VIFRT` op mappings defined in `map_ifrt_to_vifrt.h` The change versions also `mlir::func::FuncOp`, `mlir::func::ReturnOp` and `mlir::func::CallOp` because this provides the following advantages: 1) we can use the templated OpConversion class rather than implementing a separate converter for each op, and 2) we can restrict the surface of possible breaking changes to just builtin types and attributes. Moreover, the change versions `mlir::FunctionType` and `mlir::TypeAttr` in order to be able to use the generic Op converter, and to restrict types allowed in functions (just builtin and IFRT types). PiperOrigin-RevId: 698168526
Also fixed invalid C++ header usage. PiperOrigin-RevId: 698170878
…td definition. PiperOrigin-RevId: 698171237
PiperOrigin-RevId: 698174417
Next step is to migrate NcclComm and NcclOwnedComm to std::unique_ptr<Communicator> and proper virtual inheritance. PiperOrigin-RevId: 699233544
…filer_test This was originally proposed in openxla/xla#16102, but I still ran into issue where it failed by slight margin: ``` Expected: (profiler.MeasureClockCyclesPerOp(HloOpcode::kDivide, F64) .value() .clock_cycles()) > (300), actual: 296 vs 300 ``` That said, I ran 1000 tests and did not encounter this issue. Reducing the threshold to 280 since the bound seems very close and flaky test is no good either way. PiperOrigin-RevId: 699233864
… for XLA:CPU PiperOrigin-RevId: 699234540
PiperOrigin-RevId: 699235057
Also fixes a few missing includes. Uses C++ includes instead or C ones. PiperOrigin-RevId: 699237969
PiperOrigin-RevId: 699238045
Define APIs for compiling LLVM modules to functions required by the XLA:CPU runtime: kernels, comparators, etc. Implementation largely exists as SimpleOrcJit in service/cpu, but it's tightly coupled with "legacy" XLA. PiperOrigin-RevId: 699239722
PiperOrigin-RevId: 699242142
PiperOrigin-RevId: 699247019
* De-dupe logic in test common and model_buffer. * Factor out the flatbuffer model wrapper from the class in test common and move to flatbuffer_tools. * Add some extra helpers for flatbuffers in flatbuffer_tools, and add test. * Hide all the usage of `std::filesystem` stuff in one cc. Technically `<filesystem>` is an unapproved header. * Update model_load to use the flatbuffer tools. * Pull some of the member functions of "model unpacker" out into non-member functions. PiperOrigin-RevId: 699249089
…utions Performance is comparable to the synchronous version. Detailed results (where 'old' is the synchronous execution, 'new' is async execution; both use the same, custom algorithm for transposed conv): name old cpu/op new cpu/op delta BM_Conv1DStrided/process_time 29.4ms ± 6% 29.7ms ± 5% ~ (p=0.841 n=5+5) BM_Conv1DTransposedStrided/process_time 29.6ms ± 2% 30.7ms ± 2% +3.52% (p=0.008 n=5+5) BM_Conv1DTransposedStridedNonDefaultLayout/process_time 28.5ms ± 3% 28.3ms ± 1% ~ (p=0.222 n=5+5) name old time/op new time/op delta BM_Conv1DStrided/process_time 2.68ms ± 7% 2.72ms ± 5% ~ (p=0.548 n=5+5) BM_Conv1DTransposedStrided/process_time 7.91ms ± 3% 7.98ms ± 5% ~ (p=0.548 n=5+5) BM_Conv1DTransposedStridedNonDefaultLayout/process_time 7.00ms ± 2% 7.32ms ± 4% +4.58% (p=0.016 n=5+5) PiperOrigin-RevId: 699250549
Updates LLVM usage to match [556ea5265a25](llvm/llvm-project@556ea5265a25) PiperOrigin-RevId: 699251575
NCCL implementation detail will have private visibility, and for all external users (Thunks etc.) we'll export it via public header that uses xla/core/collectives APIs. PiperOrigin-RevId: 699256314
…ator, encoded as follows ``` _TENSOR_V1_<name>: { TENSOR_SHAPE: Vector<i64>, TENSOR_TYPE: tflite::TensorType (casted to i64), TENSOR_DATA: Vector<f32> or Vector<i64> } ``` PiperOrigin-RevId: 699272982
The pass runs over a VIFRT module, and tries to convert it to a given target version. PiperOrigin-RevId: 699279298
PiperOrigin-RevId: 699279921
PiperOrigin-RevId: 699286343
PiperOrigin-RevId: 699309235
…annot be null PiperOrigin-RevId: 699310290
…ying the naming. PiperOrigin-RevId: 699317601
StreamExecutorGpuClient topology description as well. PiperOrigin-RevId: 699320139
nullptr is handled here. PiperOrigin-RevId: 699323007
Also: * Add some helper functions for checking a litert op matches a tfl op which can can also be re-used in other contexts. * Add some quantization related helper functions to flatbuffer_tools * Update dump for quantization * Move thins around a bit and add quantization stuff to model_util support checks PiperOrigin-RevId: 699333588
PiperOrigin-RevId: 699337598
StreamExecutorGpuTopologyDescription rather than parsing it for every compile. PiperOrigin-RevId: 699344815
… with array output and multiple users. It may trigger compilation error, such as the added test target. PiperOrigin-RevId: 699357851
PiperOrigin-RevId: 699361885
PiperOrigin-RevId: 699397857
PiperOrigin-RevId: 699409569
PiperOrigin-RevId: 699467519
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )