-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge upstream, 2024-03-09 #2944
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In some cases exits can lack LC PHI nodes for the virtual operand. We have to create them when the epilog loop requires them which also allows us to remove some only halfway correct fixups. This is the variant triggering for alternate exits. PR tree-optimization/114099 * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Create and fill in a needed virtual LC PHI for the alternate exits. Remove code dealing with that missing. * gcc.dg/vect/vect-early-break_120-pr114099.c: New testcase.
* sv.po, zh_CN.po: Update.
The PR complains that for the __builtin_stdc_bit_* "builtins" the diagnostics doesn't mention the name of the builtin the user used, but instead __builtin_{clz,ctz,popcount}g instead (which is what the FE immediately lowers it to). The following patch repeats the checks from check_builtin_function_arguments which are there done on BUILT_IN_{CLZ,CTZ,POPCOUNT}G, such that they diagnose it with the name of the "builtin" user actually used before it is gone. 2024-02-26 Jakub Jelinek <[email protected]> PR c/114042 * c-parser.cc (c_parser_postfix_expression): Diagnose __builtin_stdc_bit_* argument with ENUMERAL_TYPE or BOOLEAN_TYPE type or if signed here rather than on the replacement builtins in check_builtin_function_arguments. * gcc.dg/builtin-stdc-bit-2.c: Adjust testcase for actual builtin names rather than names of builtin replacements.
…ata section [PR113617] If default_elf_select_rtx_section is called to put a reference to some local symbol defined in a comdat section into memory, which happens more often since the r14-4944 RA change, linking might fail. default_elf_select_rtx_section puts such constants into .data.rel.ro.local etc. sections and if linker chooses comdat sections from some other TU and discards the one to which a relocation in .data.rel.ro.local remains, linker diagnoses error. References to private comdat symbols can only appear from functions or data objects in the same comdat group, so the following patch arranges using .data.rel.ro.local.pool.<comdat_name> and similar sections. 2024-02-26 Jakub Jelinek <[email protected]> H.J. Lu <[email protected]> PR rtl-optimization/113617 * varasm.cc (default_elf_select_rtx_section): For references to private symbols in comdat sections use .data.relro.local.pool.<comdat>, .data.relro.pool.<comdat> or .rodata.<comdat> comdat sections. * g++.dg/other/pr113617.C: New test. * g++.dg/other/pr113617.h: New test. * g++.dg/other/pr113617-aux.cc: New test.
…R114012] PR fortran/114012 gcc/fortran/ChangeLog: * trans-expr.cc (gfc_conv_procedure_call): Evaluate non-trivial arguments just once before assigning to an unlimited polymorphic dummy variable. gcc/testsuite/ChangeLog: * gfortran.dg/pr114012.f90: New test.
gcc/ * config/avr/avr.cc (avr_out_compare) [AVR_TINY]: Remove code in an "if avr_adiw_reg_p()" block that's dead for AVR_TINY.
Some options that are pure optimizations where not tagged as such. gcc/ * config/avr/avr.opt (mcall-prologues, mrelax, maccumulate-args) (mstrict-X): Tag as "Optimization".
gcc.dg/attr-weakref-1.c FAILs on 32 and 64-bit Solaris/x86 with the native assembler: FAIL: gcc.dg/attr-weakref-1.c (test for excess errors) UNRESOLVED: gcc.dg/attr-weakref-1.c compilation failed to produce executable Excess errors: Assembler: attr-weakref-1.c "/var/tmp//ccUSaysF.s", line 171 : Multiply defined symbol: "Wv3a" This is a bug in the native as, which isn't seeing fixes recently. Since only a single subtest is affected, this patch omits that one. Tested on i386-pc-solaris2.11 (as and gas) and x86_64-pc-linux-gnu. 2024-02-24 Rainer Orth <[email protected]> gcc/testsuite: PR ipa/70582 * gcc.dg/attr-weakref-1.c (dg-additional-options): Define SOLARIS_X86_AS as appropriate. (lv3, Wv3a, pv3a): Wrap in !SOLARIS_X86_AS. (main): Likewise for chk (pv3a).
The following implements manual update for multi-exit loop prologue peeling during vectorization. PR tree-optimization/114081 * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Perform manual dominator update for prologue peeling. (vect_do_peeling): Properly update dominators after adding the prologue-around guard. * gcc.dg/vect/vect-early-break_121-pr114081.c: New testcase.
…[PR114044] While it seems a lot of places in various optimization passes fold bit query internal functions with INTEGER_CST arguments to INTEGER_CST when there is a lhs, when lhs is missing, all the removals of such dead stmts are guarded with -ftree-dce, so with -fno-tree-dce those unfolded ifn calls remain in the IL until expansion. If they have large/huge BITINT_TYPE arguments, there is no BLKmode optab and so expansion ICEs, and bitint lowering doesn't touch such calls because it doesn't know they need touching, functions only containing those will not even be further processed by the pass because there are no non-small BITINT_TYPE SSA_NAMEs + the 2 exceptions (stores of BITINT_TYPE INTEGER_CSTs and conversions from BITINT_TYPE INTEGER_CSTs to floating point SSA_NAMEs) and when walking there is no special case for calls with BITINT_TYPE INTEGER_CSTs either, those are for normal calls normally handled at expansion time. So, the following patch adjust the expansion of these 6 ifns, by doing nothing if there is no lhs, and also just in case and user disabled all possible passes that would fold this handles the case of setting lhs to ifn call with INTEGER_CST argument. 2024-02-27 Jakub Jelinek <[email protected]> PR rtl-optimization/114044 * internal-fn.def (CLRSB, CLZ, CTZ, FFS, PARITY): Use DEF_INTERNAL_INT_EXT_FN macro rather than DEF_INTERNAL_INT_FN. * internal-fn.h (expand_CLRSB, expand_CLZ, expand_CTZ, expand_FFS, expand_PARITY): Declare. * internal-fn.cc (expand_bitquery, expand_CLRSB, expand_CLZ, expand_CTZ, expand_FFS, expand_PARITY): New functions. (expand_POPCOUNT): Use expand_bitquery. * gcc.dg/bitint-95.c: New test.
When folding a multiply CHRECs are handled like {a, +, b} * c is {a*c, +, b*c} but that isn't generally correct when overflow invokes undefined behavior. The following uses unsigned arithmetic unless either a is zero or a and b have the same sign. I've used simple early outs for INTEGER_CSTs and otherwise use a range-query since we lack a tree_expr_nonpositive_p and get_range_pos_neg isn't a good fit. PR tree-optimization/114074 * tree-chrec.h (chrec_convert_rhs): Default at_stmt arg to NULL. * tree-chrec.cc (chrec_fold_multiply): Canonicalize inputs. Handle poly vs. non-poly multiplication correctly with respect to undefined behavior on overflow. * gcc.dg/torture/pr114074.c: New testcase. * gcc.dg/pr68317.c: Adjust expected location of diagnostic. * gcc.dg/vect/vect-early-break_119-pr114068.c: Do not expect loop to be vectorized.
GCC 13's changes file documents that iwmmx is deprecated. Raise the bar by warning when the mmintrin.h header is included by users, but provide a way to suppress the warning. gcc: * config/arm/mmintrin.h: Warn if this header is included without defining __ENABLE_DEPRECATED_IWMMXT.
gcc/analyzer/ChangeLog: PR analyzer/111881 * constraint-manager.cc (bound::ensure_closed): Assert that m_constant has integral type. (range::add_bound): Bail out on floating point constants. gcc/testsuite/ChangeLog: PR analyzer/111881 * c-c++-common/analyzer/conditionals-pr111881.c: New test. Signed-off-by: David Malcolm <[email protected]>
…y_{to,from}_device*} These routines map simply to the C counterpart and are meanwhile defined in OpenACC 3.3. (There are additional routine changes, including the Fortran addition of acc_attach/acc_detach, that require more work than a simple addition of an interface and are therefore excluded.) libgomp/ChangeLog: * libgomp.texi (OpenACC Runtime Library Routines): Document new 3.3 routines that simply map to their C counterpart. * openacc.f90 (openacc): Add them. * openacc_lib.h: Likewise. * testsuite/libgomp.oacc-fortran/acc_host_device_ptr.f90: New test. * testsuite/libgomp.oacc-fortran/acc-memcpy.f90: New test. * testsuite/libgomp.oacc-fortran/acc-memcpy-2.f90: New test. * testsuite/libgomp.oacc-c-c++-common/lib-59.c: Crossref to f90 test. * testsuite/libgomp.oacc-c-c++-common/lib-60.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-95.c: Likewise.
This is a regression present on the mainline, 13 and 12 branches. For the attached Ada case, it's a tree checking failure on the mainline at -O: +===========================GNAT BUG DETECTED==============================+ | 14.0.1 20240226 (experimental) [master r14-9171-g4972f97a265] GCC error:| | tree check: expected tree that contains 'decl common' structure, | | have 'component_ref' in tree_could_trap_p, at tree-eh.cc:2733 | | Error detected around /home/eric/cvs/gcc/gcc/testsuite/gnat.dg/opt104.adb: Time is a 10-byte record and Packed_Rec.T is placed at bit-offset 65 because of the packing. so tree-ssa-dse.cc:setup_live_bytes_from_ref has computed a const_size of 88 from ref->offset of 65 and ref->max_size of 80. Then in tree-ssa-dse.cc:compute_trims: 411 int last_live = bitmap_last_set_bit (live); (gdb) next 412 if (ref->size.is_constant (&const_size)) (gdb) 414 int last_orig = (const_size / BITS_PER_UNIT) - 1; (gdb) 418 *trim_tail = last_orig - last_live; (gdb) call debug_bitmap (live) n_bits = 256, set = {0 1 2 3 4 5 6 7 8 9 10 } (gdb) p last_live $33 = 10 (gdb) p const_size $34 = 80 (gdb) p last_orig $35 = 9 (gdb) p *trim_tail $36 = -1 In other words, compute_trims is overlooking the alignment adjustments that setup_live_bytes_from_ref applied earlier. Moveover it reads: /* We use sbitmaps biased such that ref->offset is bit zero and the bitmap extends through ref->size. So we know that in the original bitmap bits 0..ref->size were true. We don't actually need the bitmap, just the REF to compute the trims. */ but setup_live_bytes_from_ref used ref->max_size instead of ref->size. It appears that all the callers of compute_trims assume that ref->offset is byte aligned and that the trimmed bytes are relative to ref->size, so the patch simply adds an early return if either condition is not fulfilled. gcc/ * tree-ssa-dse.cc (compute_trims): Fix description. Return early if either ref->offset is not byte aligned or ref->size is not known to be equal to ref->max_size. (maybe_trim_complex_store): Fix description. (maybe_trim_constructor_store): Likewise. (maybe_trim_partially_dead_store): Likewise. gcc/testsuite/ * gnat.dg/opt104.ads, gnat.dg/opt104.adb: New test.
Also handle V2BF mode. PR target/113871 gcc/ChangeLog: * config/i386/mmx.md (V248FI): Add V2BF mode. (V24FI_32): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr113871-5a.c: New test. * gcc.target/i386/pr113871-5b.c: New test.
…3,PR111802] On e.g. gcc211 the use of "%li" with unsigned HOST_WIDE_INT led to this warning: ../../src/gcc/analyzer/access-diagram.cc: In member function ‘void ana::string_literal_spatial_item::add_column_for_byte(text_art::table&, const ana::bit_to_table_map&, text_art::style_manager&, ana::byte_offset_t, ana::byte_offset_t, int, int) const’: ../../src/gcc/analyzer/access-diagram.cc:1909:40: warning: format ‘%li’ expects argument of type ‘long int’, but argument 3 has type ‘long long unsigned int’ [-Wformat=] byte_idx_within_string.ulow ())); ^ and to all values being erroneously printed as "0". Fixed thusly. gcc/analyzer/ChangeLog: PR analyzer/110483 PR analyzer/111802 * access-diagram.cc (string_literal_spatial_item::add_column_for_byte): Use %wu for printing unsigned HOST_WIDE_INT. Signed-off-by: David Malcolm <[email protected]>
This is a (partial) reversion of r14-8987-gdd9d14f7d53 to return to eagerly emitting inline variables to the middle-end when they are declared. 'import_export_decl' will still continue to accept them, as allowing this is a pure extension and doesn't seem to cause issues with modules, but otherwise deferring the emission of inline variables appears to cause issues on some targets and prevents some code using inline variable templates from correctly linking. There might be a more targetted way to support this, but due to the complexity of handling linkage and emission I'd prefer to wait till GCC 15 to explore our options. PR c++/113970 PR c++/114013 gcc/cp/ChangeLog: * decl.cc (make_rtl_for_nonlocal_decl): Don't defer inline variables. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/inline-var10.C: New test. Signed-off-by: Nathaniel Shead <[email protected]>
… large struct I think we have no coverage for the case where structure_value_addr_parm and TYPE_NO_NAMED_ARGS_STDARG_P are both true. The if (type_arg_types != 0) n_named_args = (list_length (type_arg_types) /* Count the struct value address, if it is passed as a parm. */ + structure_value_addr_parm); else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)) n_named_args = 0; else /* If we know nothing, treat all args as named. */ n_named_args = num_actuals; code should probably have n_named_args = structure_value_addr_parm; instead of n_named_args = 0;, this testcase is an attempt to see if it is broken on any target. 2024-02-28 Jakub Jelinek <[email protected]> * gcc.dg/c23-stdarg-6.c: New test.
…ge integral types in memcpy etc. folding [PR113988] The following patch changes the memcpy etc. folding to use bitwise vector types rather than huge INTEGER_TYPEs for copying of > MAX_FIXED_MODE_SIZE lengths. The problem with the huge INTEGER_TYPEs is that they aren't supported very much, usually there are just optabs to handle moves of them, perhaps misaligned moves and that is it, so they pose problems e.g. to BITINT_TYPE lowering. 2024-02-28 Jakub Jelinek <[email protected]> PR tree-optimization/113988 * stor-layout.h (bitwise_mode_for_size): Declare. * stor-layout.cc (bitwise_mode_for_size): New function. * gimple-fold.cc (gimple_fold_builtin_memory_op): Use it. Use bitwise_type_for_mode instead of build_nonstandard_integer_type. Use BITS_PER_UNIT instead of 8. * gcc.dg/bitint-91.c: New test.
The following testcases are miscompiled, because graphite ignores boolean, enumerated or _BitInt comparisons, rewrites the code as if the comparisons were always true or always false. The INTEGER_TYPE checks were initially added in r6-2239 but at that point it was both in add_conditions_to_domain and in parameter_index_in_region. Later on the check was also added to stmt_simple_for_scop_p, and finally r8-3931 changed the stmt_simple_for_scop_p check to INTEGRAL_TYPE_P and turned the parameter_index_in_region -> assign_parameter_index_in_region into INTEGRAL_TYPE_P assertion, but the add_conditions_to_domain check for INTEGER_TYPE remained. The following patch uses INTEGRAL_TYPE_P to complete the change. 2024-02-28 Jakub Jelinek <[email protected]> PR tree-optimization/114041 * graphite-sese-to-poly.cc (add_conditions_to_domain): Check for INTEGRAL_TYPE_P check rather than INTEGER_TYPE. * gcc.dg/graphite/run-id-pr114041-1.c: New test. * gcc.dg/graphite/run-id-pr114041-2.c: New test.
The emulation via word mode tries to perform integer arithmetic on floating point values instead of floating point arithmetic. This leads to mis-compilations. Failure occured on s390x on these existing test cases: gcc.dg/vect/tsvc/vect-tsvc-s112.c gcc.dg/vect/tsvc/vect-tsvc-s113.c gcc.dg/vect/tsvc/vect-tsvc-s119.c gcc.dg/vect/tsvc/vect-tsvc-s121.c gcc.dg/vect/tsvc/vect-tsvc-s131.c gcc.dg/vect/tsvc/vect-tsvc-s132.c gcc.dg/vect/tsvc/vect-tsvc-s2233.c gcc.dg/vect/tsvc/vect-tsvc-s421.c gcc.dg/vect/vect-alias-check-14.c gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c gcc/ChangeLog: PR tree-optimization/114075 * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating point vectors Signed-off-by: Juergen Christ <[email protected]>
This adds testcase from PR114075 which has been fixed by the r14-9205 change on s390x-linux with -march=z13. 2024-02-28 Jakub Jelinek <[email protected]> PR tree-optimization/114075 * gcc.dg/gomp/pr114075.c: New test.
…4 [PR91567] gcc.dg/tree-ssa/builtin-snprintf-6.c currently XPASSes on i?86-*-* configurations with -m64: XPASS: gcc.dg/tree-ssa/builtin-snprintf-6.c scan-tree-dump-times optimized "Function test_assign_aggregate" 1 (seen e.g. on i386-pc-solaris2.11, i686-pc-linux-gnu, or i386-apple-darwin*). The problem is that the xfail only handles x86_64, ignoring that i?86 configurations can also be multilibbed. This patch fixes the by handling both forms alike. Tested on i386-pc-solaris2.11, amd64-pc-solaris2.11, sparc-sun-solaris2.11, and sparcv9-sun-solaris2.11. 2024-02-28 Rainer Orth <[email protected]> gcc/testsuite: PR tree-optimization/91567 * gcc.dg/tree-ssa/builtin-snprintf-6.c (scan-tree-dump-times): Treat i?86-*-* like x86_64-*-*.
powerpc64-linux apparently (not very surprisingly) behaves the same way as powerpc64le-linux and has 4 sunk statements rather than 5, so we should xfail it on powerpc64*-*-* rather than just powerpc64le-*-*. powerpc-linux has 3 sunk statements, but the scan pattern is done for lp64 only as the comment explains. 2024-02-28 Jakub Jelinek <[email protected]> PR testsuite/111462 * gcc.dg/tree-ssa/ssa-sink-18.c: XFAIL also on powerpc64.
libstdc++-v3/ChangeLog: * include/std/stacktrace: Add nodiscard attribute to all functions without side effects.
libstdc++-v3/ChangeLog: * include/bits/alloc_traits.h: Include <bits/stl_iterator.h> for __make_move_if_noexcept_iterator.
Cygwin should use std::fwrite, not WriteConsoleW. And the -lstdc++exp library is only needed when running the tests on *-*-mingw*. libstdc++-v3/ChangeLog: * include/std/ostream (vprint_unicode) [__CYGWIN__]: Use POSIX code path for Cygwin instead of Windows. * include/std/print (vprint_unicode) [__CYGWIN__]: Likewise. * testsuite/27_io/basic_ostream/print/1.cc: Only add -lstdc++exp for *-*-mingw* targets. * testsuite/27_io/print/1.cc: Likewise.
When parsing a std::chrono::sys_days (or a sys_time with an even longer period) we should not require a time-of-day to be present in the input, because we can't represent that in the result type anyway. Rather than trying to decide which specializations should require a time-of-date and which should not, follow the direction of Howard Hinnant's date library, which allows extracting a sys_time of any period from input that only contains a date, defaulting the time-of-day part to 00:00:00. This seems consistent with the intent of the standard, which says it's an error "If the parse fails to decode a valid date" (i.e., it doesn't care about decoding a valid time, only a date). libstdc++-v3/ChangeLog: PR libstdc++/114240 * include/bits/chrono_io.h (_Parser::operator()): Assume hours(0) for a time_point, so that a time is not required to be present. * testsuite/std/time/parse/114240.cc: New test.
…mic_compare_and_swapsi. If the hardware does not support LAMCAS, atomic_compare_and_swapsi needs to be implemented through "ll.w+sc.w". In the implementation of the instruction sequence, it is necessary to determine whether the two registers are equal. Since LoongArch's comparison instructions do not distinguish between 32-bit and 64-bit, the two operand registers that need to be compared are symbolically extended, and one of the operand registers is obtained from memory through the "ll.w" instruction, which can ensure that the symbolic expansion is carried out. However, the value of the other operand register is not guaranteed to be the value of the sign extension. gcc/ChangeLog: * config/loongarch/sync.md (atomic_cas_value_strong<mode>): In loongarch64, a sign extension operation is added when operands[2] is a register operand and the mode is SImode. gcc/testsuite/ChangeLog: * g++.target/loongarch/atomic-cas-int.C: New test.
When the value of the macro DEFAULT_CFLAGS is set to '-ansi -pedantic-errors', regname-s9-fp.c will test to fail. To solve this problem, add the compilation option '-Wno-pedantic -std=gnu90' to this test case. gcc/testsuite/ChangeLog: * gcc.target/loongarch/regname-fp-s9.c: Add compilation option '-Wno-pedantic -std=gnu90'.
When I've added the -mnoreturn-no-callee-saved-registers option to i386.opt, I forgot to regenerate i386.opt.urls and Mark's CI kindly reminded me of that. Fixed thusly. 2024-03-09 Jakub Jelinek <[email protected]> * config/i386/i386.opt.urls: Regenerate.
gcc/ * config/avr/avr.cc (avr_rtx_costs_1) [PLUS]: Determine cost for usum_widenqihi and add_zero_extend1. [MINUS]: Determine costs for udiff_widenqihi, sub+zero_extend, sub+sign_extend. * config/avr/avr.md (*addhi3.sign_extend1, *subhi3.sign_extend2): Compute exact insn lengths. (*usum_widenqihi3): Allow input operands to commute.
…to allow the IE to LE linker relaxation In Binutils we need to make IE to LE relaxation only allowed when there is an R_LARCH_RELAX after R_LARCH_TLE_IE_PC_{HI20,LO12} so an invalid "partial" relaxation won't happen with the extreme code model. So if we are emitting %ie_pc_{hi20,lo12} in a non-extreme code model, emit an R_LARCH_RELAX to allow the relaxation. The IE to LE relaxation does not require the pcalau12i and the ld instruction to be adjacent, so we don't need to limit ourselves to use the macro. For the distro maintainers backporting changes: this change depends on r14-8721, without r14-8721 R_LARCH_RELAX can be emitted mistakenly in the extreme code model. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_print_operand_reloc): Support 'Q' for R_LARCH_RELAX for TLS IE. (loongarch_output_move): Use 'Q' to print R_LARCH_RELAX for TLS IE. * config/loongarch/loongarch.md (ld_from_got<mode>): Likewise. gcc/testsuite/ChangeLog: * gcc.target/loongarch/tls-ie-relax.c: New test. * gcc.target/loongarch/tls-ie-norelax.c: New test. * gcc.target/loongarch/tls-ie-extreme.c: New test.
… MEMs [PR114284] Before the recent PR111267 r14-8319 fwprop changes, fwprop would never try to propagate what was not considered PROFITABLE, where the profitable part actually was partly about profitability, partly about very good reasons not to actually propagate and partly for cases where propagation is completely incorrect. In particular, classify_result has: /* Allow (subreg (mem)) -> (mem) simplifications with the following exceptions: 1) Propagating (mem)s into multiple uses is not profitable. 2) Propagating (mem)s across EBBs may not be profitable if the source EBB runs less frequently. 3) Propagating (mem)s into paradoxical (subreg)s is not profitable. 4) Creating new (mem/v)s is not correct, since DCE will not remove the old ones. */ if (single_use_p && single_ebb_p && SUBREG_P (old_rtx) && !paradoxical_subreg_p (old_rtx) && MEM_P (new_rtx) && !MEM_VOLATILE_P (new_rtx)) return PROFITABLE; and didn't mark any other MEM_P (new_rtx) or rtxes which contain a MEM in its subrtxes as PROFITABLE. Now, since r14-8319 profitable_p method has been renamed to likely_profitable_p and has just a minor role. Now, rule 4) above is something that isn't about profitability, but about correct behavior, if you propagate mem/v, the code is miscompiled. This particular case has been fixed elsewhere by Haochen in r14-9379. But I think even the 1) and 2) and maybe 3) are a strong don't do it, don't rely solely on rtx costs, increasing the number of loads of the same memory, even when cached, is undesirable, canceling load hoisting can be undesirable as well. So, the following patch restores previous behavior of src contains any MEMs, in that case likely_profitable_p () is taken as the old profitable_p () as a requirement rather than just a hint. For propagation of something which doesn't load from memory this keeps the r14-8319 behavior. 2024-03-09 Jakub Jelinek <[email protected]> PR target/114284 * fwprop.cc (try_fwprop_subst_pattern): Don't propagate src containing MEMs unless prop.likely_profitable_p ().
gcc/ * config/avr/avr.md: Fix typos in comment, indentation glitches and some other nits.
tschwinge
force-pushed
the
tschwinge/merge-upstream
branch
from
April 10, 2024 13:06
90bb4d8
to
013b520
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merging in several stages, towards #2802 and further.
This must of course not be rebased by GitHub merge queue, but has to become a proper Git merge. (I'll handle that, once ready.)