Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enum size depends on ordering of members #117238

Open
tromey opened this issue Oct 26, 2023 · 9 comments · May be fixed by #131684
Open

Enum size depends on ordering of members #117238

tromey opened this issue Oct 26, 2023 · 9 comments · May be fixed by #131684
Assignees
Labels
A-layout Area: Memory layout of types C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@tromey
Copy link
Contributor

tromey commented Oct 26, 2023

I tried this code:

enum E {
    E1,
    E2,
    E3,
    E4,
    E5,
    E6,
    E7(bool),
    E8,
    E9,
    E10,
    E11,
    E12,
    E13,
    E14,
    E15,
    E16,
    E17,
    E18,
    E19,
    E20,
    E21,
    E22,
    E23,
    E24,
    E25,
    E26,
    E27,
    E28,
    E29,
    E30,
    E31,
    E32,
    E33,
    E34,
    E35,
    E36,
    E37,
    E38,
    E39,
    E40,
    E41,
    E42,
    E43,
    E44,
    E45,
    E46,
    E47,
    E48,
    E49,
    E50,
    E51,
    E52,
    E53,
    E54,
    E55,
    E56,
    E57,
    E58,
    E59,
    E60,
    E61,
    E62,
    E63,
    E64,
    E65,
    E66,
    E67,
    E68,
    E69,
    E70,
    E71,
    E72,
    E73,
    E74,
    E75,
    E76,
    E77,
    E78,
    E79,
    E80,
    E81,
    E82,
    E83,
    E84,
    E85,
    E86,
    E87,
    E88,
    E89,
    E90,
    E91,
    E92,
    E93,
    E94,
    E95,
    E96,
    E97,
    E98,
    E99,
    E100,
    E101,
    E102,
    E103,
    E104,
    E105,
    E106,
    E107,
    E108,
    E109,
    E110,
    E111,
    E112,
    E113,
    E114,
    E115,
    E116,
    E117,
    E118,
    E119,
    E120,
    E121,
    E122,
    E123,
    E124,
    E125,
    E126,
    E127,
    E128,
    E129,
    E130,
    E131,
    E132,
    E133,
    E134,
    E135,
    E136,
    E137,
    E138,
    E139,
    E140,
    E141,
    E142,
    E143,
    E144,
    E145,
    E146,
    E147,
    E148,
    E149,
    E150,
    E151,
    E152,
    E153,
    E154,
    E155,
    E156,
    E157,
    E158,
    E159,
    E160,
    E161,
    E162,
    E163,
    E164,
    E165,
    E166,
    E167,
    E168,
    E169,
    E170,
    E171,
    E172,
    E173,
    E174,
    E175,
    E176,
    E177,
    E178,
    E179,
    E180,
    E181,
    E182,
    E183,
    E184,
    E185,
    E186,
    E187,
    E188,
    E189,
    E190,
    E191,
    E192,
    E193,
    E194,
    E195,
    E196,
    E197,
    E198,
    E199,
    E200,
    E201,
    E202,
    E203,
    E204,
    E205,
    E206,
    E207,
    E208,
    E209,
    E210,
    E211,
    E212,
    E213,
    E214,
    E215,
    E216,
    E217,
    E218,
    E219,
    E220,
    E221,
    E222,
    E223,
    E224,
    E225,
    E226,
    E227,
    E228,
    E229,
    E230,
    E231,
    E232,
    E233,
    E234,
    E235,
    E236,
    E237,
    E238,
    E239,
    E240,
    E241,
    E242,
    E243,
    E244,
    E245,
    E246,
    E247,
    E248,
    E249,
    E250,
    E251,
    E252,
    E253,
    E254,
    E255,
}

enum F {
    F1,
    F2,
    F3,
    F4,
    F5,
    F6,
    F7,
    F8,
    F9,
    F10,
    F11,
    F12,
    F13,
    F14,
    F15,
    F16,
    F17,
    F18,
    F19,
    F20,
    F21,
    F22,
    F23,
    F24,
    F25,
    F26,
    F27,
    F28,
    F29,
    F30,
    F31,
    F32,
    F33,
    F34,
    F35,
    F36,
    F37,
    F38,
    F39,
    F40,
    F41,
    F42,
    F43,
    F44,
    F45,
    F46,
    F47,
    F48,
    F49,
    F50,
    F51,
    F52,
    F53,
    F54,
    F55,
    F56,
    F57,
    F58,
    F59,
    F60,
    F61,
    F62,
    F63,
    F64,
    F65,
    F66,
    F67,
    F68,
    F69,
    F70,
    F71,
    F72,
    F73,
    F74,
    F75,
    F76,
    F77,
    F78,
    F79,
    F80,
    F81,
    F82,
    F83,
    F84,
    F85,
    F86,
    F87,
    F88,
    F89,
    F90,
    F91,
    F92,
    F93,
    F94,
    F95,
    F96,
    F97,
    F98,
    F99,
    F100,
    F101,
    F102,
    F103,
    F104,
    F105,
    F106,
    F107,
    F108,
    F109,
    F110,
    F111,
    F112,
    F113,
    F114,
    F115,
    F116,
    F117,
    F118,
    F119,
    F120,
    F121,
    F122,
    F123,
    F124,
    F125,
    F126,
    F127,
    F128,
    F129,
    F130,
    F131,
    F132,
    F133,
    F134,
    F135,
    F136,
    F137,
    F138,
    F139,
    F140,
    F141,
    F142,
    F143,
    F144,
    F145,
    F146,
    F147,
    F148,
    F149,
    F150,
    F151,
    F152,
    F153,
    F154,
    F155,
    F156,
    F157,
    F158,
    F159,
    F160,
    F161,
    F162,
    F163,
    F164,
    F165,
    F166,
    F167,
    F168,
    F169,
    F170,
    F171,
    F172,
    F173,
    F174,
    F175,
    F176,
    F177,
    F178,
    F179,
    F180,
    F181,
    F182,
    F183,
    F184,
    F185,
    F186,
    F187,
    F188,
    F189,
    F190,
    F191,
    F192,
    F193,
    F194,
    F195,
    F196,
    F197,
    F198,
    F199,
    F200,
    F201,
    F202,
    F203,
    F204,
    F205,
    F206,
    F207,
    F208,
    F209,
    F210,
    F211,
    F212,
    F213,
    F214,
    F215,
    F216,
    F217,
    F218,
    F219,
    F220,
    F221,
    F222,
    F223,
    F224,
    F225,
    F226,
    F227,
    F228,
    F229,
    F230,
    F231,
    F232,
    F233,
    F234,
    F235,
    F236,
    F237,
    F238,
    F239,
    F240,
    F241,
    F242,
    F243,
    F244,
    F245,
    F246,
    F247,
    F248,
    F249,
    F250,
    F251,
    F252,
    F253,
    F254,
    F255(bool),
}

fn main () {
    let ae = E::E1;
    let be = E::E7(true);
    let ce = E::E8;
    let af = F::F1;
    let bf = F::F255(true);
    let cf = F::F8;
}

Then I ran this in gdb and examined the size of the enums:

(gdb) p sizeof(ae)
$1 = 2
(gdb) p sizeof(af)
$2 = 1

I expected these to have the same size. The two enums differ only in the placement of the (bool)-carrying member.

Instead, the two enums have different sizes.

rustc --version --verbose:

prentzel. rustc --version --verbose
rustc 1.73.0 (cc66ad468 2023-10-03)
binary: rustc
commit-hash: cc66ad468955717ab92600c770da8c1601a4ff33
commit-date: 2023-10-03
host: x86_64-unknown-linux-gnu
release: 1.73.0
LLVM version: 17.0.2
@tromey tromey added the C-bug Category: This is a bug. label Oct 26, 2023
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Oct 26, 2023
@Jules-Bertholet
Copy link
Contributor

@rustbot label A-layout

@rustbot rustbot added the A-layout Area: Memory layout of types label Oct 26, 2023
@the8472
Copy link
Member

the8472 commented Oct 27, 2023

I suspect an inclusive/exclusive range confusion or an off-by-one error here.
With 254 variants the niche optimization works on any variant. With 255 variants it only works on the last one.

@rustbot claim

@the8472 the8472 added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Oct 27, 2023
@the8472
Copy link
Member

the8472 commented Nov 1, 2023

@Veykril was the change in 2d73d30 meant to fix an observed bug or was it just based on the assumption that the count there was supposed to be equivalent to RangeInclusive::count?


Still trying to understand whether this is a limitation by the way we currently encode/decode niche values or an oversight in how the ranges are calculated.
To support a discriminant in the middle of the range it has to wrap around... but on the other hand if the nicheful variant is in the middle then we don't need to represent that value after wraparound.

/// Niche (values invalid for a type) encoding the discriminant:
/// Discriminant and variant index coincide.
/// The variant `untagged_variant` contains a niche at an arbitrary
/// offset (field `tag_field` of the enum), which for a variant with
/// discriminant `d` is set to
/// `(d - niche_variants.start).wrapping_add(niche_start)`.
///
/// For example, `Option<(usize, &T)>` is represented such that
/// `None` has a null pointer for the second tuple field, and
/// `Some` is the identity function (with a non-null reference).
Niche {
untagged_variant: VariantIdx,
niche_variants: RangeInclusive<VariantIdx>,
niche_start: u128,
},

let needs_disc =
|index: VariantIdx| index != largest_variant_index && !absent(&variants[index]);
let niche_variants = all_indices.clone().find(|v| needs_disc(*v)).unwrap()
..=all_indices.rev().find(|v| needs_disc(*v)).unwrap();
let count =
(niche_variants.end().index() as u128 - niche_variants.start().index() as u128) + 1;

Variants::Multiple {
tag_encoding:
TagEncoding::Niche { untagged_variant, ref niche_variants, niche_start },
tag_field,
..
} => {
if variant_index != untagged_variant {
let niche = self.project_field(bx, tag_field);
let niche_llty = bx.cx().immediate_backend_type(niche.layout);
let niche_value = variant_index.as_u32() - niche_variants.start().as_u32();
let niche_value = (niche_value as u128).wrapping_add(niche_start);
// FIXME(eddyb): check the actual primitive type here.
let niche_llval = if niche_value == 0 {
// HACK(eddyb): using `c_null` as it works on all types.
bx.cx().const_null(niche_llty)
} else {
bx.cx().const_uint_big(niche_llty, niche_value)
};
OperandValue::Immediate(niche_llval).store(bx, niche);
}
}

@Veykril
Copy link
Member

Veykril commented Nov 2, 2023

That line was supposed to be an inlining of RangeInclusive::size_hint because the Step implementation for the index type had to be cfg gated (which prevents the Iterator impl). In hindsight, I could've just turned the indices to u128 first and then consruct the range of those etc. So no, there was no change of behavior intended, apologies for that if that is the case.

@the8472
Copy link
Member

the8472 commented Nov 2, 2023

Ah, that was just a fix for another commit in the same PR. Nevermind then. I only saw the commit in isolation.

@the8472
Copy link
Member

the8472 commented Nov 2, 2023

Ok I think the issue here stems from how the variant index range of niche-encoded variants is calculated.

A bool occupies values 0 and 1, which leaves 2..=255 (254 values) as possible niches. In principle that's enough to represent the remaining variants. But to keep things simple codegen only generates a branch to check if it's the non-niche variant and otherwise shifts the range of tag values down (with a subtraction) to match the variant indices.
E0 is index 0, E255 is index 255. So it would need a value range of 2..=257 to represent that range because the value 6 will be unused for E6.

To do this properly we have to detect that the variant index range ends up being more compact if we rely on wraparound and have it start at E8 instead.

Should be feasible but this is more than fixing an off-by-one error. I'm not sure if I have time to get to this soon.

@rustbot release-assignment

@adwinwhite
Copy link
Contributor

@rustbot claim

@the8472
Copy link
Member

the8472 commented Oct 14, 2024

Since you're working on this: I have a commit for the "// FIXME(eddyb)" in a local branch. I can make a PR for that. I don't think it's actually relevant to this issue, but it should be one less thing to worry about.

@adwinwhite
Copy link
Contributor

That's great. I haven't looked into that part yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-layout Area: Memory layout of types C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants