Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Allow type inference for const or static #3546

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

Neo-Zhixing
Copy link

@Neo-Zhixing Neo-Zhixing commented Dec 21, 2023

Allow type inference for const or static when the RHS is known.

const PI = 3.1415; // inferred as f64
static MESSAGE = "Hello, World!"; // inferred as &'static str
const FN = std::string::String::default; // Inferred as the unnamable type of ZST closure associated with this item. Its type is reported by `type_name_of_val` as ::std::string::String::default

Pre-RFC discussion: #1349

cc #1623

Rendered

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Dec 21, 2023
# Motivation
[motivation]: #motivation

Rust currently requires explicit type annotations for `const` and `static` items.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to write a little bit about why Rust is like this currently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this should have a longer explanation of Rusts "rule" of "no inference in signatures", how this RFC is breaking it and why this is okay.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion! I actually don't know why the design was made that way. Maybe someone from the Rust team can help explain?

The "type is missing in const" error was emitted from the parser so my guess would be that it was just difficult to infer types when consts were implemented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's like this currently because it was decided that all public API points should be "obviously semver stable" rather than "quick to type".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining this. I've incorporated it into the RFC.

from the initial value. For example:

```rs
const PI = 3.1415; // inferred as f64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally prefer that the actual type of numeric or float types be known rather than picking arbitrary defaults (e.g. i32 or f64). I'm not too keen on it in a local context either tbh but there it's mitigated by the compiler being able to infer the real type from the surrounding code most of the time.

const PI = 3.1415; // error
const PI = 3.1415_f64; // inferred as f64

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer more consistent behavior with let bindings. So let's make a simple vote here:
🎉: const PI = 3.1415; // error
🚀: const PI = 3.1415; // inferred as f64

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a big issue for me either way it's just that a few times I've only later realised a type has been unexpectedly made i32. But it's easily fixed and more of an annoyance than a problem per se. It doesn't help that an i32 is very rarely what I want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it follows the normal rust inference default types then it's at least no worse than what let does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it's worse in the sense that let can use future code to infer the type, so mostly this is a non issue there unless there's a lot of generics involved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I would most like if the const was of {float} type but people aren't ready for that conversation maybe

Copy link

@Rudxain Rudxain Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lokathor do you mean something like Go const? 🤔

```rs
const PI = 3.1415; // inferred as f64
static MESSAGE = "Hello, World!"; // inferred as &'static str
const FN_PTR = std::string::String::default; // inferred as fn() -> String
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically the const would have the function item's type instead of a function pointer type.
https://doc.rust-lang.org/reference/types/function-item.html

Copy link
Author

@Neo-Zhixing Neo-Zhixing Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a good point - I guess there's not much point to coerce the function item's type into function pointer type for const items. But should we coerce for static items? That way you get to reassign the static items with functions of the same signature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused about why a person would want it as a static at all, so perhaps we shouldn't allow it at all in the first version of this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lokathor maybe they just use the const as a quick switch between two cfg implementations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense as a const but little sense as a static

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the type is a ZST, like function item types, at runtime, const and static are essentially interchangable. when the type is a function pointer, unless wrapped in something with interior mutability, static is basically just a const with a stable address, IIRC LLVM will still constant-propagate the value to most uses, since it's marked read-only so LLVM knows the value won't change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing implicit coercions into an unknown target type sounds very confusing. If the right-hand side is String::default, then the resulting type should be the actual type of the expression, the function item ZST.

@Aloso
Copy link

Aloso commented Dec 22, 2023

Both of these drawback could be addressed using an allow-by-default clippy lint for const and static types.

Make that a rustc lint, and make it warn by default, unless

  • The type is unnameable
  • The item was generated by a macro
  • The item is private or not accessible outside the crate (i.e. not part of the public API)

Then I'd be very happy with this proposal.

One open question is to what extent type placeholders should be allowed, and if they should silence the lint:

const X: _ = 3.14;
const Y: [_; 4] = [1, 2, 3, 4];

@Aloso
Copy link

Aloso commented Dec 22, 2023

I'd like to point out that all of the mentioned downsides also apply to function return types. The main difference is that functions support impl Trait as return type, but const/static items do not. Allowing this would make Rust more expressive:

const FOO: impl Fn() -> i32 = || {
    todo!();
};

text/0000-const-type-inference.md Outdated Show resolved Hide resolved
text/0000-const-type-inference.md Outdated Show resolved Hide resolved
However, not all `const` or `static` items are public, and explicit typing isn't always important for semvar stability.
Requiring explicit typing for this reason seems a bit heavy handed.

Both of these drawback could be addressed using an allow-by-default clippy lint for `const` and `static` types.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think an allow-by-default lint addresses concerns about it being potentially confusing. Having a lint can be useful, but it doesn't address the problem because almost everyone won't be using it (but may be using inference).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, happy to withdraw this suggestion and go with what you've suggested instead.

@chrysn
Copy link

chrysn commented Dec 28, 2023

Many of the cases where this would be helpful would also be addressed by type_alias_impl_trait; in particular, it helps with dealing with unnameable types and macro / code generator output, without the drawbacks of loss of clarity and semver trouble (because the trait is guaranteed, and is also the only property usable in other crates).

This may not be an argument against this RFC, but at least warrants discussion in the alternatives.

@kpreid
Copy link
Contributor

kpreid commented Dec 28, 2023

Many of the cases where this would be helpful would also be addressed by type_alias_impl_trait

That's a good thing to note as an alternative, but it won't help with array lengths, which are one of the ways the current rules frequently create pain (especially for include_bytes!()ed arrays where the length isn't even a property of the same source code file):

static ARRAY_OF_UNINTERESTING_LENGTH = [
    "foo",
    "bar",
    // ...
];

In the cases where a TAIT such as impl AsRef<[T]> could reasonably be used, you can also use an static &[T] slice, but not all cases are that.

@JarredAllen
Copy link

I feel like providing explicit types for values should be assumed to be the default, and use of the _ placeholder in types looks better to me:

// I prefer this
const FOO: _ = "hello";
// over this
const FOO: "hello";

The use of placeholders also I think works better in the event of a "partially-nameable" type, e.g.:

struct MyWrapper<T>(T);

// We can name `MyWrapper`, but not its generic argument.
static FOO: MyWrapper<_> = MyWrapper(|| todo!());

Also, for type inference around literals specifically, I'd prefer it not guess between options. I think this could lead to some confusion around things like this:

fn takes_u8(_: u8) {}
fn takes_u16(_: u16) {}

// Demonstrating type inference on local variables.
fn local() {
    let foo = 7;
    // Uncommenting either of these lines works, but uncommenting both results in a compile error.
    // takes_u8(foo);
    // takes_u16(foo);
    // If both lines are commented, `foo` is an i32; otherwise, the uncommented line changes its type.
}

// Demonstrating type inference on static variables.
fn local() {
    static foo: _ = 7; // Inferred as i32 since there's no uses.
    // Can uncommenting one of the below lines work?
    // I don't like uses of a static changing its value, but it also feels wrong to let inference change a literal's inferred type based on uses in locals, but not statics.
    // takes_u8(foo);
    // takes_u16(foo);
}

I'd prefer if literals still have to be explicitly annotated (either on the static definition or in the literal itself).

Both of these drawback could be addressed using an allow-by-default clippy lint for const and static types.

Given that the main use-case IMO is for types that can't be named, I think making a deny-by-default lint in rustc for a placeholder in the place of anything nameable (except for macro-generated code) would also be a good idea. And once TAIT is stabilized, I think it should expand to also lint any type in a public API (but not array lengths), regardless of whether it's macro-generated or nameable, since that can be used instead.

@Neo-Zhixing
Copy link
Author

Neo-Zhixing commented Jan 9, 2024

I do think that requiring at least a _ placeholder is a good idea. It nudges people to specify explicit types when they can by making const inference "opt-in".

Given the semver compatibility concerns, I also think that it's a good idea to require numerical types used for inference to have explicit typing.

I've updated the RFC to reflect this.

However, I don't think a deny-by-default lint is a good idea. Some types can just be too cumbersome to type, and I think we should leave this decision up to the users.

@tmccombs
Copy link

However, I don't think a deny-by-default lint is a good idea. Some types can just be too cumbersome to type, and I think we should leave this decision up to the users.

in that case I think it would be totally reasonable to add an #[allow(...)] attribute to the item. And I think such a lint should (at least by default) only apply to public items.

@FrankHB
Copy link

FrankHB commented Mar 29, 2024

Use a type alias. If your const type is too complex, it will make it painful to use in downstream code. You can infer a complex type, but you can't ignore it, and you must still bend your code to fit the specific type.

For a public const, sure. But for a local const or static, that seems like overkill.

No. This is nothing to do with publicity or locality, but the intent of the API author. Until the language supports the feature to express "unspecified" items explicitly in the syntax, it is legitimate to encode the concrete type in the public code but annotated with documentation to caveat users about the fact that parts of them are implementation details and subject to change.

@caseyross
Copy link

Here's my perspective as someone new to Rust:

I was confused by the compiler asking me to add a type for a const string literal, as it seemed like the type would be already known to the compiler.

I would support making type declarations optional for any const/static where there is exactly 1 possible type. I think this is an unimpeachable ergonomics improvement -- however, as mentioned abovethread, there is an argument against it if, through unrelated RFCs, there exists a possibility of making currently unambiguous types ambiguous in the future. (Though doing something like that sounds like a bad idea in the first place.)

I wouldn't support automatic type inference for numeric literals or other values where the literal value could fit multiple types. I think the user should always know what type is coming out of the const/static declaration, and the compiler should not be making any choices regarding that type.

@technetos

This comment was marked as resolved.

@FrankHB

This comment was marked as abuse.

@FrankHB

This comment was marked as abuse.

@oli-obk

This comment was marked as resolved.

@chorman0773
Copy link

Ok, now I want to use that array in my downstream code. I need to know its exact length, because my API requires an array of specific size. Am I supposed to manually count the array size each time?

For the original author, writing the array size is trivial. Just write 0, and the compiler will complain that it doesn't match the actual size 4. Write it down, done. It's a trivial annoyance, and it doesn't matter much unless you change the lengths of your arrays frequently (in which case its a semver break anyway, so it should be made explicit and done deliberately, rather than the compiler silently swallowing the breaking change).

You're trading a trivial ease-of-writing change into an issue for each one of your downstream consumers.

FTR, I personally think the array issue is persuasive. If I've got a local static that contains a list of upwards of a thousand values (say, syscall addresses in a kernel), I'm not going to want to scroll up a thousand+ lines of code to also update the length of the array.
And to answer the "use a slice" point before it's made again, slap #[no_mangle] on the array, and store the .len() in a different (also #[no_mangle]) static. Or pass it in via a sym and const operand to global_asm!(). Slices can't be accessed directly via FFI or assembly. The syscall point is I think very important here, because here's what a syscall wrapper might be:

static SYSCALLS: _ = [/*build syscall array*/]

global_asm!(
    ".globl __syscall_dest_lstar", 
    "__syscall_dest_lstar:"
    "cmp eax, {SYSCALL_LIMIT}",
    "jae 2f",
    "mov rax, qword ptr [{SYSCALLS}+rip]",
    "cmp rax, rax",
    "jz 2f",
    "swapgs",
    "push rcx",
    "push rsp",
    "mov rsp, gs:[0]",
    "mov rcx, r10",
    "call rax",
    "pop rsp",
    "pop rcx",
    "swapgs",
    "sysret",
    "2:",
    "mov eax, {ERROR_UNSUPPORTED_KERNEL_FUNCTION}",
    "sysret",
    ERROR_UNSUPPORTED_KERNEL_FUNCTION = const errors::UNSUPPORTED_KERNEL_FUNCTION,
    SYSCALLS = sym SYSCALLS,
    SYSCALL_LIMIT = const SYSCALLS.len()
);

It's fairly simple - check if the syscall number (in eax/rax) is below the syscall limit, if ts not, then return UNSUPPORTED_KERNEL_FUNCTION, otherwise load the address of the syscall, check it for being null, and then convert from syscall convention to Sys-V convention (and do a stack switch), call the function, switch back to syscall, and sysret. Of course, it's simple because SYSCALLS is an array, not a slice.

@scottmcm
Copy link
Member

scottmcm commented Jan 3, 2025

I think this is too broad, and we shouldn't do it for the same reason we're not planning on accepting leaving off function return types.

We've talked about similar things in the past, like we once discussed fn foo(MyType { a, b }) without needing the : MyType -- which can be inferred from the pattern -- ended up not accepting that. Nor did we accept the simpler case of just eliding array sizes in consts, where I still think the same thing I did in #2545 (comment)

So rather than just leaving this open for another year, I propose
@rfcbot fcp close


Note that I see that as "not this way", not that we'd be rejecting anything at all in this space.

Specifically, I think that allowing static FOO: impl Trait = ...; would be good -- like we allow -> impl Trait! -- for the situations where you don't want to write or commit to an exact type.

And I also look forward to other things that allow simplifying the RHS of the static/const while still keeping a type annotation. For example, if we had something like

static HIGH_COMPRESSION: CompressionOptions = .{
    lookback_size: 1 << 20,
    symbol_size: 9,
    dictionary_length: 1 << 10,
};

Then you'd still be able to avoid re-mentioning the CompressionOptions name, but there'd also still be a type annotation in the normal way.

@rfcbot
Copy link
Collaborator

rfcbot commented Jan 3, 2025

Team member @scottmcm has proposed to close this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns.
See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-close This RFC is in PFCP or FCP with a disposition to close it. labels Jan 3, 2025
@kennytm
Copy link
Member

kennytm commented Jan 3, 2025

And I also look forward to other things that allow simplifying the RHS of the static/const while still keeping a type annotation. For example, if we had something like

static HIGH_COMPRESSION: CompressionOptions = .{
    lookback_size: 1 << 20,
    symbol_size: 9,
    dictionary_length: 1 << 10,
};

use _ instead of . and you'll get #3444.

static HIGH_COMPRESSION: CompressionOptions = _ { ... };

@traviscross
Copy link
Contributor

traviscross commented Jan 3, 2025

@rfcbot reviewed

Agreed it's a worthwhile problem but that it's not going to be this particular solution.

As @scottmcm said, making the type opaque as with impl Trait and further type inference on the right-hand side are among the directions that seem plausible.

@tmccombs
Copy link

tmccombs commented Jan 3, 2025

Would allowing type elision within the scope of a function be a plausible direction?

@scottmcm
Copy link
Member

scottmcm commented Jan 5, 2025

Would allowing type elision within the scope of a function be a plausible direction?

I think it's good that items are items with item rules, regardless of where you put them. (Just speaking for me, not for the team.)

If you have a const that's local to a function and you want to omit the type, use let foo = const { … }; instead of const FOO: TheType = …; -- just like how if you want to omit the types on a function that you only use locally, you write it as a lambda expression instead of as a fn item.

@joshtriplett
Copy link
Member

joshtriplett commented Jan 27, 2025

I'm going to abstain from checking a box here.

In any case, I do hope we manage to get as far as inferring array lengths one day.

@tmccombs
Copy link

If you have a const that's local to a function and you want to omit the type, use let foo = const { … }; instead of const FOO: TheType = …;

But foo can't be used in const contexts. And there isn't a way to use let to create a place with a static lifetime, like you can with static.

@tmandry
Copy link
Member

tmandry commented Feb 4, 2025

I'm open to this happening in the future and don't think it should be closed as "never going to do this" from the team.

This is a weird situation for our process – do I file a concern to block closing the RFC and likely continue to leave it open indefinitely without resolution? I feel funny doing that, but it would be better than making a statement from the team that I don't agree with.

@rfcbot concern but what if we did

The most important property of item types being explicit in Rust, in my view, is that there is no global type inference taking place across the entire program. This RFC is not proposing that. If the type of a const is clear from its definition site, it's a matter of style and clarity whether to include it.

We can go on making this choice for our users, or we can be open to cases where it's not helping and let users make it themselves. I find an example like this well-motivating:

const WRAPPED_PI: MyStruct<f32> = MyStruct(3.1415_f32);

The type annotation is not adding clarity or value there.

Furthermore, since bidirectional type inference has limited power in this context (we would not infer types based on usage of the const), the definition site is forced to be explicit about any ambiguous types that are not specified on the annotation itself. And we already have type inference within const sub-expressions today: If MyStruct were not generic but had a single f32 field, it would be inferred as that type without any explicit annotation on the const item.

Do I think there will be cases where this makes code less clear? I'm sure there will, but don't expect them to be very common given the limited power of type inference in this context and the fact that we already do inference for sub-expressions. With that said, I would be enthusiastic about restricting the fallback behavior of type inference for integers and floats on const items, so that without the type annotation the _f32 suffix is required above.

On the precedent arguments, it looks like we didn't accept a special case for array lengths because const generics were supposed to fix it – but they haven't yet. I know some work is happening on const generics in H1 but we should really consider accepting a special case if it's not looking much closer by the end. I don't know the earlier reasoning for not accepting fn foo(MyType { a, b }), nor am I clear on whether it applies here, and without knowing it doesn't seem obvious to me we should have rejected that either.

@ChayimFriedman2
Copy link

@tmandry The same argument could be applied to functions' return types. Should we also make them inferable?

@Lokathor
Copy link
Contributor

Lokathor commented Feb 4, 2025

That's a whole separate discussion, please don't conflate the two things.

@tmandry
Copy link
Member

tmandry commented Feb 4, 2025

@ChayimFriedman2 Valid point about my argument, but function bodies tend to be more complex than const items. For one, const items don't have arguments with their own complex types to keep track of.

This feels to me like a difference in degree that in practice becomes more of a difference in kind.

Yes there will be complex const expressions, but (in addition to being discouraged altogether) those should be welcome and encouraged to use type annotations. I would be fully supportive of an on-by-default clippy lint that fired when an unannotated const contains too many statements or doesn't mention the outer type name in its final expression.

@traviscross
Copy link
Contributor

On the procedural point, I think it's perfectly valid to raise a concern on an FCP close. As usual, that should probably prompt us to discuss it (rather non-urgently), so...

@rustbot labels +I-lang-radar

On this example:

const WRAPPED_PI: MyStruct<f32> = MyStruct(3.1415_f32);

What's your view on the idea that we'd be more likely to want to want to solve that with RHS solutions? That is, something along the lines of, e.g.:

const WRAPPED_PI: MyStruct<f32> = _(3.1415);

Rather than:

const WRAPPED_PI: _ = MyStruct(3.1415_f32);

@rustbot rustbot added the I-lang-radar Items that are on lang's radar and will need eventual work or consideration. label Feb 5, 2025
@Lokathor
Copy link
Contributor

Lokathor commented Feb 5, 2025

That would not work with functions and personally I'd find it extremely confusing.

@chorman0773
Copy link

Also does not help with arrays

@Lokathor

This comment was marked as resolved.

@traviscross

This comment was marked as resolved.

@traviscross
Copy link
Contributor

traviscross commented Feb 5, 2025

Also does not help with [array lengths]

I've been persuaded by @scottmcm that if you don't want to write the number in the signature, then you probably don't want callers to rely on the exact number anyway, and so some solution that makes this opaque is probably correct.

@joshtriplett
Copy link
Member

@tmandry The same argument could be applied to functions' return types. Should we also make them inferable?

I think we should in simple cases (e.g. if the entire body is a constructor call), but that's not on topic for this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-close This RFC is in PFCP or FCP with a disposition to close it. I-lang-radar Items that are on lang's radar and will need eventual work or consideration. proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.