Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] A new stack-based vector #2990

Closed
wants to merge 3 commits into from
Closed

Conversation

c410-f3r
Copy link

@c410-f3r c410-f3r commented Sep 27, 2020

This RFC, which depends and takes advantage of the upcoming stabilization of constant generics (min_const_generics), tries to propose the creation of a new "growable" vector named ArrayVec that manages stack memory and can be seen as an alternative for the built-in structure that handles heap-allocated memory, aka alloc::vec::Vec<T>.

Rendered

@c410-f3r c410-f3r changed the title A new stack-based vector [RFC] A new stack-based vector Sep 27, 2020
Copy link
Contributor

@jonas-schievink jonas-schievink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StackVec would be an incorrect name for such a data structure, since there is no reason it couldn't be constructed in a static, or on the heap via Box<StackVec<...>>.

The RFC seems fairly light on details that really should be addressed in the RFC format, for example:

  • How does the initial API of this type look? Why should it look that way?
  • Where should the type be placed, and why is that the best place?

The implementation details are really not that important, as long as the RFC is reasonably implementable at all.

# Motivation
[motivation]: #motivation

`core::collections::StackVec<T>` has several use-cases and should be conveniently added into the standard library due to its importance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

core::collections does not currently exist, so that should probably be called out explicitly (it makes sense to put this structure there, but it would be a first)

// core::collections

pub struct StackVec<T, const N: usize> {
data: MaybeUninit<[T, N]>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
data: MaybeUninit<[T, N]>,
data: MaybeUninit<[T; N]>,

}

impl<T, const N: usize> StackVec<T, N> {
// Much of the `Vec` API goes here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mirroring the Vec API doesn't really work that well for a structure with bounded capacity, since push can fail. What should happen then?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reasoning is: There must be two methods for everything that tries to add elements. One method panics at run-time and other one returns a Result.

fn push(&mut self, element: T) {  ... } // Panics at run-time
fn try_push(&mut self, element: T) -> Result<(), Error> { ... } // Ok if successful 

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mirroring the Vec API doesn't really work that well for a structure with bounded capacity

That's not really true at all, frankly. Many of the functions can in fact be exactly the same in every way, and for the ones where regular Vec would have simply increased in capacity upon becoming full, you have the option of either simply panicking or returning something like Err or None. Or doing both at different times, if it's the case that you simultaneously provide push and try_push or insert and try_insert, for example.


### Compiler internals

Unstable features can be used internally to make operations more efficient, e.g., `specialization`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how this is relevant. The use of const generics is probably more important, since that is directly part of the API, therefore potentially blocking stabilization of StackVec.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hehehehe. It was originally intended for a SmallVec-like structure tied with the unstable untagged_unions feature.

Removed.

---------
```

`StackVec` takes advantage of this predictable behavior to reserve an exactly amount of uninitialized bytes up-front and these bytes form a buffer where elements can be included dynamically.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the stack needs to be explained in an RFC or in the standard library documentation. The implementation of StackVec also would not really care about how or where it is allocated, so this section doesn't seem very relevant either.


### The current ecosystem is fine

Even with all the fragmentation, different types of memory usage is an overkill in certain situations. If someone wants to use stack memory in an embedded application, then it is just a matter of grabbing an appropriated crate.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that fragmentation is only really relevant when the type in question frequently is part of the API of other libraries. If it isn't, at worst you have a bit of code bloat when multiple similar crates are used by your dependencies.

@jonas-schievink jonas-schievink added A-collections Proposals about collection APIs A-types-libstd Proposals & ideas introducing new types to the standard library. T-libs-api Relevant to the library API team, which will review and decide on the RFC. labels Sep 27, 2020
@c410-f3r
Copy link
Author

c410-f3r commented Sep 28, 2020

@jonas-schievink Thanks for the review

Hum... The discussion of a whole API is something that will probably raise some personal concerns and I was expecting to discuss it further in a possible subsequent PR. Hope my new suggestions on this subject make sense.

StackVec would be an incorrect name for such a data structure, since there is no reason it couldn't be constructed in a static, or on the heap via Box<StackVec<...>>.

Changed temporally to ArrayVec. I can't think of anything more obvious than this name 🤔

@Lokathor
Copy link
Contributor

It seems fairly clear to me (the tinyvec lead) that we shouldn't bother much with this sort of a thing in core/alloc until we have at least min_const_generics stable. At that time, it might be appropriate to add some sort of array-based data structure to core.

However, Rust is ultimately all about paying attention to the details of a situation and selecting the best option for the current moment, so honestly it's fine to have four different crates that each offer a slightly different variant.

Comment on lines 159 to 161
pub fn pop(&mut self) -> T; // Panics at run-time

pub fn push(&mut self, element: T); // Panics at run-time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quite strongly feel that the default behavior for inserts returns something akin Result<(), InsertionError>.

So push and insert should both return Result.

pop should also return Option<T>.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quite strongly feel that the default behavior for inserts returns something akin Result<(), InsertionError>.

The compiler will probably eliminate the conditional bounding checking so yeah, let's do this.

impl<T, const N: usize> ArrayVec<T, N> {
// Constructors

pub unsafe fn from_raw_parts(_ptr: *mut T, _len: usize) -> Self;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel that we should add this method here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed


#[derive(Debug, Eq, Ord, PartialEq, PartialOrd)]
pub enum ArrayVecError {
CapacityOverflow,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CapacityOverflow,
CapacityOverflow(T),

Otherwise element is lost. Considering that afaict there is no method returning both CapacityOverflow and NoElement it also makes more sense to split this error into two distinct types.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArrayVecError was intended to be a "catch-all" error container but doesn't seem to be necessary at all.

For a better overall coherence, pub fn push(&mut self, element: T) -> Result<(), ArrayVecError> is now pub fn push(&mut self, element: T) -> Result<(), T>

@@ -176,49 +181,13 @@ impl<T, const N: usize> ArrayVec<T, N> {

pub fn split_off(&mut self, at: usize) -> Self;

pub fn swap_remove(&mut self, idx: usize) -> T; // Panics at run-time
pub fn swap_remove(&mut self, idx: usize) -> Option<T>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

swap_remove uses a specific index, in this case I feel like panicking is actually better

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for removal, I think just copying Vec 1 by 1 is the correct choice

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A "safe" swap_remove version is similar to get in the sense that both return an optional value when out-of-bounds. On the opposite direction, pub fn swap_remove(&mut self, idx: usize) { ... } is similar to indexing, e.g., let value = slice[10];.

In certain cases, users that don't want to panic at run-time will have to manually check if the index is within bounds before issuing a removing command. If a "panicking" swap_remove is indeed desirable, please also consider the inclusion of try_swap_remove.

Copy link
Contributor

@pickfire pickfire Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could check this at compile time to guarantee idx < N but probably not.

pub fn from_array(array: [T; N]) -> Self;

#[inline]
pub fn from_array_and_len(array: [T; N], len: usize) -> Self;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn from_array_and_len(array: [T; N], len: usize) -> Self;
pub fn from_array_and_len(array: [MaybeUninit<T>; N], len: usize) -> Self;

I am not sure if that method is valuable, as all cases where [MaybeUninit<T>; N] and a len is used we should "just" use ArrayVec directly

Copy link
Author

@c410-f3r c410-f3r Sep 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beyond the semantic value, storage re-utilization and a possible future function "constification", from_array and from_array_and_len are right now being used to write unit-tests :)

I am not sure if that method is valuable, as all cases where [MaybeUninit; N] and a len is used we should "just" use ArrayVec directly

Something like pub const fn from_parts(array: [MaybeUninit<T>; N], len: usize) -> Self ?

}

impl<T, const N: usize> ArrayVec<T, N> {
// Constructors

pub unsafe fn from_raw_parts(_ptr: *mut T, _len: usize) -> Self;
#[inline]
pub fn from_array(array: [T; N]) -> Self;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this just ArrayVec::from(arr), I don't feel like a method is needed here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do or will eventually const generics allow something like impl<T, const N: usize, const M: usize> impl From<[T; M]> for ArrayVec<T, N> where {M <= N}?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, using feature(const_evaluatable_checked). This does require a N <= N where bound when using it in a generic context though We probably will be able to detect that Param <= Param always holds in the future, but for now that is not a priority.

Note that this from impl will be horrible for type inference, so we probably want to make it a separate method anyways imo.

Copy link

@slightlyoutofphase slightlyoutofphase Mar 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lcnr, when you said:

Isn't this just ArrayVec::from(arr), I don't feel like a method is needed here.

I personally found with staticvec that having three inherent methods, (new_from_slice, new_from_array, and new_from_const_array) allowed me to really easily (with the specialization feature flag enabled) write no less than eight slice / array related From impls in direct terms of just calling the appropriate one of those three:

impl<T, const N: usize> From<[T; N]> for StaticVec<T, N>
impl<T, const N1: usize, const N2: usize> From<[T; N1]> for StaticVec<T, N2>
impl<T: Copy, const N: usize> From<&'_ [T; N]> for StaticVec<T, N>
impl<T: Copy, const N1: usize, const N2: usize> From<&'_ [T; N1]> for StaticVec<T, N2>
impl<T: Copy, const N: usize> From<&'_ [T]> for StaticVec<T, N>
impl<T: Copy, const N: usize> From<&'_ mut [T; N]> for StaticVec<T, N>
impl<T: Copy, const N1: usize, const N2: usize> From<&'_ mut [T; N1]> for StaticVec<T, N2>
impl<T: Copy, const N: usize> From<&'_ mut [T]> for StaticVec<T, N>

I was additionally able to implement my staticvec! macro in direct terms of new_from_const_array, so there was even more code re-use benefits there.

You can see how the three inherent methods work starting here if you like.

In case you wind up wondering, also, the call to StaticVec:from inside new_from_array in turn defers to the From impl that calls new_from_const_array. Doing it like that is necessary to have the types inferred properly by rustc, as if you try to just call new_from_const_array itself there rustc complains about N2 not being the same as N.

@burdges
Copy link

burdges commented Sep 28, 2020

I'd suggest std focus on impl<T; const N usize> FromIterator<T> for [T;N] instead since doing Iterator::collect represents like half these crates' use cases.

I suspect ArrayVec<[T;N]> should remain the idiomatic approach since we'll seemingly keep discovering scenarios where associated types work better than const generics.

@frehberg
Copy link

frehberg commented Oct 1, 2020

As for stack based lists, may I turn your attention to the crate "seq" implementiing a sequence of items being managed via the function scopes https://crates.io/crates/seq


pub struct ArrayVec<T, const N: usize> {
data: MaybeUninit<[T; N]>,
len: usize,
Copy link
Contributor

@pickfire pickfire Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be usize when we probably can't even hit u16?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Probably" is not a great fit for the standard library.

Ideally we’d use specialization of some private trait with an associated type to choose the smallest integer type that can represent 0..=N. However it looks like boolean conditions based on const parameters can not be used in where clauses today, and I don’t know if supporting that is planned at all. And this would only help where align_of::<T>() is less than size_of::<usize>().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential hack until specializations works: Store the size in a [u8; M] which gets converted to a usize on demand. Calculate the M based on N.

@c410-f3r
Copy link
Author

c410-f3r commented Oct 8, 2020

A preliminary benchmark using criterion for 10, 1000 and 100000 push operations.

lines

ArrayVec and Vec will share much of the same code, so a proper adjustment for ArrayVec will probably also be beneficial for Vec.

Comment on lines 166 to 162
pub fn drain<R>(&mut self, range: R)
where
R: RangeBounds<usize>;
Copy link
Contributor

@pickfire pickfire Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this return a Drain struct like Vec did? Maybe we have Drain<T, const N: usize>?

Comment on lines 200 to 191
pub fn splice<R, I>(&mut self, range: R, replace_with: I)
where
I: IntoIterator<Item = T>,
R: RangeBounds<usize>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe splice should return a Splice like vec did, same for Drain above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Thanks for pointing this out

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the new one is still different from what Vec did.

where
F: FnMut(&mut T) -> K,
K: PartialEq<K>;

pub fn drain<R>(&mut self, range: R)
pub fn drain<R>(&mut self, range: R) -> Option<Drain<'_, T, N>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this Option? The current one isn't Option IIRC.


pub fn split_off(&mut self, at: usize) -> Self;
pub fn split_off(&mut self, at: usize) -> Option<Self>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this Option?

@pickfire
Copy link
Contributor

Difference from Vec section? Looks like this have quite some difference from Vec from what I see, it would be good to have a section for it.

@c410-f3r
Copy link
Author

c410-f3r commented Oct 10, 2020

Difference from Vec section? Looks like this have quite some difference from Vec from what I see, it would be good to have a section for it.

Methods are checked due to out-of-bound inputs or invalid capacity, which are different from most of the Vec methods that simply abort at run-time behind the scenes. IMHO, checked methods are more secure and can be flexible enough to let the user decide if they will panic foo.insert(99999999, ELEMENT).unwrap() or will be simply ignored let _ = foo.insert(99999999, ELEMENT).

Everything you see are suggestions. If people or the lib team think that a stack-based vector is worth the std addition, then I also think it is better to discuss the API design in another PR, forum thread or RFC.

@KodrAus KodrAus added the Libs-Tracked Libs issues that are tracked on the team's project board. label Nov 6, 2020
@c410-f3r
Copy link
Author

With a successful crater run and no last minute blocker, min_const_generics is going to be stabilized in the 1.50 release version. rust-lang/rust#79135

More benchmarks for extend, pop, push and truncate -> https://imgur.com/a/zdGmnW2

@mbartlett21
Copy link
Contributor

Should this specify that the ArrayVec implementation can almost all be const?

In my try at implementing this, I was able to make push, pop, insert, and remove const.

Playground link

@c410-f3r
Copy link
Author

c410-f3r commented Jan 5, 2021

@mbartlett21 Thanks for the heads-up. Some features like const_intrinsic_copy weren't a thing when the RFC was written but yeah, these "stable" constant functions are only possible within the std because third-party libraries require a nightly compiler.

@mbartlett21
Copy link
Contributor

The implementation being able to be const could also be mentioned in the motivation, since until compile time allocation, this would be one of the only vector-like types to work in constant-eval.

Thinking about it, a compile-time hashmap could be useful...

@mbartlett21
Copy link
Contributor

Is the function from_array_and_len meant to leak all the elements past len, or should it take an array of MaybeUninit<T> instead of T?

@c410-f3r
Copy link
Author

Was pointed at this by this zulip thread. There's another way of defining an ArrayVec (leveraging unsizing and const generics) that avoids instantiating duplicating every function used for each N. My rough experience is that it helps compile times a good amount.

None of the existing crates do this, and I've considered publishing one that does, but no time (I tried to hassle @Lokathor into making the arrayvec in TinyVec do it but no luck). The basic idea is sketched out here https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=da40dafedb02ed7c7723eff5f16d949e . Essentially:

* You have `ArrayVec<Arr: ?Sized>`

* You use this either as `ArrayVec<[T; N]>` or `ArrayVec<[T]>`

* Vec methods go on `ArrayVec<[T]>`.

* `ArrayVec<[T; N]>` deref{,mut}s to `ArrayVec<[T]>`

* `ArrayVec<[T]>` deref{,mut}s to `[T]`

That's the basic idea, not all of this is a required way of implementing it, in particular, there might be a way to make it work without the Deref hackery — and just leverage unsizing.

This doesn't hurt codegen either — llvm can easily see through this if it decides its a good bet, but it's not forced to. It also allows you to take an arrayvec of any size as an &mut ArrayVec<[T]> or whatever, without requiring you monomorphize your arrayvec-using function for each N. I think it would be good to at least consider a design like this.

(It's a bit like the trick you see in C++ SmallVec implementations like llvms' which have a SmallVectorBase type which doesn't have the template parameter for the size, which is intended to be used as an argument and such — However, the same technique can't be used for a smallvec in rust because of the lack of MaybeUninit for ?Sized types).

When verifiable methods were proposed, many people were reluctant and this situation didn't move the RFC forward. Perhaps, the same could be said to this approach or perhaps not. Perhaps, this code arrangement could improve compile times and ergo-metrics or perhaps Arr type opacity and the yet to be figured out lack of arbitrary length types are deal breakers.

Perhaps? Maybe? Who knows? This RFC didn't receive an official answer so it is like walking in a dark room without proper directions. Personally, I just want this feature implemented and I will make any changes that the lib team wants

@mbartlett21
Copy link
Contributor

I prefer <T, N> over <[T; N]>, at least for ArrayVec. Fixing the monomorphisation problem could be done by having a separate struct, called something like DynArrayVec<T> that ArrayVec<T, N> derefs to and that has nonmonomorphised methods impl'd on it

@burdges
Copy link

burdges commented May 27, 2021

A priori, anytime <[T; N]> works then it should be preferred over <T, N> for polymorphism, meaning exploring this direction might give <impl Borrow<[T]>> eventually, although more likely a more restrictive bound. It possible this already breaks down here though. I donno but.. I've just noticed detached lengths being an anti-pattern created by generic_array elsewhere.

@SimonSapin
Copy link
Contributor

A priori, anytime <[T; N]> works then it should be preferred over <T, N> for polymorphism

I’m not necessarily disputing that, but why?

@burdges
Copy link

burdges commented May 29, 2021

You often want broader functionality like [T] ala #2990 (comment) or to support other types, like borrows or guards or tuple struct wrappers. It's not always true of course, maybe you've a specific application where being generic add unnecessary trait bounds, or maybe broader just does not work well enough somehow.

@aobatact
Copy link

I think rust-lang/wg-allocators#79 should be considered instead of creating vector type for each storage.

@c410-f3r
Copy link
Author

c410-f3r commented Dec 5, 2021

After more than one year of waiting (...), this PR is being closed due to the lack of official answers from any person or any related or unrelated team.

If desired, anyone is free to copy or reference any inner content to write another similar RFC.

@c410-f3r c410-f3r closed this Dec 5, 2021
@Lokathor
Copy link
Contributor

Lokathor commented Dec 5, 2021

there's at least two crates with array based vector types in the ecosystem. Given that, I'm not sure adding this to stdlib is particularly necessary. It's a nifty optimization, but not a huge gain in capability, so it's fine to just be a crate.

@the8472
Copy link
Member

the8472 commented Dec 5, 2021

I think we still need something like it in std to be able to collect iterators into an array. Perhaps only a subset.

@clarfonthey
Copy link

A few comments have mentioned adding some way to allow Vec to have storage on the stack instead of the heap, which would make this something that has to be added to libstd. But that's probably a different RFC.

@usbalbin
Copy link

usbalbin commented Dec 6, 2021

I think we still need something like it in std to be able to collect iterators into an array. Perhaps only a subset.

In my book that actually sounds a lot like my crate iter_fixed. It has a type IteratorFixed which is very much like a regular Iterator but with some restrictions that makes it possible to collect it into an array. (which I believe works fine outside std lib)

@the8472
Copy link
Member

the8472 commented Dec 6, 2021

There's the desire to populate an array from regular iterators, see rust-lang/rust#81615
One of the possible approaches to do that is to have an intermediate type that looks a lot like an ArrayVec.

@slightlyoutofphase
Copy link

There's another way of defining an ArrayVec (leveraging unsizing and const generics) that avoids instantiating duplicating every function used for each N. My rough experience is that it helps compile times a good amount.

Broadly speaking, in my experience, the compiler already perfectly avoids unnecessary monomorphization of const generics, with compile times not being an issue really at all, given a baseline type along the lines of anything like this:

pub struct ArrayVec<T, const N: usize> {
    data: MaybeUninit<[T; N]>,
    length: usize,
}

I do not personally think the ArrayVec<[T; N]> form has any real advantages at all if you actually are approaching things from a perspective of fully leveraging nightly const generics functionality.

@tgross35
Copy link
Contributor

tgross35 commented Sep 11, 2022

There's no reason this would be closed for good, right? (i.e., no hard "no"s from the team)

The reason I ask is that CStr was fairly recently moved to core, but there's no builtin way to convert a &str to CStr without needing to go through CString (some details and example implementation at this thread I started). Having ArrayVec would make some sort of &str -> CStr implementation easily possible in core.

I am actually a bit surprised I don't see C FFI mentioned anywhere on this thread as a good use case, since the pattern of "non-heap buffer with capacity and length" is fairly prevelant in C. Not having any user friendly representations for this in Rust means that a lot of common FFI cases tend to go through some indirection on the heap - not the end of the world in OS, but not possible in no_std. Other patterns would become easier as well.

Also, I know it was already mentioned here, but adding ArrayVec to core would be a huge boon for Rust in the embedded world too - where dependencies often must be kept to a minimum or zero, this adds insanely huge value that doesn't exist in C/C++.

So, use cases for ArrayVec are:

  • Array collection
  • Embedded
  • FFI

Which is a pretty serious list.

@Lokathor
Copy link
Contributor

You don't need to use CString if you put the \0 on your string yourself

@tgross35
Copy link
Contributor

You don't need to use CString if you put the \0 on your string yourself

That is correct, but unfortunately that precludes using any Rust functions that may return a &str

@burdges
Copy link

burdges commented Sep 11, 2022

Aside from const generics noted above, I think unsized_locals and VLAs are another reason not to rush stack based vectors into core.

It's imho worth sorting out the VLA story for both str and CStr behind the unsized_locals rustc feature flag, as if they improve calling C code much then they raise the stabilization priority unsized_locals.

@tgross35
Copy link
Contributor

Aside from const generics noted above

Are you referring to the <[T; N]> vs. <T, const N: Usize> above? IMO the debate means it's probably better left to WG members. ( I may be missing something if that's not what you meant)

It's imho worth sorting out the VLA story for both str and CStr behind the unsized_locals rustc feature flag, as if they improve calling C code much then they raise the stabilization priority unsized_locals.

Err.... Do they improve calling CFFI? I feel like they're not the best solution because:

  • A lot of time C buffers aren't unsized, they're fixed-length buffers
  • Especially on bare metal CFFIs (and especially especially on safety critical things), unsized locals are a big no no, you need to know with certainty that you won't overflow your memory. Torvalds is pretty against VLA in kernel fwiw
  • VLAs tend to (I believe) generate somewhat messy ASM compared to fixed-length buffers like this
  • Having any str/CStr functionality depend on runtime dynamics seems a bit weird, becasue CStr at least usually isn't

So I might be missing something, but aren't these independent issues? ArrayVec still seems to be like the better representation of a C buffer that is more suitable in no_std

@tgross35
Copy link
Contributor

Side note, @c410-f3r would you consider reopening this? I think a lot of people see "closed" and think that means "decided against", not just inactive (probably the thinking behind opening this PR)

@SimonSapin
Copy link
Contributor

I agree that a sized ArrayVec would be great to have in core.

At this point I think what this RFC needs is someone to "champion" it. c410-f3r said they didn’t want to keep doing that, so if someone wants to take this up comment here to coordinate with anyone else potentially interested.

Next steps are probably to discuss on Zulip to see interest from @rust-lang/libs-api members, then make a new RFC PR (possibly based on this one).

@c410-f3r
Copy link
Author

c410-f3r commented Sep 11, 2022

Side note, @c410-f3r would you consider reopening this? I think a lot of people see "closed" and think that means "decided against", not just inactive (probably the thinking behind opening this PR)

I can if the lib team is willing to accept such feature.

To give you a better idea about the situation, a very trivial and useful auxiliary function that creates custom arrays took TWO YEARS to be stabilized (rust-lang/rust#75644).

There is still no answer for the lack of official feedback. Maybe because many ppl that contribute to Rustc are volunteers? Maybe because the ppl involved have other priorities? Maybe because there aren't enough human resources? The things is, it is probably best to not have high hopes for this feature.

### Optimization

Stack-based allocation is generally faster than heap-based allocation and can be used as an optimization in places that otherwise would have to call an allocator. Some resource-constrained embedded devices can also benefit from it.

Copy link
Contributor

@tgross35 tgross35 Sep 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest adding a new section, something like:

### C FFI

An extremely common pattern in C is to pass a buffer with a maximum size and a
current length, e.g., `myfunc(int buf[BUF_SIZE], int* current_len)`, and expect a
callee function to be able to modify the `buf` and `len` to store its return data.
Currently, the best ways to interoperate with this FFI pattern in Rust without relying
on external libraries is to create a slice and manually maintain the length as changes
are made.

This is an _ideal_ use case for `ArrayVec` in core, as it is a representation of a fixed-
capacity buffer with a current length. This greatly simplifies common patterns
required when working with existing C APIs, especially in embedded development.
Further, it would provide the backbone for an easier `&str` to `CStr` transition that
is not directly possible without going through heap-based `CString`, which is not
available in `no_std` environments.

Interoperability between C and Rust is currently "good", but providing a better
representation for common C patterns is an excellent way to grow Rust as a language
and ease adoption in C-dominant environments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I don't know where best it would go, but some section here should really drive home the usability in no_std environments (embedded, bare metal, kernel, safety-critical, etc). Rust is gaining traction in kernel of course, but seems like it has yet to really break into any of the other environments.

In many of these places, dependencies are frowned upon.

@tgross35
Copy link
Contributor

At this point I think what this RFC needs is someone to "champion" it. c410-f3r said they didn’t want to keep doing that, so if someone wants to take this up comment here to coordinate with anyone else potentially interested.

I would shakily raise my hand or be willing to help somebody else, but c410 just popped into the comments so maybe it won't be necessary

I can if lib team is willing to accept such feature.

To give you a better idea about the situation, a very trivial and useful auxiliary function that creates custom arrays took TWO YEARS to be stabilized (rust-lang/rust#75644).

There is still no answer for the lack of official feedback. Maybe because many ppl that contribute to Rustc are volunteers? Maybe because the ppl involved have other priorities? Maybe because there aren't enough human resources? The things is, it is probably best to not have high hopes for this feature.

Two years is nothing in language world, have faith :) I'll bring something up on Zulip as @SimonSapin mentioned, or maybe with the embedded WG on Matrix, and see what they have to say

@c410-f3r
Copy link
Author

At this point I think what this RFC needs is someone to "champion" it. c410-f3r said they didn’t want to keep doing that, so if someone wants to take this up comment here to coordinate with anyone else potentially interested.

I would shakily raise my hand or be willing to help somebody else, but c410 just popped into the comments so maybe it won't be necessary

I can if lib team is willing to accept such feature.
To give you a better idea about the situation, a very trivial and useful auxiliary function that creates custom arrays took TWO YEARS to be stabilized (rust-lang/rust#75644).
There is still no answer for the lack of official feedback. Maybe because many ppl that contribute to Rustc are volunteers? Maybe because the ppl involved have other priorities? Maybe because there aren't enough human resources? The things is, it is probably best to not have high hopes for this feature.

Two years is nothing in language world, have faith :) I'll bring something up on Zulip as @SimonSapin mentioned, or maybe with the embedded WG on Matrix, and see what they have to say

If you are willing to push forward this feature, then feel free to take up the responsibility. My motivation to contribute to the standard library in general was severely diminished due to how things were handled in my contributions.

@tgross35
Copy link
Contributor

If you are willing to push forward this feature, then feel free to take up the responsibility. My motivation to contribute to the standard library in general was severely diminished due to how things were handled in my contributions.

I definitely feel that pain, it's far from a rust-only problem. Any big OSS project just has super slow design cycles, and usually enough pressing things that "nice to have but not directly needed" stuff can feel swept under the rug.

I've had some discussion on zulip the past couple of days, and it seems like the response is more that some questions need answering than a hard no. Since I suppose a RFC is the place for that, I will go ahead and fork this and do some updates. Will update with a new PR link after that is done

@tgross35 tgross35 mentioned this pull request Sep 15, 2022
4 tasks
@tgross35
Copy link
Contributor

Ok all, the rough draft PR is available: #3316

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-collections Proposals about collection APIs A-types-libstd Proposals & ideas introducing new types to the standard library. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the RFC.