-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Iterator::collect_array method #79659
Conversation
r? @sfackler (rust-highfive has picked a reviewer for you, use r? to override) |
where | ||
Self: Sized, | ||
{ | ||
struct Buf<T, const N: usize> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have this type somewhere already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there is one in [T; N]::map()
method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we need core::collections::array_vec
already :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an item for the next libs meeting to talk about that https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/PRs.20need.20a.20decision.20on.20.60try_generate.60.20or.20.60ArrayVec.60.20or.20.2E.2E.2E/near/218779392 :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm currently writing Iterator::array_chunks
and I have +- the same code :)
(though it's a bit more convoluted since DoubleEndedItetator
impl requires an ability to push_back)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately we rarely get to our needs decision items 🙃
If we keep tabs on these unwind-safe-partially-initialized-array types we're introducing I'd be happy not to block any of these PRs individually on them and run a pass over later to replace them with something more general.
9cf42aa
to
7db95bf
Compare
With const-generics, we can now collect exactly `N` elements out of the iterator. Very similar API existed in the itertools crate for quite some time: https://docs.rs/itertools/0.9.0/itertools/trait.Itertools.html#method.collect_tuple
7db95bf
to
84bc38a
Compare
is there a reason to prefer this over a more general Do you intend to keep this as unstable until we are comfortable adding the impl as that is impl stable, similar to |
I think I don't understand the question, could you rephrase it? Shooting in the dark, yes, I expect this to be unstable for quite some time, both because it uses const-generics, and because I think the API itself needs to bake in for some times. However, long term we absolutely should have something like this on stable, in one form or another. |
We can add
My second question was based on my understanding that |
Thanks for clarifying, I indeed misunderstood the question. I'll continue discussion in that PR to not split it over many threads. |
I don't have a string opinion here, but it indeed might help with nailing down the fiddly Error type, for example! |
I don't think there's any prior art for this kind of fallible collection (even the result propagation impls only fail due to an inner Err rather than a size mismatch). The "collect-to-result" impl is already a bit undiscoverable, and I'd worry that a similar impl would be even more awkward to work with since you'd potentially need to explicitly name the error type chosen. |
/// ``` | ||
#[inline] | ||
#[unstable(reason = "new API", issue = "none", feature = "iter_collect_array")] | ||
fn collect_array<const N: usize>(self) -> Option<[Self::Item; N]> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pondering: given that this will potentially collect only a prefix, should it be &mut self
? The Option
is enough to know whether the iterator was exhausted in the call, so one could use .collect_array::<4>()
in a loop...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think I misunderstood initially... You want to have "exact N" semantics, but you want &mut self
at the same time?
This is interesting!
The problem I am seeing here is that, to get exactly N, you'll need to fetch N + 1 elements, so one would be dropped on the floor or stuffed into the Err
variant somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that it.collect_array::<3>()
would be equivalent to try { [it.next()?, it.next()?, it.next()?] }
. So you'd get None
if the length is shorter than the number requested, but if there's more you'd be able to call again to get more of them.
(Maybe with a rename like sfackler mentions -- this is reminding me of as_chunks
and array_chunks
on slices, so maybe something like that?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scottmcm for iterators, self
and &mut self
is the same thing - you can always use .by_ref()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WaffleLapkin Conceptually yes. But in implementation not exactly -- see all these benchmarks for example.
And if this doesn't exhaust the iterator, then it should be &mut self
for consistency with other iterator methods that don't necessarily exhaust the iterator (like all
and any
, which are also &mut self
).
One option to avoid dealing with the "fallible collect" question could be to instead treat this as |
Answering both #79659 (comment) and #79659 (comment): I think we should do what itertool does. It has (modernizing API) It just seems wrong for To be clear, I am still not sure whether |
Urgh, the previous comment actually referes to #69985, as I am confusing the two (didn't have my morning pu'er yet :(). To clarify:
|
Maybe it might make sense to return some kind of Result that lets you decide how you want to deal with the overflowing case, so Result<[T; N], [T; N]> or so with maybe some convenience methods for unwrapping either Ok or Err (maybe as a general method on Result<T, T>, seemingly people want such a method anyway: https://twitter.com/djco/status/1334535545905672192 ) |
Attempting to summarize discussion here:
Implementation wise, we def should wait for fn next_n<const N: usize>(&mut self) -> Result<[T; N], ArrayVec<T, N - 1>>; I suggest the following plan for this PR:
|
FWIW, I just had a situation where I wanted exactly EDIT: for clarity, I did need partial results for this use, so unlike my other comment I wouldn't have been able to use A thought if we don't want to wait for EDIT2: That version as fn next_n<const N: usize>(&mut self) -> Result<[T; N], ArrayVec<T, N - 1>>; sounds amazing, though -- would be exactly right for the simd-sum. |
Due to all the similar PRs floating around #81615 should help to centralize design discussion. |
…ray, r=dtolnay Add internal `collect_into_array[_unchecked]` to remove duplicate code Unlike the similar PRs rust-lang#69985, rust-lang#75644 and rust-lang#79659, this PR only adds private functions and does not propose any new public API. The change is just for the purpose of avoiding duplicate code. Many array methods already contained the same kind of code and there are still many array related methods to come (e.g. `Iterator::{chunks, map_windows, next_n, ...}`, `[T; N]::{cloned, copied, ...}`, ...) which all basically need this functionality. Writing custom `unsafe` code for each of those doesn't seem like a good idea. I added two functions in this PR (and not just the `unsafe` version) because I already know that I need the `Option`-returning version for `Iterator::map_windows`. This is closely related to rust-lang#81615. I think that all options listed in that issue can be implemented using the function added in this PR. The only instance where `collect_array_into` might not be general enough is when the caller want to handle incomplete arrays manually. Currently, if `iter` yields fewer than `N` items, `None` is returned and the already yielded items are dropped. But as this is just a private function, it can be made more general in future PRs. And while this was not the goal, this seems to lead to better assembly for `array::map`: https://rust.godbolt.org/z/75qKTa (CC `@JulianKnodt)` Let me know what you think :) CC `@matklad` `@bstrie`
☔ The latest upstream changes (presumably #81732) made this pull request unmergeable. Please resolve the merge conflicts. |
…ray, r=dtolnay Add internal `collect_into_array[_unchecked]` to remove duplicate code Unlike the similar PRs rust-lang#69985, rust-lang#75644 and rust-lang#79659, this PR only adds private functions and does not propose any new public API. The change is just for the purpose of avoiding duplicate code. Many array methods already contained the same kind of code and there are still many array related methods to come (e.g. `Iterator::{chunks, map_windows, next_n, ...}`, `[T; N]::{cloned, copied, ...}`, ...) which all basically need this functionality. Writing custom `unsafe` code for each of those doesn't seem like a good idea. I added two functions in this PR (and not just the `unsafe` version) because I already know that I need the `Option`-returning version for `Iterator::map_windows`. This is closely related to rust-lang#81615. I think that all options listed in that issue can be implemented using the function added in this PR. The only instance where `collect_array_into` might not be general enough is when the caller want to handle incomplete arrays manually. Currently, if `iter` yields fewer than `N` items, `None` is returned and the already yielded items are dropped. But as this is just a private function, it can be made more general in future PRs. And while this was not the goal, this seems to lead to better assembly for `array::map`: https://rust.godbolt.org/z/75qKTa (CC ``@JulianKnodt)`` Let me know what you think :) CC ``@matklad`` ``@bstrie``
I suspect that this PR wants to become a thin wrapper around the internal API that landed in #82098 , yes? |
Triage: what's next steps here? |
Urgh, seems like I dropped the ball on this one. I feel that the plan outlined in #79659 (comment) is still valid, but I doubt that I'll pursue it to completion soon. So optimistically closing! |
and... ? all the little features that I always end up needing and assuming rust supports have closed down PRs... |
With const-generics, we can now collect exactly
N
elements out ofthe iterator.
Very similar API existed in the itertools crate for quite some time:
https://docs.rs/itertools/0.9.0/itertools/trait.Itertools.html#method.collect_tuple
This needs more thorough tests before merging (esp around
Drop
& panic), I'll add them if this API as a whole feels like a good idea.