Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for array_map #75243

Closed
3 of 4 tasks
JulianKnodt opened this issue Aug 7, 2020 · 33 comments · Fixed by #87174
Closed
3 of 4 tasks

Tracking Issue for array_map #75243

JulianKnodt opened this issue Aug 7, 2020 · 33 comments · Fixed by #87174
Labels
A-const-generics Area: const generics (parameters and arguments) A-slice Area: `[T]` C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@JulianKnodt
Copy link
Contributor

JulianKnodt commented Aug 7, 2020

The feature gate for the issue is #![feature(array_map)].

Public API

impl<T, const N: usize> [T; N] {
    pub fn map<F, U>(self, f: F) -> [U; N] where F: FnMut(T) -> U;
}

Steps / History

Unresolved Questions

How should order be documented/allowed for map? See #75212 (comment) for discussion.

@JulianKnodt JulianKnodt added the C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC label Aug 7, 2020
@JulianKnodt JulianKnodt changed the title Tracking Issue for XXX Tracking Issue for array_map Aug 7, 2020
JulianKnodt added a commit to JulianKnodt/rust that referenced this issue Aug 7, 2020
@jonas-schievink jonas-schievink added A-const-generics Area: const generics (parameters and arguments) T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Aug 7, 2020
@the8472
Copy link
Member

the8472 commented Aug 7, 2020

A possible followup optimization: Perform this in place when size_of::<T>() == size_of::<U>() && align_of::<T>() == align_of<U>(), which can be checked at compile time. Together with RVO that might save a lot of extra stack space on large arrays.

JulianKnodt added a commit to JulianKnodt/rust that referenced this issue Aug 13, 2020
Add note & example about iter order

Add doc changes

Update doc comments
JulianKnodt added a commit to JulianKnodt/rust that referenced this issue Aug 24, 2020
Add note & example about iter order

Add doc changes

Update doc comments
@vadixidav
Copy link
Contributor

vadixidav commented Aug 30, 2020

This feature is currently breaking arraymap::ArrayMap without enabling the feature:

error[E0658]: use of unstable library feature 'array_map'
  --> lambda-twist/tests/consensus.rs:25:6
   |
25 |     .map(|&p| Point3::from(p));
   |      ^^^
   |
   = note: see issue #75243 <https://github.com/rust-lang/rust/issues/75243> for more information
   = help: add `#![feature(array_map)]` to the crate attributes to enable

This code started breaking on more recent Rust versions as it seems that it thinks I am attempting to use the map method on array created by this feature, but actually I am using arraymap::ArrayMap::map.

Since I am depending on nightly anyways, I will simply use this feature instead to get around this issue.

@JulianKnodt
Copy link
Contributor Author

Ah I didn't know this crate existed, thanks for the heads up. I'm not quite sure how to resolve this, but will investigate.

@KodrAus KodrAus added the Libs-Tracked Libs issues that are tracked on the team's project board. label Sep 10, 2020
@KodrAus KodrAus added the A-slice Area: `[T]` label Sep 22, 2020
@KamilaBorowska
Copy link
Contributor

KamilaBorowska commented Sep 23, 2020

That seems like a compiler bug. I think a warning should show, but due to map accepting &self rather than self, it's not being seen. The problem can be seen with the following program:

trait ArrayExt {
    fn map<F: FnMut(i32) -> i32>(&self, f: F);
}

impl ArrayExt for [i32; 1] {
    fn map<F: FnMut(i32) -> i32>(&self, f: F) {}
}

fn main() {
    [4].map(|x| x);
}

In my opinion, this should print the following:

warning: a method with this name may be added to the standard library in the future
  --> src/main.rs:10:15
   |
10 |     [4].map(|x| x);
   |         ^^^
   |
   = note: `#[warn(unstable_name_collisions)]` on by default
   = warning: once this method is added to the standard library, the ambiguity may cause an error or change in behavior!
   = note: for more information, see issue #48919 <https://github.com/rust-lang/rust/issues/48919>
   = help: call with fully qualified syntax `ArrayExt::map(...)` to keep using the current method
   = help: add `#![feature(array_map)]` to the crate attributes to enable `array::<impl [T; N]>::map`

That being said, I'm kinda wondering whether [T; N]::map should use self or &self? Maybe we should have two methods, because they both have their use-cases. &mut self is also possible, albeit that one is kinda niche.

@JulianKnodt
Copy link
Contributor Author

Hm, fair questions. It does usually make sense to take &self since you'd need to make a whole new array of items, but if you want to take ownership of the elements, maybe there is a use case I imagine that probably finding an iterator solution there might be more elegant and is in the works. It would be helpful to post this in the general issues, and link to this Tracking issue for this sort of thing. For &mut self, if you want to modify elements in place, I think it makes most sense to just use a for loop over the elements and modify them that way instead of thru a method.

@LukasKalbertodt
Copy link
Member

That being said, I'm kinda wondering whether [T; N]::map should use self or &self? Maybe we should have two methods, because they both have their use-cases. &mut self is also possible, albeit that one is kinda niche.

That should be solved by #75490. Then you can simply as array.each_ref().map(...), for example. That makes it work like Option, where you also only have one map that takes self but also have Option::as_ref and Option::as_mut to work on references to the underlying data instead.

@LukasKalbertodt
Copy link
Member

What do you all think about specifying the order in which the closure is called and then already stabilizing this method? I know it hasn't been on nightly for very long, but I don't see what we would want to change about this. It's a pretty straight forward method.

Let's go through the parts of the signature:

  • Name: I'm pretty sure we all agree map is the best choice here.
  • Receiver self, &self or &mut self: as I said in my last comment, array::each_ref and array::each_mut will solve this in the same way it is solved for Option, so self by value seems like the right choice to me.
  • Closure type: FnOnce is out. But maybe Fn instead of FnMut? That's basically only useful if we want to map each element concurrently. But there is currently no similar method in std which internally/secretly uses multiple threads, so I'm pretty sure we don't want that. Apart from that, array lives in core, where we don't even have access to std::thread. Finally, all other map methods take FnMut.

Regarding the order in which the closure is called: I think it is safe to specify the obvious order (index order, starting from 0, going up). We don't work on elements concurrently as mentioned above, so I don't see why we would chose another order.

@KamilaBorowska
Copy link
Contributor

KamilaBorowska commented Jan 11, 2021

Thread usage mostly has to do with Send and Sync bound, and I don't think map should require those bounds for its callback. Let's leave multi-threaded array maps to rayon.

FnMut I think makes most sense. Using Fn doesn't prevent mutation anyway, and only will make API more annoying to use in cases where computing the value requires mutation (for example when the values are being memoized) by requiring RefCell usage.

@usbalbin
Copy link
Contributor

There were some light discussion in #80094 about something like iterators that encode the length. Would that be something to consider before stabilizing this array_map?

@eduardosm
Copy link
Contributor

It seems that map is not getting as optimized as a loop or a manually unrolled version: https://godbolt.org/z/hsEdqv

@JulianKnodt
Copy link
Contributor Author

It seems that zip is quite annoying: https://godbolt.org/z/4do8nT,
Here's enumerate instead of zip: https://godbolt.org/z/MnrEPo.
That being said, I do have a blocked PR up for optimizing it a little.

@Lokathor
Copy link
Contributor

I'm also blocked by this hitting stable.

@LukasKalbertodt
Copy link
Member

With #82098, the assembly output has improve somewhat: https://godbolt.org/z/hsEdqv (Same link @eduardosm posted above, but Godbolt automatically updates the nightly). It's still not as good as the other two versions though.

@usbalbin
Copy link
Contributor

usbalbin commented Apr 17, 2021

Sorry for the noise, please let me know if I should ask this question somewhere else and if so where :)

What is the (or is there a) long term plan when it comes to iter-like methods on [T; N]? Will there be lots of methods added as time goes or will map and generate/from_fn be the only ones? (semi rhetorical question)

My question is If the plan is to make arrays behave a bit like iterators then why not make an iterator-like thing that works with arrays(and other collections of fixed size) instead of tacking the methods directly to [T; N]? I have made a small (WIP) proof of concept iter_fixed of the idea in my post above(happens to depend on array_map).

With that, map would like something like:

let res: [_; 4] = [1, 2, 3, 4]
    .into_iter_fixed()
    .map(...)
    .collect();

a few more lines of code, sure, however it gives access to a whole lot of other methods and code can be made generic over the underlying collection.

Edit: the proposed iter-like methods I know of this far are map, from_fn and zip

@KamilaBorowska
Copy link
Contributor

KamilaBorowska commented Apr 18, 2021

Sorry for the noise, please let me know if I should ask this question somewhere else and if so where :)

What is the (or is there a) long term plan when it comes to iter-like methods on [T; N]? Will there be lots of methods added as time goes or will map and generate/from_fn be the only ones? (semi rhetorical question)

My question is If the plan is to make arrays behave a bit like iterators then why not make an iterator-like thing that works with arrays(and other collections of fixed size) instead of tacking the methods directly to [T; N]? I have made a small (WIP) proof of concept iter_fixed of the idea in my post above(happens to depend on array_map).

With that, map would like something like:

let res: [_; 4] = [1, 2, 3, 4]
    .into_iter_fixed()
    .map(...)
    .collect();

a few more lines of code, sure, however it gives access to a whole lot of other methods and code can be made generic over the underlying collection.

Edit: the proposed iter-like methods I know of this far are map, generate/from_fn(cant find the issue) and zip

Most of methods in iter_fixed are unlikely to be introduced at this point as they would need to do computations to calculate new length of array and that is not allowed on stable.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jun 22, 2021
@CryZe
Copy link
Contributor

CryZe commented Jun 22, 2021

The generated code isn't always optimal right now, but that doesn't have to block stabilization. We can always keep improving the implementation.

I'm pretty sure that the performance pitfall is not something that can in any way ever be mitigated other than through possibly clippy lints, which however is maybe good enough.

@m-ou-se
Copy link
Member

m-ou-se commented Jun 22, 2021

Are you talking specifically about large arrays? For an [u8; 8].map(..).map(..).map(..) it's no different than just applying three functions to a u64 or something. Moving or copying large arrays has always been a bit of a performance pitfall in Rust. Some of these pitfalls were avoided because we only had implementations up to a length of 32, but that's no longer the case. A warning (from clippy or rustc) about moving/copying large arrays might be useful. I don't think we need to block array.map() on it, because it's not a problem specific to map.

@CryZe
Copy link
Contributor

CryZe commented Jun 22, 2021

No, this has nothing to do with moving (that's a separate issue, that can indeed be fixed). Instead it has something to do with the fact that the maps can't ever "merge together" as they would in a normal iterator chain. It is forced to fully complete the first map before starting the second, meaning that it is forced to do 3 entire loops over the array as opposed to just a single loop with the 3 closures merged and therefore optimized together. LLVM may be smart enough to at some point (atm it definitely doesn't do this) realize that the 3 closures are order independent and do the optimization anyway, but the moment either of those do anything that's not order independent, no optimization will ever be possible.

Overall I guess it's fine merging this. There are many other ways to write slow code in Rust. This just needs a few lints and that's it.

@LukasKalbertodt
Copy link
Member

@CryZe I just played with a few examples and looked at the generated assembly. E.g. see this: https://rust.godbolt.org/z/M1KTbYfqv

For some examples, LLVM really doesn't perform too well. I wasn't quite aware of that, so thanks for bringing that up again. But note that for smallish arrays, LLVM seems to unroll all loops. And in those cases, it seems to perform quite well and combine the different operations. In the example above, the threshold is somewhere between 32 and 50. And I would expect a considerable amount of array::map uses cases to be about (very) short arrays, e.g. [f32; 3], [f32; 4] and [[f32; 4]; 4] being common in computer graphics (and the 4x4 matrix is already basically the largest example I can think of).

And while I don't know anything about how LLVM works internally, I feel like there is nothing fundamental about the design of array::map keeping LLVM from better optimizing this, right?

All that said, I think a note about this possible performance pitfall in the docs for array::map is probably warranted. But yeah, I think the number of use cases that won't encounter this optimization problem still vastly outnumbers the ones that do. So I still think stabilization is already a good idea.

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Jul 2, 2021
@rfcbot
Copy link

rfcbot commented Jul 2, 2021

🔔 This is now entering its final comment period, as per the review above. 🔔

@Lokathor
Copy link
Contributor

Lokathor commented Jul 5, 2021

Even if this method is useless in 99% of cases, it lets people write default arrays. So for that alone it's worth it.

And the 99% figure is itself hyperbolic.

@PonasKovas
Copy link
Contributor

I don't know if this is the right place for this, but I'd like to report that array::map uses the stack excessively and operating on larger arrays result in unexpected stack overflows.

Minimal example: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=2d96e8f66b3432a8d1ecf0ca4331f85a

In the example, at some point more than 16 times more stack is used than the original array size.

As far as I've looked into the actual code, the only thing that stood out to me is use of core::mem::transmute_copy instead of core::mem::transmute in core::array::IntoIter::new, but I doubt it's the reason for this.

@Lokathor
Copy link
Contributor

Lokathor commented Jul 6, 2021

Works in release at least.

In general, Rust debug performance has problems like this all the time, and can hit stack size limits easily. Particularly on embedded for example.

I think Rust can work on this problem, but it's not specifically the fault of array_map.

@programmerjake
Copy link
Member

created #86912 to track that stack usage issue

@rfcbot rfcbot added the finished-final-comment-period The final comment period is finished for this PR / Issue. label Jul 12, 2021
@rfcbot
Copy link

rfcbot commented Jul 12, 2021

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

@rfcbot rfcbot added to-announce Announce this issue on triage meeting and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Jul 12, 2021
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Jul 16, 2021
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Jul 16, 2021
…nytm

Stabilize `[T; N]::map()`

This stabilizes the `[T; N]::map()` function, gated by the `array_map` feature. The FCP has [already completed.](rust-lang#75243 (comment))

Closes rust-lang#75243.
@bors bors closed this as completed in 7c5cabe Jul 16, 2021
JohnTitor added a commit to JohnTitor/rust that referenced this issue Jul 30, 2021
…ocs, r=m-ou-se

Add docs about performance and `Iterator::map` to `[T; N]::map`

This suboptimal code gen for some usages of array::map got a bit of
attention by multiple people throughout the community. Some cases:

- rust-lang#75243 (comment)
- rust-lang#75243 (comment)
- https://www.reddit.com/r/rust/comments/oeqqf7/unexpected_high_stack_usage/

My *guess* is that this gets the attention it gets because in JavaScript
(and potentially other languages), a `map` function on arrays is very
commonly used since in those languages, arrays basically take the role
of Rust's iterator. I considered explicitly naming JavaScript in the
first paragraph I added, but I couldn't find precedence of mentioning
other languages in standard library doc, so I didn't add it.

When array::map was stabilized, we still wanted to add docs, but that
somehow did not happen in time. So here we are. Not sure if this sounds
crazy but maybe it is worth considering beta backporting this? Only if
it's not a lot of work, of course! But yeah, stabilized array::map is
already in beta and if this problem is really as big as it sometimes seems,
might be worth having the docs in place when 1.55 is released.

CC `@CryZe`

r? `@m-ou-se` (since you were involved in that discussion and the stabilization)
JohnTitor added a commit to JohnTitor/rust that referenced this issue Jul 30, 2021
…ocs, r=m-ou-se

Add docs about performance and `Iterator::map` to `[T; N]::map`

This suboptimal code gen for some usages of array::map got a bit of
attention by multiple people throughout the community. Some cases:

- rust-lang#75243 (comment)
- rust-lang#75243 (comment)
- https://www.reddit.com/r/rust/comments/oeqqf7/unexpected_high_stack_usage/

My *guess* is that this gets the attention it gets because in JavaScript
(and potentially other languages), a `map` function on arrays is very
commonly used since in those languages, arrays basically take the role
of Rust's iterator. I considered explicitly naming JavaScript in the
first paragraph I added, but I couldn't find precedence of mentioning
other languages in standard library doc, so I didn't add it.

When array::map was stabilized, we still wanted to add docs, but that
somehow did not happen in time. So here we are. Not sure if this sounds
crazy but maybe it is worth considering beta backporting this? Only if
it's not a lot of work, of course! But yeah, stabilized array::map is
already in beta and if this problem is really as big as it sometimes seems,
might be worth having the docs in place when 1.55 is released.

CC ``@CryZe``

r? ``@m-ou-se`` (since you were involved in that discussion and the stabilization)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-const-generics Area: const generics (parameters and arguments) A-slice Area: `[T]` C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.