-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Array iterator classes private to hide implementation from client #193
base: main
Are you sure you want to change the base?
Changes from 12 commits
e6691a1
84535ad
42b09a7
5f5d79f
a52eb5f
73099ee
873c378
b0fd354
f492370
62a7b03
48fc8bd
255c487
869eb82
2bfd088
7bf7641
06018fc
e60bb5a
16ec9ef
5b91ce2
7fda29d
4363ab0
5dbe1ca
68b7695
c0da56d
b539018
44103dd
d61d656
18f8d49
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,285 @@ | ||
- Feature Name: array_private_classes | ||
- Start Date: 2022-01-08 | ||
- RFC PR: (leave this empty) | ||
- Pony Issue: (leave this empty) | ||
|
||
# Summary | ||
|
||
Collection classes should not expose internal classes through iterators functions. | ||
|
||
Following information hiding design principle, the builtin classes of collection | ||
data structures must not be made visible through iterator functions like `ArrayKeys`, | ||
`ArrayValues` and `ArrayPairs`. These classes can be made private as they are | ||
only used as return types for `Array` functions `keys`, `values` and `pairs`. | ||
The return values for these functions are changed to the more general interface | ||
`Iterator`. | ||
|
||
A new interface `RewindableIterator` is defined to allow for rewindable iterators, | ||
like it is the case for `Array` `values`. | ||
|
||
This design principle is applied to the other collection classes that expose | ||
internals too like: | ||
|
||
* `List` | ||
* `Map` | ||
* persistent `Map` | ||
* `Vec` | ||
* persistent `Vec` | ||
* `Set` | ||
* `Itertools` | ||
|
||
This is a breaking change for collections' client code that use now internal | ||
classes but a search on Github repositories shows that the impact should be | ||
limited. | ||
|
||
# Motivation | ||
|
||
This change brings: | ||
|
||
- Applying the design principle of | ||
[hiding implementation details](https://en.wikipedia.org/wiki/Information_hiding) | ||
but offer a general and stable interface. Returning interfaces instead of concrete | ||
classes allows changing the implementation. Usually, one must return the most | ||
general type that fullfils the contract of the function (in the case of the | ||
functions discussed in this RFC, iteration). | ||
- Collections' functions `keys`, `values` and `pairs` definitions are made more | ||
general. Iterators implementation details are not public. Internal classes used | ||
by implementation like `*Keys`, `*Values` and `*Pairs` are now | ||
[opaque data types](https://en.wikipedia.org/wiki/Opaque_data_type). Generally, | ||
when using these collection classes, clients are not interested by the iterators | ||
implementation, but by the types these iterators return and that is provided by | ||
the generic parameters. | ||
- The generic return signature of these 3 iterating functions is simpler to | ||
understand for clients of collection classes. | ||
pmetras marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Reduces the number of public classes in the standard library by hiding 18 | ||
specialised classes (iterators implementations) of which 3 are from the | ||
`builtin` module. | ||
- The interface `RewindableIterator` is added to create rewindable iterators | ||
(can be re-start from first value). | ||
|
||
This change remains compatible with the existing code base but for client code | ||
that is directly using the classes `*Keys`, `*Values` and `*Pairs`. A search on | ||
Github shows that the impact is very limited. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This doesn't seem to be motivation and should be worked into design as these are notes and statements about a some specific design points. |
||
|
||
# Detailed design | ||
pmetras marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Iterating functions in collections `keys`, `values` and `pairs` are changed to | ||
return `Iterator` and the classes that implement these iterators are made private. | ||
Here are the full implementation of these functions for the `Array` class (changes | ||
in other collection classes are identical). | ||
|
||
As the function `values` of class `Array` uses an iterator with a `rewind` function | ||
that is not part of the `Iterator` interface, a new interface `RewindableIterator` | ||
is added to enable creation of rewindable iterators. | ||
|
||
```pony | ||
fun keys(): Iterator[USize]^ => | ||
""" | ||
Return an iterator over the indices in the array. | ||
""" | ||
_ArrayKeys[A, this->Array[A]](this) | ||
|
||
fun values(): RewindableIterator[this->A]^ => | ||
""" | ||
Return an iterator over the values in the array. | ||
""" | ||
_ArrayValues[A, this->Array[A]](this) | ||
|
||
fun pairs(): Iterator[(USize, this->A)]^ => | ||
""" | ||
Return an iterator over the (index, value) pairs in the array. | ||
""" | ||
_ArrayPairs[A, this->Array[A]](this) | ||
pmetras marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
|
||
Note: To remain consistent with `Array` behaviour, functions `keys` and `pairs` | ||
should return a `RewindableIterator` too but we limited the API change to minimum | ||
as we did not understood why it was not already the case. | ||
pmetras marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```pony | ||
interface RewindableIterator[A] is Iterator[A] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's no need for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's true for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So in the case of those, it made sure that if the Iterator interface was changed that all those concrete classes would fail compilation, is that the intention here? If yes, I think there is a larger discussion about the idea of duplicating the methods from Iterator in RewindableIterator and then using It feels odd. It might be worth it to avoid having an interface that is only I know you had some conversation around this with @jemc. I think you should discuss with him some supporting justification for this. Is this a one off? Is this establishing a pattern that we want to use in the standard library? Is it a pattern that is already being used in the standard library? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My suggestion would be to keep the For an
Or from the perspective of the interface user:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As noted below by me, I prefer the alternative design of not having There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand what is finally desired here. What would be the |
||
""" | ||
A `RewindableIterator` is an iterator that can be rewinded, that is start | ||
again from first item. The data structure being iterated on can't change the | ||
order it return iterated items. | ||
""" | ||
fun has_next(): Bool | ||
""" | ||
Return `true` when function `next` can be called to get next iteration item. | ||
""" | ||
|
||
fun ref next(): A ? | ||
""" | ||
Return the next item of the iteration or an error in case there are no other | ||
items. A previous call to `has_next` check if we can continue iteration. | ||
""" | ||
|
||
fun ref rewind(): Iterator[A]^ | ||
""" | ||
Start the iterator over again from the beginning. | ||
""" | ||
``` | ||
|
||
The code of the standard library is adapted to remove use of these now private | ||
classes, mainly in tests. Here are the files that must be changed: | ||
pmetras marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
* `packages/builtin/array.pony` as shown above | ||
* `packages/itertools/iter.pony` in function `cycle` | ||
* `packages/collections/heap.pony` in function `values` | ||
* `packages/collection/builtin/_test.pony` in class `_TestArrayValuesRewind` | ||
* `packages/collections/list.pony` | ||
* `packages/collections/map.pony` | ||
* `packages/collections/persistent/map.pony` | ||
* `packages/collections/persistent/vec.pony` | ||
* `packages/collections/set.pony` | ||
* `test/libponyc/util.cc` to change the name of the class to `_ArrayValues` | ||
|
||
## Detailed changes | ||
|
||
In order to judge how the API becomes simpler to understand for clients of the | ||
collections classes, here are the changes in the functions' signatures. The `-` | ||
line shows the old signature while the `+` one is the new: | ||
|
||
```pony | ||
// Array | ||
- fun keys(): ArrayKeys[A, this->Array[A]]^ => | ||
+ fun keys(): Iterator[USize]^ => | ||
- fun values(): ArrayValues[A, this->Array[A]]^ => | ||
+ fun values(): RewindableIterator[this->A]^ => | ||
- fun pairs(): ArrayPairs[A, this->Array[A]]^ => | ||
+ fun pairs(): Iterator[(USize, this->A)]^ => | ||
|
||
// Heap | ||
- fun values(): ArrayValues[A, this->Array[A]]^ => | ||
+ fun values(): Iterator[this->A]^ => | ||
|
||
// List | ||
- fun nodes(): ListNodes[A, this->ListNode[A]]^ => | ||
+ fun nodes(): Iterator[this->ListNode[A]]^ => | ||
- fun rnodes(): ListNodes[A, this->ListNode[A]]^ => | ||
+ fun rnodes(): Iterator[this->ListNode[A]]^ => | ||
- fun values(): ListValues[A, this->ListNode[A]]^ => | ||
+ fun values(): Iterator[this->A]^ => | ||
- fun rvalues(): ListValues[A, this->ListNode[A]]^ => | ||
+ fun rvalues(): Iterator[this->A]^ => | ||
|
||
// Map | ||
- fun keys(): MapKeys[K, V, H, this->HashMap[K, V, H]]^ => | ||
+ fun keys(): Iterator[this->K]^ => | ||
- fun values(): MapValues[K, V, H, this->HashMap[K, V, H]]^ => | ||
+ fun values(): Iterator[this->V]^ => | ||
- fun pairs(): MapPairs[K, V, H, this->HashMap[K, V, H]]^ => | ||
+ fun pairs(): Iterator[(this->K, this->V)]^ => | ||
|
||
// Persistent Map | ||
- fun val keys(): MapKeys[K, V, H] => | ||
+ fun val keys(): Iterator[K] => | ||
- fun val values(): MapValues[K, V, H] => | ||
+ fun val values(): Iterator[V] => | ||
- fun val pairs(): MapPairs[K, V, H] => | ||
+ fun val pairs(): Iterator[(K, V)] => | ||
|
||
// Persistent Vec | ||
- fun val keys(): VecKeys[A]^ => | ||
+ fun val keys(): Iterator[USize]^ => | ||
- fun val values(): VecValues[A]^ => | ||
+ fun val values(): Iterator[A]^ => | ||
- fun val pairs(): VecPairs[A]^ => | ||
+ fun val pairs(): Iterator[(USize, A)]^ => | ||
|
||
// Set | ||
- fun values(): SetValues[A, H, this->HashSet[A, H]]^ => | ||
+ fun values(): Iterator[this->A]^ => | ||
``` | ||
|
||
# How We Teach This | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nothing in this section seems to be about teaching people anything going forward about the pattern nor explaining to people impacted by the breaking change what they need to be aware of. This section should be devoted to what if anything should be done with Pony Patterns, the Pony tutorial, standard library documentation, and release notes to adapt to the new world this RFC would bring about. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pmetras - "How We Teach This" could include a post on the "Pony Patterns" site that demonstrates how to use a public interface to hide the implementation of a private class, rather than making that class public. Like you, I see this as a desirable design pattern, so we could have a "Pony Patterns" example showing how we did this in the Pony standard library with |
||
|
||
This change keeps the code compatible in the vast majority of cases. When client | ||
classes are defining objects of these now private types, the reason is usually | ||
to get access to the function `rewind` that was not defined in `Iterator`. By | ||
adding the interface `RewindableIterator`, client code can easily be adapted, | ||
replacing `ArrayValues[A]` by `RewindableIterator[A]`. | ||
|
||
Also, client code generally uses these functions to iterate on the returned types | ||
and does not try to access the iterator directly but is interested by the iterated | ||
items. When client code refers to the iterator type, that's generally useless and | ||
the code can be rewritten to be made shorter and more future proof. | ||
|
||
A [search on Github Pony code](https://github.com/search?q=%22ArrayValues%22+language%3APony&type=code) | ||
finds 24 files using the class `ArrayValues`, of which 6 are copies of `array.pony` file. | ||
|
||
For instance, in | ||
[xml2xpath.pony](https://github.com/redvers/pony-libxml2/blob/bbca5d98d48854bfec2c6ee110220873ecc4df34/pony-libxml2/xml2xpath.pony#L41), | ||
the code can be changed from | ||
|
||
```pony | ||
fun values(): ArrayValues[Xml2node, this->Array[Xml2node]]^ ? => | ||
if (allocated) then | ||
ArrayValues[Xml2node, this->Array[Xml2node]](nodearray) | ||
else | ||
error | ||
end | ||
``` | ||
|
||
to | ||
|
||
```pony | ||
fun values(): RewindableIterator[Xml2node]^ ? => | ||
if (allocated) then | ||
nodearray.values() | ||
else | ||
error | ||
end | ||
``` | ||
|
||
In this sample, the developer was not really concerned by the type of the iterator | ||
but that the `values` function must return an `RewindableIterator` over `Xml2node`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not think you can speak on behalf Red's concerns here; unless I am missing some prior conversation where Red stated this directly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, it was only an example I found while surveying how existing code were using the concrete class on Github and that I used to demonstrate the type of impact in the RFC. I saw that in the few examples I found, these concrete class were not used for their implementation, but they were used as iterators only. In the example, I showed how an existing piece of code could be rewritten using the new functions signatures while still giving the same results. And as a consequence, the code will be simpler to understand. |
||
The new version makes the code simpler to understand. | ||
|
||
This change in `array.pony` and other collections will break such code but it | ||
can be easily adapted to use the new API. And it will make the standard library | ||
easier to learn by reducing the number of public types. | ||
|
||
# How We Test This | ||
|
||
Pony tests must continue to pass. No additional tests are need as after review | ||
the existing coverage in Pony standard library tests is sufficient | ||
|
||
# Drawbacks | ||
|
||
Will break any existing code that uses any of the classes that are currently | ||
public and will be made private by this RFC. | ||
|
||
# Alternatives | ||
|
||
Stay as is. Continue the | ||
[discussion on Zulip](https://ponylang.zulipchat.com/#narrow/stream/189959-RFCs/topic/Make.20Array.20iterators.20private). | ||
pmetras marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Instead of defining a new type of iterator with the `RewindableIterator` interface, | ||
we can consider only the rewindable part of it in an `Rewindable` interface: | ||
|
||
```pony | ||
interface Rewindable[A] | ||
fun ref rewind(): Iterator[A]^ | ||
""" | ||
Start the iterator over again from the beginning. | ||
""" | ||
``` | ||
|
||
Then one can create a rewindable iterator by composing these two interfaces: | ||
|
||
```pony | ||
type RewindableIterator[A] is (Iterator[A] & Rewindable[A]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As noted above, I prefer this approach. I think whenever possible we should have non-overlapping interfaces in the stdlib which are then joined into commonly used type alias such as this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will lead to an explosion of types in the standard library and in this case, they will need to be in builtin unless Array doesn't implement the same iterator interface as things like Map etc in collections which strikes me as weird. A key goal is to keep builtin as small as possible. If this wasn't going to start us towards either
then I might be in favor of this. As is, I am not. |
||
``` | ||
|
||
The rewindable type can be used with other types than `Iterator`, like a data | ||
structure that would implement a rewindable property. This alternative was | ||
[put aside](https://github.com/ponylang/rfcs/pull/193#discussion_r780793165) to | ||
prevent name colisions in `builtin` with user-named types. | ||
|
||
# Unresolved questions | ||
|
||
Pony language definition is not precise about | ||
[type variance](https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)) | ||
and its impact in collections classes, particularly for covariant return types | ||
of functions (in this RFC, `keys`, `values` and `pairs`). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Standard notions of covariance and contravariance do apply as normal in Pony, including covariant return types. Do you still have open/unresolved questions on this topic? If so, I can probably help resolve them. Feel free to expand your questions here or hit me up in the Zulip chat to discuss further. Once you're comfortable with the answers we can remove this from the "unresolved questions" section, since I don't think there are any Pony surprises related to this particular topic that are relevant to this RFC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Motivation can be used a precedence for future RFCs so I want this clarified to state that we are not setting a precedence to generally avoid returning concrete types in the stdlib -- which would be a mistake in my opinion. This reasoning to use opaque types due to client disinterest does not extend beyond this RFC -- specifically we are stating that we have those expectations of disinterest for iterator implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The objective of the RFC is not about concrete type vs virtual ones, but more about applying Liskov principle. And it's difficult to say that a principle applies only on a special RFC... My initial questioning was about only about
Array
iterators and now it extends to all collection iterators, where we have to define what collection means (because collection data structures are not limited to the collection package).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would vote against the RFC on those grounds. If you want to include it in the motivation you can, but I won't be in support in that case.