-
-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC 0181] List index syntax #181
base: master
Are you sure you want to change the base?
Changes from 5 commits
1a67fbb
be069a8
07d2247
05d3dac
51e396c
3c8b9f0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,290 @@ | ||
--- | ||
feature: list-index-syntax | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since we are here, what about extending this to sets? { x=1; y=2; }.["x"] There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why? You can already write There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is a good argument why the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I do not have a fancy technical term for this. While lists are indexed by positive or null integers, sets are indexed by keys. If we are using a syntax construction for one thing and another for the other one, it adds an otherwise anti-natural distinction. The idea from piegamesde looks better since it squeezes two bytes. On the other hand, in guise of speculation, with the bracketed syntax we can have nice things like
plus let
keysNeeded = [ "x" "y" ];
set = { x=1; y=2; z=4; };
in
set.keysNeeded # [ 1 2 ] There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think this should be discussed in its own RFC. IMO Since unquoted attr names can't start with a number, this fits in nicely with the current language.
This would leave |
||
start-date: 2024-07-14 | ||
author: rhendric | ||
co-authors: | ||
shepherd-team: infinisil inclyc | ||
shepherd-leader: | ||
rhendric marked this conversation as resolved.
Show resolved
Hide resolved
|
||
related-issues: https://github.com/NixOS/nix/issues/10949, https://github.com/NixOS/rfcs/pull/137 | ||
--- | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
This proposal extends the attrpath syntax to include `'[' INT ']'` components that refer to elements in lists. | ||
This would enable expressions such as the following: | ||
|
||
```nix | ||
x.[0] # = builtins.elemAt x 0 | ||
x.[1] or y # = if builtins.isList x && builtins.length x > 1 then builtins.elemAt x 1 else y | ||
rhendric marked this conversation as resolved.
Show resolved
Hide resolved
|
||
x ? [3].y # = builtins.isList x && builtins.length x > 3 && builtins.elemAt x 3 ? y | ||
``` | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @sg-qwt Let's try to keep the discussions in threads, so they can be easily marked as resolved later
A primary other use case is |
||
|
||
I'm in a REPL. | ||
I'm exploring parts of Nixpkgs. | ||
I type a partial expression: | ||
|
||
``` | ||
nix-repl> someExpr.foo | ||
{ bar = { ... }; ignoreThis = { ... }; moreStuff = { ... }; } | ||
``` | ||
|
||
I hit up-arrow and keep typing to drill deeper: | ||
|
||
``` | ||
nix-repl> someExpr.foo.bar | ||
{ baz = true; qux = [ ... ]; quux = { ... }; } | ||
|
||
nix-repl> someExpr.foo.bar.qux | ||
[ { ... } ] | ||
|
||
nix-repl> someExpr.foo.bar.qux.0 | ||
error: attempt to call something which is not a function but a list | ||
|
||
at «string»:1:1: | ||
|
||
1| someExpr.foo.bar.qux.0 | ||
| ^ | ||
``` | ||
|
||
Of course that doesn't work. | ||
I don't actually type that. | ||
What I actually do is hit up-arrow, then hit Home, then type `builtins.elemAt`, then hit End, then type ` 0`. | ||
|
||
``` | ||
nix-repl> builtins.elemAt someExpr.foo.bar.qux 0 | ||
{ greatMoreStuff = { ... }; } | ||
``` | ||
|
||
Now what do I get to do? | ||
That's right, hit up-arrow, then hit Home, then type `(`, then hit End, then type `).greatMoreStuff`. | ||
|
||
--- | ||
|
||
When writing Nix code, it is relatively uncommon to want to index into a list, and `builtins.elemAt` suffices. | ||
When exploring data in a REPL, however, indexing into lists is more common, and the above example illustrates how `elemAt` is incompatible with the attrpath selector syntax that is the primary means of data exploration. | ||
Many other programming languages allow syntaxes such as | ||
`foo.bar.qux[0].moreStuff` (C, many others) | ||
or `foo.bar.qux(0).moreStuff` (Octave, Scala) | ||
or `foo.bar.qux.[0].moreStuff` (F#, OCaml) | ||
or `foo.bar.qux.0.moreStuff` (Coco, LiveScript, the `--attr` option of `nix-instantiate` and other `nix` commands). | ||
Of these, only `.[]` does not conflict with existing syntax (see [Alternatives] for more details). | ||
|
||
# Detailed design | ||
[design]: #detailed-design | ||
|
||
The `attrpath` grammar nonterminal is currently defined as | ||
|
||
``` | ||
attrpath | ||
: attrpath '.' attr | ||
| attrpath '.' string_attr | ||
| attr | ||
| string_attr | ||
; | ||
``` | ||
|
||
where `attr` is a simple identifier and `string_attr` is either a string literal or a bare `${}` interpolation. | ||
This nonterminal is used in three contexts in the grammar: | ||
* In a selector (e.g. `expr.attrpath` or `expr.attrpath or default`) | ||
* As the right-hand side of the `?` operator (e.g. `expr ? attrpath`) | ||
* As the left-hand side of a binding (e.g. `let attrpath = expr; in body` or `{ attrpath = expr; }`) | ||
|
||
This proposal adds two productions to `attrpath`: | ||
|
||
``` | ||
| attrpath '.' '[' INT ']' | ||
| '[' INT ']' | ||
``` | ||
|
||
(Supporting non-literal expressions is scoped out of this proposal; see [Future work][future].) | ||
|
||
In a selector or `?` operator, these new productions have the following semantics: | ||
* `expr.prefix.[n].suffix` is the result of evaluating `(builtins.elemAt (expr.prefix) n).suffix` | ||
* `expr ? prefix.[n].suffix` is true if and only if all of the following are true: | ||
* `expr ? prefix` (or `prefix` is empty) | ||
* `expr.prefix` evaluates to a list with length at least `n + 1` | ||
* `expr.prefix.[n] ? suffix` (or `suffix` is empty) | ||
* `expr.prefix.[n].suffix or expr2` is the result of evaluating `if expr ? prefix.[n].suffix then expr.prefix.[n].suffix else expr2` | ||
|
||
It is a syntax error if either new production is used in the left-hand side of a binding. | ||
|
||
An implementation of this design is available as patches for Nix at <https://gitlab.com/rhendric/nix-list-index-syntax/>; see instructions there for use. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't expect much regression, but have you tested current nixpkgs trunk? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tested one build on the |
||
|
||
# Examples and Interactions | ||
[examples-and-interactions]: #examples-and-interactions | ||
|
||
``` | ||
nix-repl> [ 1 4 9 ].[2] | ||
9 | ||
|
||
nix-repl> pkgs.maptool.meta.maintainers.[0].github | ||
"rhendric" | ||
|
||
nix-repl> [ 1 4 9 ].[3] | ||
error: list index 3 is out of bounds | ||
|
||
at «string»:1:1: | ||
|
||
1| [ 1 4 9 ].[3] | ||
| ^ | ||
|
||
nix-repl> [ 1 4 9 ] ? [2] | ||
true | ||
|
||
nix-repl> [ 1 4 9 ] ? [3] | ||
false | ||
|
||
nix-repl> [ 1 4 9 ] ? [0].[0] | ||
false | ||
|
||
nix-repl> [ [ 1 4 9 ] ] ? [0].[0] | ||
true | ||
``` | ||
|
||
The following are all syntax errors: | ||
|
||
``` | ||
nix-repl> let n = 0; in [ 4 ].[n] | ||
error: syntax error, unexpected ID, expecting INT | ||
|
||
at «string»:1:22: | ||
|
||
1| let n = 0; in [ 4 ].[n | ||
| ^ | ||
|
||
nix-repl> { a.[0] = true; } | ||
error: syntax error, index '0' not allowed here | ||
|
||
at «string»:1:3: | ||
|
||
1| { a.[0] = true; | ||
| ^ | ||
|
||
nix-repl> let a.[0] = true; in a | ||
error: syntax error, index '0' not allowed here | ||
|
||
at «string»:1:5: | ||
|
||
1| let a.[0] = true; | ||
| ^ | ||
``` | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
The prototype [patch] implementing this feature adds a net 69 lines of code to Nix, excluding tests. | ||
|
||
[patch]: https://gitlab.com/rhendric/nix-list-index-syntax/-/blob/main-dist/patches/bracketed/from-90e630a5.patch | ||
|
||
As noted in passing in the motivation section, the proposed syntax differs from the syntax already used by Nix tools on the command line for `--attr`. | ||
`nix-instantiate` has no difficulty with: | ||
|
||
``` | ||
$ nix-instantiate --eval '<nixpkgs>' -A maptool.meta.maintainers.0.github | ||
"rhendric" | ||
``` | ||
|
||
This divergence might lead to confusion. | ||
|
||
Evolving the syntax of Nix always imposes a cost on third-party tools that process Nix syntax, including syntax highlighters, linters, formatters, static analyzers, and language servers. | ||
This syntax is not a dramatic extension of the language but would require support from all of the above for them to maintain full functionality. | ||
|
||
Finally, there is an opportunity cost to claiming new syntax. | ||
One could imagine speculative features that might want to use this syntax, such as a list or string slicing syntax, or a ‘list swizzle’ operator that desugars `expr.[ 2 0 1 ]` to `[ (elemAt expr 2) (elemAt expr 0) (elemAt expr 1) ]`. | ||
It is, in my opinion, unlikely that list and string manipulation (assuming that any feature in competition for this syntax would involve lists or strings somehow) would be so common in Nix to make this a compelling objection. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To me, this is a major blocker for the proposed syntax. While I agree that list slices are fairly uninteresting, this syntax could instead be used for set slicing, as drafted out here: To me, this is a major blocker for the proposed syntax. While I agree that list slices are fairly uninteresting, this syntax could be used with tremendous benefits for set slicing, as drafted out here: https://md.darmstadt.ccc.de/nix2?view=#Set-slicing-confidence-mid (ignore the comma separated lists, which are a separate language improvement proposal) Therefore, I propose going with the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm in agreement. Unifying all the composite type elements under one operator is probably going to be nicer long-term. The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed,
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I'm not so sure about that part. Basically There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it only follows naturally. In There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One of them can be done at parse time, the other one has to be delayed until the expression is forced. Btw I don't know the actual performance impact this might have, but I'd be very wary of it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As I describe in the Alternatives section, the main concern with that is less about the evaluator's performance and more about external static analysis tools. |
||
|
||
# Alternatives | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another alternative which might be worth mentioning would be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Even in a REPL, the motivating case of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I agree that "pipe operator" is good but not a real alternative of this case. |
||
[alternatives]: #alternatives | ||
|
||
#### `expr.${0}` | ||
|
||
The simplest alternative, aside from doing nothing, would be to reuse the `.${expr}` syntax that Nix already supports in attrpaths. | ||
In this alternative, when evaluating `expr.${idx}`, the interpreter would determine whether `idx` evaluates to an integer or a string. | ||
`expr` would be required to be a list if the former and an attribute set if the latter. | ||
|
||
While this is parsimonious with respect to syntax, it creates more complexity for static analyzers. | ||
It is currently statically known that the `expr` in `expr.${idx}` must be a set and that `idx` must be a string. | ||
Representing a new constraint that _either_ `expr` is a set and `idx` is a string _or_ `expr` is a list and `idx` is an integer might be difficult for such tools and could result in a degraded user experience. | ||
|
||
#### `expr[1]` | ||
|
||
The most common syntax, by far, for indexing into a list or array in other programming languages is `expr[idx]`. | ||
Nix is currently whitespace-insensitive with respect to attrpaths, so such an expression is indistinguishable from `expr [idx]`, an application of the function `expr` to the argument `[idx]`. | ||
It would be possible to adapt the Nix lexer to interpret the `[` character in `expr[idx]` as the opening of an indexer or a list depending on whether there is whitespace before it. | ||
Among the disadvantages of this approach would be that it would make specifying the grammar of Nix more complicated, and that it may prevent long attrpaths from being broken naturally across lines. | ||
Perhaps most fatally, despite the odds of someone writing `expr[idx]` and intending a function call being virtually nil, the conservative principles of the Nix team forbid altering the meaning of even such an unlikely bit of syntax without some sort of larger language versioning or deprecation story, which has eluded us for some time. | ||
|
||
All of the above objections apply to `expr(1)` as well, with the additional drawback that in a wide array of C-like languages, this syntax represents exactly what it currently (if coincidentally) represents in Nix. | ||
|
||
#### `expr.$[2]` | ||
|
||
The syntax `expr.$[idx]` was offered as a possibility in [NixOS/nix#10949](https://github.com/NixOS/nix/issues/10949). | ||
It resembles the `${}` syntax already used in attrpaths, with the change in delimiters suggesting a shift from attribute sets to lists. | ||
However, it requires an additional character to type and its technical qualities are identical to those of the proposed syntax without the `$` character. | ||
There is at least some prior art for `.[]` in OCaml and F#; there is none that I know of for `.$[]`. | ||
|
||
#### `expr.3` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Everyone arguing for this alternative: you don't have to convince me of its technical superiority. I am already convinced, as I describe below. This is the syntax I'm using in my personal Nix build. Rather, you have to convince me that the Nix maintainers can be persuaded to make a breaking change to float literal syntax on a timeline that is less than, say, five years, when they have expressed reluctance to make ‘breaking changes’ to the language even with respect to obvious buggy behavior. Gating this proposal behind something that will never happen is, in practice, equivalent to rejecting the proposal. I would rather see some syntax implemented in a couple of years than wait indefinitely for the ideal syntax. And if sane language evolution manages to actually become policy at some point, in that case we'll have the tools we need to deprecate the less-appealing syntax and migrate to the better one, should we choose to do so. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With Obviously, If you introduce If yes, why duplicating and confusing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This proposal is only for when the desired index is known; replacing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe that adopting While we might agree on There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I agree here. If we agree that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It doesn't really open that gate; There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
this is already criticized above for runtime overhead and issues with static analyzers.
These are not truly parallel, How to read There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I mean, currently, such code likely hit "attempt to call something which is not a function but an integer" error She just get wrong result at the end. The code looks pretty valid doing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
At least for Lix, I am currently setting up infrastructure to remove URL literals, and the However, I can't guarantee that adding the new syntax will be allowed without proper language versioning tools (which are on the roadmap but it will take years still).
I'll join that camp. Especially since the main motivation for this RFC is the REPL, where the user knows which index they want to use. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (What other projects do is their business; this is a Nix RFC.) For everyone using original Nix, a reminder that you can have either the proposed syntax or the |
||
|
||
It is no accident that a simple dotted index `expr.3` was the syntax chosen for attribute paths in `nix-instantiate` and friends. | ||
It is the most straightforward expression of intent imaginable, if it is known for certain that an attribute path is what is intended. | ||
Implementing the same syntax in the Nix language would be harmonious. | ||
The drawback is that, as with the `expr[1]` case, whitespace insensitivity means that `expr.3` is indistinguishable from `expr .3`, the application of a function to a float literal. | ||
As before, abandoning whitespace sensitivity is possible, if distasteful. | ||
|
||
Another approach would be to abandon float literals that don't start with a digit. | ||
There are currently no such literals in Nixpkgs. | ||
Though such literals have historically been supported by many C-like languages, some languages (Haskell, Ruby, Rust, Swift) and the [Google C++ style guide](https://google.github.io/styleguide/cppguide.html#Floating_Literals) reject them. | ||
Several standards for science and engineering, such as the United States' [NIST Guide to the SI][NIST] and [National Renewable Energy Laboratory's Communication Standards][NREL], do the same. | ||
Forbidding them would also require a capacity for language versioning or deprecation, but the end result would not require adding whitespace sensitivity to the grammar. | ||
|
||
[NIST]: https://www.nist.gov/pml/special-publication-811/nist-guide-si-chapter-10-more-printing-and-using-symbols-and-numbers#1052 | ||
[NREL]: https://www.nrel.gov/comm-standards/editorial/zero.html | ||
|
||
If the practical barriers to introducing backwards incompatibilities into Nix were not a concern, this would be far and away my preferred choice. | ||
An implementation of this option is also available at <https://gitlab.com/rhendric/nix-list-index-syntax/>, and I'm using it as my main Nix package. | ||
(This patch is marginally more complex than the other because the lexer needs to be persuaded not to see a float in `expr.1.2`, but this is _not_ a grammar conflict because `expr . 1.2` doesn't parse as anything.) | ||
|
||
#### `expr!4` (or other character) | ||
|
||
To get the parsimony of `expr.3` but without the fuss of dealing with the float conflict, we could choose another symbolic character that isn't already an infix operator. | ||
Ideally this character would be somewhat mnemonic. | ||
Pros and cons of various choices briefly covered below, in highly subjective best-to-worst order. | ||
|
||
* `!` (+ Haskelly, resembles `.` but, like, different) | ||
* `@` (+ evokes `elemAt`; − apparently we want this to be reserved for things that create bindings) | ||
* `\` (okay I guess?) | ||
* `&` (− association with bitwise-and, but possibly in Nix that's a sufficiently remote concept to be irrelevant) | ||
* `|` (− ditto for bitwise-or) | ||
* `$` (− strong Haskell association with a different concept) | ||
* `^` (− could be used for exponentiation if `**` isn't) | ||
* `%` (− could be used for modulo) | ||
* `` ` `` (− confusing) | ||
|
||
Relative to the proposed syntax, any feasible option here seems like it would be more alienating to new users, to a degree not worth the benefit of saving two characters. | ||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
As mentioned above, F# also supports a `.[]` syntax for indexing into arrays, though since F# 6.0 the C-like notation is recommended over this. | ||
The following quote from the [changelog entry][F#6expridx] may be relevant, as a warning to us: | ||
|
||
> Up to and including F# 5, F# has used `expr.[idx]` as indexing syntax. Allowing the use of `expr[idx]` is based on repeated feedback from those learning F# or seeing F# for the first time that the use of dot-notation indexing comes across as an unnecessary divergence from standard industry practice. | ||
|
||
[F#6expridx]: https://learn.microsoft.com/en-us/dotnet/fsharp/whats-new/fsharp-6#simpler-indexing-syntax-with-expridx | ||
|
||
# Unresolved questions | ||
[unresolved]: #unresolved-questions | ||
|
||
None at this time. | ||
|
||
# Future work | ||
[future]: #future-work | ||
|
||
This proposal is motivated primarily, if not exclusively, by the REPL use case, in which the desired index is known. | ||
While there is minimal technical challenge to allowing arbitrary expressions inside the square brackets, doing so would consume a larger slice of the syntax design space. | ||
Just as there is a difference between `.foo` and `.${foo}`, it might be desirable to have a different syntax for this new case—`.[2]` and `.$[1 + 1]`, perhaps. | ||
Discussing these issues and determining whether the benefits relative to using `elemAt` are worth the drawbacks is left as future work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rhendric @inclyc @AndersonTorres (but anybody is welcome to join of course): I created #nix-rfc181:matrix.org to have more synchronous discussions. I'd like to propose having a regular 2-weekly meeting to make continuous progress on this. If that sounds fine, please enter your availability here so we can find a time that works 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like our availability has close to zero overlap, so I don't think that's going to work. Instead let's discuss it semi-synchrounously in the Matrix channel at least. @rhendric and @inclyc can you also join #nix-rfc-181:matrix.org?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rhendric Another ping, can you also join?