Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to break long lines at assignment first before breaking within RHS #183

Open
gadenbuie opened this issue Jan 17, 2025 · 11 comments
Open

Comments

@gadenbuie
Copy link
Contributor

In chromote, we had this line

private$event_callback_counts[[domain]] <- private$event_callback_counts[[domain]] - 1

which, when wrapped to 80 characters with air 1.0.0 becomes

private$event_callback_counts[[domain]] <- private$event_callback_counts[[
        domain
      ]] -
        1

but I'd much rather this break at the <- (without me having to introduce that break manually)

private$event_callback_counts[[domain]] <- 
  private$event_callback_counts[[domain]] - 1
@DavisVaughan
Copy link
Collaborator

That's tough because you definitely want this to break within the RHS first

foo <- this %>% that() %>% this_other_long_thing() %>% and_this_one_too_as_well()

I agree the current situation isn't great but I'm not sure what the best situation is yet without diving in a bit more.

The line break right after the <- is currently an opt-in feature (through a persistent line break). It won't ever generate one of those line breaks automatically. I don't really love seeing those kinds of line breaks just appear as soon as you hit the line limit, because I greatly prefer the lhs and rhs to at least start on the same line.

@gadenbuie
Copy link
Contributor Author

gadenbuie commented Jan 17, 2025

One way of expressing my preference is that I prefer line-wrapping strategies that minimize line breaks, with exceptions.

In my original example, I'd much prefer breaking around an entire expression over introducing several breaks. Once internal breaks are required, then I like consistent breaks.

Personally, I'd be okay with your over-long pipe chain example becoming this

foo <- 
  this %>% that() %>% this_other_long_thing() %>% and_this_one_too_as_well()

if you're only 6 characters over the line width constraint. Very often the second line will still not be under, in which case you'd move to an "internal break" on the RHS

foo <- this_is_too_long %>%
  that() %>%
  this_other_long_thing() %>%
  and_this_one_too_as_well()

@lionel-
Copy link
Collaborator

lionel- commented Jan 21, 2025

I feel like we never want to break in subset operators whose arguments is a simple literal or symbol?

@DavisVaughan
Copy link
Collaborator

I feel like we never want to break in subset operators whose arguments is a simple literal or symbol?

This is also the conclusion that I came to, making the ideal result in this case

private$event_callback_counts[[domain]] <- private$event_callback_counts[[domain]]
  - 1

@gadenbuie
Copy link
Contributor Author

gadenbuie commented Jan 22, 2025

I agree with @lionel-, but I still find it much more readable to have the entire RHS on the same line

# very easy to miss that -1 is part of the RHS
private$event_callback_counts[[domain]] <- private$event_callback_counts[[domain]]
  - 1

# My preference (possible with persistent breaks)
private$event_callback_counts[[domain]] <- 
  private$event_callback_counts[[domain]] - 1

I just ran into another example that feels related

# Original
stopifnot(
  "Running shortcuts by name requires using an R file for your shortcuts" = tolower(fs::path_ext(path)) ==  "r"
)

# air-formatted
stopifnot(
  "Running shortcuts by name requires using an R file for your shortcuts" = tolower(
    fs::path_ext(path)
  ) ==
    "r"
)

# My preference (reformatted to above even with persistent breaks)
stopifnot(
  "Running shortcuts by name requires using an R file for your shortcuts" = 
    tolower(fs::path_ext(path)) ==  "r"
)

# air-compatible formatting
stopifnot(
  "Running shortcuts by name requires using an R file for your shortcuts" = {
    tolower(fs::path_ext(path)) == "r"
  }
)

Somewhere else in the issues I read @lionel- talk about symmetry, which made this whole thing click for me. There's a symmetry in breaking once at a higher level of the AST than introducing several breaks among lower branches of the tree, which I feel results in much easier to read code.

@DavisVaughan
Copy link
Collaborator

Not entirely sure what the best solution is yet, but I thought about this a little today and compared to biome/ruff. I think if we were to do this it would break down into two problems that need to be tackled. It seems like quite a big task though.

For this particular syntax type, can we ever move the RHS to its own line?

i.e. the RHS on its own line looks nice for this binary expression

private$event_callback_counts[[domain]] <-
  private$event_callback_counts[[domain]] - 1

But I would never want it to move onto the next line for this function call

private$event_callback_counts[[domain]] <- my_special_function(with, really_really_really, long_arguments)

That should always prefer expanding to

private$event_callback_counts[[domain]] <- my_special_function(
  with, 
  really_really_really, 
  long_arguments
)

Similarly for pipes, I still think I prefer

foo <- this %>% that() %>% this_other_long_thing() %>% and_this_one_too_as_well()

turning into

foo <- this %>% 
  that() %>% 
  this_other_long_thing() %>% 
  and_this_one_too_as_well()

rather than

foo <- 
  this %>% that() %>% this_other_long_thing() %>% and_this_one_too_as_well()

In particular, the last form here only buys you 4 more characters before you'd have to break again, but now a persistent new line is in play, so as you keep typing to add more to your pipe chain Air would never remove that persistent new line on its own...that would be annoying.

Ruff does sort of have a similar construct. It adds parentheses in some cases like this one

Image

That's kind of like what we are talking about here, but R would not need parens due to how the parser works.

Ruff has a trait NeedsParentheses that each construct defines an implementation for, so like binary expressions are allowed to get wrapped in parentheses, but function calls are not because they "bring their own"

For this particular syntax type, what format is best?

Say we decide binary expressions can move to their own line. We need the formatter to decide between (in order of least to most expanded)

# option 1 - no breaks (past line length)
private$event_callback_counts[[domain]] <- private$event_callback_counts[[domain]] - 1

# option 2 - on own line AND ensure the RHS doesn't have any breaks in it
private$event_callback_counts[[domain]] <-
  private$event_callback_counts[[domain]] - 1

# option 3 - fully broken out form
private$event_callback_counts[[domain]] <- private$event_callback_counts[[
        domain
      ]] -
        1

This typically ends up being some kind of job for best_fitting!, which is an expensive macro to use. Ruff does have something like this, its called best_fit_parenthesize()
https://github.com/astral-sh/ruff/blob/05abd642a8ddb3f3b6e7c78592995c6ccfca2520/crates/ruff_formatter/src/builders.rs#L1549

You can see here how its an optimized form of the 3 options listed above
https://github.com/astral-sh/ruff/blob/05abd642a8ddb3f3b6e7c78592995c6ccfca2520/crates/ruff_formatter/src/builders.rs#L1456-L1485

Unfortunately their best_fit_parenthesize() seemed to require some changes to the biome formatter itself (which they forked but we have not), so we wouldn't be able to do that exactly.

@DavisVaughan
Copy link
Collaborator

DavisVaughan commented Jan 22, 2025

Interestingly if you try to pythonize the stopifnot example it looks roughly the same as what air currently does

Image

This idea in general is a similar but different case (assignment versus named arguments)

@DavisVaughan
Copy link
Collaborator

A big worry of mine is dealing with this combined with the persistent line break behavior after the <-. Similar to what I said about the pipe chain example above:

# If this
private$event_callback_counts[[domain]] <- my_special_function(with, really_really_really, long_arguments)

# turned into this
private$event_callback_counts[[domain]] <- 
  my_special_function(with, really_really_really, long_arguments)

# but then you typed more
private$event_callback_counts[[domain]] <- 
  my_special_function(with, really_really_really, long_arguments, long_arguments, long_arguments)

# then it would respect the persistent line break and break as
private$event_callback_counts[[domain]] <- 
  my_special_function(
    with, 
    really_really_really, 
    long_arguments, 
    long_arguments, 
    long_arguments
  )

# rather than what you'd really want, which is
private$event_callback_counts[[domain]] <- my_special_function(
  with, 
  really_really_really, 
  long_arguments, 
  long_arguments, 
  long_arguments
)

You'd have to manually remove the line break before the my_special_function() to get to the final form

@gadenbuie
Copy link
Contributor Author

Thanks for writing and sharing your thoughts!

But I would never want it to move onto the next line for this function call

private$event_callback_counts[[domain]] <- my_special_function(with, really_really_really, long_arguments)

Never is a strong word but it matches how much I would want my_special_function( to start on a new line in this case. And where you said "rather than what you'd really want", I would have said "rather than what I'd never want" 😆 .

That said, I'm certainly willing to trade some of my style preferences for easy and consistent formatting! And I think it's very defensible to have a persistent line break formula require intervention, so I understand the desire to avoid adding line break rules that have the formatter insert characters that are reserved for humans.

In other words, I could very much live with a solution that lets me add a persistent line break after = too so that the following is left alone by Air.

stopifnot(
  "Running shortcuts by name requires using an R file for your shortcuts" = 
    tolower(fs::path_ext(path)) ==  "r"
)

@gadenbuie
Copy link
Contributor Author

Actually, I suppose I can live with this:

stopifnot(
  "Running shortcuts by name requires using an R file for your shortcuts" = {
    tolower(fs::path_ext(path)) == "r"
  }
)

Joking aside, I do think it's a good and clean idea to stick with "Air won't add persistent line breaks on its own". Thanks for entertaining my suggestion!

@DavisVaughan
Copy link
Collaborator

I could very much live with a solution that lets me add a persistent line break after = too

This is not unreasonable, provided we don't come across any weirdness while implementing it, or think of any scenario where it might lock us in to some behavior.

It would be the exact same reasoning as the persistent line break after <-, i.e. "never by default, but respected if explicitly asked for"


The { solution is also pretty interesting to me because its a way to get what you're looking for without adding special cased persistent line breaks


I also think we will probably make foo[[bar]] never expand across multiple lines when bar is just a simple identifier, so that's another nice thing that came out of this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants