Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilers can define their execution order #2114

Closed
wants to merge 3 commits into from

Conversation

rzane
Copy link
Contributor

@rzane rzane commented Dec 6, 2024

Motivation

Tapioca generates types for ActiveModel and ActiveRecord attributes, and usually does the right thing. However, there are some cases where more flexibility is necessary. It would be useful to be able to override the definitions provided by Tapioca's default compilers in very special cases.

Example 1: ActiveModel attributes without a type

Not all ActiveModel attributes can be represented with ActiveModel's built-in types. It also isn't practical to define an ActiveModel::Type for every type of value that might be passed to an ActiveModel.

For example, many ActiveModel instances take a User as a parameter (e.g. attribute :user). I'd like to be able to tell Sorbet that this is a User. Unfortunately, the Tapioca compiler defines this field as T.untyped. Without forking the ActiveModel compiler, I have no way of overriding the type defined by Tapioca.

Example 2: Column types don't necessarily represent method return types

We use a gem called steady_state. It defines a state machine and stores the state in a string column on the record.

This gem overrides the attribute reader to return an instance of SteadyState::Attribute::State, which is basically an ActiveSupport::StringInquirer.

Without forking the ActiveRecordColumns compiler, I have no way of overriding the type defined by Tapioca.

Implementation

Step 1: Allow DSL compilers to control execution order

This PR implements an optional compiler method .run_after_compilers. Compilers can tell Tapioca which other compilers they expect to precede them.

So, for example, my SteadyState compiler could say:

class Tapioca::Dsl::Compilers::SteadyState < Tapoica::Dsl::Compiler
  def self.run_after_compilers
    [Tapioca::Dsl::Compilers::ActiveRecordColumns]
  end

  # ... etc ...
end

Tapioca can use TSort to ensure that the steady state compiler runs after ActiveRecordColumns.

Step 2: Allow DSL compilers to redefine methods

This PR implements RBI::Tree#remove_method, which removes a method from the tree by name.

Tests

This PR includes tests.

@rzane rzane requested a review from a team as a code owner December 6, 2024 02:42
@rzane rzane changed the title Rzane/sort compilers Compilers can define their execution order Dec 6, 2024
@rzane
Copy link
Contributor Author

rzane commented Dec 6, 2024

This PR is an alternative to #2112

@paracycle
Copy link
Member

Without forking the ActiveModel compiler, I have no way of overriding the type defined by Tapioca.

I don't think this is true. You could prepend a module in the ActiveModelTypeHelper module and hook the type_for method to do any processing that you want. You could ship this as a "compiler" in your own application and thus modify the built-in behaviour of Tapioca.

Since the last few versions, Tapioca gives you the guarantee that it will load its internal compilers before loading custom compilers, so that this kind of modification is possible.

Without forking the ActiveRecordColumns compiler, I have no way of overriding the type defined by Tapioca.

Same comment here. Injecting behaviour into built-in Tapioca compilers is possible.

See this test for more details on how that is possible:

it "must be able to modify behaviour of existing compilers" do
@project.write!("lib/post.rb", <<~RB)
::ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:")
class Post < ActiveRecord::Base
end
RB
foo = mock_gem("foo", "0.0.1") do
write!("lib/tapioca/dsl/compilers/active_record_relation_ext.rb", <<~RB)
class Tapioca::Dsl::Compilers::ActiveRecordRelations
ID_TYPES = ID_TYPES.union(["Foo"]).freeze
end
RB
end
@project.require_mock_gem(foo)
@project.require_real_gem("activerecord", require: "active_record")
@project.require_real_gem("sqlite3")
@project.bundle_install!
result = @project.tapioca("dsl Post")
assert_stdout_equals(<<~OUT, result)
Loading DSL extension classes... Done
Loading Rails application... Done
Loading DSL compiler classes... Done
Compiling DSL RBI files...
create sorbet/rbi/dsl/post.rbi
Done
Checking generated RBI files... Done
No errors found
All operations performed in working directory.
Please review changes and commit them.
OUT
assert_project_file_includes("sorbet/rbi/dsl/post.rbi", indented(<<~RBI, 4))
sig do
params(
args: T.any(String, Symbol, ::ActiveSupport::Multibyte::Chars, T::Boolean, BigDecimal, Numeric, ::ActiveRecord::Type::Binary::Data, ::ActiveRecord::Type::Time::Value, Date, Time, ::ActiveSupport::Duration, T::Class[T.anything], Foo)
).returns(::Post)
end
sig do
params(
args: T::Array[T.any(String, Symbol, ::ActiveSupport::Multibyte::Chars, T::Boolean, BigDecimal, Numeric, ::ActiveRecord::Type::Binary::Data, ::ActiveRecord::Type::Time::Value, Date, Time, ::ActiveSupport::Duration, T::Class[T.anything], Foo)]
).returns(T::Enumerable[::Post])
end
sig { params(args: NilClass, block: T.proc.params(object: ::Post).void).returns(T.nilable(::Post)) }
def find(args = nil, &block); end
RBI
assert_empty_stderr(result)
assert_success_status(result)
end
end

In general, I really don't want to add any kind of ordering guarantees to compilers, since that will be brittle and full of race conditions.

@paracycle paracycle closed this Dec 9, 2024
@rzane
Copy link
Contributor Author

rzane commented Dec 9, 2024

Thanks @paracycle, I'm aware that I could monkey patch Tapioca to achieve the desired results, but is that really the best way to solve this problem? Is there another solution to this problem that you might be more open to?

Can you explain why sorting the compilers would be brittle and full of race conditions?

@paracycle
Copy link
Member

Can you explain why sorting the compilers would be brittle and full of race conditions?

Because Tapioca does not control a central list of compilers, they could be coming from the app, any of the gems it includes, or any other source. If we were to add the ability to declare ordering, that implies:

  1. That there is a way to globally reference another compiler by name. A constant reference like in your example isn't enough, since that assumes the declaring compiler will be loaded after the other one, which isn't always correct.
  2. That no two compilers will need to depend on a particular compiler to load and, yet, still perform conflicting operations. That is, if compiler B and C both depend on A but have different ideas about what should happen with A's definitions, then they will still be conflicting. In order to break that conflict, you'd have to add ordering between B and C, but they could be coming from wildly different sources and the fact that they would have to know about each other is a code smell.

All of this is highly problematic, and means additional code and workflows to support. Instead, it is already possible to modify the behaviour of built-in compilers, and until we meet a case where that is not enough, we shouldn't have to take on the burden of adding a new feature.

@rzane
Copy link
Contributor Author

rzane commented Dec 9, 2024

A constant reference like in your example isn't enough, since that assumes the declaring compiler will be loaded after the other one, which isn't always correct.

I probably should have included in my example that the compiler needs to require the other compiler, but I thought that was implied. So, for example, you'd say:

require "tapioca/dsl/compilers/active_record_columns"

class Tapioca::Dsl::Compilers::SteadyState < Tapoica::Dsl::Compiler
  def self.run_after_compilers
    [Tapioca::Dsl::Compilers::ActiveRecordColumns]
  end

  # ... etc ...
end

In order to break that conflict, you'd have to add ordering between B and C

I agree that this could be a little awkward, but this strategy does provide a way to deal with that issue. Isn't this already a problem anyway? If B and C both define a method, your RBI files are invalid.

Instead, it is already possible to modify the behaviour of built-in compilers

Yes, through monkey-patching, which is a much bigger code smell. The type_for method is an internal API. There are no guarantees that this API won't change in a future version.

If we want the Ruby community to adopt Tapioca, this feels like a requisite feature, in my opinion. Do we really want gems that ship Tapioca compilers to be required to monkey-patch Tapioca?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants