Compilers can define their execution order #2114

rzane · 2024-12-06T02:42:44Z

Motivation

Tapioca generates types for ActiveModel and ActiveRecord attributes, and usually does the right thing. However, there are some cases where more flexibility is necessary. It would be useful to be able to override the definitions provided by Tapioca's default compilers in very special cases.

Example 1: ActiveModel attributes without a type

Not all ActiveModel attributes can be represented with ActiveModel's built-in types. It also isn't practical to define an ActiveModel::Type for every type of value that might be passed to an ActiveModel.

For example, many ActiveModel instances take a User as a parameter (e.g. attribute :user). I'd like to be able to tell Sorbet that this is a User. Unfortunately, the Tapioca compiler defines this field as T.untyped. Without forking the ActiveModel compiler, I have no way of overriding the type defined by Tapioca.

Example 2: Column types don't necessarily represent method return types

We use a gem called steady_state. It defines a state machine and stores the state in a string column on the record.

This gem overrides the attribute reader to return an instance of SteadyState::Attribute::State, which is basically an ActiveSupport::StringInquirer.

Without forking the ActiveRecordColumns compiler, I have no way of overriding the type defined by Tapioca.

Implementation

Step 1: Allow DSL compilers to control execution order

This PR implements an optional compiler method .run_after_compilers. Compilers can tell Tapioca which other compilers they expect to precede them.

So, for example, my SteadyState compiler could say:

class Tapioca::Dsl::Compilers::SteadyState < Tapoica::Dsl::Compiler
  def self.run_after_compilers
    [Tapioca::Dsl::Compilers::ActiveRecordColumns]
  end

  # ... etc ...
end

Tapioca can use TSort to ensure that the steady state compiler runs after ActiveRecordColumns.

Step 2: Allow DSL compilers to redefine methods

This PR implements RBI::Tree#remove_method, which removes a method from the tree by name.

Tests

This PR includes tests.

rzane · 2024-12-06T02:45:46Z

This PR is an alternative to #2112

paracycle · 2024-12-09T12:00:09Z

Without forking the ActiveModel compiler, I have no way of overriding the type defined by Tapioca.

I don't think this is true. You could prepend a module in the ActiveModelTypeHelper module and hook the type_for method to do any processing that you want. You could ship this as a "compiler" in your own application and thus modify the built-in behaviour of Tapioca.

Since the last few versions, Tapioca gives you the guarantee that it will load its internal compilers before loading custom compilers, so that this kind of modification is possible.

Without forking the ActiveRecordColumns compiler, I have no way of overriding the type defined by Tapioca.

Same comment here. Injecting behaviour into built-in Tapioca compilers is possible.

See this test for more details on how that is possible:

tapioca/spec/tapioca/cli/dsl_spec.rb

Lines 2250 to 2308 in cc54939

    
                   it "must be able to modify behaviour of existing compilers" do 
        
                     @project.write!("lib/post.rb", <<~RB) 
        
                       ::ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:") 
        
                       class Post < ActiveRecord::Base 
        
                       end 
        
                     RB 
        
                     foo = mock_gem("foo", "0.0.1") do 
        
                       write!("lib/tapioca/dsl/compilers/active_record_relation_ext.rb", <<~RB) 
        
                         class Tapioca::Dsl::Compilers::ActiveRecordRelations 
        
                           ID_TYPES = ID_TYPES.union(["Foo"]).freeze 
        
                         end 
        
                       RB 
        
                     end 
        
                     @project.require_mock_gem(foo) 
        
                     @project.require_real_gem("activerecord", require: "active_record") 
        
                     @project.require_real_gem("sqlite3") 
        
                     @project.bundle_install! 
        
                     result = @project.tapioca("dsl Post") 
        
                     assert_stdout_equals(<<~OUT, result) 
        
                       Loading DSL extension classes... Done 
        
                       Loading Rails application... Done 
        
                       Loading DSL compiler classes... Done 
        
                       Compiling DSL RBI files... 
        
                             create  sorbet/rbi/dsl/post.rbi 
        
                       Done 
        
                       Checking generated RBI files...  Done 
        
                         No errors found 
        
                       All operations performed in working directory. 
        
                       Please review changes and commit them. 
        
                     OUT 
        
                     assert_project_file_includes("sorbet/rbi/dsl/post.rbi", indented(<<~RBI, 4)) 
        
                       sig do 
        
                         params( 
        
                           args: T.any(String, Symbol, ::ActiveSupport::Multibyte::Chars, T::Boolean, BigDecimal, Numeric, ::ActiveRecord::Type::Binary::Data, ::ActiveRecord::Type::Time::Value, Date, Time, ::ActiveSupport::Duration, T::Class[T.anything], Foo) 
        
                         ).returns(::Post) 
        
                       end 
        
                       sig do 
        
                         params( 
        
                           args: T::Array[T.any(String, Symbol, ::ActiveSupport::Multibyte::Chars, T::Boolean, BigDecimal, Numeric, ::ActiveRecord::Type::Binary::Data, ::ActiveRecord::Type::Time::Value, Date, Time, ::ActiveSupport::Duration, T::Class[T.anything], Foo)] 
        
                         ).returns(T::Enumerable[::Post]) 
        
                       end 
        
                       sig { params(args: NilClass, block: T.proc.params(object: ::Post).void).returns(T.nilable(::Post)) } 
        
                       def find(args = nil, &block); end 
        
                     RBI 
        
                     assert_empty_stderr(result) 
        
                     assert_success_status(result) 
        
                   end 
        
                 end

In general, I really don't want to add any kind of ordering guarantees to compilers, since that will be brittle and full of race conditions.

rzane · 2024-12-09T13:48:43Z

Thanks @paracycle, I'm aware that I could monkey patch Tapioca to achieve the desired results, but is that really the best way to solve this problem? Is there another solution to this problem that you might be more open to?

Can you explain why sorting the compilers would be brittle and full of race conditions?

paracycle · 2024-12-09T15:19:57Z

Can you explain why sorting the compilers would be brittle and full of race conditions?

Because Tapioca does not control a central list of compilers, they could be coming from the app, any of the gems it includes, or any other source. If we were to add the ability to declare ordering, that implies:

That there is a way to globally reference another compiler by name. A constant reference like in your example isn't enough, since that assumes the declaring compiler will be loaded after the other one, which isn't always correct.
That no two compilers will need to depend on a particular compiler to load and, yet, still perform conflicting operations. That is, if compiler B and C both depend on A but have different ideas about what should happen with A's definitions, then they will still be conflicting. In order to break that conflict, you'd have to add ordering between B and C, but they could be coming from wildly different sources and the fact that they would have to know about each other is a code smell.

All of this is highly problematic, and means additional code and workflows to support. Instead, it is already possible to modify the behaviour of built-in compilers, and until we meet a case where that is not enough, we shouldn't have to take on the burden of adding a new feature.

rzane · 2024-12-09T17:00:32Z

A constant reference like in your example isn't enough, since that assumes the declaring compiler will be loaded after the other one, which isn't always correct.

I probably should have included in my example that the compiler needs to require the other compiler, but I thought that was implied. So, for example, you'd say:

require "tapioca/dsl/compilers/active_record_columns"

class Tapioca::Dsl::Compilers::SteadyState < Tapoica::Dsl::Compiler
  def self.run_after_compilers
    [Tapioca::Dsl::Compilers::ActiveRecordColumns]
  end

  # ... etc ...
end

In order to break that conflict, you'd have to add ordering between B and C

I agree that this could be a little awkward, but this strategy does provide a way to deal with that issue. Isn't this already a problem anyway? If B and C both define a method, your RBI files are invalid.

Instead, it is already possible to modify the behaviour of built-in compilers

Yes, through monkey-patching, which is a much bigger code smell. The type_for method is an internal API. There are no guarantees that this API won't change in a future version.

If we want the Ruby community to adopt Tapioca, this feels like a requisite feature, in my opinion. Do we really want gems that ship Tapioca compilers to be required to monkey-patch Tapioca?

rzane added 3 commits December 5, 2024 20:19

Add a hook that compilers can implement to specify their order

0cd4e1f

TSort compilers

2f4387b

Allow nodes to be removed

68cc57f

rzane requested a review from a team as a code owner December 6, 2024 02:42

rzane changed the title ~~Rzane/sort compilers~~ Compilers can define their execution order Dec 6, 2024

paracycle closed this Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compilers can define their execution order #2114

Compilers can define their execution order #2114

rzane commented Dec 6, 2024 •

edited

Loading

rzane commented Dec 6, 2024

paracycle commented Dec 9, 2024

rzane commented Dec 9, 2024 •

edited

Loading

paracycle commented Dec 9, 2024

rzane commented Dec 9, 2024 •

edited

Loading

Compilers can define their execution order #2114

Compilers can define their execution order #2114

Conversation

rzane commented Dec 6, 2024 • edited Loading

Motivation

Example 1: ActiveModel attributes without a type

Example 2: Column types don't necessarily represent method return types

Implementation

Step 1: Allow DSL compilers to control execution order

Step 2: Allow DSL compilers to redefine methods

Tests

rzane commented Dec 6, 2024

paracycle commented Dec 9, 2024

rzane commented Dec 9, 2024 • edited Loading

paracycle commented Dec 9, 2024

rzane commented Dec 9, 2024 • edited Loading

rzane commented Dec 6, 2024 •

edited

Loading

rzane commented Dec 9, 2024 •

edited

Loading

rzane commented Dec 9, 2024 •

edited

Loading