Skip to content
This repository has been archived by the owner on Nov 30, 2024. It is now read-only.

Fix diff output when a fuzzy finder anything is inside an expected hash #599

Merged
merged 11 commits into from
Jun 30, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions lib/rspec/support/differ.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ def diff(actual, expected)
if any_multiline_strings?(actual, expected)
diff = diff_as_string(coerce_to_string(actual), coerce_to_string(expected))
end
elsif all_hashes?(actual, expected)
diff = diff_hashes_as_object(actual, expected)
elsif no_procs?(actual, expected) && no_numbers?(actual, expected)
diff = diff_as_object(actual, expected)
end
Expand Down Expand Up @@ -56,6 +58,26 @@ def diff_as_string(actual, expected)
end
# rubocop:enable Metrics/MethodLength

def diff_hashes_as_object(actual, expected)
if defined?(RSpec::Mocks::ArgumentMatchers::AnyArgMatcher)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only call this from all_hashes?, and it already has this check

Potentially, this method can be mistakenly called from somewhere else causing errors for those not using rspec-mocks, but i’m less worried about this at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is currently called only under if all_hashes?(actual, expected) condition, and all_hashes? is also checking if the user is opting out rspec-mocks. This check in line 62 is redundant if we assume this method will only be called inside all_hashes? condition . But it is there in case someone mistakenly call the method from somewhere else, so the program won't break if rspec-mocks is opted-out.

If someone opts-out rspec-mock, diff_hashes_as_object should have the exact same behavior as diff_as_object. That's the purpose of this redundant check.

Personally, I like redundancy. But I understand sometimes it may be overkill or unnecessary... maybe defining diff_hashes_as_object as private is good enough? I don't have a strong opinion here.

anything_hash = expected.select { |_, v| RSpec::Mocks::ArgumentMatchers::AnyArgMatcher === v }

anything_hash.each_key do |k|
expected[k] = actual[k]
end

diff_string = diff_as_object(actual, expected)

anything_hash.each do |k, v|
expected[k] = v
end

diff_string
else
diff_as_object(actual, expected)
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically we define such methods conditionally rather than doing checks in the method e.g.

Suggested change
def diff_hashes_as_object(actual, expected)
if defined?(RSpec::Mocks::ArgumentMatchers::AnyArgMatcher)
anything_hash = expected.select { |_, v| RSpec::Mocks::ArgumentMatchers::AnyArgMatcher === v }
anything_hash.each_key do |k|
expected[k] = actual[k]
end
diff_string = diff_as_object(actual, expected)
anything_hash.each do |k, v|
expected[k] = v
end
diff_string
else
diff_as_object(actual, expected)
end
end
if defined?(RSpec::Mocks::ArgumentMatchers::AnyArgMatcher)
def diff_hashes_as_object(actual, expected)
anything_hash = expected.select { |_, v| RSpec::Mocks::ArgumentMatchers::AnyArgMatcher === v }
anything_hash.each_key do |k|
expected[k] = actual[k]
end
diff_string = diff_as_object(actual, expected)
anything_hash.each do |k, v|
expected[k] = v
end
diff_string
end
else
def diff_hashes_as_object(actual, expected)
diff_as_object(actual, expected)
end
end

I'd also like to see a hash built from expected rather than mutating the original hash e.g.

expected_to_diff =
  expected.reduce({}) do |hash, (key, value)|
    if RSpec::Mocks::ArgumentMatchers::AnyArgMatcher === v
      hash[key] = actual[key]  
    else
      hash[key] = expected[key]
    end
    hash
  end

This has the benefit of less enumerations as well as a safety aspect


def diff_as_object(actual, expected)
actual_as_string = object_to_string(actual)
expected_as_string = object_to_string(expected)
Expand All @@ -77,6 +99,10 @@ def no_procs?(*args)
safely_flatten(args).none? { |a| Proc === a }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I really liked about your previous changes was the potential to get rid of the recursive check and the safely_flatten method.
No pressure, as this is not the goal of the pr, but it would be a nice bonus to fixing the issue. And a performance improvement in theory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, it would be nice to get rid of that recursivity. You can count on me to brainstorm on another PR or submit a proposal to fix that issue. I am enjoying reading and getting my hands on this project, and once this PR is finished (by finished I mean either merged or closed (if you and the other maintainers believe the changes I'm proposing here are too risky, or not worth doing them)) I'd like to propose some changes to make the BuiltIn::Change matcher class diffable when it tries to match a Hash in rspec-expectations repo... but that's another story.

About your concern with safely_flatten, that method was introduced ~10 years ago. Does anyone still knows why that method is there? I think its purpose is not only to check in a one method call whether all the arguments are not Proc, but along with no_numbers? to make Differ work with deep down nested arrays inside actual and expected vars. If we stick with my previous change, we can make the method faster, but we may loose the Differ feature to diff deep down nested arrays in a seamlessly way.

Just some brainstorming and ramble here: Is it worth to keep that recursivity? when an expected and actual vars with deep nested arrays are submitted as input to Differ, is worth to display a big diff output? are big diff strings useful when you have to scroll down pages of information? or a diff string is useful only when it shows the exact difference and a little bit of context? Maybe the answers to these questions will pop up when we will see such diff strings... I don't know the answer.

On the project I am currently working on, when I'm testing the actual and expected values of a hash with nested hashes inside, I've found is useful to strip my actual hash of its nested objects in my test set-up, and inside an aggregate_failures block use one expect(...)... sentence for the main hash, and for each nested object use an extra expect(...)... sentence. That's a nice workaround instead of testing a big hash in a single expect(...)... sentence and expect a single diff to show me -everything- I want to see (the idea behind #596 )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can’t tell of the top of my head with certainty, but I can’t recall if nested hashes are (usefully) diffed.

I’d rather simplify the code if no spec fails.

end

def all_hashes?(actual, expected)
defined?(RSpec::Mocks::ArgumentMatchers::AnyArgMatcher) && (Hash === actual) && (Hash === expected)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't check if we have an any arg here, the other method already does this.

Suggested change
def all_hashes?(actual, expected)
defined?(RSpec::Mocks::ArgumentMatchers::AnyArgMatcher) && (Hash === actual) && (Hash === expected)
end
def all_hashes?(actual, expected)
(Hash === actual) && (Hash === expected)
end


def all_strings?(*args)
safely_flatten(args).all? { |a| String === a }
end
Expand Down
24 changes: 24 additions & 0 deletions spec/rspec/support/differ_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -555,6 +555,30 @@ def inspect; "<BrokenObject>"; end
expect(differ.diff(false, true)).to_not be_empty
end
end

describe "fuzzy matcher anything" do
it "outputs only key value pair that triggered diff, anything_key should absorb actual value" do
actual = { :fixed => "fixed", :trigger => "trigger", :anything_key => "bcdd0399-1cfe-4de1-a481-ca6b17d41ed8" }
expected = { :fixed => "fixed", :trigger => "wrong", :anything_key => anything }
diff = differ.diff(actual, expected)
expected_diff = dedent(<<-'EOD')
|
|@@ -1,4 +1,4 @@
| :anything_key => "bcdd0399-1cfe-4de1-a481-ca6b17d41ed8",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now when I look at this, this is indeed and improvement to -anything_key +anything_key we had.
But do we need a potentially huge value here? An uuid is fine, but what about a 5kb json-like thing?

Most certainly if we’ve anythinged it in the test, we don’t care what’s inside, including the diff?

@JonRowe does this make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you concerned by the case a 5kb json-like thing would be copied into the expected -> anything value? that will definitively be a performance issue. Would it be worth doing the change? Is there a way we can make sure the 5kb json-like thing will be a shallow copy? Maybe a shallow copy will mitigate the performance issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a shallow copy will not impact memory ram consumption. But what about CPU time? does diff_as_object iterates the actual and expected vars regardless of their .object_id? or it checks the object_id and if they match it will cease any iteration?

Another question: rspec-support module should be allowed to mutate expected vars? regardless they will be later on return them to their original state? My proposed changes on this PR are a hacky way to fold noise when the developers use anything fuzzy matcher on their hash, I understand this may not be allowed to do here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, a 5kb json-like thing will impact CPU time ONLY if the matcher fails.

  def diff_as_object(actual, expected)
    actual_as_string = object_to_string(actual) # STEP 1
    expected_as_string = object_to_string(expected) # STEP 2
    diff_as_string(actual_as_string, expected_as_string) # STEP 3
  end

STEP 1 and STEP 2 will get the string representation of 5kb json-like thing in two different strings. STEP 3 will perform the comparison of both strings. This will be an expensive operation because both strings will be huge. In contrast, without the changes on this PR, comparing the string representation of this 5kb json-like thing with anything fuzzy matcher would have been easy-peasy

AFAIK, RSpec::Support::Differ is called when an expectation matcher fails AND the expectation matcher responds to diffable?. If the matcher does not responds to diffable, expected and actual vars are assigned nil and RSpec::Support::Differ will not perform any action at all.

My assessment is this will impact performance only when the test fails and the developer is using a matcher that produces an output-diff. But this is my assessment about performance, I still don't know the answer to: should rspec-support module be allowed to mutate expected vars in the developer test bench (ie, his _spec.rb files)? or rspec-support should remain a "pure" read-only module?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @pirj's concern was about larger diffs, but as things should be identical the differ will often not print anything at all

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Identical - yes. But if it’s neighbouring with the difference, like the ‘anything_key’ here on line 567, and it is big, i would prefer to see the literal “anything” to be printed rather than 5kb of json.

| :fixed => "fixed",
|-:trigger => "wrong",
|+:trigger => "trigger",
|
EOD
expect(diff).to be_diffed_as(expected_diff)
end
it "checks the 'expected' var continues having the 'anything' fuzzy matcher, it has not mutated" do
JonRowe marked this conversation as resolved.
Show resolved Hide resolved
actual = { :fixed => "fixed", :trigger => "trigger", :anything_key => "bcdd0399-1cfe-4de1-a481-ca6b17d41ed8" }
expected = { :fixed => "fixed", :trigger => "wrong", :anything_key => anything }
differ.diff(actual, expected)
expect(expected).to eq({ :fixed => "fixed", :trigger => "wrong", :anything_key => anything })
end
end
end
end
end
Expand Down