Skip to content

Commit

Permalink
Merge pull request #28 from Earlopain/invalid-encoding-stuff
Browse files Browse the repository at this point in the history
Handle invalid encoded strings
  • Loading branch information
janlelis authored Dec 26, 2024
2 parents b00c5bf + bc47d28 commit a23a070
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 2 deletions.
10 changes: 8 additions & 2 deletions lib/unicode/display_width.rb
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,14 @@ class DisplayWidth

# Returns monospace display width of string
def self.of(string, ambiguous = nil, overwrite = nil, old_options = {}, **options)
string = string.encode(Encoding::UTF_8) unless string.encoding == Encoding::UTF_8
# Binary strings don't make much sense when calculating display width.
# Assume it's valid UTF-8
if string.encoding == Encoding::BINARY && !string.force_encoding(Encoding::UTF_8).valid_encoding?
# Didn't work out, go back to binary
string.force_encoding(Encoding::BINARY)
end

string = string.encode(Encoding::UTF_8, invalid: :replace, undef: :replace) unless string.encoding == Encoding::UTF_8
options = normalize_options(string, ambiguous, overwrite, old_options, **options)

width = 0
Expand Down Expand Up @@ -236,4 +243,3 @@ def of(string, **kwargs)
end
end
end

11 changes: 11 additions & 0 deletions spec/display_width_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,17 @@
it 'works with non-utf8 Unicode encodings' do
expect( 'À'.encode("UTF-16LE").display_width ).to eq 1
end

it 'works with a string that is invalid in its encoding' do
s = "\x81\x39".dup.force_encoding(Encoding::SHIFT_JIS)

# Would print as �9 on the terminal
expect( s.display_width ).to eq 2
end

it 'works with a binary encoded string that is valid in UTF-8' do
expect( '€'.b.display_width ).to eq 1
end
end

describe '[emoji]' do
Expand Down

0 comments on commit a23a070

Please sign in to comment.