From 620454c3cb6c8cd8e2fbe00045befb6be4431515 Mon Sep 17 00:00:00 2001 From: Jan Lelis Date: Thu, 26 Dec 2024 18:07:33 +0100 Subject: [PATCH] Add Encoding note to README and CHANGELOG --- CHANGELOG.md | 7 +++++++ README.md | 5 +++++ 2 files changed, 12 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 9fd8967..ff76fb6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,12 @@ # CHANGELOG +## 3.1.3 (unreleased) + +Better handling of non-UTF-8 strings: + +- Data with *BINARY* encoding is interpreted as UTF-8, if possible +- Use `invalid: :replace` and `undef: :replace` options when converting to UTF-8 + ## 3.1.2 - Performance improvements diff --git a/README.md b/README.md index dd9fa2d..9b8f593 100644 --- a/README.md +++ b/README.md @@ -71,6 +71,11 @@ Unicode::DisplayWidth.of("·", 1) # => 1 Unicode::DisplayWidth.of("·", 2) # => 2 ``` +### Encoding Notes + +- Data with *BINARY* encoding is interpreted as UTF-8, if possible +- Non-UTF-8 strings are converted to UTF-8 before measuring, using the [`{invalid: :replace, undef: :replace}`) options](https://ruby-doc.org/3.3.5/encodings_rdoc.html#label-Encoding+Options) + ### Custom Overwrites You can overwrite how to handle specific code points by passing a hash (or even a proc) as `overwrite:` parameter: