Deactivate many regex
Unicode crate features
#2702
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#1643 disabled many deafault features of the
regex
crate but left theunicode
meta feature enabled. With theunicode
feature enabled andbindgen
as a build dependency,regex-syntax
(a direct dependency of theregex
crate) takes 7 seconds to compile as a build dependency in my application.The
unicode
feature includes support for many Unicode character class lookups which I find unlikely that bindgen uses. Even without theunicode
feature enabled, theregex
crate supports Unicode. The variousunicode-*
features only remove compiled in data tables that support various types of character classes.From https://docs.rs/regex/latest/regex/#unicode-features:
I have retained the unicode-perl feature, which gives support for
\w
,\s
and\d
, because these character classes were required to get tests to pass.Removing support for these character classes removes the need to compile many data tables, which should significantly reduce compile times.