-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix handling of "ambiguous ampersands" to report error when expected #83
Conversation
See validator/htmlparser#82 See validator/htmlparser#83 This switches the HTML checker to using the validator-nu branch of the htmlparser repo until validator/htmlparser#83 is reviewed and merged into the main branch.
This change started triggering a bunch of new errors for us when checking against https://validator.w3.org/nu/. The lines are all lines similar to this:
The I just would like to double check that this is bad practice and something we should fix in our code. Because it seems pretty innocuous. |
@dill0wn that looks like a regression to me. I don't think it should be flagging the query string parameters |
Yeah, it‘s a regression. See also validator/validator#1595 |
Per the current spec, it’s not bad practice and not something you need to fix in your code in order for it to be valid. We changed the parser algorithm in the spec years ago to allow for ampersands in URLs in attribute values, while still disallowing it in other cases where we wanted to disallow it — and then we attempted to change the parser to match the behavior required by the spec. It seems now we got the allow-in-attribute-values behavior done right in the parser, but as a result, somewhere along the lines, we seem to have ended up allowing ampersands everywhere. So we still need to fix that. But in the meantime, I need to revert this change. |
I’ve fixed this (reverted the patch in the PR) in the HTML checker sources and pushed the fix to https://validator.w3.org/nu/ |
…dards Mode. r=smaug. Differential Revision: https://phabricator.services.mozilla.com/D123318
to remedy an O(n^2) performance problem when parsing e.g. inline images
#64) * Mozilla bug 1701828 addendum - More comments for meta charset rewrite for Gecko. Differential Revision: https://phabricator.services.mozilla.com/D125808 Co-authored-by: Michael[tm] Smith <[email protected]>
Bumps [actions/checkout](https://github.com/actions/checkout) from 2 to 3. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v2...v3) --- updated-dependencies: - dependency-name: actions/checkout dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]>
Bumps [actions/cache](https://github.com/actions/cache) from 2 to 3. - [Release notes](https://github.com/actions/cache/releases) - [Commits](actions/cache@v2...v3) --- updated-dependencies: - dependency-name: actions/cache dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]>
Bumps [actions/setup-java](https://github.com/actions/setup-java) from 2 to 3. - [Release notes](https://github.com/actions/setup-java/releases) - [Commits](actions/setup-java@v2...v3) --- updated-dependencies: - dependency-name: actions/setup-java dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]>
Differential Revision: https://phabricator.services.mozilla.com/D145571
See Mozilla bug 1775477. Co-authored-by: Henri Sivonen <[email protected]>
This basically undoes the manual change from #70.
4ef2c05
to
8c718d3
Compare
Fixes #82. 44427c0 broke handling of "ambiguous ampersands" for the case where we've already examined enough of string after the ampersand to conclude that the substring we have so far doesn't match the beginning of any known named character reference.
This change re-conforms the error-reporting for that case to the requirements in the HTML spec. Otherwise, without this change, no error is reported as expected in many or most cases where an ampersand doesn't actually start a character reference.