Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automatic spellcheck by codespell for common misspellings. #4997

Merged
merged 2 commits into from
Feb 6, 2025

Conversation

fujitatomoya
Copy link
Collaborator

follow-up of #4980.

Copy link

github-actions bot commented Feb 6, 2025

HTML artifacts: https://github.com/ros2/ros2_documentation/actions/runs/13183844552/artifacts/2548931443.

To view the resulting site:

  1. Click on the above link to download the artifacts archive
  2. Extract it
  3. Open html-artifacts-4997/index.html in your favorite browser

@fujitatomoya fujitatomoya force-pushed the fujitatomoya/automatic-spell-check branch from b909ade to c701049 Compare February 6, 2025 04:42
@fujitatomoya fujitatomoya force-pushed the fujitatomoya/automatic-spell-check branch from c701049 to 95721f6 Compare February 6, 2025 04:43
@fujitatomoya
Copy link
Collaborator Author

some important notes related to this PR.

  • This PR enables codespell check for both make spellcheck in local environment and github action.
  • --skip="source/Releases/*": this folder is ignored because of a couple of reasons. 1st xxx-Changelog.rst is basically generated by commit log that could include the some misspelling, but i think we should keep the original commit messages here. 2nd, that includes names for authors and contributors, that compiles up the false alarms.
rosindex@3a0a0abcf2df:/tmp/doc_repository$ git ls-files '*.md' '*.rst' | xargs codespell --ignore-words=codespell_whitelist.txt 
source/Releases/Beta3-Overview.rst:29: Readded ==> Re-added, Read
source/Releases/Galactic-Geochelone-Complete-Changelog.rst:738: explictly ==> explicitly
source/Releases/Galactic-Geochelone-Complete-Changelog.rst:768: arbitraty ==> arbitrary
source/Releases/Galactic-Geochelone-Complete-Changelog.rst:775: Servin ==> Serving
source/Releases/Galactic-Geochelone-Complete-Changelog.rst:775: Idel ==> Idle
source/Releases/Galactic-Geochelone-Complete-Changelog.rst:804: dependancy ==> dependency
source/Releases/Galactic-Geochelone-Complete-Changelog.rst:833: Idel ==> Idle
source/Releases/Galactic-Geochelone-Complete-Changelog.rst:909: paramter ==> parameter
source/Releases/Galactic-Geochelone-Complete-Changelog.rst:1323: Servin ==> Serving
source/Releases/Galactic-Geochelone-Complete-Changelog.rst:1323: Idel ==> Idle
...
  • since address mistyping detected by codespell checker. #4980 (and jazzy and humble backports) has already merged, there should be no misspelling warning, that means we can run the spell check for entire repository. (we could run the spelling check for only changed part, but i do not think that is really needed.)
  • execution time is reasonably fast enough even running for all markdown and rst files.
rosindex@3a0a0abcf2df:/tmp/doc_repository$ time make spellcheck
git ls-files '*.md' '*.rst' | xargs codespell --ignore-words=codespell_whitelist.txt --skip="source/Releases/*"

real    0m0.210s
user    0m0.202s
sys     0m0.008s
  • Permissive common misspelling only. according to codespell-project, it tells that It does not check for word membership in a complete dictionary, but instead looks for a set of common misspellings. that said we can start small without false alarms, at least address mistyping detected by codespell checker. #4980 related misspelling can be detected via github action before merge.

i had tried some test for failure case,

image

image

image

it can tells where and how exactly it should be fixed.

@fujitatomoya fujitatomoya self-assigned this Feb 6, 2025
@fujitatomoya fujitatomoya added the backport-all backport at reviewers discretion; from rolling to all versions label Feb 6, 2025
@fujitatomoya
Copy link
Collaborator Author

@kscottz @christophebedard @clalancette i think this is solid and ready to fly 🤔 compared to once-sentence-per-line rule. would love to get feedback on this.

codespell_whitelist.txt Outdated Show resolved Hide resolved
codespell_whitelist.txt Outdated Show resolved Hide resolved
Copy link
Collaborator

@kscottz kscottz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fujitatomoya you are the absolute best and this looks good to go for now.

Looking at the Code Spell docs it appears they support a custom dictionary. It seems like we could add a custom dictionary to catch some common stuff like "ROS2 -> ROS 2", etc.

@fujitatomoya
Copy link
Collaborator Author

fujitatomoya commented Feb 6, 2025

@kscottz

catch some common stuff like "ROS2 -> ROS 2", etc.

actually this is really good idea. after this PR is merged, i will open the issue to brainstorm the ROS 2 doc dictionary👍

@fujitatomoya fujitatomoya merged commit b9fb2b4 into rolling Feb 6, 2025
5 checks passed
@fujitatomoya fujitatomoya deleted the fujitatomoya/automatic-spell-check branch February 6, 2025 18:05
mergify bot pushed a commit that referenced this pull request Feb 6, 2025
* automatic spellcheck by codespell for common misspellings.

Signed-off-by: Tomoya Fujita <[email protected]>

* remove indention from the whitelist.

Signed-off-by: Tomoya Fujita <[email protected]>

---------

Signed-off-by: Tomoya Fujita <[email protected]>
(cherry picked from commit b9fb2b4)
mergify bot pushed a commit that referenced this pull request Feb 6, 2025
* automatic spellcheck by codespell for common misspellings.

Signed-off-by: Tomoya Fujita <[email protected]>

* remove indention from the whitelist.

Signed-off-by: Tomoya Fujita <[email protected]>

---------

Signed-off-by: Tomoya Fujita <[email protected]>
(cherry picked from commit b9fb2b4)
fujitatomoya added a commit that referenced this pull request Feb 6, 2025
…4999)

* automatic spellcheck by codespell for common misspellings.

Signed-off-by: Tomoya Fujita <[email protected]>

* remove indention from the whitelist.

Signed-off-by: Tomoya Fujita <[email protected]>

---------

Signed-off-by: Tomoya Fujita <[email protected]>
(cherry picked from commit b9fb2b4)

Co-authored-by: Tomoya Fujita <[email protected]>
fujitatomoya added a commit that referenced this pull request Feb 6, 2025
…5000)

* automatic spellcheck by codespell for common misspellings.

Signed-off-by: Tomoya Fujita <[email protected]>

* remove indention from the whitelist.

Signed-off-by: Tomoya Fujita <[email protected]>

---------

Signed-off-by: Tomoya Fujita <[email protected]>
(cherry picked from commit b9fb2b4)

Co-authored-by: Tomoya Fujita <[email protected]>
@fujitatomoya
Copy link
Collaborator Author

fujitatomoya commented Feb 6, 2025

@clalancette @kscottz @christophebedard thanks for the review, permissive spell check is now online for all distros!

as Kat suggested above, ROS 2 dictionary approach can be good without having false alarms. i will look into that feature, and create the issue for that.

@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/requesting-help-lets-make-ros-2-dictionary-for-spell-checker/42005/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-all backport at reviewers discretion; from rolling to all versions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants