-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoding error when parsing BibTeX file with multi-byte characters on Windows #20
Comments
I have a similar problem with my bib file (kwb_dummy.txt) on Windows:
|
I can still confirm that there is an encoding issue in file <- "book.bib"
encoding <- "UTF-8"
out <- bibtex::do_read_bib(file, encoding = encoding, srcfile(file, encoding = encoding))
out[[1]]
## address
## "Vilnius"
## author
## "{\\v{C}}ekanavi{\\v{c}}ius, Vydas and Murauskas, Gediminas"
## title
## "{Taikomoji regresinÄ— analizÄ— socialiniuose tyrimuose}" The contents of "book.bib" file: @book{Cekanavicius2014,
address = {Vilnius},
author = {{\v{C}}ekanavi{\v{c}}ius, Vydas and Murauskas, Gediminas},
title = {{Taikomoji regresinė analizė socialiniuose tyrimuose}},
year = {2014}
} An RStudio project for further experimentation: bib-file--UTF-8--issue.zip @romainfrancois It is quite an old issue. What can be done towards solving it? The solution to this issue would also solve some issues in packages that depend on bibtex including ropensci/RefManageR#66 or crsh/citr#67 |
Some findings on this:
|
* Upgrade testing suite to testthat 3 * Add testing for previous issues #45 * Add tests for standard bibtex entries As defined in BibTEX version 0.99b https://ctan.javinator9889.com/biblio/bibtex/base/btxdoc.pdf * Update actions * Add testing for examples * Add snapshots for examples read.bib This may fail on some platforms * Skip on non windows Possibly a character problem related with #20 and #43 * Not test on R 3.4 Some changes in default parsing (snapshot), but results are still ok * Add more tests * Try to increase coverage * One more tests for do_read_bib * Fix test for do_read_bib * Revert actions * Add devtools for testing * Add more tests * Add test for multiline string * Add non standard field names * Add myself as author * Move issues to inst/bib files * Refactor tests for avoiding cluttering
* Upgrade testing suite to testthat 3 * Add testing for previous issues #45 * Add tests for standard bibtex entries As defined in BibTEX version 0.99b https://ctan.javinator9889.com/biblio/bibtex/base/btxdoc.pdf * Update actions * Add testing for examples * Add snapshots for examples read.bib This may fail on some platforms * Skip on non windows Possibly a character problem related with #20 and #43 * Not test on R 3.4 Some changes in default parsing (snapshot), but results are still ok * Add more tests * Try to increase coverage * One more tests for do_read_bib * Fix test for do_read_bib * Revert actions * Add devtools for testing * Add more tests * Add test for multiline string * Add non standard field names * Add myself as author * Move issues to inst/bib files * Refactor tests for avoiding cluttering * Modify do_read_bib() Also bump version * Remove C code :) * Recreate snapshots on Linux * Check reverse dependencies * Add backports * Deprecate arguments * Use seq_along Co-authored-by: James J Balamuta <[email protected]> * Use seq_along again Co-authored-by: James J Balamuta <[email protected]> * Remove completely header and footer * Document internal functions * Rename internal params * Use vapply and new tests * Rerun revdeps * Update docs and snapshots * Update revdep and check action Some snapshots changes due to changes on toBibTex() on later versions of R, not related with the package * Fix action and snapshots Co-authored-by: James J Balamuta <[email protected]>
Thanks for this great package. I encountered a problem when using bibtex package to parse BibTeX files with Chinese characters on Windows:
read.bib
could not parse Chinese characters no matter encoding was set to "UTF-8" or not.Here is my session info
After digging a little bit, I found that encode the input of make.bib.entry to "UTF-8" can solve this problem. But I am not sure if this is a proper solution.
The text was updated successfully, but these errors were encountered: