-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoding issue with GetBibEntryWithDOI #24
Comments
You can certainly use RefManageR to update the fields you encounter with bad accent formatting, e.g.
if you need to change the authors of the first entry, but it's not going to be the most efficient. Recent versions of R have allowed users to specify additional LaTeX macros in |
I encountered similar issue both using RefManageR and bibtex package when reading BibTeX file with multi-byte characters on Windows. I opened an issue on bibtex, please see here. # Get current locale info
Sys.getlocale()
#> [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
# Set locale to Chinese
Sys.setlocale(locale = "Chinese")
#> [1] "LC_COLLATE=Chinese (Simplified)_China.936;LC_CTYPE=Chinese (Simplified)_China.936;LC_MONETARY=Chinese (Simplified)_China.936;LC_NUMERIC=C;LC_TIME=Chinese (Simplified)_China.936"
bib_text <- "
@misc{text,
title = {{你好}},
language = {zh-CN},
author = {{你好}},
month = jun,
year = {2013},
pages = {163}
}
"
# change encoding to "UTF-8"
bib_text_utf8 <- enc2utf8(bib_text)
Encoding(bib_text_utf8)
#> [1] "UTF-8"
# make sure the saved BibTeX file is UTF-8 encoded
con <- file("test.bib", encoding = "UTF-8")
writeLines(bib_text_utf8, con)
close(con)
readLines("test.bib", encoding = "UTF-8")
#> [1] "" " @misc{text,"
#> [3] " title = {{你好}}," " language = {zh-CN},"
#> [5] " author = {{你好}}," " month = jun,"
#> [7] " year = {2013}," " pages = {163}"
#> [9] " }" " " It seems that bib <- bibtex::do_read_bib("test.bib", "UTF-8", srcfile("test.bib", encoding = "UTF-8", "UTF-8"))
bib
#> [[1]]
#> title language author month year pages
#> "{浣犲ソ}" "zh-CN" "{浣犲ソ}" "jun" "2013" "163"
#> attr(,"entry")
#> [1] "misc"
#> attr(,"key")
#> [1] "text"
#>
#> attr(,"include")
#> character(0)
#> attr(,"strings")
#> named character(0)
#> attr(,"preamble")
#> character(0) Currently, one workaround is to encode the output of Encoding(bib[[1]])
#> [1] "unknown" "unknown" "unknown" "unknown" "unknown" "unknown"
Encoding(bib[[1) <- "UTF-8"
bib
#> [[1]]
#> title language author month year pages
#> "{你好}" "zh-CN" "{你好}" "jun" "2013" "163"
#> attr(,"entry")
#> [1] "misc"
#> attr(,"key")
#> [1] "text"
#>
#> attr(,"include")
#> character(0)
#> attr(,"strings")
#> named character(0)
#> attr(,"preamble")
#> character(0) @mwmclean Any insights? Thanks! |
Hi, I am processing a bib file with various Spanish authors.
Many of them are messed up due to accents, e.g. . Gutia'errez-Uribe instead of Gutiérrez-Uribe.
Is there any solution?
Thanks, Robert
draft.bib.zip
The text was updated successfully, but these errors were encountered: