Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding issue (not UTF-8) and repeated entries #67

Open
GegznaV opened this issue Mar 15, 2020 · 5 comments
Open

Encoding issue (not UTF-8) and repeated entries #67

GegznaV opened this issue Mar 15, 2020 · 5 comments

Comments

@GegznaV
Copy link
Contributor

GegznaV commented Mar 15, 2020

Describe the bug

  1. Encoding issue in displaying non-ASCII characters.
  2. Repeated entries of the same source (in bib file they are entered only once)

To Reproduce
Call citr RStudio add-in from the attached project: citr--UTF-8--bug.zip

Expected behavior

  1. Correct encoding (UTF-8) for all characters.
  2. Each entry is shown exactly once.

Screenshots
image

Encoding is set to UTF-8 in settings:
image

Additional context

R             3.6.3
RStudio       1.2.5033
citr          0.3.2
- Session info ----------------------------------
 setting  value                       
 version  R version 3.6.3 (2020-02-29)
 os       Windows 10 x64              
 system   x86_64, mingw32             
 ui       RStudio                     
 language (EN)                        
 collate  English_United States.1252  
 ctype    English_United States.1252  
 tz       Europe/Helsinki             
 date     2020-03-15    
@GegznaV
Copy link
Contributor Author

GegznaV commented Mar 15, 2020

Another encoding issue may be here:

parent_document <- readLines(parents_path[parents], warn = FALSE)

Shouldn't it be:

parent_document <- readLines(parents_path[parents], warn = FALSE, encoding = getOption("citr.encoding")) 

@GegznaV
Copy link
Contributor Author

GegznaV commented Mar 15, 2020

#68 solves the issue of duplicated entries. It addresses one more potential issue related to encoding.

And the indicated encoding issue is related to RefManageR::ReadBib() which does not respect the value of encoding:

RefManageR::ReadBib("book.bib", check = FALSE, .Encoding = "UTF-8")

## [1] V. Čekanavi�ius and G. Murauskas. _Statistika ir jos taikymai I_. Vilnius:
## TEV, 2006, p. 240. ISBN: 9986-546-93-1.

##  / truncated /

## [5] V. Janilionis, V. Morkevicius, and R. Rauleckas. “III dalis. StatistinÄ—s
## analizÄ—s pavyzdžių naudojant pavyzdin\ce skaitmenin\ce duomenų baz\ce
## medžiaga”. In: _StatistinÄ— kiekybinių duomenų analizÄ— su SPSS ir Stata_.
## Kaunas, 2008. Chap. 10. Daugia, p. 393. <URL:
## http://www.lidata.eu/index.php?file=files/mokymai/stat/stat.html{\&}course{\_}file=stat{\_}III{\_}10.html>.

#  / truncated /

## Warning messages:
## 1: Janilionis2008-III-10: unknown macro '\c' 
## 2: Janilionis2008-III-10: unknown macro '\c' 
## 3: Janilionis2008-III-10: unknown macro '\c' 

This encoding issue is related to #53

@crsh
Copy link
Owner

crsh commented Jun 4, 2020

Thanks for the PR, I've hardcoded the expected encoding of parent documents to UTF-8, because rmarkdown assumes UTF-8 encoding anyways and because the option citr.encoding specifies the encoding of the Bib-file.

@crsh
Copy link
Owner

crsh commented Jul 15, 2020

Hi @GegznaV, has this issue been resolved (except for the upstream encoding issue)?

@crsh crsh added the bug label Jul 15, 2020
@GegznaV
Copy link
Contributor Author

GegznaV commented Jul 30, 2020

It seems that only the upstream issue is left.

@crsh crsh added the upstream label Jul 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants