Encoding issue (not UTF-8) and repeated entries #67

GegznaV · 2020-03-15T12:51:20Z

Describe the bug

Encoding issue in displaying non-ASCII characters.
Repeated entries of the same source (in bib file they are entered only once)

To Reproduce
Call citr RStudio add-in from the attached project: citr--UTF-8--bug.zip

Expected behavior

Correct encoding (UTF-8) for all characters.
Each entry is shown exactly once.

Screenshots

Encoding is set to UTF-8 in settings:

Additional context

R             3.6.3
RStudio       1.2.5033
citr          0.3.2

- Session info ----------------------------------
 setting  value                       
 version  R version 3.6.3 (2020-02-29)
 os       Windows 10 x64              
 system   x86_64, mingw32             
 ui       RStudio                     
 language (EN)                        
 collate  English_United States.1252  
 ctype    English_United States.1252  
 tz       Europe/Helsinki             
 date     2020-03-15

The text was updated successfully, but these errors were encountered:

GegznaV · 2020-03-15T14:29:35Z

Another encoding issue may be here:

citr/R/insert_citation.R

Line 107 in 0afd6f9

parent_document <- readLines(parents_path[parents], warn = FALSE)

Shouldn't it be:

parent_document <- readLines(parents_path[parents], warn = FALSE, encoding = getOption("citr.encoding"))

GegznaV · 2020-03-15T15:18:28Z

#68 solves the issue of duplicated entries. It addresses one more potential issue related to encoding.

And the indicated encoding issue is related to RefManageR::ReadBib() which does not respect the value of encoding:

RefManageR::ReadBib("book.bib", check = FALSE, .Encoding = "UTF-8")

## [1] V. ÄŒekanaviÄ�ius and G. Murauskas. _Statistika ir jos taikymai I_. Vilnius:
## TEV, 2006, p. 240. ISBN: 9986-546-93-1.

##  / truncated /

## [5] V. Janilionis, V. Morkevicius, and R. Rauleckas. “III dalis. StatistinÄ—s
## analizÄ—s pavyzdžiÅ³ naudojant pavyzdin\ce skaitmenin\ce duomenÅ³ baz\ce
## medžiaga”. In: _StatistinÄ— kiekybiniÅ³ duomenÅ³ analizÄ— su SPSS ir Stata_.
## Kaunas, 2008. Chap. 10. Daugia, p. 393. <URL:
## http://www.lidata.eu/index.php?file=files/mokymai/stat/stat.html{\&}course{\_}file=stat{\_}III{\_}10.html>.

#  / truncated /

## Warning messages:
## 1: Janilionis2008-III-10: unknown macro '\c' 
## 2: Janilionis2008-III-10: unknown macro '\c' 
## 3: Janilionis2008-III-10: unknown macro '\c'

This encoding issue is related to #53

crsh · 2020-06-04T08:44:37Z

Thanks for the PR, I've hardcoded the expected encoding of parent documents to UTF-8, because rmarkdown assumes UTF-8 encoding anyways and because the option citr.encoding specifies the encoding of the Bib-file.

crsh · 2020-07-15T12:26:37Z

Hi @GegznaV, has this issue been resolved (except for the upstream encoding issue)?

GegznaV · 2020-07-30T02:20:30Z

It seems that only the upstream issue is left.

GegznaV mentioned this issue Mar 15, 2020

Prevent from duplicated entries and some encoding issues #68

Merged

GegznaV mentioned this issue Mar 15, 2020

Encoding error when parsing BibTeX file with multi-byte characters on Windows ropensci/bibtex#20

Open

crsh added the bug label Jul 15, 2020

crsh added the upstream label Jul 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encoding issue (not UTF-8) and repeated entries #67

Encoding issue (not UTF-8) and repeated entries #67

GegznaV commented Mar 15, 2020 •

edited

Loading

GegznaV commented Mar 15, 2020

GegznaV commented Mar 15, 2020 •

edited

Loading

crsh commented Jun 4, 2020

crsh commented Jul 15, 2020

GegznaV commented Jul 30, 2020

Encoding issue (not UTF-8) and repeated entries #67

Encoding issue (not UTF-8) and repeated entries #67

Comments

GegznaV commented Mar 15, 2020 • edited Loading

GegznaV commented Mar 15, 2020

GegznaV commented Mar 15, 2020 • edited Loading

crsh commented Jun 4, 2020

crsh commented Jul 15, 2020

GegznaV commented Jul 30, 2020

GegznaV commented Mar 15, 2020 •

edited

Loading

GegznaV commented Mar 15, 2020 •

edited

Loading