Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using fewer codepages? #41

Open
blerner opened this issue May 7, 2018 · 6 comments
Open

Consider using fewer codepages? #41

blerner opened this issue May 7, 2018 · 6 comments

Comments

@blerner
Copy link

blerner commented May 7, 2018

According to the RTF spec (https://www.microsoft.com/en-us/download/details.aspx?id=10725), there are only a few codepages needed in RTF:

Code 				page | Name
-- | --
437 | United States IBM
708 | Arabic (ASMO 708)
709 | Arabic (ASMO 449+, BCON V4)
710 | Arabic (transparent Arabic)
711 | Arabic (Nafitha Enhanced)
720 | Arabic (transparent ASMO)
819 | Windows 3.1 (United States and Western Europe)
850 | IBM multilingual
852 | Eastern European
860 | Portuguese
862 | Hebrew
863 | French Canadian
864 | Arabic
865 | Norwegian
866 | Soviet Union
874 | Thai
932 | Japanese
936 | Simplified Chinese
949 | Korean
950 | Traditional Chinese
1250 | Eastern European
1251 | Cyrillic
1252 | Western European
1253 | Greek
1254 | Turkish
1255 | Hebrew
1256 | Arabic
1257 | Baltic
1258 | Vietnamese
1361 | Johab
10000 | MAC Roman
10001 | MAC Japan
10004 | MAC Arabic
10005 | MAC Hebrew
10006 | MAC Greek
10007 | MAC Cyrillic
10029 | MAC Latin2
10081 | MAC Turkish
57002 | Devanagari
57003 | Bengali
57004 | Tamil
57005 | Telugu
57006 | Assamese
57007 | Oriya
57008 | Kannada
57009 | Malayalam
57010 | Gujarati
57011 | Punjabi

As far as I can tell, rtf.js supports 145 code pages (searching for cptable[###] = in the RTFJS.bundle.js file), and eliminating ones that aren't necessary could cut down the bundle file size substantially.

@zoehneto
Copy link
Collaborator

zoehneto commented May 7, 2018

From the spec: Possible values include those in the following table. A quick google search shows that there are rtf documents which use other codepages (for example google ansicpg10002). For maximum document compatibility I want to keep the default as is. What I could do is load the codepages as an external module / additional script, that way you could supply your own cut down cptable for scenarios where you know which codepages will be used.

@lounsbrough
Copy link

@zoehneto - I am also interested in using this library but it currently would double the size of our deployment bundle. How hard would it be to do what you described above so that I could provide only the code page that I need?

@zoehneto
Copy link
Collaborator

In theory you'd only have to add the library to the webpack externals, remove the include from the dev / prod config and add a peer dependency to the package.json (on the rtf.js side, you'd still need to adapt your config to provide codepagejs appropriately). I currently don't have time to look further into it, but I'd be happy about a PR, if you want to implement the feature.

@lounsbrough
Copy link

I will take a look and see if its something I can do quickly or if I run into any roadblocks. 👍🏼

@lounsbrough
Copy link

I attempted to do this but ran into issues, mainly because this package was designed to be available in the browser, and the codepage tables are baked in. What I played with is on this branch: https://github.com/lounsbrough/rtf.js/tree/extract-codepage. I think someone more familiar with the app would need to address, or decide what the best path is.

@mattiaskagstrom
Copy link

Hi!
The codepages currently takes up the majority of our bundle-size:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants