From 53481d73e15ff0feaf2e2c0972f88921b14b5869 Mon Sep 17 00:00:00 2001 From: Justin Silvestre Date: Wed, 30 Oct 2024 19:21:20 +0100 Subject: [PATCH] Some ocr and readings fixes, restore license, add vocab files for rest of brandt --- README.md | 12 ++++---- docs/brandt.md | 50 +++++++++++++++++----------------- texts/LICENSE | 2 +- texts/brandt-ch04-3.vocab.tsv | 2 +- texts/brandt-ch37-2.passage.md | 2 +- texts/brandt-ch37-2.vocab.tsv | 7 +++++ texts/brandt-ch38-1.vocab.tsv | 5 ++++ texts/brandt-ch38-2.vocab.tsv | 9 ++++++ texts/brandt-ch39-1.passage.md | 4 +-- texts/brandt-ch39-1.vocab.tsv | 20 ++++++++++++++ texts/brandt-ch39-2.vocab.tsv | 12 ++++++++ texts/brandt-ch40-1.vocab.tsv | 14 ++++++++++ texts/brandt-ch40-2.passage.md | 2 +- texts/brandt-ch40-2.vocab.tsv | 13 +++++++++ 14 files changed, 117 insertions(+), 37 deletions(-) create mode 100644 texts/brandt-ch37-2.vocab.tsv create mode 100644 texts/brandt-ch38-1.vocab.tsv create mode 100644 texts/brandt-ch38-2.vocab.tsv create mode 100644 texts/brandt-ch39-1.vocab.tsv create mode 100644 texts/brandt-ch39-2.vocab.tsv create mode 100644 texts/brandt-ch40-1.vocab.tsv create mode 100644 texts/brandt-ch40-2.vocab.tsv diff --git a/README.md b/README.md index 81bd2a1..6690ac6 100644 --- a/README.md +++ b/README.md @@ -27,17 +27,17 @@ The current focus is to transcribe + format the content of the 1927 textbook _In The biggest challenge at the moment is transcribing the portions in mixed Chinese/Latin script. OCR tools can automate some of the process, but not all of it. If you have time, please consider helping out by transcribing the remaining "Vocabulary" and "Notes" chapters listed [here](./docs/brandt.md). -## texts   [![CC BY-NC-SA 4.0][cc-by-sa-shield]][cc-by-sa] +## texts   [![CC BY-NC-SA 4.0][cc-by-nc-sa-shield]][cc-by-nc-sa] The following license information applies to the texts in the [texts](./texts) folder. -[Creative Commons Attribution-ShareAlike 4.0 International License][cc-by-sa]. +[Creative Commons Non-Commercial Attribution-ShareAlike 4.0 International License][cc-by-nc-sa]. -[![CC BY-NC-SA 4.0][cc-by-sa-image]][cc-by-sa] +[![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa] -[cc-by-sa]: http://creativecommons.org/licenses/by-sa/4.0/ -[cc-by-sa-image]: https://licensebuttons.net/l/by-sa/4.0/88x31.png -[cc-by-sa-shield]: https://img.shields.io/badge/License-CC%20BY--NC--SA%204.0-lightgrey.svg +[cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/ +[cc-by-nc-sa-image]: https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png +[cc-by-nc-sa-shield]: https://img.shields.io/badge/License-CC%20BY--NC--SA%204.0-lightgrey.svg ## development diff --git a/docs/brandt.md b/docs/brandt.md index ccfef0a..298f6d4 100644 --- a/docs/brandt.md +++ b/docs/brandt.md @@ -1881,7 +1881,7 @@ total progress: 535 / 1384 tasks complete (~38%) - [ ] proofread - format content - [x] create `.passage.md` according to established format ([example](../texts/brandt-ch01-1.passage.md)) - - [ ] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) + - [x] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) - add content - [ ] check/proofread/fill in missing Mandarin pinyin readings in `.vocab.tsv` - [ ] check/proofread/fill in missing Vietnamese readings in `.vocab.tsv` @@ -1890,99 +1890,99 @@ total progress: 535 / 1384 tasks complete (~38%) - [ ] write Chinese -> English gloss - Lesson 37, Text 2 - transcribe content - - [ ] use OCR on text + - [x] use OCR on text - [ ] proofread OCR results - [ ] transcribe English definitions in Vocabulary section - [ ] transcribe Notes section - [ ] proofread - format content - - [ ] create `.passage.md` according to established format ([example](../texts/brandt-ch01-1.passage.md)) - - [ ] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) + - [x] create `.passage.md` according to established format ([example](../texts/brandt-ch01-1.passage.md)) + - [x] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) - add content - [ ] check/proofread/fill in missing Mandarin pinyin readings in `.vocab.tsv` - [ ] check/proofread/fill in missing Vietnamese readings in `.vocab.tsv` - [ ] check/proofread/fill in missing Cantonese readings in `.vocab.tsv` - - [ ] align transcribed Chinese + English sentences + - [x] align transcribed Chinese + English sentences - [ ] write Chinese -> English gloss - Lesson 38, Text 2 - transcribe content - - [ ] use OCR on text + - [x] use OCR on text - [ ] proofread OCR results - [ ] transcribe English definitions in Vocabulary section - [ ] transcribe Notes section - [ ] proofread - format content - - [ ] create `.passage.md` according to established format ([example](../texts/brandt-ch01-1.passage.md)) - - [ ] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) + - [x] create `.passage.md` according to established format ([example](../texts/brandt-ch01-1.passage.md)) + - [x] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) - add content - [ ] check/proofread/fill in missing Mandarin pinyin readings in `.vocab.tsv` - [ ] check/proofread/fill in missing Vietnamese readings in `.vocab.tsv` - [ ] check/proofread/fill in missing Cantonese readings in `.vocab.tsv` - - [ ] align transcribed Chinese + English sentences + - [x] align transcribed Chinese + English sentences - [ ] write Chinese -> English gloss - Lesson 39, Text 2 - transcribe content - - [ ] use OCR on text + - [x] use OCR on text - [ ] proofread OCR results - [ ] transcribe English definitions in Vocabulary section - [ ] transcribe Notes section - [ ] proofread - format content - - [ ] create `.passage.md` according to established format ([example](../texts/brandt-ch01-1.passage.md)) - - [ ] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) + - [x] create `.passage.md` according to established format ([example](../texts/brandt-ch01-1.passage.md)) + - [x] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) - add content - [ ] check/proofread/fill in missing Mandarin pinyin readings in `.vocab.tsv` - [ ] check/proofread/fill in missing Vietnamese readings in `.vocab.tsv` - [ ] check/proofread/fill in missing Cantonese readings in `.vocab.tsv` - - [ ] align transcribed Chinese + English sentences + - [x] align transcribed Chinese + English sentences - [ ] write Chinese -> English gloss - Lesson 40, Text 1 - transcribe content - - [ ] use OCR on text + - [x] use OCR on text - [ ] proofread OCR results - [ ] transcribe English definitions in Vocabulary section - [ ] transcribe Notes section - [ ] proofread - format content - - [ ] create `.passage.md` according to established format ([example](../texts/brandt-ch01-1.passage.md)) - - [ ] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) + - [x] create `.passage.md` according to established format ([example](../texts/brandt-ch01-1.passage.md)) + - [x] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) - add content - [ ] check/proofread/fill in missing Mandarin pinyin readings in `.vocab.tsv` - [ ] check/proofread/fill in missing Vietnamese readings in `.vocab.tsv` - [ ] check/proofread/fill in missing Cantonese readings in `.vocab.tsv` - - [ ] align transcribed Chinese + English sentences + - [x] align transcribed Chinese + English sentences - [ ] write Chinese -> English gloss - Lesson 40, Text 2 - transcribe content - - [ ] use OCR on text + - [x] use OCR on text - [ ] proofread OCR results - [ ] transcribe English definitions in Vocabulary section - [ ] transcribe Notes section - [ ] proofread - format content - - [ ] create `.passage.md` according to established format ([example](../texts/brandt-ch01-1.passage.md)) - - [ ] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) + - [x] create `.passage.md` according to established format ([example](../texts/brandt-ch01-1.passage.md)) + - [x] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) - add content - [ ] check/proofread/fill in missing Mandarin pinyin readings in `.vocab.tsv` - [ ] check/proofread/fill in missing Vietnamese readings in `.vocab.tsv` - [ ] check/proofread/fill in missing Cantonese readings in `.vocab.tsv` - - [ ] align transcribed Chinese + English sentences + - [x] align transcribed Chinese + English sentences - [ ] write Chinese -> English gloss - Lesson 40, Text 3 - transcribe content - - [ ] use OCR on text + - [x] use OCR on text - [ ] proofread OCR results - [ ] transcribe English definitions in Vocabulary section - [ ] transcribe Notes section - [ ] proofread - format content - - [ ] create `.passage.md` according to established format ([example](../texts/brandt-ch01-1.passage.md)) - - [ ] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) + - [x] create `.passage.md` according to established format ([example](../texts/brandt-ch01-1.passage.md)) + - [x] create `.vocab.tsv` according to established format ([example](../texts/brandt-ch01-1.vocab.tsv)) - add content - [ ] check/proofread/fill in missing Mandarin pinyin readings in `.vocab.tsv` - [ ] check/proofread/fill in missing Vietnamese readings in `.vocab.tsv` - [ ] check/proofread/fill in missing Cantonese readings in `.vocab.tsv` - - [ ] align transcribed Chinese + English sentences + - [x] align transcribed Chinese + English sentences - [ ] write Chinese -> English gloss diff --git a/texts/LICENSE b/texts/LICENSE index dec1a49..05d30af 100644 --- a/texts/LICENSE +++ b/texts/LICENSE @@ -1,3 +1,3 @@ The following license applies to the files in this directory. -https://creativecommons.org/licenses/by-sa/4.0/ +https://creativecommons.org/licenses/by-nc-sa/4.0/ diff --git a/texts/brandt-ch04-3.vocab.tsv b/texts/brandt-ch04-3.vocab.tsv index 6cc7743..1b3d92d 100644 --- a/texts/brandt-ch04-3.vocab.tsv +++ b/texts/brandt-ch04-3.vocab.tsv @@ -29,7 +29,7 @@ Traditional Qieyun Hanyu Pinyin Jyutping Korean Vietnamese English 苗 明三B宵平 miáo miu⁴ 묘 meo sprouts; shoots 晝 知三尤去 zhòu zau³ 주 daylight; daytime 等 端開一咍上?, 端開一登上? děng dang² 등 đẳng a class; a sort; equal; equally; a sign of the plural -長 澄開三陽平?, 知開三陽上?, 澄開三陽去? zhǎng coeng⁴ 장 trường long +長 澄開三陽平?, 知開三陽上?, 澄開三陽去? cháng coeng⁴ 장 trường long 並 並四青上 bìng bing⁶ 병 two together; united; all; equally; also; really 重 澄三鍾平?, 澄三鍾上?, 澄三鍾去? zhòng cung⁵ 중 trọng heavy; important; severe 案 影開一寒去 àn on³ 안 duyên?, án? a table. A case at law diff --git a/texts/brandt-ch37-2.passage.md b/texts/brandt-ch37-2.passage.md index 46c9f48..e15e2c4 100644 --- a/texts/brandt-ch37-2.passage.md +++ b/texts/brandt-ch37-2.passage.md @@ -6,7 +6,7 @@ 探問友人疾病函 -某某仁兄大人閣下運啟者。吾兄貴體違和殊深。惦念伏思尊軀素健。今偶失檢點。乃爲二豎所侵。惟期安心靜養。定占勿藥之喜。達人自玉。皇閣下勿稍介意。未悉請何醫士診治。弟稍暇卽冨趨府看望。特此致候。順頌痊安。 +某某仁兄大人閣下運啟者。吾兄貴體違和殊深。惦念伏思尊軀素健。今偶失檢點。乃爲二豎所侵。惟期安心靜養。定占勿藥之喜。達人自玉。皇閣下勿稍介意。未悉請何醫士診治。弟稍暇卽當趨府看望。特此致候。順頌痊安。 弟某某鞠躬月日 diff --git a/texts/brandt-ch37-2.vocab.tsv b/texts/brandt-ch37-2.vocab.tsv new file mode 100644 index 0000000..5fb2e59 --- /dev/null +++ b/texts/brandt-ch37-2.vocab.tsv @@ -0,0 +1,7 @@ +Traditional Qieyun Hanyu Pinyin Jyutping Korean Vietnamese English +惦 diàn dim³ điếm +軀 溪三虞平 qū keoi¹ 구 xo +健 羣開三元去 jiàn gin⁶ 건 kiện +豎 常三虞上 shù syu⁶ 수 +藥 以開三陽入 yào joek⁶ 약 dược +介 見開二皆去 jiè gaai³ 개 giới \ No newline at end of file diff --git a/texts/brandt-ch38-1.vocab.tsv b/texts/brandt-ch38-1.vocab.tsv new file mode 100644 index 0000000..2b6205c --- /dev/null +++ b/texts/brandt-ch38-1.vocab.tsv @@ -0,0 +1,5 @@ +Traditional Qieyun Hanyu Pinyin Jyutping Korean Vietnamese English +織 章開三之去?, 章開三蒸入? zhī zik¹ 직 chức +述 船合三眞入 shù seot⁶ 술 thuật +頻 並三A眞平 pín pan⁴ 빈 +倣 幫三陽上 fǎng fong² 방 phỏng \ No newline at end of file diff --git a/texts/brandt-ch38-2.vocab.tsv b/texts/brandt-ch38-2.vocab.tsv new file mode 100644 index 0000000..165e831 --- /dev/null +++ b/texts/brandt-ch38-2.vocab.tsv @@ -0,0 +1,9 @@ +Traditional Qieyun Hanyu Pinyin Jyutping Korean Vietnamese English +青 清開四青平 qīng cing¹ 청 thanh +竿 見開一寒平 gān gon¹ 간 cần +臆 影開三蒸入 yì jik¹ 억 +閥 並三元入 fá fat⁶ 벌 +簡 見開二山上 jiǎn gaan² 간 giản +峯 滂三鍾平 fēng fung¹ 봉 +鴻 匣一東平?, 匣一東上? hóng hung⁴ 홍 hồng +茹 日開三魚平?, 日開三魚上?, 日開三魚去? rú jyu⁴ 여 nhà \ No newline at end of file diff --git a/texts/brandt-ch39-1.passage.md b/texts/brandt-ch39-1.passage.md index 2a3198f..cd82081 100644 --- a/texts/brandt-ch39-1.passage.md +++ b/texts/brandt-ch39-1.passage.md @@ -12,7 +12,7 @@ President Ts'ao-Ku'n's Telegram of Resignation 公鑒。 To the Peking Cabinet of the 10th month, 13th year of the Republic, to the Senate and the House of Representatives, to high military and civil authorities of all provinces and special administrative areas, to all provincial assemblies, to all legal organizations and all newspapers for information of all citizens: -錕泰膺重托。德薄能鲜。致令部曲橋貳紀綱失墜。 +錕泰膺重托。德薄能鮮。致令部曲橋貳紀綱失墜。 I, K'un, was entrusted with the heavy burden (of the presidency). My virtue and ability however were so poor that a conflict among my followers broke out and all laws became ineffective (lit. fell down). 十三年十月二十三日。馮玉祥倒戈。錕受閉錮。 @@ -36,7 +36,7 @@ I earnestly hope that all my former colleagues will do their utmost to bring abo 錕優遊林下。獲睹承平。欣幸曷極。 And in the quietness and freedom of my private life I will be able to witness peaceful times which will be for me the highest happiness. -特電佈達。願共察之。曹锟。 +特電佈達。願共察之。曹錕。 I specially send forth this telegram for general information. Ts'ao-Ku'n. --- diff --git a/texts/brandt-ch39-1.vocab.tsv b/texts/brandt-ch39-1.vocab.tsv new file mode 100644 index 0000000..566e327 --- /dev/null +++ b/texts/brandt-ch39-1.vocab.tsv @@ -0,0 +1,20 @@ +Traditional Qieyun Hanyu Pinyin Jyutping Korean Vietnamese English +曹 從開一豪平 cáo cou⁴ 조 tào +錕 見合一魂平?, 見合一魂上? kūn kwan¹ 곤 +區 溪三虞平?, 影開一侯平? qū keoi¹ 구 khu +托 tuō tok³ 탁 thác?, thách?, thốc?, thước?, thướt? +橋 羣開三B宵平 qiáo kiu⁴ 교 kiều +貳 日開三脂去 èr ji⁶ 이 nhị +紀 見開三之上 jì gei² 기 kỉ +綱 見開一唐平 gāng gong¹ 강 cương +閉 幫四齊去?, 幫四先入? bì bai³ 폐 bế +錮 見一模去 gù gu³ 고 +疚 見三尤去 jiù gau³ 구 nhíu +討 透開一豪上 tǎo tou² 토 thảo +憨 曉開一談平?, 匣開一談去? hān ham¹ 감 hám +馭 疑開三魚去 yù jyu⁶ 어 ngựa +屣 生開三支上?, 生開三支去? xǐ saai² +袍 並一豪平?, 並一豪去? páo pou⁴ 포 bào +循 邪合三眞平 xún ceon⁴ 순 +軌 見合三B脂上 guǐ gwai² 궤 quẫy +睹 端一模上 dǔ dou² 도 đủ \ No newline at end of file diff --git a/texts/brandt-ch39-2.vocab.tsv b/texts/brandt-ch39-2.vocab.tsv new file mode 100644 index 0000000..845f317 --- /dev/null +++ b/texts/brandt-ch39-2.vocab.tsv @@ -0,0 +1,12 @@ +Traditional Qieyun Hanyu Pinyin Jyutping Korean Vietnamese English +訃 滂三虞去 fù fu⁶ 부 phó +音 影開三B侵平 yīn jam¹ 음 âm +玲 來開四青平 líng ling⁴ 령?, 영? liếng +翁 影一東平 wēng jung¹ 옹 ông +泉 從合三仙平 quán cyun⁴ 천 tuyền +純 常合三眞平?, 章合三眞上? chún seon⁴ 순?, 준? thuần +撫 滂三虞上 fǔ fu² 무 vỗ +靈 來開四青平 líng ling⁴ 령?, 영? linh +弔 端開四蕭去?, 端開四青入? diào diu³ 조 điếu +唁 疑開三B仙去 yàn jin⁶ ngon +芻 初三虞平 chú co¹ 추 so \ No newline at end of file diff --git a/texts/brandt-ch40-1.vocab.tsv b/texts/brandt-ch40-1.vocab.tsv new file mode 100644 index 0000000..db8cd13 --- /dev/null +++ b/texts/brandt-ch40-1.vocab.tsv @@ -0,0 +1,14 @@ +Traditional Qieyun Hanyu Pinyin Jyutping Korean Vietnamese English +兢 見開三蒸平 jīng ging¹ 긍 cạnh +導 定開一豪去 dǎo dou⁶ 도 đạo +迭 定開四先入 dié dit⁶ 질 dật +謗 幫一唐去 bàng pong³ 방 báng +赤 昌開三清入 chì cek³ 적 xích +遴 來開三眞去 lín leon⁴ lận +扞 匣開一寒去 hàn hon⁶ 한 +遡 心一模去 sù sou³ 소 +帥 生合三脂去?, 生合三眞入? shuài seoi³ 솔?, 수? soái +貞 知開三清平 zhēn zing¹ 정 trinh +斬 莊開二咸上 zhǎn zaam² 참 trảm +僉 清開三鹽平 qiān cim¹ 첨 +濱 幫三A眞平 bīn ban¹ 빈 \ No newline at end of file diff --git a/texts/brandt-ch40-2.passage.md b/texts/brandt-ch40-2.passage.md index 20d0fac..9ef646a 100644 --- a/texts/brandt-ch40-2.passage.md +++ b/texts/brandt-ch40-2.passage.md @@ -6,7 +6,7 @@ 慰友人喪母函 -某某仁兄大人苫次頃奉訃聞。驚知老伯母大人於某月某日駕返瑤池。驚閱之下。悼働莫名。伏維 伯母大人。閫範永垂。母儀足式。今者星墜女嫌。對萱堂而雨泣。峯願天姥。感樹木之風悲雖歸真於天上。無遺憾於人間。尙望兄台勉釋軫懷。是爲至禱期屆駕輛。自應前往執紼。謹具奠儀。尙所代薦靈凡之右。此泐。順候孝履。 +某某仁兄大人苫次頃奉訃聞。驚知老伯母大人於某月某日駕返瑤池。驚閱之下。悼慟莫名。伏維 伯母大人。閫範永垂。母儀足式。今者星墜女嫌。對萱堂而雨泣。峯願天姥。感樹木之風悲雖歸真於天上。無遺憾於人間。尙望兄台勉釋軫懷。是爲至禱期屆駕輀。自應前往執紼。謹具奠儀。尙所代薦靈凡之右。此泐。順候孝履。 弟某某鞠躬月日 diff --git a/texts/brandt-ch40-2.vocab.tsv b/texts/brandt-ch40-2.vocab.tsv new file mode 100644 index 0000000..b74125f --- /dev/null +++ b/texts/brandt-ch40-2.vocab.tsv @@ -0,0 +1,13 @@ +Traditional Qieyun Hanyu Pinyin Jyutping Korean Vietnamese English +苫 書開三鹽平?, 書開三鹽去? shān sim¹ 점 chôm +悼 定開一豪去 dào dou⁶ 도 điệu +慟 定一東去 tòng dung⁶ 통 đỏng +閫 溪合一魂上 kǔn kwan² 곤 +萱 曉合三元平 xuān hyun¹ 훤 +姥 明一模上 lǎo mou⁵ 모 mụ +憾 匣開一覃去 hàn ham⁶ 감 hám +軫 章開三眞上 zhěn zan² 진 +輀 日開三之平 ér ji⁴ +紼 幫三文入 fú fat¹ +奠 定開四先去?, 端開四青去? diàn din⁶ 전 +薦 精開四先去 jiàn zin³ 천 \ No newline at end of file