Skip to content

Commit

Permalink
Transcribe vocab up to brandt 10-3, allow comment header in TSV, hand…
Browse files Browse the repository at this point in the history
…le edge case of variants used within one text where group is introduced
  • Loading branch information
justinsilvestre committed Oct 31, 2024
1 parent 3ee1baf commit d708158
Show file tree
Hide file tree
Showing 12 changed files with 161 additions and 93 deletions.
32 changes: 25 additions & 7 deletions src/app/prebuild.ts
Original file line number Diff line number Diff line change
Expand Up @@ -77,10 +77,12 @@ async function fillInMissingReadingsInTsvs() {
const isBrandtPassage = textId.startsWith("brandt-");
if (isBrandtPassage) brandtPassagesVisited += 1;
const vocabFileContents = getPassageVocabFileContents(textId);
const { vocab, variants } = parsePassageVocabList(

const { vocab, variants, comment } = parsePassageVocabList(
textId,
vocabFileContents
);

const mainToSecondaryVariants: {
[mainVariant: string]: string[];
} = Object.entries(variants).reduce(
Expand All @@ -104,7 +106,20 @@ async function fillInMissingReadingsInTsvs() {
Object.keys(vocab).concat(isBrandtPassage ? newCharsInPassage : [])
);

for (const char of featuredChars) {
if (textId.includes("10-3")) {
console.log({ variants, vocab, newCharsInPassage, featuredChars });
}

const featuredCharsMainVariants = new Set(
[...featuredChars].flatMap((char) => {
const mainVariants = [
...(lexicon.variants[char] || []),
...(variants[char] || []),
];
return mainVariants.length ? mainVariants : [char];
})
);
for (const char of featuredCharsMainVariants) {
if (
!lexicon.vocab[char] ||
vocab[char]?.some((e) => vocabFileColumns.some((k) => !e[k.key]))
Expand Down Expand Up @@ -168,7 +183,7 @@ async function fillInMissingReadingsInTsvs() {
}));
}
}
const newVocabFileContents = makeVocabTsvContent(vocab, variants);
const newVocabFileContents = makeVocabTsvContent(comment, vocab, variants);

if (
Object.keys(vocab).length &&
Expand Down Expand Up @@ -218,7 +233,10 @@ function getEmbeddedChineseSegments(text: string) {
function writePassageVocabularyJsons(lexicon: PassageVocabWithVariants) {
for (const textId of getTextsIds()) {
const vocabFileContents = getPassageVocabFileContents(textId);
const { vocab } = parsePassageVocabList(textId, vocabFileContents);
const { vocab, variants: _passageVariants } = parsePassageVocabList(
textId,
vocabFileContents
);
const passage = parsePassage(getPassageFileContents(textId));

const vocabJsonPath = path.join(
Expand All @@ -229,9 +247,7 @@ function writePassageVocabularyJsons(lexicon: PassageVocabWithVariants) {

const variants: PassageTermVariants = {};
for (const char of passageChars) {
if (!vocab[char]) {
vocab[char] = lexicon.vocab[char];
}
vocab[char] ||= lexicon.vocab[char];
const mainVariants = lexicon.variants[char];
if (mainVariants) {
variants[char] = mainVariants;
Expand Down Expand Up @@ -266,10 +282,12 @@ function getMandarinReadings(
}

function makeVocabTsvContent(
comment: string | null,
vocab: Partial<Record<string, LexiconEntry[]>>,
variants: PassageTermVariants
): string | NodeJS.ArrayBufferView {
return [
...(comment ? [comment.trim()] : []),
`Traditional\tQieyun\tHanyu Pinyin\tJyutping\tKorean\tVietnamese\tEnglish`,
...Object.entries(vocab).flatMap(
([char, ee]) =>
Expand Down
10 changes: 8 additions & 2 deletions src/app/texts/Passage.ts
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,14 @@ export function parsePassageVocabList(
) {
const vocab: PassageVocab = {};
const variants: PassageTermVariants = {};
let comment: string | null = null;

if (vocabFileContents) {
const lines = vocabFileContents.split("\n");
const [, commentText, body] =
vocabFileContents?.match(/^(#.*\n)?([\s\S]*)$/) || [];
comment = commentText || null;

const lines = body.split("\n");

const [, ...columnHeaders] = lines[0].trim().split("\t");
const invalidColumnHeaders = columnHeaders.filter(
Expand Down Expand Up @@ -92,7 +98,7 @@ export function parsePassageVocabList(
}
}
}
return { vocab, variants };
return { vocab, variants, comment };
}

export function parsePassage(passageFileContents: string) {
Expand Down
1 change: 0 additions & 1 deletion texts/brandt-ch03-3.vocab.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,5 @@ Traditional Qieyun Hanyu Pinyin Jyutping Korean Vietnamese English
見合一寒平?, 見合一寒去? guān gun¹ quan to gaze at; to look; to inspect
端開一咍上?, 端開一登上? děng dang² đẳng to wait; a class; a rank
heoi¹
幫三文平?, 並三文去? fēn fan¹ phân
來開三陽平?, 來開三陽去? liàng loeng⁴ 량?, 양?
端開四齊平 dai¹ đây
2 changes: 1 addition & 1 deletion texts/brandt-ch04-3.vocab.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Traditional Qieyun Hanyu Pinyin Jyutping Korean Vietnamese English
溪開三之上 hei² khởi to rise up. to raise; to start
清開四齊去?, 清開四先入? qiè cit³ 절?, 체? thiết to cut; urgent; pressing; very
幫三庚平 bīng bing¹ binh a soldier, a weapon; military
溪開三B脂去 hei³ khí a vessel. implements; capacity
溪開三B脂去 hei³ khí a vessel; implements; a disih; an apparatus. capacity; ability.
見開一咍去 gài koi³ to level: to adjust; all
云合三微平 wéi wai⁴ vi to oppose; to disobey
見開三B侵平?, 見開三B侵去? jìn gam³ cấm to forbid; to prohibit
Expand Down
2 changes: 1 addition & 1 deletion texts/brandt-ch08-1.vocab.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Traditional Qieyun Hanyu Pinyin Jyutping Korean Vietnamese English
疑開三陽入 nüè joek⁶ ngược to be cruel; cruelty; to oppress.
書開三支平?, 書開三支去? shī si¹ thi to give; to bestow; to apply.
孃開三魚上?, 孃開三魚去? neoi⁵ 녀?, 여? nữa woman; female; daughter.
便 並三A仙平?, 並三A仙去? biàn bin⁶ 변?, 편? tiện convenient; then.
便 並三A仙平?, 並三A仙去? biàn bin⁶ 변?, 편? tiện convenient; cheap. ordinary; plain. then; in that case.
孃三尤上?, 孃三尤去? niǔ nau² nữu to be accustomed to; to be used to.
邪三鍾入 zuk⁶ tục common; vulgar. a custom.
云合三祭去 wèi wai⁶ vệ to escort; to guard.
Expand Down
89 changes: 58 additions & 31 deletions texts/brandt-ch10-1.vocab.tsv
Original file line number Diff line number Diff line change
@@ -1,35 +1,62 @@
# nao (428) the brain. 腦 無數 wu-shu-innumerable. 知 chih-chüch-perception 知血器 hsieh-blood. shen² (824)-spirits; gods. The soul; the mind. Force; expression. ch'i-a vessel; a dish; an apparatus. Capacity; ability. 神經 shen²-ching-nerves. ch'ian-all; the whole; 全 complete; perfect. nao-ti-the brain sub- stance. 充ch'ung to fill; to satisfy: to fulfil. 塞 sai (se¹) to stop up; to block. 充塞 ch'ung-sai-to fill. 電線 tien-lightning; electri- city. hsien a thread; a wire. 電線tien-hsien wires. telegraph 分fen to divide; to dis- tribute. lu²(207) the skull; the forehead. 分布 fen-put-to distribute. 互通 hu-mutual; together. 頭顱 l'ou-lu² the head. shu-a number; some. 通 t'ung to go through; to circulate. General; wholly; complete. hsiao (124) to melt; to consume; to disperse. 消息 hsiao-hsi news. rumours; 五人wu-jen-we. 動作 tung-tso to move; move. a 傅合 chuan-ling-to orders. issue 揮huil to move; to directi to shake. 指揮 chih-hui-to direct. 百 po-ti-the whole body; the mechanism of body. tsung (380) to unite: to 咸 sum up. To control. 同 chu²-position; circum- stances office. A board; an 總局 tsung-chi-a head of fice. 分局fen-chi-a branch-office. 痛楚 t'ung-ch'u³-pain; sore. 痾ko (01) (699)-sickness; pain. 癢 yang³ (151)-to itch. 荷癢 k'o-yang-itching. 觸 chut (788) to butt; to strike against. 肌膚 chi¹ (396)-the flesh. fu the skin; the flesh. 肌膚 chi-fu the flesh; the skin. 臭 hsiu-to smell. Read ch'ous strong smelling; stinking. 味 wei (495)-test; flavor. chich¹ (85) to receive; to take. 接鼻 pi-the nose. 感 kan³ to touch; to in- fluence; to excite. kan-chuch-sensation. 耗 hao-to waste; to destroy. A rat. 消耗 hsiao-hao-to spend; to waste. 睡 shui to sleep. mien (358) to close the eyes; to sleep. 睡眠 shui-mien-to sleep. t'unga boy under 15 years of age, A girl. 小時 hsiao-shih-an hour. tuta measure; a limit. To cross over.
Traditional Qieyun Hanyu Pinyin Jyutping Korean Vietnamese English
泥開一豪上?, 泥開一豪去? nǎo nou⁵ não
泥開一豪上 nǎo nou⁵ não the brain
泥開一豪去 ? ? ? ? tanned animal hide; leather. smooth, glossy. [morohashi]
知覺 perception
幫一魂上 běn bun² bản
腦體 the brain substance
心開三侵平 xīn sam¹ tâm
曉合四先入 xuè hyut³ huyết
昌三東平 chōng cung¹ sung
來一模平 lou⁴
曉合四先入 xuè hyut³ huyết blood
昌三東平 chōng cung¹ sung to fill; to satisfy; to fulfil
來一模平 lou⁴ the skull; the forehead
頭顱 the head
並一模上?, 並一侯上? bou⁶ bộ
生三虞上?, 生三虞去?, 生二江入? shù sou³ 삭?, 수? số
船開三眞平 shén san⁴ thần
從合三仙平 quán cyun⁴ toàn
定開四先去 diàn din⁶
心開三仙去 xiàn sin³
匣一模去 wu⁶ hỗ
透一東平 tōng tung¹ thông
心開三宵平 xiāo siu¹ tiêu
曉合三微平 huī fai¹ huy
精一東上 zǒng zung² tổng
羣三鍾入 guk⁶ cục
溪開二麻去
以開三陽上 yǎng joeng⁵
昌三鍾入 chù zuk¹ xúc
見開三B脂平 gei¹
幫三虞平 fu¹
昌三尤去 chòu cau³
明三微去 wèi mei⁶ vị
精開三鹽入 jiē zip³ tiếp
並三A脂去 bei⁶
見開一覃上 gǎn gam² cảm
hào hou³ hao
常合三支去 shuì seoi⁶
明四先平 mián min⁴ 면?, 민? miên
定一東平 tóng tung⁴ đồng
幫二山入 baat³ bát
定一模去?, 定開一唐入? dou⁶ 도?, 탁? độ
生三虞上 shũ ? ? count; enumerate. rebuke. permit. compare; be equal to. [kroll]
生三虞去 shù sou³ số number; some
生二江入 ? số close together, dense; tight-meshed, close-stitched. [kroll]
無數 innumerable
船開三眞平 shén san⁴ thần spirits; gods. the soul; the mind. force; expression
神經 nerves
從合三仙平 quán cyun⁴ toàn all; the whole; complete; perfect
定開四先去 diàn din⁶ lightning; electricity
心開三仙去 xiàn sin³ a thread; a wire
電線 telegraph wires
幫三文平 fēn fan¹ phân to divide; to distribute
並三文去 fèn ? ? portion; rank; allotment; fate. suppose; infer. affection; goodwill. [kanjikai, kroll, hxwd]
分佈 to distribute
匣一模去 wu⁶ hỗ mutual; together
透一東平 tōng tung¹ thông to go through; to circulate. general; wholly; complete
心開三宵平 xiāo siu¹ tiêu to melt; to consume; to disperse
消息 rumours; news
吾人 we
動作 to move; a move
傳令 to issue orders
曉合三微平 huī fai¹ huy to move; to direct; to shake
指揮 to direct
百體 the whole body; the mechanism of body
精一東上 zǒng zung² tổng to unite; to sum up. to control
羣三鍾入 guk⁶ cục a head office
分局 a branch-office
痛楚 pain; sore
溪開二麻去 sickness; pain
以開三陽上 yǎng joeng⁵ to itch
荷癢 itching
昌三鍾入 chù zuk¹ xúc to butt; to strike against
見開三B脂平 gei¹ the flesh
幫三虞平 fu¹ the skin; the flesh
肌膚 the flesh; the skin
昌三尤去 chòu cau³ strong smelling; stinking
嗅,臭 xiù cau³ to smell
明三微去 wèi mei⁶ vị test; flavor
精開三鹽入 jiē zip³ tiếp to receive; to take
並三A脂去 bei⁶ the nose
見開一覃上 gǎn gam² cảm to touch; to influence; to excite
hào hou³ hao to waste; to destroy. a rat
消耗 to spend; to waste
常合三支去 shuì seoi⁶ to sleep
明四先平 mián min⁴ 면?, 민? miên to close the eyes; to sleep
定一東平 tóng tung⁴ đồng a boy under 15 years of age. a girl.
小時 an hour
定一模去 dou⁶ độ a measure; a limit. to cross over
定開一唐入 duó ? ? measure; calculate; plan; conjecture. [kroll]
幫二山入 baat³ bát eight.
2 changes: 1 addition & 1 deletion texts/brandt-ch10-2.passage.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ How could I for cupidity and for dread of death (lit. because I see the interest
吾不能生。而使公子獨死矣。
I cannot remain alive and let the prince die alone.

突遂與公子俱逃澤中
遂與公子俱逃澤中
And then she, holding the prince in her arms (lit. together with the prince), jumped into a pool.

秦軍見而射之。
Expand Down
42 changes: 24 additions & 18 deletions texts/brandt-ch10-2.vocab.tsv
Original file line number Diff line number Diff line change
@@ -1,20 +1,26 @@
# 魏 weit (512)-name of (403-241 feudal State В. С.) 129/520 200% + Vocabulary. 蓋 乳ju milk; to suckle. 乳母:ju-mu-a wet nurse. ch'in² name of a feudal State (897-221 P. C.) ho² an interrogative par- ticle,--why not? would it not be better to....? ying) (202)-proper; sui- table. Ouglht; must. Read ying to answer; to correspond; to fulfil. 0 must. Business; duty; function. 務 wut to be necessary; 田 weit to fear; to be dread- ed. 公子 kung-tzu-a son of a prince; a heir-apparent. 員匿 罪 tsui-a crime; Punishment. a 族 shang to bestow; to re- ward.. nit to hide; to abscond. 誅 chu² (499)-to punish; to put to death. sin. 廢義 fei-i-to neglect duty. the tsu-a tribe; a clan; a family. 詐 cha (150) to deceive; artful; false. 120 行詐 hsing-chat-to commit a treachery.. i (98)-the wings of a bird; to shelter; to as- sist. 翼 逃 t'aos (383) to flee; escape. to i-pi-to cover; to pro- tect. 池澤 时 tse-a marsh; a pool. 射shet-to shoot. chu-here: "to hit"; "to strike" shiha dart; an arrow. To take an oath.
Traditional Qieyun Hanyu Pinyin Jyutping Korean Vietnamese English
疑合三微去 wèi ngai⁶
日三虞上 jyu⁵
從開三眞平 qín ceon⁴
書開三陽上 shǎng soeng² thưởng
疑合三微去 wèi ngai⁶ name of a feudal State
日三虞上 jyu⁵ milk; to suckle
乳母 wet nurse
從開三眞平 qín ceon⁴ name of a feudal State
公子 son of a prince; heir-apparent
書開三陽上 shǎng soeng² thưởng to bestow; to reward
清開四先平 qiān cin¹ thiên
孃開三蒸入 nik¹ 닉?, 익? nặc
從合一灰上 zuì zeoi⁶ tội
從一東入 zuk⁶ tộc
影開三蒸平?, 影開三蒸去? yīng jing³ ứng
明三虞去 mou⁶ vụ
影合三微去 wèi wai³
知三虞平 zhū zyu¹ tru
莊開二麻去 zhà zaa³ trá
定開一豪平 táo tou⁴ đào
以開三麻去?, 船開三麻去?, 以開三清入?, 船開三清入? shè se⁶ xạ
以開三蒸入 jik⁶ dực
日開三脂去 èr ji⁶ nhì
書開三脂上 shǐ ci² ngao?, thỉ?
透合一魂入?, 定合一魂入? dat⁶ đột
孃開三蒸入 nik¹ 닉?, 익? nặc to hide; to abscond
從合一灰上 zuì zeoi⁶ tội crime; punishment
從一東入 zuk⁶ tộc tribe; clan; family
影開三蒸平?, 影開三蒸去? yīng jing³ ứng to answer; to correspond; to fulfil
明三虞去 mou⁶ vụ business; duty; function
影合三微去 wèi wai³ to fear; to be dreaded
知三虞平 zhū zyu¹ tru to punish; to put to death
廢義 to neglect the duty.
莊開二麻去 zhà zaa³ trá to deceive; artful; false.
行詐 to commit a treachery.
定開一豪平 táo tou⁴ đào to flee; to escape.
以開三麻去?, 船開三麻去?, 以開三清入?, 船開三清入? shè se⁶ xạ to shoot.
以開三蒸入 jik⁶ dực the wings of a bird; to shelter; to assist.
翼蔽 to cover; to protect.
日開三脂去 èr ji⁶ nhì two
書開三脂上 shǐ ci² ngao?, thỉ? a dart; an arrow. to take an oath.

Loading

0 comments on commit d708158

Please sign in to comment.