-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Walter Obweger
committed
Aug 28, 2020
1 parent
79c4890
commit 2e9ba1e
Showing
1 changed file
with
35 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
hello I'm Walter, | ||
I was lucky enough to be assigned to project Arthur Schnitzler Briefe conducted by Martin. | ||
|
||
use the hashtag to explore more on twitter. | ||
|
||
correspSearch dot net is a platform hosted by | ||
Berlin-Brandenburg Academy of Sciences and Humanities intended to facilitate correspondence search, who wrote when and where a letter to whom. | ||
to do so, a TEI variation called CMIF has to be prepared. | ||
Correspondence Metadata Interchange Format | ||
|
||
there is an online editor available, as you see here. | ||
GND, gemeinsame Normdaten geo locations are supported and resolvable. | ||
|
||
input data were letter in print, scanned already. | ||
the larger volumes contained a index, three columns starting with recipient and dates. | ||
|
||
there are multiple approaches to tackle task to convert index into an CMIF. | ||
|
||
Martin inspired me to following workflow | ||
|
||
first we use transkribus to perform a text recognition. | ||
important to mention is that text area recognition needed a little help, manual selection was unavoidable. | ||
|
||
now all lines of the index were in a text file, line by line. | ||
|
||
with OpenRefine this messy data was shaped into a table. | ||
during this process, problems in text recognition surfaced. | ||
on first glance correct dates, couldn't be, because they weren't chronological. | ||
|
||
to detect such problems, a little python script in jupyter notebook was created. | ||
|
||
as final step, by means of OpenRefine suspect dates were edited manually. | ||
and shaped into CMIF json file, which CMIF Creator can load. | ||
GND has been adjusted manually by Martin. | ||
|