logoalt Hacker News

thadttoday at 6:50 PM2 repliesview on HN

AI had been a super useful for processing historical data. Interviewed a volunteer last month from the diary archive in Germany, and they're using supervised AI for diary transcription. Going from (old) personalized hand script to text is a lot of work, even for experienced transcribers. Being able to automate the first pass of that has been a huge boon to their processing pipeline.


Replies

superxpro12today at 7:48 PM

Can you go a bit deeper on this?

If the risk of mistranslation is high, I fail to comprehend how letting AI "take a swing at it" does not reduce the translation quality?

How are they ensure no drop in translation quality?

show 1 reply
jmyeettoday at 8:27 PM

I hadn't considered or read about this problem before but it makes sense.

It reminds me of the cuneiform problem. Between 500,000 and 1 million tablets have been collected. This is one of the earliest preserved writing systems. Even so, fewer than 10% of these tablets have been translated. I was surprised to learn this but it makes sense. There are several problems:

1. Scribes used a lot of shorthand;

2. Cuneiform itself changed over time;

3. Writers would use multiple languages (eg Sumerian, Akkadian), even on the same tablet. There are relatively few people fluent in these languages, particularly in multiple of them at once;

4. To some extent the tablets are 3D such that a 2D photo might not be sufficient to translate because you might need to physically turn the tablet to accurately see the marks; and

5. In some cases the tablets are incomplete or broken so you may not to figure out how things fit together.

I wonder if AI can help make inroads into this 90%. I really wonder what is waiting to be unearthed.

show 1 reply