Wednesday, November 26, 2014

Extreme badness of computer scanned text retype

Note, I am not speaking against scanned pictures, but when a scan results in the computer making texts rather than images. In Project Runeberg's version of Nordisk Familjebok, I have no problem reading the text from scanned images, but more than once I have had to correct the text produced under them. Here is a typical and pretty bad example. First a Bible text in English translation, then the scan produced text, then the real Latin words:

  • If the world hates you, remember it hated me first.
  • Si imtndus vos odil scitofeqnia mcpviorem Kobis odio halmit
  • Si mundus vos odit scitote quia me priorem vobis odio habuit.

Lettres apostoliques de S. S. Léon XIII, tome 7

