Handwritten Books to Be Rescued from Obscurity by AI
Of the 53 miles of the Vatican Secret Archives, only a few documents are available to scholars because the ancient script is so difficult to scan. Soon, AI will change that.
According to a story in The Atlantic magazine, a project called In Codice Ratio, organized by a team of university professors in Italy and involving labor from college students and high school students, is using AI and optical character recognition software to decode ancient texts that even modern readers would find challenging to decipher. If the technology works, other long unseen texts from around the world will become accessible to the public in libraries.
Traditional OCR only works on typeset text⎯a lack of uniform spacing in handwritten documents makes it devilishly hard, so to speak, for computers to “read” the text. AI can solve that problem by reading common words in medieval Latin through a method called “jigsaw segmentation,” which divides words into vertical and horizontal bands.
Students “taught” the software through scanning images of various medieval letters, then AI made educated determinations based on word knowledge and context. After testing and adjusting, the software got 96% of handwritten letters correct.
read more at The Atlantic