Using Optical Character Recognition (OCR) to Build a DH Corpus
Bobst Library, NYU, Room 617 70 Washington Square South, New York, NY, United StatesStudents will learn how to use common OCR software, including Tesseract and ABBYY Finereader, to build the text corpora they need to for common DH methods such as text mining, topic modeling, bibliographic visualizations, and text-as-data analyses. Skill Level Beginner Prerequisites None Equipment Requirements None