BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//NYCDH Week - ECPv6.15.17//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:NYCDH Week
X-ORIGINAL-URL:https://nycdh.org/dhweek
X-WR-CALDESC:Events for NYCDH Week
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20160313T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20161106T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20170312T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20171105T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20180311T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20181104T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20190310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20191103T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20200308T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20201101T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20210314T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20211107T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20220313T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20221106T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20210211T180000
DTEND;TZID=America/New_York:20210211T200000
DTSTAMP:20260601T041316
CREATED:20210119T180721Z
LAST-MODIFIED:20210208T193202Z
UID:5428-1613066400-1613073600@nycdh.org
SUMMARY:Build Your Own Text-as-Data Corpus: A Print-to-Bytes Primer
DESCRIPTION:This hands-on workshop will teach participants how to construct their own digital text corpus for conducting humanities data analysis. We’ll cover simple tools for turning printed texts in a variety of languages into computer-readable files\, the use of Optical Character Recognition (OCR) software\, and consider helpful tools for post-process correction of digitized texts. We’ll also look at open-access text-as-data sources available over simple web-browser-based API calls. The workshop is geared toward digital humanists needing to assemble text data that are not yet compiled or in computer readable form for analysis\, and who are looking for an introduction to the workflows and software suited to building the research materials needed for analysis. We’ll learn how to use Tesseract\, an open-source OCR software\, consider the anatomy of an HOCR file (the output of OCR efforts)\, and deploy techniques for extracting structured information from a page. \nComputer with a text editor installed such as BBEdit\, TextWrangler\, Atom\, Notepad++ or the like; administrator access to install open-source software (Tesseract).
URL:https://nycdh.org/dhweek/event/build-your-own-text-as-data-corpus-a-print-to-bytes-primer/
LOCATION:Virtual\, NY\, United States
CATEGORIES:2021,Text Analysis,WIDH2021
ORGANIZER;CN="Nicholas Wolf":MAILTO:nicholas.wolf@nyu.edu
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20170208T080000
DTEND;TZID=America/New_York:20170208T100000
DTSTAMP:20260601T041316
CREATED:20170111T235354Z
LAST-MODIFIED:20170112T011215Z
UID:364-1486540800-1486548000@nycdh.org
SUMMARY:Using Optical Character Recognition (OCR) to Build a DH Corpus
DESCRIPTION:Students will learn how to use common OCR software\, including Tesseract and ABBYY Finereader\, to build the text corpora they need to for common DH methods such as text mining\, topic modeling\, bibliographic visualizations\, and text-as-data analyses. \nSkill Level\nBeginner \nPrerequisites\nNone \nEquipment Requirements\nNone
URL:https://nycdh.org/dhweek/event/using-optical-character-recognition-ocr-to-build-a-dh-corpus/
LOCATION:Bobst Library\, NYU\, Room 617\, 70 Washington Square South\, New York\, NY\, 10012\, United States
CATEGORIES:Beginner,Digital Humanities
ORGANIZER;CN="Nicholas Wolf":MAILTO:nicholas.wolf@nyu.edu
GEO:40.7294345;-73.9972124
X-APPLE-STRUCTURED-LOCATION;VALUE=URI;X-ADDRESS=Bobst Library NYU Room 617 70 Washington Square South New York NY 10012 United States;X-APPLE-RADIUS=500;X-TITLE=70 Washington Square South:geo:-73.9972124,40.7294345
END:VEVENT
END:VCALENDAR