Note to all attendees: Session leaders will contact you with additional information, including a meeting link, for each individual workshop, event, or demonstration. 

Building a Text Analysis Pipeline with Python

Pace University, 1 Pace Plaza, E101 1 Pace Plaza, New York

This workshop will show participants how to use the Python and the Natural Language Toolkit to load a plaintext document, split it into paragraphs/sentences/words, and retrieve dictionary headwords and part-of-speech information for the words in the document. We will then create charts and visualizations for the feature counts. LEVEL: Beginner/Intermediate NOTES: Bring personal laptop; required […]

RSVP Now Free 25 spots left

Thinking Through Word Embeddings

Babble Lab @ Pace University, Room 1105 163 William St., New York, NY, United States

Word embeddings are a family of algorithms that can be remarkably effective at representing the meanings of words, and their relationships to each other. We'll cover the basics of word embeddings: what they do, how to train a model using word2vec, and how to use them to search for synonyms and analogies. And we'll look […]

RSVP Now Free 15 spots left

Analyzing Twitter Data for Beginners

Fordham Lincoln Center, Quinn Library Room 234 113 W 60th Street, New York, NY, United States

Interested in analyzing conversations on Twitter but don’t know where to start? This workshop will demonstrate how to use TAGS <https://tags.hawksey.info/get-tags/>, an open source tool developed by Martin Hawksey to collect and visualize Twitter data as it happens. Aimed at novice users, this session will experiment with small datasets generated from Twitter conversations under specific […]

RSVP Now Free 20 spots left

Social Media Scraping for Qualitative Research

Bobst Library, NYU, Room 617 70 Washington Square South, New York, NY, United States

Interested in incorporating social media content into your qualitative research project? This workshop will introduce the basics of using small-scale web scraping of social media for qualitative analysis. Using NCapture, a web browser extension, and NVivo, a qualitative analysis software package, this session will focus on methods to incorporate the context from web pages, online […]

RSVP Now Free 20 spots left

R for Text Analysis

Studio@Butler 535 W. 114th St., New York, NY, United States

In this workshop, we will use R for text analysis, with a focus on the Tidy Text approach within the Tidytext framework. Your insights will be visualized and can also be turned into an interactive without any web coding skills, using Shiny R. The workshop is open to anyone with an interest in this topic. […]

RSVP Now Free 25 spots left

Word Embeddings: Can Vectors Encode Meaning?

Columbia University, CEPSR, Room 620 530 West 120th Street, New York, NY, United States

Word embeddings, or vector representations of words, are commonly used in computer science to work with and analyze text. They are particularly useful as a powerful off-the-shelf tool when using open-source word embeddings previously generated by Google, Facebook, or other technology companies based on web crawls. We present the background and justifications for using vectors […]

RSVP Now Free 15 spots left

Text as Data in the Humanities

Bobst Library, NYU, Room 617 70 Washington Square South, New York, NY, United States

An introduction to text analysis for literature with a foundational overview of considerations for approaching computational text analysis in the humanities. This workshop will cover a) gathering text corpus, b) copyright considerations c) data cleaning, d) an introduction to the computational software tools e) reading the output and analysis that may include word frequencies, cluster […]

Free

Advanced Topics in Word Embeddings

Studio@Butler 535 W. 114th St., New York, NY, United States

Word embeddings are the hottest new technology in natural language processing, and are used across linguistic computer science, from machine translation to information extraction and computational literary analysis. We will cover advanced topics in word embeddings, including: document similarity analysis, nearest-neighbor analysis, training vector spaces, and visualization. We will use literary texts as examples, but […]

Free

Intro to the Command Line

Bobst Library, NYU, Room 619 70 Washington Square S, New York, NY, United States

Learn how to use the command line to perform basic tasks. We’ll begin by discussing why humanists would want to learn something so technical, then jump into learning how to create and edit files and directories. Knowledge of the command line can be applied in many contexts, including several of the other workshops offered this […]

Free

What matters to your Congressperson?

Bobst Library, NYU, Room 619 70 Washington Square S, New York, NY, United States

What topics most preoccupy your member of Congress? Are those the sorts of things you prioritize? In this workshop users will learn how to navigate a database of Congress to constituent e-newsletters and how to perform text analyses in R to get a top level core of what members of Congress most focus on in […]

Free

NLP for non-data scientists – Event Extraction

Columbia (Butler Library room 208B) 535 West 114th St, New York, NY, United States

The amount of text data available is mind-boggling. We will explore programatic approaches to identify information about what happened and when it happened by gathering knowledge from text. Equipment: Python, Anaconda, Laptop Prerequisites: Working familiarly with Python  

Free

Text as Data in the Humanities

Bobst Library, NYU, Room 617 70 Washington Square South, New York, NY, United States

An introduction to computational text analysis for literature with basic introduction to software packages. This workshop is a primer for working with text as data in the humanities. This workshop will cover: gathering text corpora, data cleaning, an introduction to some computational software tools, reading the output and analysis of topic modeling and cluster analysis, […]

Introduction to WebAnno

Studio@Butler 535 W. 114th St., New York, NY, United States

WebAnno is a web-based tool for linguistic annotation (marking up) of text, with layers for morphological, syntactic, and semantic annotation. We will work through tagging named entities and relationships in a text, exporting as a tab-delimited file, and using the annotated text as input into a (Python) machine-learning algorithm for named entity recognition. Equipment Requirements: […]

Free

The Making and Knowing Project’s Digital Critical Edition and English Translation of a 16th-c. Manuscript of Artisanal Recipes

Columbia University, Fayerweather Hall, Room 513 1180 Amsterdam Avenue, New York, NY, United States

The Making and Knowing Project (Center for Science and Society, Columbia University) is excited to present Secrets of Craft and Nature in Renaissance France—a digital critical edition and English translation of a sixteenth-century French manuscript of artisanal recipes. The publication of this edition marks the culmination over five years of iterative, collaborative, and interdisciplinary work by […]

Free

Starting to Text Mine the Digitized Library with HathiTrust Features.

Pace University, Babble Lab, Rm. 202 41 Park Row, New York, NY, United States

Millions of books have been digitized in the past two decades. Thanks to a 2014 court ruling, about 15 million books are available for computational analysis in the HathiTrust including data about word counts on each individual page. In the next year or two, similar data will become available for JStor and Portico books. This […]

Free

FairCopy: A word processor for the digital humanities.

Virtual NY, United States

FairCopy is a simple and powerful tool for transcribing, editing, and studying manuscripts and historical texts. FairCopy gives humanists an editor to create TEI encoded texts without writing a single line of XML, so this rich format becomes accessible for everyone. Nick Laiacona will demonstrate the use of this new tool and its functionality. The […]

Free

Build Your Own Text-as-Data Corpus: A Print-to-Bytes Primer

Virtual NY, United States

This hands-on workshop will teach participants how to construct their own digital text corpus for conducting humanities data analysis. We'll cover simple tools for turning printed texts in a variety of languages into computer-readable files, the use of Optical Character Recognition (OCR) software, and consider helpful tools for post-process correction of digitized texts. We’ll also […]

Brooklyn College Covid-19 Archive@ A Journal of the Plague Year

Virtual NY, United States

This digital archive has collected stories and experiences from the Brooklyn College community related to the Covid-19 pandemic. The archive resides within the larger, omnibus archive, A Journal of the Plague Year. This demonstration will review the principles that guided the project, the submission process and explore possible digital humanities projects based upon the archive […]

Free

Textual Corpus Creation with Corpus-DB

Online New York, NY, United States

In this workshop, participants will learn how to set up a text analysis project, by automatically assembling a large collection of text, using the Corpus-DB API. Corpus-DB allows digital humanities researchers to quickly assemble a textual corpus, according to publication date, literary genre, author, and more. We will generate corpora which may be of interest […]

Free

Text Analysis with a Zine Corpus

Online New York, NY, United States

Working with transcribed zines from the Barnard Zine Library, we will engage participants in the ethics and steps of creating a corpus and how to explore them using Voyant-Tools and a pre-written Python script. Corpus metadata highlight zine creators holding one or more minoritized identities. All are welcome, and no coding experience is necessary. This […]

Free
Go to Top