Skip to toolbar
Text as Data in the Humanities2020-01-28T13:46:37-05:00

An introduction to computational text analysis for literature with basic introduction to software packages. This workshop is a primer for working with text as data in the humanities. This workshop will cover: gathering text corpora, data cleaning, an introduction to some computational software tools, reading the output and analysis of topic modeling and cluster analysis, and a general overview of common questions asked in computational literary studies.

To register for this event please use the following link:

Translating Questions into Actionable Research2020-01-21T11:48:37-05:00

Researchers are often driven by a hunch, a practical problem or a gap in existing knowledge. However, successfully translating research questions into data collection and analysis methods requires skills and experience. This workshop will review commonly used methods for collecting primary sources data (questionnaires, interviews, observations), as well as qualitative and quantitative approaches to data analysis. At the end of the workshop, participants will have a better understanding of methodological options and issues that affect their research inquiries.

An Introduction to Wikidata2020-02-04T14:43:16-05:00

This workshop has been cancelled due to unforeseen circumstances. Please accept our apologies for this late notice.

If Wikipedia aims to provide access to the sum of all human knowledge, Wikidata aims to structure it. The newest project of the Wikimedia movement, Wikidata is a collaboratively edited, free repository of linked open data that connects knowledge across all 301 language editions of Wikipedia and its sister projects. This workshop will introduce attendees to Wikidata and its applications to the digital humanities, with opportunities for hands-on editing.

Equipment Requirements: laptop

Critical Data Methods: Theory & Praxis2020-01-17T11:34:49-05:00

Whether in the classroom or archive, humanities scholars and students often encounter data methods as means to an end. Processes like data modeling, analysis, and visualization — sometimes represented by particular applications or technologies — populate the proverbial DH toolbox, equipping practitioners to pursue data-driven research and project-based learning curricula. But, while these data-oriented skills and tools frequently facilitate incredible research and classroom practice, they aren’t always accompanied by a robust critical framework that centers historical, ethical, and justice-oriented concerns.

In this workshop, we will approach basic concepts in data (including data taxonomies and applications) from a critical data studies perspective. Rather than taking a tool- or software-oriented approach, we will collaborate on ways to “do” and teach data that are informed by feminist, critical race, and indigenous theories of information. Keeping in mind this year’s theme — “Histories and Representations of Communities Across the Five Boroughs” — we will engage with local archival materials and other humanities content in order to develop data praxes that are situated and self-reflective.

Participants can expect to:

  • become familiar with types of data, including structured and unstructured data
  • think critically about ways to model their research or teaching data
  • begin to explore key theorists and concepts in critical data studies, including data feminism
  • participate in an exercise that enacts critical data pedagogy by bringing humanities methods to data modeling
  • situate their own use of data within historical and epistemological matrices
  • collaborate on a shared document featuring critical data resources

This workshop is designed for humanities scholars and students who are interested in pursuing data-driven work and who want to develop critical — rather than purely instrumental — data practices. Instructors and researchers who already work extensively with data are also welcome, regardless of discipline!

Equipment Requirements: Laptop recommended (Chromebooks OK)


Introduction to OpenRefine2020-01-27T10:38:09-05:00

OpenRefine is a popular open-source application for data analysis, clean up, and enrichment. It can help you prepare your digital humanities dataset for further analysis and visualization through:

  • text filters and facets
  • batch editing
  • assisted clustering of terms
  • splitting and merging values
  • advanced transformations, such as regular expressions

It also allows you to export your operation history, which helps with research reproducibility.

In this beginner-level workshop, we’ll cover the basic features and functionalities of OpenRefine, with a taste of more advanced operations using GREL (General Refine Expression Language) and regular expressions. We’ll also have a discussion about how data cleaning fits into our digital humanities work, inspired by Katie Rawson and Trevor Muñoz’s article “Against Cleaning” (

Equipment Requirements: If possible, please bring your own laptop. Some laptops will be available to borrow on site. If you will be using your own laptop, please install the latest stable version of OpenRefine (available at ahead of time. Workshop datasets will be made available for download closer to the date of the sessions.

OpenRefine for Beginners2020-01-27T14:26:21-05:00

Looking to organize and rearrange a large spreadsheet for a project? Join us for an interactive, step-by-step introduction to OpenRefine, an open source desktop application described as “a powerful tool for working with messy data.” This session will cover OpenRefine basics including editing and reconciling data, transforming data into different formats, and connecting to external data sources like Wikidata.

Equipment Requirements: Participants will need to bring a laptop with OpenRefine 3.3 and Google Chrome installed. We can guide participants through installation at the beginning of the workshop if needed. Sample data sets will be provided.

Using IMDb as a Dataset for Digital Humanities2019-01-23T22:56:18-05:00

Cindy Conaway, an associate professor in Media Studies and Communication and Diane Shichtman an associate professor in Information Systems at SUNY Empire State College will discuss using the Internet Movie Database (IMDb) and its advantages and challenges as a dataset for Digital Humanities. In many ways IMDb is an excellent source for Digital Humanities projects and gives media studies scholars a new way to use Digital Humanities. The organization makes it free to download a great deal of its very robust data. However much of IMDb’s data is inconsistent, incomplete, and often wrong or misleading. The downloadable information is also limited to certain categories. This presentation will also discuss the challenges of interdisciplinary work, and how changes in IMDb’s process over several years, and differing views available to scholars can also create issues as we have found in our project tracing connections using the show Seinfeld.

Requirements: none.

Sustaining and Growing your DH Projects2017-01-22T23:59:22-05:00

What does it take for a DH project to go from concept to community treasure? While some DH projects are purely experimental, many project leaders are eager to see their work grow and develop over time and become useful to a significant community of scholars and students.

This workshop will introduce digital project leaders to the basics of dynamic sustainability, the notion that for a project to continue to grow and develop over time, its leaders must create and encourage an ongoing cycle of support. Using examples of success stories from the field, the session will offer an outline of some practical steps you can take to develop a reliable sustainability model, exploring the role of audience, the host institution, and the potential for a range of revenue sources. Participants at all stages of work, from developing proposals to running established projects, are encouraged to attend.

Workshop leader Nancy Maron is author of Sustaining the Digital Humanities, Guide to the Best Revenue Models and Funding Sources for your Digital Resources, and several other reports and case studies concerning strategies for DH support. Prior to founding BlueSky to BluePrint, she led the Sustainability and Scholarly Communications team at Ithaka S+R. She currently serves as President of the Board of the Yonkers Public Library.

Skill Level


Equipment Requirements


Dealing with Messy Data using Open Refine and other tools2017-01-15T03:58:51-05:00

The raw data received or compiled for an analysis project is often messy, inconsistent, or in the wrong format. Learn how to use Google Open Refine and Microsoft Excel to transform data into the structure you need to conduct analysis and successfully complete your project.

Skill Level


Equipment Requirements
Laptop with Open Refine installed