OpenRefine is a popular open-source application for data analysis, clean up, and enrichment. It can help you prepare your digital humanities dataset for further analysis and visualization through:
- text filters and facets
- batch editing
- assisted clustering of terms
- splitting and merging values
- advanced transformations, such as regular expressions
It also allows you to export your operation history, which helps with research reproducibility.
In this beginner-level workshop, we’ll cover the basic features and functionalities of OpenRefine, with a taste of more advanced operations using GREL (General Refine Expression Language) and regular expressions. We’ll also have a discussion about how data cleaning fits into our digital humanities work, inspired by Katie Rawson and Trevor Muñoz’s article “Against Cleaning” (http://curatingmenus.org/articles/against-cleaning/).
Equipment Requirements: If possible, please bring your own laptop. Some laptops will be available to borrow on site. If you will be using your own laptop, please install the latest stable version of OpenRefine (available at http://openrefine.org/download.html) ahead of time. Workshop datasets will be made available for download closer to the date of the sessions.