Automated computer vision methods and tools offer new ways of analysing audio-visual material in the realm of the Digital Humanities (DH). While there are some promising results where these tools can be applied, there are basic challenges, such as algorithmic bias and the lack of sufficient transparency, one needs to carefully use these tools in a productive and responsible way. When it comes to the socio-technical understanding of computer vision tools and methods, a major unit of sociological analysis, attentiveness, and access for configuration (for both computer vision scientists and DH scholars) is what computer science calls “ground truth”. What is specified in the ground truth is the template or rule to follow, e.g. what an object looks like. This article aims at providing scholars in the DH with knowledge about how automated tools for image analysis work and how they are constructed. Based on these insights, the paper introduces an approach called “active learning” that can help to configure these tools in ways that fit the specific requirements and research questions of the DH in a more adaptive and user-centered way. We argue that both objectives need to be addressed, as this is, by all means, necessary for a successful implementation of computer vision tools in the DH and related fields.

Digital Humanities, Computer Vision, Image Understanding, Machine Learning, Ground Truth Generation, Explainable Artificial Intelligence, Active Learning
Netherlands Institute for Sound and Vision
dx.doi.org/10.18146/2213-0969.2018.jethc153
VIEW Journal
Authors who publish with this journal agree to the following terms:Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
VIEW Journal of European Television History and Culture; Vol 7, No 14 (2018): Audiovisual & Digital Humanities; 59-72

Musik, Christoph, & Zeppelzauer, Matthias. (2018). Computer Vision and the Digital Humanities: Adapting Image Processing Algorithms and Ground Truth through Active Learning. VIEW Journal, 7(14), 59–72. doi:10.18146/2213-0969.2018.jethc153