Computer assisted enhancement of the reproducibility of qualitative data analysis in the social sciences

Computer assisted enhancement of the reproducibility of qualitative data analysis in the social sciences

Principal Investigator: Martin Hájek

in cooperation with the Institute of Philosophy, Czech Academy of Sciences (Radim Hladík)

Team members: Martin Hájek, Radim Hladík, Nina Fárová, Michael Škvrňák

Application Guarantor: Faculty of Social Sciences, Charles University


The project aims to strengthen the reproducibility of qualitative data analysis (QDA) in social science research and thus increase its social accountability. The project responds to the so-called reproducibility crisis related to the low robustness of published scientific results. There are limited opportunities to validate results in social science disciplines working with QDA, which reduces credibility for other researchers, providers, and evaluators. The project will develop and test a tool for computer-assisted QDA with elements to support its reproducibility to achieve its objective. It will combine insights from the philosophy of science and qualitative social science methodology with techniques for computer analysis of unstructured data.

The effort to increase the reproducibility of research is not a new idea; it has been with science since its beginnings. Our project is novel in that it focuses on an area of qualitative data analysis that has resisted traditional efforts to increase "reliability" because quantitative research standards cannot be mechanically transferred to it. Therefore, previous attempts have led to the development of new analytical procedures, of which Grounded Theory (GT) has become a classic. With some exaggeration, it can be said that we are attempting a software update of GT that will have implemented support for reproducibility of the analysis. It will be necessary to rethink how to preserve the hermeneutic principles of qualitative analysis and open coding while finding a way to quality control their application to specific data. It will, therefore, not be a new method but a new computer-assisted protocol that will have built-in elements that enhance reproducibility. These elements will not be new in themselves. Still, they will build on existing computer text analysis and natural language processing techniques. However, these existing practices will be innovatively modified to check reproducibility by creating new indicators (e.g. intercoder agreement assessment, coding density evaluation).  Another innovative feature will be the evaluation of gender neutrality of coding, which has never been implemented in QDA software before and which, in the case of team coding, will compare the comparability of coding procedures according to the gender of the coders. To date, qualitative research has used digital methods mostly passively because it has only translated manual procedures into digital form. Our solution offers a proactive approach that leverages digital affordances (possibilities for action) to innovate qualitative methods in ways that are not feasible through traditional routes.