The Project
Working definition of bias:
Bias (n.) disproportionate weight in favour of or against an idea or thing
Biased datasets collections of data that disproportionately represent certain perspectives or groups
Mission¶
Our mission is to promote ethical and responsible data practices in the humanities and social sciences by developing comprehensive guidelines for creating datasets that are not only technically sound, but also fair, transparent and inclusive, and respectful of diverse perspectives and experiences.
We believe this research is relevant and necessary, as epistemic systems and data creation are deeply intertwined. The increasing use of data in the field of SSH has allowed for more diversity in approaches to research questions, as well as broaden the possible range of questions that can be asked. However, historical data used in SSH research is by nature overwhelmed by biases, impacted by the author’s positionality and context, producing problematic language and categorisations. The complexity of this issue in the field of SSH lies within the fact that these skewed perspectives cannot simply be replaced with less biased alternatives (a ‘rewriting’ of the archives). Instead, we believe it is the responsibility of the dataset creators to critically reflect on the biases of the source data, and to acknowledge, contextualise and inform users about these. Combatting Bias underlines the necessity for responsible and transparent strategies to navigate these biases. Thus, we aim to create lasting and flexible handholds for dataset creators to guide them in this process.
We will therefore produce a set of guidelines for data creation that we call FAIR+. These guidelines will be rooted in the existing FAIR principles, aimed at developing a technically sustainable data creation process; combined with crucial ethical considerations enmeshed in the field of SSH, which have become more pronounced as the field increasingly uses digital infrastructures to support and enable research. This encourages the creation of documentation and datasheets that critically and transparently reflect on our data and data curation practices. In line with the FAIR+ guidelines, we will additionally create reusable templates that can be used for ethical data documentation. Through this, we not only hope to establish an ethical and sustainable framework for future data collection and curation, but also encourage that different, less represented data is collected.
Network of expertise¶
Combatting Bias is a collaborative project. We are grateful to work closely together with four partner projects and ten advisors, who will shape our guidelines, through insights into their experiences and expertise in dataset creation, (digital) heritage, decolonisation, fair data, and DEI. Through the wide range of perspectives, we will be able to develop comprehensive FAIR+ guidelines that effectively address data bias in SHH. The project itself is also embedded in two research institutes: Huygens Institute and the International Institute of Social History (IISH).
Objectives¶
In short, the objectives of the project are threefold:
- Identify biases: critically reflect on existing datasets and the dataset creation process. This leads to insights into what is disproportionately represented in datasets.
- Mitigate biases: develop guidelines to aid researchers in creating ethical and sustainable datasets through transparent, contextual and FAIR documentation.
- Promote: actively encourage the use of dataset creation guidelines in SSH research and active (re)consideration of certain concepts (such as gender and ethnicity) used in SSH datasets. Additionally, data collection of underrepresented perspectives is encouraged to be undertaken to mitigate biases.
- Build a network: exchange knowledge and experiences with experts from different fields within SSH