The Language Data Commons of Australia aims to ensure long-term access to language data collections for analysis and reuse. Sustainable management of and access to these significant collections of intangible cultural heritage are underpinned by two sets of complementary guiding principles of data management and stewardship, namely the FAIR and CARE principles.
FAIR Principles
The FAIR principles were first published by a group of stakeholders representing academia, industry, funding agencies and scholarly publishers. The principles aim to address issues surrounding data management and stewardship, focusing on four areas (which provide the FAIR acronym).
Findable
Metadata and data should be easy to find for both humans and computers. Making the data findable includes:
- (Meta)data are assigned a globally unique and persistent identifier
- Data are described with rich metadata
- Metadata clearly and explicitly include the identifier of the data they describe
- (Meta)data are registered or indexed in a searchable resource
Accessible
Once the user finds the required data, they need to know how they can be accessed, possibly including authentication and authorisation.
- (Meta)data are retrievable by their identifier using a standardised communications protocol
- The protocol is open, free and universally implementable
- The protocol allows for an authentication and authorisation procedure, where necessary
- Metadata are accessible, even when the data are no longer available
Interoperable
The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage and processing.
- (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation
- (Meta)data use vocabularies that follow FAIR principles
- (Meta)data include qualified references to other (meta)data
Reusable
The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
- (Meta)data are richly described with a plurality of accurate and relevant attributes
- (Meta)data are released with a clear and accessible data usage license
- (Meta)data are associated with detailed provenance
- (Meta)data meet domain-relevant community standards
The Australian Research Data Commons (ARDC) and LDaCA support FAIR data practices and initiatives that make data and related research outputs FAIR. At the same time, the ARDC and LDaCA acknowledge that the implementation of the principles will look different across disciplines and will need discipline-specific approaches and standards.
CARE Principles
The CARE Principles for Indigenous Data Governance were developed by the Global Indigenous Data Alliance (GIDA); they are a response to the FAIR principles and aim to complement them. GIDA highlights how the FAIR principles and the open data movement focus on increasing data sharing among researchers and entities but do not take into account power differentials and historical contexts or fully engage with Indigenous Peoples’ rights and interests. These include the rights to generate value from Indigenous data in ways that are grounded in Indigenous worldviews and to advance Indigenous innovation and self-determination. The CARE Principles also focus on four areas.
Collective Benefit
Data ecosystems shall be designed and function in ways that enable Indigenous Peoples to derive benefit from the data.
- For inclusive development and innovation
- For improved government and citizen engagement
- For equitable outcomes
Authority to control
Indigenous Peoples’ rights and interests in Indigenous data must be recognised and their authority to control such data be empowered.
- Recognizing rights and interests
- Data for governance
- Governance of data
Responsibility
Those working with Indigenous data have a responsibility to share how those data are used to support Indigenous Peoples’ self-determination and collective benefit.
- For positive relationships
- For expanding capability and capacity
- For Indigenous languages and worldviews
Ethics
Indigenous Peoples’ rights and wellbeing should be the primary concern at all stages of the data life cycle and across the data ecosystem.
- For minimizing harm and maximizing benefit
- For justice
- For future use
The Australian Research Data Commons (ARDC) and LDaCA support the CARE principles to further extend data management principles, ensuring that Indigenous communities benefit from the data, and that authority to control Indigenous data is held by Indigenous Peoples. We have a responsibility to share how data is used to collectively benefit Indigenous Peoples, and that Indigenous Peoples’ rights and wellbeing are the primary concern at all stages of the data life cycle.