CANARIE Awards up to $3.2M in Funding to 9 Research Teams to Develop Research Data Management Software Tools

The ability to reuse research data helps accelerate discovery, allows for reproducibility of scientific results, and maximizes return on investment of research funding

Funded software tools aim to strengthen Canada’s capacity to manage research data

[Ottawa, ON]

CANARIE, a vital component of Canada’s digital infrastructure supporting research, education and innovation, today announced nine successful recipients of its Research Data Management (RDM) funding call, announced in May 2018. This new funding will enable research teams to develop software components and tools to enable Canadian researchers to adopt best practices in managing data resulting from scientific research.

Data management practices impact the entire research lifecycle, from project planning and execution, to backing up data as it is created and used, and finally to its long-term preservation after the investigation is complete. RDM best practices help ensure the protection of data during the research lifecycle and beyond, and help meet the increasingly stringent requirements of research ethics and reproducibility.

The RDM stakeholder community’s broad engagement in CANARIE’s January 2018 consultation identified the priorities of this funding call.

“Canada produces some of the world’s best science. To stay competitive, our researchers need access to advanced computing and big data resources. Thanks to CANARIE, research teams across our country are developing new ways to access, store and share the massive amounts of data needed to make the life-changing discoveries that benefit all Canadians,” said the Honourable Kirsty Duncan, Minister of Science and Sport.

“CANARIE is proud to support the contribution these teams will make in strengthening Canada’s capacity to effectively manage research data”, said Mark Wolff, Chief Technology Officer at CANARIE. “Effective RDM practices not only maximize investments in science but can have a profound impact on accelerating discovery by simplifying access to data generated from scientific research.”

This funding is part of the Government of Canada’s $105 million investment supporting CANARIE through its 2015-2020 mandate.

Research Teams Awarded Funding

The following research projects will receive funding through this call. These projects contribute both to the priorities identified by the RDM stakeholder community: enriching [meta]data and discovery, federated repositories / interoperability, domain-specific repositories, data deposit and curation, preservation, persistent IDs / citability, data access and analytics, and data privacy and security; and support of the FAIR Principles: Findability, Accessibility, Interoperability, and Reusability of research data.

  • Canadian Health Omics Repository, Distributed (CanDIG CHORD) – Led by Dr. Guillaume Bourque, McGill University

CanDIG is a national project that allows collaborative analysis of human health genomics data distributed across the country, enabling stewards of this data complete, auditable control over data access. The CHORD project will create a federated Canadian national data service for privacy-sensitive genomic and related health data. It will also broaden the Canadian health research community’s access to the technologies and services being built by CanDIG and its international partners in the Global Alliance for Genomics and Health.

  • Dataverse for the Canadian Research Community – Led by Kate Davis, University of Toronto

Dataverse (DV) is an open-source research data repository platform, developed by Harvard University’s Institute for Quantitative Social Science with adopters and contributors from Canada, the US, and Europe. Originally architected to serve the needs of social science researchers with small to medium size data files, this project will adapt Dataverse’s software architecture to address the needs of a broad range of researchers in Canada through improved scalability, support for large data files, curation worksflows, and integration with Canadian storage and authentication providers.

Canadian researchers have access to many storage services suitable for the long-term preservation of digital content, including research data. The DuraCloud project will connect several Canadian preservation storage services via this software, which is maintained by the DuraSpace Foundation. As a result, Canadian researchers will be able to seamlessly access different storage services through a single interface.

  • FAIR Repository for Annotations, Corpora and Schemas (FRACS) – Led by André Lapointe, CRIM

Artificial intelligence-based applications require access to massive quantities of data. To enable Canada’s academic researchers to scale their AI-based projects such that they are competitive with private sector applications, large volumes of data must be coupled with detailed annotations. Annotated datasets allow models to be effectively trained and validated by machine learning algorithms.

The FRACS project will simplify the management of largescale datasets by facilitating the creation, storage, search, manipulation and sharing of their annotations.

  • Federated Geospatial Data Discovery for Canada – Co-Led by Eugene Barsky, Evan Thornberry, and Paul Lesack, University of British Columbia Library

Traditionally, research data repositories have relied on text-based searching. However, there is increasing demand for geographic components in research, examples of which include migration paths, the distribution of agricultural yields, infrared satellite imagery, the distribution of artifacts in an archaeological site, and the flow routes of water. The goal of this project is to create an extensible, open-source software method to search and discover Canadian geospatial research data using an interface specifically designed for maps, enabling users to discover geospatial resources in a more spatially-intuitive way.

  • Making Identifiers Necessary to Track Evolving Data (MINTED) – Led by Reyna Jenkyns, Ocean Networks Canada (ONC), University of Victoria

ONC operates world-leading ocean observatories and dynamic data repository services. While there has been a growing recognition of the benefits and need for data citations made evident by the introduction of the FAIR Principles, existing platforms and tools are currently only able to serve the needs of static or non-frequently updated datasets.

The MINTED project will apply best practices for dynamic dataset citation, Digital Object Identifiers (DOIs), and researcher ORCIDs into ONC’s Oceans 2.0 digital infrastructure.

  • Radiam: Management Software for Active Research Data – Led by Dr. Kevin Schneider, University of Saskatchewan

Research data, which may have value beyond the research for which it was collected, is often distributed across multiple storage devices, tools, and platforms. Simply knowing that a dataset exists, let alone finding it, presents a significant challenge. Radiam will provide a project-level metadata index of research data, regardless of where or how it is stored. Radiam will improve researchers’ ability to find and cite existing datasets by not only storing the location of the data, but also the standard and custom metadata records associated with it.

  • Managing the Research Data Lifecycle using Islandora – Co-led by Donald Moses and Rosemary Le Faive, University of Prince Edward Island (UPEI)

In collaboration with Simon Fraser University and the Islandora Foundation, UPEI will build research data management capacity and integrations using the latest version of Islandora, also known as CLAW. Islandora is an open-source software framework designed to help organizations collaboratively manage, discover, and share digital assets using a best-practices, standards-based approach. The project will develop integrations with identifier, metadata, authentication, storage, and dissemination systems, supporting the FAIR principles and the research data lifecycle.

  • Research Portal for Secure Data Discovery, Access and Collaboration – Co-led by Dr. Elizabeth Theriault, Ontario Brain Institute and Moyez Dharsee, Indoc Research

The Ontario Brain Institute (OBI) and Indoc Research have developed Brain-CODE, an extensible neuroinformatics platform designed to manage the collection, curation, analysis and sharing of different data types across several brain disorders.

To address the RDM needs of researchers studying disorders of the brain and other disease areas, this project will develop data portal software that will enable research teams to securely and seamlessly capture, query, and visualize patient data; collaborate and share datasets; and access support and training resources. The project will serve the needs of teams using Brain-CODE as well as those from collaborating institutions and the broader medical research community.

The projects funded through this call are on track to be completed before April 2020.

For more information, please contact:

Ela Yazdani
Director, Communications
CANARIE
[email protected] | 613-943-5432

About CANARIE

CANARIE strengthens Canadian leadership in science and technology by delivering digital infrastructure that supports world-class research and innovation.

CANARIE and its twelve provincial and territorial partners form Canada’s National Research and Education Network. This ultra-high-speed network connects Canada’s researchers, educators and innovators to each other and to global data, technology, and colleagues.

Beyond the network, CANARIE funds and promotes reusable research software tools and national research data management initiatives to accelerate discovery, provides identity management services to the academic community, and offers advanced networking and cloud resources to boost commercialization in Canada’s technology sector.

Established in 1993, CANARIE is a non-profit corporation, with the majority of its funding provided by the Government of Canada.

For more information, please visit: www.canarie.ca