iCRF Generator Extended Edition - Opening Interoperable Science (iCRF-OS)
dr. J.A.M. Beliën – AMC
Team member: Sander de Ridder (AMC)
Interoperability of clinical data can be achieved by using international thesauri, such as SNOMED CT, when setting up case report forms (CRFs). This is, however, complicated and time consuming. With the iCRF Generator, standardised data definitions, such as the Clinical Building Blocks, can be reused to generate CRFs for various electronic data capture systems. The data then collected is interoperable with other data that uses these standard definitions. In this project, the iCRF Generator will be expanded with: 1) ODM export, the international standard for clinical trials; 2) support for alternative codebook sources; 3) Improved user interface for codebook selection.
Case-study Research & Data Reuse (CaRe & DaRe)
prof. J.J. Berends – VU
Team members: Fleur Deken (VU); Kacana Khadjavi Pour (VU); Ricarda Braukmann (DANS); Freek Dijkstra (SURF)
Qualitative case-studies are widely used in many fields, but the reuse of qualitative case study data has received little attention in open science recommendations. Qualitative case study data is seldom reused due to privacy and confidentiality restrictions. In collaboration with DANS and SURF, this project develops a novel decentralized procedure for the reuse of qualitative case study data, which maintains sovereignty over data, and makes data FAIR without open sharing. It will be tested in a pilot with data collected by different researchers on ‘fieldlabs’ as innovation instrument. Thus, the project advances open science in hitherto neglected areas.
Q2O: Closing the gap between questionable and open research practices using a metacognitive tool.
prof.dr. A.B.H. de Bruin – UM
Team members: Wisnu Wiradhany (RUG); Farah Djalal (KU Leuven)
There is limited evidence on the efficacy of open science research practices (ORP) to minimize questionable research practices (QRP); Two related practices that might involve different cognitive processes. Can we close the gap between QRP and ORP using metacognitive interventions, which have been shown to improve behavior calibration in educational settings? In the first study, we aim to investigate the (meta)cognitive processes that are involved when researchers take part in ORP and QRP. We will use this information to develop a metacognitive tool with the aim to boost the commitment to ORP and lower QRP in the second study.
Next stage development of SciPost’s publishing infrastructure (SciPostPI))
dr. J.S. Caux – UvA
Team members: Paula Perez (SciPost); Sergio Tapias Arze (SciPost); Jan Willem Wijnen (SciPost)
Based at the Institute of Physics (UvA), the SciPost team is developing a non-profit open science publishing platform. Governed by open science principles, SciPost journals publish open access, without paywalls or costs to authors. SciPost currently publishes five journals in physics and is now expanding to other disciplines. Working with several scientific communities in parallel necessitates further development of IT infrastructure and web app. This project will upgrade SciPost’s infrastructure, allowing working with multiple teams employing parallel editorial processes and dealing with growing numbers of authors and manuscripts. Subsequently more research communities can publish their own open science journals.
The anatomical connectome of the brain
dr. N.L.M. Cappaert – UvA
Team member: Niels van Strien (UvA)
Much research time has been devoted to determining anatomical interconnectivity of the brain. However, effective utilization of the resulting publications is hampered by the large number of available studies, by the lack of standardization in anatomical nomenclature and the use of neuro-anatomical jargon. In this project we will assemble brain connectivity data from these publications in a peer-reviewed, public database containing brain connections using standardized terminology and related metadata. This information will be made accessible through a web-based connectome. This will help brain scientist to create a better understanding of fundamental brain processes and diseases.
WikipediaCitations: A FAIR dataset of Wikipedia’s citations to its sources
dr. G. Colavizza – UvA,
Team members: Ludo Waltman (LU) ; Nees J. van Eck (LU); Silvio Peroni (University of Bologna); Leila Zia (Wikimedia Foundation Research)
Wikipedia is an essential component of the open science ecosystem, yet it is poorly integrated with academic open science initiatives. We propose to create WikipediaCitations: a FAIR dataset of citations from Wikipedia to all its sources. Citations will be enriched with permanent identifiers and citation statements and ingested as linked data in OpenCitations. WikipediaCitations will include a documented codebase to replicate and expand upon results. This project promises to advance the open science agenda by contributing high-quality citation data useful in ‘altmetrics’ and a variety of third-party applications, and in researching and improving the reliability of Wikipedia’s contents.
A metaDAta Publication Toolbox (ADAPT): An open-source toolbox to bridge the gap between research communities and data repositories
prof.dr. M.R. Drury - UU,
Team members: Richard Wessels (UU); Lora Armstrong (TUD); Madeleine de Smaele (4TU.ResearchData); Otto Lange (UU)
Researchers publish increasing amounts of data in archives called data repositories. It is important that the right descriptive terms (metadata) are added to datasets, allowing data to be found and reused. Data repositories typically allow the addition of generic metadata, but cannot be easily adapted to accommodate discipline-specific metadata schemas and vocabularies developed by international research communities. We propose to develop an open-source toolbox that helps both researchers and data managers to easily assign discipline-specific and internationally recognized metadata to their data publications. This will significantly increase the potential for future re-use of data.
Open Science Escape Room (OSER)
dr. A. Eerland - RU
Team members: Karin Fikkers (UU); Victor van Doorn (Sherlocked); Francine Boon (Sherlocked)
To establish a real culture change in science it is crucial to engage as many scholars as possible. Even though Open Science initiatives spring up like mushrooms, very few of them target scholars with little to no experience or affinity with Open Science. This is not surprising because these scholars are difficult to reach and motivate for Open Science. Yet they are needed to make Open Science the norm. This project will introduce these scholars in a playful way to the different aspects of Open Science through an online escape room and involve them in the discussion on its implementation.
dr.ir. S. Faez – UU,
Team members: Maarten Voors (WUR); Antonio Forner Cuenca (TUE); Yali Tang (TUE)
To combat climate change and meet goals of the Paris Climate Agreement, an unprecedented rapid shift to sustainable electrification is expected to happen in the next decade. This requires urgent action to increase energy generation and especially storage capacity. No single technology can scale up in such a short time and large scale, without decentralized open innovation and rapid knowledge diffusion. We aim to set up a common knowledge pool and to build and train an expert community to enable rapid global development of open-hardware electrochemical batteries, by focusing on a versatile and locally resourced platform of redox-flow batteries.
Integration of interactive research environments to data repositories to facilitate FAIR data management practices: JupyterFAIR
dr. S. Girgin – UT
Team members: Connie Clare (4TU.ResearchData); Jose Urra Llanusa (TUD); Manuel Garcia Alvarez (TUD)
Many researchers use virtual research environments, such as JupyterLab, where substantial data produced during the whole research lifecycle. However, data publishing and sharing typically happen only at the end of the research and shared data often lack important metadata, mainly due to the need of manual inputs. This project aims to develop and operationalize a tool (JupyterFAIR) for 'one-click' and seamless integration of research environments and data repositories, including metadata transfer and data quality checks. The tool will significantly decrease manual intervention needed to archive research data and promote more frequent data sharing in line with FAIR principles.
Strengthening the foundations of Open Science practices at the Amsterdam Science Park Study Group
dr.ir. J. Goedhart - UvA
Team members: Marc Galland (UvA); Stacy Shinneman (UvA); Like Fokkens (UvA)
The Amsterdam Science Park Study Group is a local community of life scientists that share their expertise, organize training and promote good practices in data-related topics such as data analysis, programming and data management. Engaging early-career researchers through peer-to-peer mentoring and training is a valuable approach to foster a cultural change toward a more Open Science. This proposal aims to expand our current activities and to upscale our community to increase the impact of the Amsterdam Science Park Study Group on Open Science practices in the life sciences.
3DWorkSpace - an open science/interactive tool for 3D datasets
dr. J.L. Hilditch – UvA
Team members: Jitte Waagen (UvA); Leon van Wissen (UvA); Loes Opgenhaffen (UvA); Hugo Huurdeman (UvA)
New tools are urgently needed for 3D datasets to improve accessibility, facilitate engagement/interaction with the datasets and promote two-directional knowledge transfer. 3DWorkSpace will adapt the open source Voyager 3D digital museum curation tool suite (Smithsonian Institute) to promote interactive engagement with traditionally complex digital datasets. Embedded structured guidance/training for gaining competence and skills for interpreting 3D datasets will allow broader narratives to be generated and open up new avenues for knowledge publication through the creation of annotated personal 3D collections that can be tailored to specific learning goals or interests.
Raincloudplots 2.0. A robust and transparent data visualisation tool
Prof. Rogier Kievit - RUMC
Team members: Jordy van Langen (RU); Micah Allen (Aarhus University); Eric-Jan Wagenmakers (UvA)
"Good communication of scientific findings relies on good data visualization. However, classic approaches such as barplots often obfuscate, rather than illustrate, the underlying data. To meet this challenge, we created ‘Raincloud plots’: A statistically robust, scientifically transparent, reproducible and aesthetically pleasing framework for datavisualization. In this project, we will improve our framework and expand our audience through workshops, software expansions and integration of raincloudplots within existing open-source statistical software (JASP). Our innovative and interdisciplinary tool promises to improve transparency and data visualisation across a wide range of scientific fields by creating more accessible, transparent, and statistically robust approaches to data-visualization.created ‘Raincloud plots’. Raincloudplots show raw data, dis-tributional characteristics, and summary statistics in a single, uncluttered and aesthetically pleasing plot. Raincloudplots are available in multiple languages and have reached users across disciplines including psychology, neuroscience, chemistry, meteorology, and philosophy. Raincloudplots’ success has led to increasing demands for new features. The NWO Open Science fund would allow us to greatly increase the capabilities and reach of raincloudplots.
FAMTAFOS: Free automated multi-language text anonymization for open science
dr. B. Kleinberg – UvT
Team members: Maximilian Mozes (University College London)
Text data are increasingly used across research disciplines. However, ethical and data protection requirements impede open science practices and data flow between researchers and stakeholders. This project uses machine learning and natural language processing methods to replace sensitive information from text data automatically and render them anonymous. We will develop an existing prototype further into a free, locally usable desktop app that allows users to anonymize English and Dutch texts at scale while maintaining complete control over their data. This project will be a step towards anonymized and shared text data by default.
Fair Metrics for FAIR Software
dr. A.L. Lamprecht – UU
Team members: Michelle Barker (Research Software Alliance); Jonathan de Bruin (UU); Carlos Martinez-Ortiz (NLeSC); Jurriaan Spaaks (NLeSC)
Open Science entails that all research resources, including software, should be FAIR (findable, accessible, interoperable, reusable). This comes with a need to assess the ‘FAIRness’ of software. To foster the cultural change towards Open Science, it is vital that such metrics incentivize developers to create FAIRer software. In this project, we will evaluate suggested FAIRness metrics for software while focusing on their incentivization potential. We will then develop recommendations for the use of FAIRness metrics and automated assessment tooling. They will enable easy and transparent FAIRness assessment, and stimulate software-developing researchers of all disciplines to create FAIRer software.
Using semantic modeling to create FAIR open data for archaeological field survey: a showcase and toolkit (SEMAFORA)
dr. P.M. van Leusen – RUG
Team members: Tymon de Haas (LU); George Bruseker (Takin.solutions); Denitsa Nenova (Takin.Solutions); Sjoerd Siebinga (Delving.eu)
Field surveys have, since about 1970, been the main method by which archaeologists discover and record findspots and individual finds at the earth’s surface. Whilst for the Mediterranean area alone the documented finds already run in the millions, the lack of documentation standards effectively prevents researchers and heritage managers from conducting large-scale analyses. This project seeks to build and showcase a software toolkit that will allow them to share and query this fundamental and irreplaceable resource in a distributed, online form, taking advantage of existing work in so-called ‘semantic’ data modelling in the cultural heritage sector.
Leiden FAIR variation datapoint: Developing a FAIR LOVD (LFVD)
dr. M. Roos - LUMC
Team members: Ivo Fokkema (LUMC); Rajaram Kaliyaperumal (LUMC); Núria Queralt-Rosinach (LUMC); Peter-Bram ‘t Hoen (RUMC)
The Leiden Open Variation Database (LOVD) is an online database containing information on patients with genetic diseases. LOVD forms a worldwide network of unique, high-quality information on genetic variants and their association with disease from 1500000 individuals. LOVD databases are browsed yearly by 200000 institutes and receive hundreds of millions of requests from computer programs, but are not yet prepared for advanced, secure applications where multiple sources are queried at once (federated translational analytics, AI, machine learning). This project makes LOVD automatically Findable, Accessible, Interoperable, Reusable, and ‘computer-understandable’ (FAIR) to work with national and global FAIR infrastructure.
ChemEng KG – The Chemical Engineering Knowledge Graph
dr. A.M. Schweidtmann – TU Delft
Team members: Christoph Lange (Aachen University)
Flowsheet simulations are crucial for (bio-)chemical process development. However, no public database for flowsheet simulation files exists which is a major hurdle as knowledge from earlier simulations is not easily findable, accessible, interoperable, and reusable. We envision establishing an open-source knowledge graph database for flowsheet simulation data that is FAIR and linked to leading initiatives in the open science community. The “ChemEng KG” will accelerate process development in academia and industry. In addition, it will pave the way for automated process design through optimization and machine learning.
Tidystats: A reference manager for statistics
dr. W.W.A. Sleegers - TvT
Statistics are fundamental in the social sciences. Statistics underlie the description of relationships, tests of hypotheses, and serve as input for meta-analyses. Yet, statistics are often reported incorrectly and incompletely. tidystats, a reference manager for statistics, is a solution to help researchers improve how they report statistics. Similar to how reference managers are used to report citations, researchers can use tidystats to easily export and archive statistics, and dynamically insert their statistics into text editing software. This enables researchers to share and report all statistics from the many analyses they conduct, reproducibly and without error.
Common Language for Accessibility, Interoperability, and Reusability in Historical Demography (CLAIR-HD)
dr. R.J. Stapel - KNAW/IISG
Team members: Rick Mourits (KNAW/IISG); Bram van den Hout (KNAW/IISG)
One of the biggest challenges in the transition to open science is making data interoperable. Normally, ontologies and vocabularies are used to describe data, but these are generally problematic for historians as existing ontologies and vocabularies are insensitive to temporal variations. Within history, the subdiscipline of historical demography is a forerunner in dealing with this problem, as it studies large-scale reconstructions of populations and life courses. Historical demographers have designed their own ontologies and vocabularies to standardize historical data. We aim to gather these schemes, so that we can standardize existing insights into a common language for historical (demographic) data.
Umbrella HEART NL (Umbrella HEreditary heARt disease daTabase of the Netherlands)
dr. M.A. Swertz – RUG
Team members: Imke Christiaans (UMCG); Michelle Michels (ErasmusMC); Annette Baas (UMCU); Mariëlle van Gijn (UMCG)
Early recognition and treatment of inherited cardiac conditions, both of patients and of symptom free relatives, can save lives. To enable successful prevention and treatment, genetic data and long-term clinical outcomes of large patient groups need to be studied. However, currently in the Netherlands, this data is spread over many different hospitals and registries and not automatically extracted from patient records. We aim at developing a national database that automatically extracts data and acts as an umbrella resource for existing data(bases) used in research and diagnostics to stimulate combination and re-use of data to improve cardio genetic patient care.
ShareTrait: a data portal for making trait data interoperable and reusable
dr. W.C.E.P. Verberk – RU
Team members: Matty Berg (VU); Jacintha Ellers (VU)
More and more data are being collected on species traits, as a means to understand how species respond and interact with their environment. Despite these efforts, effective integration of such data is hampered by inadequate standardization and insufficient sharing of metadata. This project will streamline the collection, synthesis and reuse of data on key species traits (energy metabolism, development and fecundity). By developing pipelines for data reformatting and standardization, we will enable individual researchers to easily contribute their data to our new database, allowing the research community to tap into the wealth of existing data and achieve a synthesis.
Journal Observatory – Toward integrated information about the openness of scholarly journals
dr. L.R. Waltman – LEI
Team members: Nees Jan van Eck (LU); Tony Ross-Hellauer (Graz University of Technology); Serge Horbach (Aarhus University)
Lots of efforts are being made to promote open science practices in scholarly publishing. However, information on the openness of scholarly journals is highly fragmented. There are various data sources that provide information on specific aspects of openness, but there is hardly any integration of these data sources. We propose to create a Journal Observatory, an infrastructure that brings together journal information from different data sources and that makes this information openly available through a single interface. The Journal Observatory will help researchers to adopt open science practices and funders and other organizations to check compliance with open science policies.
Automatic Detection of Identifiers in Open Data
prof. dr. J.M. Wicherts – UvT
Team members: Chris Hartgerink (Liberate Science GmbH); Richard Klein (UvT)
Show research of prof. dr. J.M. Wicherts on Orcid.org
Privacy breaches pose a major risk in the dissemination of rich datasets in the medical, social, and behavioural sciences, particularly when the data involve sensitive information. Here, we develop and validate an open tool called Automatic Detection of Identifiers in Open Data (ADIODA) allowing researchers in these fields to proactively and readily identify information in datasets that could (inadvertently) be used to (re-)identify individuals. ADIODA will be an open tool that can be easily implemented in research workflows, data audits, and editorial procedures to help protect the privacy of participants whose information is used in openly shared datasets.
BridgeDb and Wikidata: a powerful combination generating interoperable open research
dr. E.L. Willighagen – UM
Team members: Denise Slenter (UM); Dr Martina Kutmon (UM); Marvin Martens (UM)
Like humans have a unique social security number and different phone numbers from various providers, so do proteins and metabolites have a unique structure but different identifiers from various databases. BridgeDb is an interoperability platform that allows combining these databases, by matching database-specific identifiers. These matches are called identifier mappings, and they are indispensable when combining experimental (omics) data with knowledge in reference databases. BridgeDb takes care of this interoperability between gene, protein, metabolite, and other databases, thus enabling seamless integration of many knowledge bases and wet-lab results. Since databases get updated continuously, so should the Open Science BridgeDb project.
FAIR reuse of SPARQL queries through open Web sharing
dr. R.L. Zijdeman - KNAW/IISG
Team members: Menzo Windhouwer (KNAW Humanities Cluster); Bram van den Hout (KNAW/IISG)
Linked Open Data (LOD) is a way to connect any data across the Web. It is a simple and ‘open’ way of making data FAIR: Findable, Accessible, Interoperable and Reusable. When retrieving Linked Open Data however, we write SPARQL queries that are not FAIR, because there is no open way to share SPARQL queries. This proposal combines two existing technologies to store and share SPARQL queries in a user friendly way. By doing so SPARQL queries become FAIR too, allowing for replication of research, reuse of queries and collaborative query building by the community at large.