2025 Open Research Online Devroom talks

Date and time: Saturday the 15th of February

The call will be hosted on Zoom, allowing us to provide real-time captions and translation.

Schedule:

S/No.	TIME (CET)	Title	Author	Chair
1	11:30-12:00	Is your Community implementing the Diversity Equity and Inclusion (DEI) equivalent of Python 2.6?	Rowland Mosbergen	Sara Petti
2	12:00-12:30	Will AI coding assistants kill FLOSS in research software engineering?	Giuditta Parolini	Sara Petti
3	12:30-13:00	A Universe to be Decided	Mike Smith	Sara Petti
	13:00-13:20	Discussion/break		Sara Petti
4	13:20-13:50	Creating an Open Knowledge Graph for Climate	Shweata Hegde	Mathieu Jacomy
5	13:50-14:20	Cartography of the Missing: Mapping Forced Disappearances in Jalisco	Angel Abundis	Mathieu Jacomy
6	14:20-14:50	Yanayi Project	Claire Depardieu	Mathieu Jacomy
	14:50-15:10	Discussion/break		Mathieu Jacomy
7	15:10-15:40	Research on (re)search: FLOSS as an open knowledge infrastructure	Renée Ridgway	Jim Madge
8	15:40-16:10	TMI-WEB: An Open Approach to Computational Social Science	Coraline Ada Ehmke	Jim Madge
	16:10-16:40	Final discussion		Jim Madge

Abstract of Talks

Is your Community implementing the Diversity Equity and Inclusion (DEI) equivalent of Python 2.6?

Abstract: Many open source research communities want to be more diverse, equitable and inclusive. However many DEI strategies fail because they think they are implementing the DEI equivalent of Python 3.13 when they are actually implementing the DEI equivalent of Python 2.6.

This presentation is going to show you how to check if you are currently on an older version and how to upgrade if you are. I show how you can make a difference as an individual, organisation, or a community in practical ways that can be implemented easily.

I also show examples of how I have embedded these practices into recruitment, internship programs, leadership courses, conferences, and fellowships - and the impact that they have had.

Speaker: Rowland Mosbergen. Talk licence: CC-BY4.0

Will AI coding assistants kill FLOSS in research software engineering?

Abstract: AI coding assistants are becoming popular tools for writing code and documentation among research software engineers (RSE). The legal implications of using AI coding assistants, however, are ever hardly considered. Yet, these legal implications can threaten the future development of Free/Libre and Open Source Software (FLOSS) in research due to multiple Intellectual Property Rights (IPR) issues ranging from the copyright status of AI generated code to the potential infringement of third-party IPR rights.

The aim of the talk is to draw attention to the most pressing legal issues that RSE can face when they decide to use AI coding assistants. Copyright laws and licensing practices have been crucial in shaping the world of free/libre and open source software as we know it today. RSE should be fully aware of the risks posed to FLOSS by the use of AI coding assistants and should carry out a risk-benefit analysis before they decide to adopt these tools. Description AI coding assistants are becoming popular tools for writing code and documentation among research software engineers (RSE). A recent survey of academic postdoc attitudes towards AI tools suggests that over half of the researchers taking up these tools use them for generating, editing, and troubleshooting code (Nordling 2023). Organisations active in creating training materials for RSE are also considering the use of these tools in their teaching curriculum.

The legal implications of using AI coding assistants in research, however, are ever hardly considered. Yet, these legal implications can threaten the future development of Free/Libre and Open Source Software (FLOSS) in research due to multiple Intellectual Property Rights (IPR) issues ranging from the copyright status of AI generated code to the potential infringement of third-party IPR.

The talk takes inspiration from a blog post on a popular AI coding assistant, GitHub Copilot, written by a lawyer and open source contributor, Matthew Butterick, and entitled “This Copilot is stupid and wants to kill me”. The aim of the talk is to review the most pressing legal issues that RSE can face when they decide to use AI coding assistants, and how their choice of tools can prevent them from releasing their own code under a free/open source license, make them breach the re-use conditions of open source software they are building on, and also expose them to litigations due to inappropriate use of copyrighted material.

IPR laws and regulations for software have been crucial in shaping the world of free/libre and open source software as we know it today. FLOSS licensing models have been supporting the development of tools that can be used by everyone royalty-free, and fit in with the open science commitment embraced by many funders of academic research. The usefulness of these tools is confirmed every day by growing user communities, including RSE communities around the world. RSE should be fully aware of the risks posed to FLOSS by the use of AI coding assistants and should carry out a risk-benefit analysis before they decide to adopt these tools.

Speaker: Giuditta Parolini Talk licence: CC-BY

A Universe to be Decided

Abstract: Deep learning’s current ““hot topics”” are foundation models in the vein of ChatGPT and Chinchilla. These remarkably simple models contain a few standard deep learning building blocks and are trained through the prediction of the next item in a sequence. Surprisingly, these models’ performances scale with dataset and model size via a predictable power law. Even more astoundingly, these models have been shown to display ““emergent abilities”” such as knowledge (albeit not ““understanding””) of arithmetic, law, geography, and history.

In 2022, a team at Google DeepMind discovered that–optimally–the size of these foundation models should be scaled in a roughly equal proportion to the size of the dataset used to train them. This means that state-of-the-art foundation models are limited by dataset size, and not model size as previously thought. Astronomy in theory is awash with data suitable for training an astronomical foundation model, and so such a model would not be data constrained.

At UniverseTBD we are coming together as a community to develop and provide such high quality multi-modal open public datasets to advance the cutting edge of both deep learning and astronomy. We want to use these data to train open astronomical foundation models that can be used both to advance science, and also to inspire the next generation of astronomers. We will talk about how we are creating such datasets and models and, more importantly, show how you can get involved.

Speaker: Mike Smith

Creating an Open Knowledge Graph for Climate

Abstract: semanticClimate is a global hybrid community where interns from colleges (mainly in India) create climate knowledge to help the world make informed decisions. We work with trustable material such as the UN/IPCC reports (over 15,000 pages of important, but dense text). The resulting knowledge products include:

term-based dictionaries (ontologies) enhanced with Wikipedia and Wikidata
a Corpus tool for scraping and analysing the current Open scholarly literature
a knowledge graph created from the above with navigation tools and F/OSS software to make this easy and automatic.

semanticClimate interns come from high-school up and need have no knowledge of software. They learn-by-doing, and in some weeks have 2-hour online sessions daily - these are recorded and transcribed to text for all to see. Interns are encouraged to give public talks (e.g. OKFN, Wikipedia, CODATA) and to make 5 min videos. All software is modular, Git-branched, versioned and unit-tested. Where possible we publish it in J. Open Source Software.

Speaker: Shweata Hegde

Cartography of the Missing: Mapping Forced Disappearances in Jalisco

Abstract: This project maps forced disappearances in Jalisco, Mexico, at the neighborhood scale, identifying clusters of incidents within radii as small as 500–700 meters. Using natural language processing (NER) to extract addresses from disappearance records, combined with clustering algorithms and GIS tools, the project visualizes spatial patterns and motivations behind disappearances. The interactive map categorizes data by variables like gender, status, and timeframe, revealing high-risk zones and recurring criminal patterns. This session will detail the mapping process, discuss the map’s insights, and invite feedback to improve data-driven tools for combating this humanitarian crisis.

Speaker: Angel Abundis

Yanayi Project

Abstract: Rural populations in Sub-Saharan Africa and Haiti are heavily dependent on natural resources, making them particularly vulnerable to climate change. In this context of climate emergency, it is crucial to understand their vulnerabilities and value their knowledge and adaptive capacities. Operating within the framework of open science, the participatory action-research project "Yanayi" has involved rural communities from eight countries of the French- speaking Global South to collect local knowledge and develop tools for reflection, awareness, and guiding collective action on climate change. As a result, 478 climate change narratives from elders were collected in local languages, translated into French, and integrated into an open qualitative database (Epicollect5). This conference will first present the "Yanayi" project, highlighting achievements made possible by the involvement of rural communities in collecting local knowledge and creating tools to act on climate change. Student projects completed in 2023 and those currently in preparation will then be presented. Keywords: population resilience, narratives, local knowledge, adaptation strategies, climate change.

Speaker: Claire Depardieu

Research on (re)search: FLOSS as an open knowledge infrastructure

Abstract: This talk delves into the values and ethics of Free Software, traversing its negotiations of knowledge and power by Silicon Valley behemoths that are all built on foundations of Linux servers and open source software. It follows the evolving nomenclature of F(L)OSS to open source, which has been offered as a business model for (corporate) infrastructure; simultaneously it recounts how software developers worldwide built new infrastructures and technical solutions based on shared resources and open code. Yet was Free Software a dispositif (Foucault), an ideology (Stallman), a strategy, as with open source (Raymond), or an “open knowledge infrastructure” that is not understood enough? To grasp this interplay between these technical infrastructures as artefacts and human organisation, as well as the ethics and politics implicit within their design, speaking directly with those immanent is essential.

Semi-structured interviews with developers, academics, geeks and awardees from NGI (Next Generation Internet) Search projects, which have to be made open source upon completion, put forth the value openness in relation to other core ethics: useability, modifiability and maintenance. A critical discourse analysis interweaves excerpts from the interviews structured by Star and Ruhleder’s dimensions (1996), or Star’s properties (1999), of an emerging infrastructure. The Free Software movement of past decades resurfaces through the makers/practitioners of today, who, as a recursive public (Kelty 2008), elucidate the techno-infrastructural through the medium itself (code). These interventions go beyond the critique of open source as a solution to proprietary software and copyleft ideology to show how they differ more on an ethical, than a technical, level. Drawing on a chapter for the forthcoming book Politics of Open Infrastructures, this presentation demonstrates how F(L)OSS can be deemed an open (knowledge) infrastructure through its reorientation of knowledge and power, whilst revealing some of the values and ethical considerations embedded within developing open source (search) projects.

Speaker: Renée Ridgway

Abstract: Can open source help us understand each other better? What if we could use open source to explore our intersecting identities and discover connections between our shared struggles and joys?

This presentation introduces TMI-WEB, a computational social science research tool that pairs a custom Ruby and Rails web application with the graph database Neo4j to create detailed models of the multifaceted nature of intersecting social identities (e.g., race, gender, or socioeconomic status). These models come to life as interactive visualizations of interconnected nodes in an explorable universe of identities, helping social scientists uncover connections between individual and social experiences.

We are developing TMI-WEB to address critical ethical and technical challenges at the intersection of social science and computing. The project aims to provide an open alternative to proprietary research tools, advance the field of computational social science, and support research that contributes to a more equitable society.

Built in the open and designed to support innovative social research methodologies, TMI-WEB enables scalable operationalization of intersectional analysis across large, text-based datasets—a rare capability in qualitative research. From research design to technical design, TMI-WEB was developed to foster openness, accessibility, and collaboration while advancing the goals of broad social good and societal well-being.

The TMI-WEB source code repository is at https://github.com/identity-research-lab/tmi-web.

Speaker Names: Coraline Ada Ehmke, Jess Parris

Bio: Coraline Ada Ehmke is an internationally recognized tech ethicist and software engineer. For more than a decade, she’s worked on practical approaches to promoting the values of diversity, equity, and justice in the technology industry, with a particular focus on open source. She is the creator of Contributor Covenant, the first and most popular code of conduct for digital communities, and the Hippocratic License, an innovative software license designed to promote and protect human rights. Coraline co-founded the Organization for Ethical Source (https://ethicalsource.dev) and serves as its Executive Director.

Dr. Jess Parris Westbrook is a tenured Associate Professor of Design at DePaul University’s Jarvis College of Computing and Digital Media in Chicago, Illinois, USA, and the Director of the Identity Research Lab. Dr. Westbrook identifies as a postnormal social researcher and critical designer, with a passion for mixing methods, crafting probes and provocations, and connecting data with imagination. Their professional background bridges studio art, applied design, academic research, and education, valuing a commitment to transdisciplinary approaches and innovative collaborations.