Highlights from RDA Plenary 13

(Matthew Viljoen) #1

Conference website: https://www.rd-alliance.org/plenaries/rda-thirteenth-plenary-meeting-philadelphia-us Conference programme: https://www.rd-alliance.org/rda-13th-plenary-programme

Opening Plenary Main speaker was https://www.rd-alliance.org/about/organization/key-profiles/julia-stoyanovich where the common theme of the conference was introduced (Responsible Data) focussing on the reality of statistical bias in data/data-processing algorithms & the societal impact of this e.g. - targetted ads to people of specific profiles (racial, sexual etc.) - machine bias when processing data - e.g. softeware in USA criminal justice system used whether offenders are likely to re-offend

The speaker’s proposal to deal with this:

  • Algorithmic transparency (not just releasing source code which can be unnecessary & often insufficient)
  • Algorithmic transparency requires Data transparency,
  • Data transparency is NOT synonymous with making all data public but should release it whenever possible, inc. releasing data selection, collection, pre-processing methodologies, provenance, quality info. known sources of bias, privacy preservation statistical summaries of the data.
  • Data transparency - helps prevent discrimination & enables establishment of trust.

But technology alone is not enough - also need regulation & civic engagement - something we should drive through engagement with the public - both technical & non-technical

IG in Health Data A problem was identified was of low-quality health data abundant in health research. This has been exacerbated with legal aspects (e.g. GDPR) Emerging solutions to deal with this are:

  • Secure Multi-Party Computation (SMPC) which may be run on untrusted computing, BUT needs strong governance to be run on untrusted networking
  • Synthetic data. Which can be anonymised data, or data ‘completed’ with average values to increase its quality

OUTCOME: follow-up discussion with Oya Deniz Beyan (Fraunhofer) from GOFAIR project to investigate whether we can be involved in SMPC-like testing

WG on FAIR Data Maturity (chaired by Edit Herczog)

approx. 60 attendees, mostly EU, North America and some from AU, Africa. Approx. 50% involved in infrastructure/SPs. Also Maggie Helstrom connecting remotely

Discussion of the scope and methodology to Create a common set of core assessment criteria for FAIRness A bottom up approach: definition, development, testing and delivery. Agreement over the 4 kick-off meetings across different regions so far.

Long discussion of whether the assessment should be automatic (done by machines/algorithms) or manual, using examples of the volume of data & practicality considerations.

Agreement that the scope should be cross-disciplinary rather than domain specific.

An example of the maturity is the existance (or not) of a license, and if so, its nature. But there is currently no ontology of licenses.

There are currently 11 different methods which we will examine in turn.

OUTCOME: monthly meetings which EGI should participate in.

IG Observtional Data to Information (OD2I) and Science Gateways

There was an introduction to two IGs including aspects of brining raw data to usable information for research, and VREs or Science Gateways or Virual Labs. These were followed by a number of different talks including one that raised much interest about a method for Annotating Data (implemented using a MongoDB separate from the data) and tools for measuring quality of preserved data.

I gave a presentation introducing the SKA/AENEAS project and the plans for a Science Gateway. Since there seemed parallels with this topic and both IGs, the chairs and I agreed to continue engagement until the end of the project.

OUTCOME: contact established with Sandra Gesing and discussions for future engagement with the IG

BoF Assessing FAIR data policy implementation in health research

Introduction of meeting introducing the new FAIR4Health project and the landscape analysis it will conduct to assess FAIR implementation in health research. An H2020 project, 17 partners from 11 eu and non-eu countries. Also an overview of the existing projects in the FAIR ecosystem EOSC-Life fairsfair, gofair.

I mentioned the importance of engaging with existing eInfrastructures to help most important outputs, workable implementations, e.g. ELIXIR and AAI aspects

There followed discussion regarding :

  • problems health researchers face regarding research data
  • main social, technical and/or cultural barriers preventing FAIRification
  • benefits of FAIRifying health research data

poss. synthetic data may help the privacy issues. (But this is just anonymized data…)

Miscellaneous follow-ups

  • Mohamed Ba-Essa (KAUST library services) re-iterating EGI’s willingness to follow-up with discussions with KAUST last year about KAUST joining the federation. Will speak to the colleagues involved and contact me
  • Peter Elias (SCiLeD) regarding federating e-Infrastructures across Lagos, Nigeria, Accra, Ghana, and the possibility of engaging with EGI. Sent follow-up email and Peter will discuss internally and contact us
(Gergely Sipos) #2

Hi, was there any good slides that explain the story of how certain group/community achieved FAIR-ness? I’d like to see concrete examples of working practices, and possibly reuse them in my ‘EOSC/EGI’ promotional talks.

Another, maybe related thing: I am working on a partnership agreement between EGI and the US Science gateway institute.

(Matthew Viljoen) #3

There was much more discussion over the projects involved in promoting FAIRness and far little real evidence of FAIR data from researchers. Probably the closest were reports from the Photon/Neutron community in this talk: https://rd-alliance.org/ig-research-data-needs-photon-and-neutron-science-community-rda-13th-plenary-meeting

Unfortunately I see the slides are not yet linked. I’ll contact the chair to request that they are.