iDPP Open Evaluation Challenges and FAIR Data: The BRAINTEASER Approach to Open Science

by Nicola Ferro, Full professor in computer science at the Department of Information Engineering of the University of Padua, Italy.

 

During its ideation but, even more, during its execution, the BRAINTEASER project adopted a strong focus on Open Science[1] as a means both to validate its results and methods and to accelerate the transfer of its outcomes. In this respect, the BRAINTEASER project developed its own approach to Open Science consisting of two pillars: the adoption of FAIR (Findable, Accessible, Interoperable, Reusable) principles[2] for data sharing and the organization of open evaluation challenges[3].

When it comes to adopting and adhering to FAIR principles, BRAINTEASER developed an ontology[4] to model ALS and MS clinical data, both retrospective and prospective, as well as environmental data, and made the ontology itself available[5] according to FAIR principles.

The BRAINTEASER ontology served the purpose of populating a knowledge base of ALS and MS clinical data, consisting of retrospective and prospective data. The retrospective data amount to about 2,204 ALS patients (static variables, ALSFRS-R questionnaires, spirometry tests, environmental/pollution data) and 1,792 MS patients (static variables, EDSS scores, evoked potentials, relapses, MRIs, environmental/pollution data); the prospective data contain about 86 ALS patients (static variables, ALSFRS-R questionnaires compiled by clinicians or patients using the BRAINTEASER mobile application, sensors data). These datasets have been made available[6] according to FAIR data principles as well.

Both the ontology and the ALS and MS clinical datasets have been iteratively developed over the years and they both have been used to fuel the open evaluation challenges and have been validated through the open evaluation challenges.

The iDPP (Intelligent Disease Progression Prediction) challenges have been organized under the umbrella of CLEF[7] (Conference and Labs of the Evaluation Forum) with several purposes: (i) to openly and publicly validate the prediction algorithms developed by the BRAINTEASER project; (ii) to give the possibility to other researchers to develop their own prediction methods, using the BRAINTEASER ALS and MS clinical datasets; (iii) by relying on shared and common datasets, to ensure the comparability of experimental results; (iv) to accelerate the knowledge transfer to/from the project and foster the adoption of best approaches; (v) to stimulate exchange of ideas and the growth of a community by organizing annual workshop where the results are presented and discussed. BRAINTEASER organized three yearly cycles of open evaluation challenges. All the papers, participant slides and repositories are available through the BRAINTEASER web site.

The first challenge, iDPP@CLEF 2022[8], offered three tasks using ALS retrospective data: Task 1, Ranking Risk of Impairment for ALS, where participants were asked to rank subjects based on the risk of early occurrence of the event (NIV, PEG, Death); Task 2, Predicting Time of Impairment for ALS, where participants were asked to predict the time of the event (NIV, PEG, Death); and, Task 3 on position papers on Explainability of AI Algorithms for ALS.

The second challenge, iDPP@CLEF 2023[9], offered three tasks using MS retrospective data and ALS retrospective data plus environmental data: Task 1, Predicting Risk of Disease Worsening (MS), where participants were asked to rank subjects based on the risk of worsening on the basis of the Expanded Disability Status Scale (EDSS); Task 2, Predicting Cumulative Probability of Worsening (MS), where participants were asked to explicitly assign the cumulative probability of worsening at different time windows, i.e., between years 0 and 2, 0 and 4, 0 and 6, 0 and 8, 0 and 10; and, Task 3 on position papers on impact of exposition to pollutants for ALS.

The third and last challenge, iDPP@CLEF 2024[10], offered three tasks using ALS and MS prospect data, plus sensor data: Task 1, Predicting ALSFRS-R Score from Sensor Data (ALS), where participants were asked to predict the twelve scores of the ALSFRS-R (ALS Functional Rating Scale – Revised), assigned by medical doctors roughly every three months, from the sensor data collected via the BRAINTEASER app; Task 2, Predicting Patient Self-assessment Score from Sensor Data (ALS), where participants were asked to the self-assessment score assigned by patients from the sensor data collected via the app. Self-assessment scores correspond to each of the ALSFRS-R scores but, while the latter ones are assigned by medical doctors during visits, the former ones are assigned via auto-evaluation by patients themselves using the provided app; and, Task 3, Predicting Relapses from EDDS Sub-scores and Environmental Data (MS), where participants were asked to predict a relapse using environmental data and EDSS (Expanded Disability Status Scale) sub-scores.

Overall, three annual cycles of open evaluation challenges allowed for exploring different aspects of progression prediction of ALS and MS, from more consolidated approaches based on survival analysis to more explorative ones, targeting explainability of the algorithms and the joint use of environmental and sensor data. Moreover, the open evaluation challenges not only were instrumental in delivering the BRAINTEASER ALS and MS clinical datasets but they also served the purpose of accumulating experimental evidence, e.g. comparable performance score of different approaches, which has been shared back to the community. Being shared according to FAIR principles, all this accumulated knowledge is available for future reuse and research, also after the end of the BRAINTEASER project.

 


[1] https://www.unesco.org/en/open-science

[2] https://www.go-fair.org/fair-principles/

[3] https://brainteaser.health/open-evaluation-challenges/

[4] Faggioli, G., Menotti, L., Marchesin, S., Chiò, A., Dagliati, A., de Carvalho, M., Gromicho, M., Manera, U., Tavazzi, E., Di Nunzio, G. M., Silvello, G., and Ferro, N. (2024). An extensible and unifying approach to retrospective clinical data modeling: the BrainTeaser Ontology. Journal of Biomedical Semantics, 15:16:1-16:28. doi: https://doi.org/10.1186/s13326-024-00317-y

[5] https://doi.org/10.5281/zenodo.12789731

[6] https://doi.org/10.5281/zenodo.12789962

[7] https://www.clef-initiative.eu/

[8] https://brainteaser.health/open-evaluation-challenges/idpp-2022/

Guazzo, A., Trescato, I., Longato, E., Hazizaj, E., Dosso, D., Faggioli, G., Di Nunzio, G. M., Silvello, G., Vettoretti, M., Tavazzi, E., Roversi, C., Fariselli, P., Madeira, S. C., de Carvalho, M., Gromicho, M., Chiò, A., Manera, U., Dagliati, A., Birolo, G., Aidos, H., Di Camillo, B., and Ferro, N. (2022). Intelligent Disease Progression Prediction: Overview of iDPP@CLEF 2022. In Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Thirteenth International Conference of the CLEF Association (CLEF 2022), pages 395-422. Lecture Notes in Computer Science (LNCS) 13390, Springer, Heidelberg, Germany. doi: https://doi.org/10.1007/978-3-031-13643-6_25

[9] https://brainteaser.health/open-evaluation-challenges/idpp-2023/

Faggioli, G., Guazzo, A., Marchesin, S., Menotti, L., Trescato, I., Aidos, H., Bergamaschi, R., Birolo, G., Cavalla, P., Chiò, A., Dagliati, A., de Carvalho, M., Di Nunzio, G. M., Fariselli, P., García Dominguez, J. M., Gromicho, M., Longato, E., Madeira, S. C., Manera, U., Silvello, G., Tavazzi, E., Tavazzi, E., Vettoretti, M., Di Camillo, B., and Ferro, N. (2023). Intelligent Disease Progression Prediction: Overview of iDPP@CLEF 2023. In Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fourteenth International Conference of the CLEF Association (CLEF 2023), pages 343-369. Lecture Notes in Computer Science (LNCS) 14163, Springer, Heidelberg, Germany. doi: https://doi.org/10.1007/978-3-031-42448-9_24

[10] https://brainteaser.health/open-evaluation-challenges/idpp-2024/

Birolo, G., Bosoni, P., Faggioli, G., Aidos, H., Bergamaschi, R., Cavalla, P., Chiò, A., Dagliati, A., de Carvalho, M., Di Nunzio, G. M., Fariselli, P., Garcia Dominguez, J. M., Gromicho, M., Guazzo, A., Longato, E., Madeira, S., Manera, U., Marchesin, S., Menotti, L., Silvello, G., Tavazzi, E., Tavazzi, E., Trescato, I., Vettoretti, M., Di Camillo, B., and Ferro, N. (2024). Intelligent Disease Progression Prediction: Overview of iDPP@CLEF 2024. Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF 2024) – Part II, pages 118-139. Lecture Notes in Computer Science (LNCS) 14959, Springer, Heidelberg, Germany. doi: https://doi.org/10.1007/978-3-031-71908-0_6

OUR NEWSLETTER

Upcoming Events

Latest posts

Skip to content