AISym4Med is the platform that will change and improve the healthcare data system


AISym4Med aims at developing a platform that will provide healthcare data engineers, practitioners, and researchers access to a trustworthy dataset system augmented with controlled data synthesis for experimentation and modeling purposes. This platform will address data privacy and security by combining new anonymization techniques, attribute-based privacy measures, and trustworthy tracking systems.


main objectives aisym4med

Define a distributed architecture for health data collection, aggregation and analysis, capable of supporting secure access, exploitation, synthesis and analysis of data, while integrating tailored interfaces responding to the needs of the different target users.

Enhancing experimentation capacities via the expansion of datasets with reliable synthetic data, leveraging on AI to mirror the characteristics of existing datasets without real reference to anyidentifiable individual, encompassing replication of images, tabular data, and time series, including auditingand quality assessment by clinicians.

Define a dedicated solution to guarantee trustworthiness and data privacy, leveraging on an in-depth analysis of platform privacy and ethical requirements, in combination with a dedicated architecture and technological solutions to guarantee the usability of the platform.

Validating data management and experimentation functionalities against realistic use cases, with the direct involvement of target end-users and implementing iterative feedback loops in order to maximize the usability and effectiveness of the platform.

Ensuring the scalability and exploitation of the platform beyond the project implementation, in order to foster actual uptake of the platform by target stakeholders in operational environments, and guarantee the achievement of the expected impacts.



15 partners


8 countries


4 years

(or 48 months)


+6 M€

Total Budget


This project was created in response to specific needs identified in recent years. 

ML algorithms need an immense amount of data to have high performance in real-world

Data accessibility is hindered by several issues, such as privacy and anonymization concerns.

Most healthcare data does not follow global standards, is incomplete, of low-quality and
hard to be assessed, being dispersed through different databases.

Reduction of data bias as it is directly connected to the diversity of human beings’ biology.

Insufficient representation of rare pathologies in most scenarios.

Different privacy layers needed to responsibly handle medical data containing very personal and sensitive information

Lack of sociodemographic content or of pathological data especially in rare pathologies.

Scroll to Top