Exploring New Research Challenges and Collaborations in AI/BD/HCI/IoT
17-18. October 2024 at SZTAKI, Budapest, Hungary
ERCIM and JST (Japan Science and Technology Agency) are co-organizing the 5th joint workshop on ‘Exploring New Research Challenges and Collaborations in AI/BD/HCI/IoT. This workshop aims to present recent results and emerging R&D work conducted in European institutions and in the context of the JST AIP (Advanced Integrated Intelligence Platform) project in Japan. This event will provide an opportunity to its European and Japanese participants to discuss research challenges in these areas, as well as to consider collaboration prospects that may arise in the context of European programs and the relevant counterpart initiatives of JST.
In this workshop, invited participants will investigate research challenges, social aspects as well as potential collaboration opportunities in the following topics:
- #1’Trustworthy and Reliable Human-Machine Symbiotic Collaboration’,
- #2’Extracting Actionable Knowledge in the Presence of Uncertainty’,
- #3‘Trust in Data-driven Research: The role of Actors, Research Infrastructures and Processes’,
- #4’Infrastructure and Service Resilience for Smart Society’.
Interested participants should contact Dimitris Plexousakis (
Sponsors:ERCIM (European Research Consortium on Informatics and Mathematics), JST (Japan Science and Technology) AIP Network Laboratory
Dates: 17. October 2024 (Thu) 10:00〜18:00 and 18. October 2024(Fri) 10:00〜18:00
Location:SZTAKI, Budapest, Hungary
Preliminary Program Agenda
17. October 2024(Thursday)
10:00〜 Opening Remarks
10:10〜 Keynote:Dr. Masugi Inoue (Director General, Resilient ICT Research Center, NICT)
10:50〜 Keynote:TBD (ERCIM)
11:30〜18:00 Group Discussion (4 Groups)
18. October 2024(Friday)
09:00 〜 Invited:TBD(JST)
09:30 〜 Invited:TBD(ERCIM) or Discussion Status Report
10:00〜15:00 Group Discussion (4 Groups)
15:00〜17:00 Sharing Group Discussion Results: #1~#4 from Organizers
17:00〜18:00 Wrap-Up & Closing Remarks
Group Discussion Themes and Organizers
Note: Organizers might ask participants to prepare documents (describing you, your vision, thoughts, etc.)
#1 Trustworthy and Reliable Human-Machine Symbiotic Collaboration
Dr. Giorgos Flouris, Dr. Theodore Patkos (Institute of Computer Science, FORTH,
Prof. Rafik HADFI (Kyoto University,
#2 Extracting Actionable Knowledge in the Presence of Uncertainty
Prof. Nicolas Spyratos (Universite Paris – Saclay,
Prof. Akira UCHIYAMA (Osaka University,
#3 Trust in Data-driven Research: The role of Actors, Research Infrastructures and Processes
Prof. Andreas Rauber (Technical Univ. of Vienna,
Prof. Hisashi KASHIMA (Kyoto University,
#4 Infrastructure and Service Resilience for Smart Society: p8~9
Prof. Chrysostomos Stylios (Institute of Industrial Systems, Athena Research Center,
Prof. Takuro YONEZAWA (Nagoya University,
Organizing Committees:
ERCIM: Björn Levin (RISE, ERCIM President), Dimitris Plexousakis (ICS-FORTH, responsible for scientific
aspects), Andreas Rauber (SBA, responsible for outreach), Nicolas Spyratos (Prof. Emeritus,
Universite Paris - Saclay)
JST-AIP: Katsumi Emura (VP, F-REI), Akiko Aizawa (Professor, NII), Teruo Higashino (VP, Kyoto
Tachibana University),Yasuo Okabe(Professor, Kyoto University), Hideyuki Tokuda (President,
NICT)
Secretaries (ERCIM): TBD
Secretaries (JST): JST Dept. of Strategic Basic Research (ICTG), Yoshiaki Kiriha
Themes
〇#1: Trustworthy and Reliable Human-Machine Symbiotic Collaboration
Rafik HADFI, Theodore Patkos, Giorgos Flouris
Modern intelligent systems are affecting our everyday lives at an increasing pace. Future intelligent machines will need to exhibit capabilities that are not only effective, but also closer to human intuition and intellect. In order to achieve a reliable human-machine symbiotic collaboration, Artificial Intelligence (AI) needs to narrow the chasm between artificial and natural intelligence and make progress on skills that humans excel in. These include among others:
- a general understanding of how the world works;
- the exploitation of common sense knowledge that is hidden, yet pervasive, in the majority of human-to-human interactions;
- the ability to comply with social norms and values,
- the competence to engage with multi-modal forms of communication in synergetic task execution with humans;
- the ability to explain with grounded justification their decisions.
This group will discuss challenges related to engineering Cognitive AI systems, i.e., intelligent machines that exhibit forms of human-like cognition. Research in this area builds on the advancements of modern, data-driven AI technologies, but also calls for progress in symbolic, knowledge-based methods, in order to enable machines to learn from how humans create rich cognitive models about the world they live in, and how they ascribe mental states to themselves and others, such as beliefs, intentions, emotions, perceptions, even thoughts. Foundation Models and Large Language Models (LLMs) significantly enhance human-machine interaction, but face limitations in logical reasoning. Progress in integrating neuro-symbolic methods aims at addressing these limitations by enriching symbolic reasoning capabilities within LLMs.
Cognitive AI will help intelligent systems engage more smoothly in social interactions, accomplish collaborative tasks, and in general broaden their intelligence and the spectrum of problems they can tackle. At the same time, it will increase trust in the use of AI in various contexts. Such a human-machine collaboration can impact various domains, including education, healthcare, business, society, and economics, through appropriately implemented and deployed smart assistants.
Additionally, the group will discuss the strengths, limitations, relevant application areas, and agendas of Europe and Japan on the topic, suggesting ways to harness the opportunity for collaboration, in order to bring forward the conditions for a partnership that capitalizes of the competence of each side.
Topics of interest include but are not limited to:
Cognitive Robotics, Social Robotics, Multi-Modal Human-Robot Interaction, Argumentative Dialectical
Communication, Human-Machine Collaboration, Symbiotic Interaction, Large-Language Models, Never-Ending Learning, Scene Understanding, Common sense Modeling, Knowledge Representation and Reasoning, Knowledge Transfer, Explainability.
〇#2: Extracting Actionable Knowledge in the Presence of Uncertainty
Akira UCHIYAMA and N. Spyratos
In several applications today users with a variety of backgrounds seek to extract high level information from big datasets. The typical scenario is as follows: A user integrates a number of data sets, computes some statistics, then seeks an explanation for the results such as why are these data values outliers? Or why are they not outliers? Why are two graphs similar (or dissimilar)? Why are the features and/or signal values so high (low)?
However, since the dataset contains data from different sources, it usually has missing values and inconsistent data. This is especially true when the collected data comes not only from web sources but also from physical measurements and it is merged into a single table. In the area of databases inconsistencies are mainly due to key constraint violations, whereas in the area of signal processing such constraints can be regarded as those of sensing accuracies that are inconsistent among modalities. In such settings, it is difficult for users to know the constraints imposed on the data in each source of provenance and check their validity in the integrated dataset. Therefore, the only option for a user is to declare a set of constraints that she/he considers appropriate in a given application and then extract
from the dataset those data that are consistent with the declared constraints, using some query tool. In
the area of machine learning and signal processing such constraints are often represented as regularization terms.
Three main issues (among others) arise in the areas of databases, signal processing, machine learning and AI when querying possibly inconsistent datasets:
- how to extract consistent answers to queries addressed to possibly inconsistent datasets;
- how to help users explain the expected presence or absence of certain information items in the answer to a query;
- how to define easily computable measures for assessing the quality of a dataset or of a query answer.
Therefore, a major challenge for research in databases, signal processing, machine learning and AI is to
develop tools to assist users in performing the following main tasks:
Consistent query answering: An inconsistent dataset may contain consistent parts (i.e. some useful information) which (ideally) could be extracted through queries. This kind of query answering, known as consistent query answering, has attracted a lot of attention since the1990s and its importance is still growing today. However, consistent query answering is not possible to do with existing querying methods and tools, hence the need of research in this area aiming to develop such methods and tools. Besides, in the area of signal processing we may need data completion and restoration methods. Data restoration has been long studied in signal processing and successful for restoring multimedia signals like image and audio signals. However, it still needs to develop since we have various data sources both physical and virtual measurements beyond multimedia. Studying data restoration from a mixture of consistent and inconsistent signals is an important topic.
Explanation of query answers (i.e. explaining observed query outputs): With the growing popularity of big data, signal processing, machine learning and AI many users with a variety of backgrounds seek to extract high level information from possibly inconsistent datasets collected from various sources and combined using data integration techniques. Even when consistent query answering methods and tools become available, users will still need help in explaining the presence or absence of certain information items in query answers. In deep-learning related techniques “explainable” and “interpretable” algorithms are especially required since many parameters in huge-trained models would affect their contributions to the performance. Mathematical modeling could help to understand huge models. Hence the need of research in this area aiming to help users understand/explain the consistent answers.
Quality of query answers: In the area of databases, the information items of a consistent query answer addressed to a possibly inconsistent dataset are of two kinds depending on their “provenance”: (a) items whose ancestors in the dataset are consistent and (b) items whose ancestors in the dataset are inconsistent. Roughly speaking, the higher the percentage of consistent items in the dataset the higher the quality of (consistent) query answers. In the area of signal processing accuracy is measured using mean squared error but there is a need to consider “no-reference” quality metrics. Hence, it’s crucial to define easy-to-compute measures for characterizing the quality of data in a dataset as well as the quality of query answers.
#3: Trust in Data-driven Research: The role of Actors, Research Infrastructures and Processes
Hisashi Kashima, Andreas Rauber
Background
Research in virtually all disciplines is increasingly based on data collected from a variety of sources, pre-processed and analyzed in complex processes by a large number of stakeholders. The complexity of the processing pipelines, the number of actors and steps involved as well as the amount of code re-used from diverse sources, give rise to numerous challenges with respect to the amount of trust we can have in the correctness of whatever processing we apply. Ultimately, we need to ask ourselves: how much trust can we have in our own research outputs (and those of others that we re-use) given that we are unable to verify every single data instance, code and processing step applied.
Thus, questions arise as to what constitutes trust in data and code quality; the processing applied; whether and what differences exist between different scientific disciplines and cultures; what understanding of trust is necessary to make it measurable; what activities are needed to not only build trust but also to maintain it in the long term and to what extent research infrastructures that meet specific criteria of trustworthiness can assist with it. Answers to these questions are essential for determining the trust we can have in the respective research outcomes - and thus on the degree of accountability we can accept for our findings, i.e. the quality of our research.
Discussion Topics
In the first workshop we explored trust via a simplified research process consisting of the definition of Research Questions driving a loop of data and code feeding into processing that produces an output, which ultimately leads to insights or knowledge. This process is embedded in contexts of principles and expectations of Communities of Practice, humans as stakeholders, organizations that these are embedded in, and the (technical) infrastructures within which the research is being performed.
For each of these elements an extensive (but far from exhaustive) set of attributes or facets influencing perceptions of trust can be identified. “Quality” is interpreted as “fitness for purpose” in a given setting. This determines that no absolute indicators of quality, justifying trustworthiness, of any element in this setting can be provided. Any quantification of trustworthiness needs to be done with respect to the activity being performed. In the time leading up to the next workshop as well as during the workshop we plan to expand the trust diagram, extending the list of trust indicators. We furthermore plan to explore differences in cultures, disciplines, and seniority levels with respect to such trust indicators. Understanding these will help us to design infrastructures, research processes, documentation and result presentation to signal trustworthiness, as well as to have better guidance on when to use and integrate building blocks, such as data or code, while still being able to have confidence into the results we obtain. This becomes even more important given recent development in the field of AI, where accountability (and thus, ultimately, trust in the results and systems produced) are increasingly mandated by supervising authorities.
(1) Trust in data and code
- What information do we need to understand whether we can have sufficient trust in data / code / … to use it as basis for our research?
- What kind of provenance information on data and code is required and how can it be captured?
- What kind of testing / quality assurance has to be applied before data / code can be re-used? What kind of assertions on data / code quality have to be available to ensure it can be (re-)used in a specific research activity?
- Is the data collected in a technically, legally and ethically appropriate manner?
- In the case of simulated / synthetic data generated e.g. by digital Twins or Generative AI: how can we establish the degree to which it mimics real data for the task given?
(2) Trust in the data and research infrastructures
- What information do we need to have on the infrastructure to allow us to trust its viability as a source and destination of our research process (data, code, models, …)?
- Is the infrastructure securely and professionally managed, free of bias or secondary interests?
- What information do we need on the infrastructures availability, long-term business model, and assertions on proper (i.e. technically, legally and ethically correct handling) of data, code, and results?
- What are quality/trust indicators of institutions beyond personal knowledge of the institution?
(3) Trust in the processes and their outcomes?
- Is the processing in line with the law and ethics?
- Have the results of the data analysis been derived by correct procedures?
- What kinds of assertions / tests have been made to allow us to trust in the findings being reported?
- Are the results of the processing reproducible?
- What other mechanisms may we need to establish trust in findings, processing, …?
How can we determine to which degree we are able to have trust in - and thus accept accountability for - complex
models being produced?
We plan to take a look at these and related aspects with a particular focus on different perceptions with respect to
1. disciplinary context
2. cultural context
3. data type context (raw/derived data, code, processes, research outputs, interpreted insights)
trying to understand in how far perceptions, and thus approaches required differ across these categories and how high-quality research can be achieved given the massive complexity of an increasingly interdisciplinary, globalized research environment where traditional concepts of ensuring quality and establishing trust (e.g. by end-to-end testing and scrutiny, personal acquaintance with persons and labs, etc.) start to fail.
#4: Infrastructure and Service Resilience for Smart Society
Takuro YONEZAWA, Chrysostomos Stylios
We have experienced many dramatic changes from recent frequent disasters and infectious diseases. In addition, cyber security risks on digital infrastructures and services have become more severe. As we look towards the future, envisioning smart and connected communities and realizing Society 5.0, it becomes paramount to imbue these future infrastructures with resilience, making them highly reliable, robust, and fault-tolerant, thus enabling us to cope with and recover from setbacks [1]. Resilience is crucial for delivering human-centered digital services and applications that prioritize safety and security.
Visions of a smart society with information technology have been discussed in the past decade. The development and proliferation of AI, IoT, and mobility technologies have been accelerating at an ever-increasing pace. Specifically, large-scale language models like ChatGPT have garnered over 100 million users in just a few months, signifying the arrival of a new era in information technology. Based on the recent research issues and outcomes, and social background over the past few years, it is necessary to reconsider what use cases and services are feasible to build a society that is resilient to physical and cyber disasters, and to consider the grand challenges to realize the infrastructure to support these services including identifying the risks and threats of the infrastructure and its services.
In light of these imperatives, this group is dedicated to investigating existing challenges and new research missions to achieve integrated and harmonized resilience using state-of-the-art technologies such as AI, Robotics, Big Data, HI, and Internet of Things (IoT). Another aim is to tackle emerging ethical, legal, and social issues (ELSI) and responsible research and innovation (RRI) challenges.
A wide range of topics can be studied in the context of this research effort, including, but not limited to:
- Computational Social Science: Utilizing computational methods and models to gain insights into social phenomena and human behaviour, thus informing more resilient digital services,
- Use Cases, Service Model and its Risks and Threats: Deriving new use cases integrating recent and expected AI, IoT, and mobilities technologies, and identifying new risks and threats underlining the services.
- Security and Privacy: Developing robust measures to safeguard digital infrastructures and protect sensitive information,
- Quality of Data: Ensuring the accuracy, reliability, and integrity of utilized data,
- Multi-Algorithm Networks: Exploring the potential of employing diverse algorithms to enhance network resilience and adaptability from edge to cloud networks,
- Real-world Robotics and autonomous mobilities: Advancing the capabilities of robotic systems to navigate and operate in complex, real-world environments effectively,
- AI (Federated Learning) Algorithm/Platform: Investigating distributed learning approaches that allow multiple entities to train AI models while preserving data privacy collaboratively,
- Infrastructure Operations/Management: Developing strategies and frameworks for the efficient operation and management of digital infrastructures, ensuring their resilience and continuity,
Specifically, issues related to:
1 Computational Social Science
1.1 Developing methods to understand manifest and latent social phenomena by analysing various online/offline social signals.
1.2 Developing forecasting mechanisms of social phenomena by analysing past and current social
contexts.
2 Use Cases, Service Models and its Risks and Threats
2.1 Developing new use cases and service models that solve social problems and natural disasters.
2.2 Identifying the security and privacy risks and threats of those use cases and service models.
3 Security and Privacy
3.1 Developing advanced authentication mechanisms that ensure secure access to digital services; efficient encryption techniques to protect sensitive data; intelligent AI-based systems capable of detecting and mitigating cyber threats in real-time; and protocols that integrate security measures at every layer of the digital infrastructure, mitigate vulnerabilities and minimize the impact of potential breaches.
4 Quality of Data 4.1 Developing anomaly detection methods, and data cleansing techniques.
4.2 Integrating data from diverse sources to create a unified and coherent dataset, enabling more accurate analysis and decision-making.
4.3 Establishing mechanisms to trace the origin and history of data, ensuring transparency, accountability, and trustworthiness in the digital ecosystem.
5 Multi-Algorithm Networks
5.1 Designing adaptive and self-healing network architectures that can dynamically reconfigure themselves to withstand disruptions, failures, and attacks.
5.2 Addressing the challenges of efficiently managing and scaling multi-algorithm networks, ensuring that computational resources are utilized effectively while maintaining the desired levels of resilience and performance.
6 Real-world Robotics and Autonomous Mobilities
6.1 Enhancing robotic systems' perception capabilities to accurately understand and interpret their surrounding environment, enabling them to make informed decisions and adapt to dynamic situations.
7 Federated Learning Algorithm/Platform
7.1 Designing techniques to protect user data privacy during the training process of AI models, allowing for collaborative learning without the need to aggregate sensitive information centrally.
7.2 Exploring approaches for aggregating and generalizing models learned from multiple entities represent a fraction of the group's interests.