New standard to enhance trustworthiness of artificial intelligence systems

Artificial intelligence (AI) systems are used broadly by academia and governments, as well as across diverse industries, such as healthcare, manufacturing, transport and retail to enhance products and services and user experience.

Autonomous vehicles use artificial intelligence technologies
Autonomous vehicles use sensors and cameras to connect to the environment surrounding them (Photo:

For instance, manufacturers use machine learning to predict when machines need to be maintained with high accuracy; algorithms help healthcare providers diagnose patients and find the best treatments available, while investment firms use AI-powered financial search engines to mine masses of data and generate actionable reports, which saves time, human resources and costs.

IEC and ISO develop international standards for information and communication technologies in over 22 areas, through their joint technical committee (JTC 1). As part of its activities, the committee for AI (SC 42) develops international publications and provides guidance to IEC, ISO and other JTC 1 committees on applications of AI. SC 42 has just published ISO/IEC Technical Report 24029-1, Artificial intelligence – Assessment of the robustness of neural networks – Part 1: Overview.

“Technologies such as artificial intelligence are fuelling the digital transformation of industry”, said Wael William Diab, Chair of SC 42. “SC 42 has been looking at the entire AI ecosystem, which includes novel approaches to address emerging issues such as trustworthiness from the start and thus enable broad adoption. The robustness series complements the portfolio of trustworthy and ethical standards the committee is developing.”

“Robustness is an important high-level characteristic of trustworthy AI systems. This series reacts to industry demand across application areas, to be able to demonstrate robustness in AI systems based on neural networks”, said David Filip, Convenor of SC 42/Working Group 3 which developed ISO/IEC 24029-1 and is working on ISO/IEC 24029-2.

e-tech spoke with Arnault Ioualalen, editor of the Technical Report, to find out how it will contribute towards ensuring that products and services using AI systems are safe.

What does the Technical Report provide?

We wanted to provide some existing practices and add to the knowledge of the people who validate AI systems. The Technical Report provides an overview of the approaches or methods available to assess issues and risks tied to the robustness of AI systems, with a particular focus on neural networks, what they do, how they work and can be used.

For an evaluator, these methods answer different questions about the systems they validate. There are three types of methods:

  • Statistical approaches usually rely upon a mathematical testing process on some datasets and help ensure a certain level of confidence in the results. They allow the evaluator to answer questions related to a desired target performance threshold, for instance, what was the false positive negative rate when predicting a material fault, and is that rate acceptable?
  • Formal methods rely on sound formal proof, which enables the evaluator to check if the properties are provable over a domain of use; for example, does the system always operate within some specified safety boundaries?
  • Empirical methods rely on experimentation, observation and expert judgement.

They enable the evaluator to assess the degree to which the system’s properties holds true in the scenario tested. In other words, is the observed behaviour satisfactory?

The idea behind these methods for robustness assessment is to evaluate to which extent these properties stand up when circumstances change.

What is robustness?

Neural networks must be properly validated for several aspects, such as robustness, resiliency, reliability, accuracy, safety, security and privacy. This Technical Report considers robustness, in other words how to assess, or even prove, in the case of formal methods, that your system will still operate normally under various conditions.

When you perform an in-house validation, you can control the conditions around you, such as light, visibility, temperature, movement around the system being tested. When you deploy an AI system in real-life circumstances, outside of the test lab, sometimes the conditions will vary and sometimes there will be problems.

Robustness is the ability to say whether or not your system will withstand the live situation in terms of ability to function properly. Most AI systems are designed to have contact with the real world. They are usually linked to a sensor or camera where there are few constraints over the environment parameters.

For example, in a factory or a company set-up, you can control everything in the environment, such as temperature, or who has access to a certain area. You also build in environmental constraints to ensure safety-by-design, so that you are sure about how your system will interact with its environment.

In contrast, if you take the example of an AI system used for self-driving cars, anything can happen on the road. People, animals or objects can suddenly cross the vehicle’s path, so there will be more challenges for your AI system to adapt to in real-life situations.

Why is robustness important?

You can look at robustness from different perspectives.

In the short-term, it is a matter of efficiency. When you design a system, you go through several steps: R&D, prototype, integration and testing. At each phase, your system could have a problem and you would have to backtrack to find the issue, make improvements and ensure quality all the way along the lifecycle. Robustness will have to be checked at each step, which will save some of the backtracking. This is key for industry because it saves time and money.

In the long-term, it is about ensuring there are mechanisms to build trust in AI systems so that people will have the confidence to adopt this technology in everyday life.

What’s next?

We are developing Part 2 of the series, which restates the formal assessment method listed in ISO/IEC TR 24029-1 and then recommends methodologies to follow for assessing robustness properties of neural networks given specific situations or constraints.

Part 2 follows the lifecycle of an AI system, from inception to retirement, and for each step, illustrates how each criterion and method can be used to prove particular aspects of the robustness of neural networks.

Find out about IEC and ISO AI standardization activities and how you can get involved.