As digitalization continues to develop rapidly, masses of data are generated by people and devices every day. According to Statista, the total amount created globally is forecast to reach 175 zettabytes – a 175 billion terabytes, or a 175 trillion gigabytes – in 2025.
Additionally, authorities, businesses and industry are implementing data analytics to enhance their products and services, in healthcare, smart manufacturing and city infrastructure, but there are many more examples.
Nonetheless, there are concerns around the quality and management of data, as well as how it is generated, used, stored and protected. Questions remain regarding how much of our personal data already exists, where is it, who can access it, who is using it and for what purpose.
e-tech talked with Ian Oppermann, who leads the work on data usage, to hear about the latest standardization work in this area. AG 9 is part of the IEC and ISO Joint Technical Committee (ISO/IEC JTC 1) which develops international standards for ICT. Oppermann is President of the ISO/IEC JTC 1 Strategic Advisory Committee in Australia and President of the Australian Computer Society (ACS).
What key areas are being considered by the advisory group?
We’re considering the entire lifecycle of data, including collection, transfer, storage, analytics, revising insights, long-term storage, data sharing agreements, data sharing networks and data usage networks. There are big gaps in the way data is treated throughout the process, which makes it difficult to nail down a data framework. Essentially there are three main areas of work: data sharing and use frameworks, data sensitivities and data quality.
What is the personal information factor (PIF) and how is it relevant to data frameworks?
As more and more data are generated by people and devices, there is a growing concern about data privacy but also the unintended consequences of data use. A framework of controls will rely on the ability to determine the level of sensitivity of the personal information (PI) captured in the data, known as the PI factor. The framework also needs to consider at what stage of the lifecycle the data is being used, such as collection, analysis or the output.
Some of the issues around data sensitivities include:
How can standards improve the data lifecycle?
There are several issues, including that there is no clearly defined language and structure. We need to define what data sharing is so that we can have consistent use cases. We also need to make sure that the metadata is clear so that we can build proper data for use by algorithms.
Advisory Group 9 has considered many aspects of data usage and come up with seven recommendations to address these issues through the development of international standards. This would entail working with other IEC and ISO subcommittees (SC) and technical committees (TC) who cover different aspects of ICT.
1) Frameworks for data sharing agreements – JTC 1/SC 38: Cloud computing and distributed platforms
We have recognized a gap when framed against the data value chain and recommend that SC 38 consider addressing this gap in ISO/IEC 23751, Information technology - Cloud computing and distributed platforms - Data sharing agreement (DSA) framework, the standard currently being developed.
2) Decision to share – JTC 1/SC 40: IT Service management and IT governance
We recommend that a NWIP be created on the topic of Guidance for data usage, with the aim of providing a coherent set of considerations and guidance that will help organizations develop their decision-making process for sharing, exchange or exploitation of data.
3) Data quality – ISO/TC 184: Automation systems and integration
We note that JTC 1/SC 42: Artificial intelligence, has an Advisory Group already investigating data quality and reviewing ISO 8000 to determine if these standards meet the needs for big data. We recommend that SC 42 should continue the review of aspects of data quality and include sharing and analytics. We would then review the feedback on how/if these standards meet, or do not meet, the needs for data sharing and analytics.
4) Appropriate use of analytics
Appropriate use of analytics is ultimately a subjective matter and supported by appropriate governance frameworks. Such frameworks address controls in response to different risk elements and sensitivities. Metadata on data collection, provenance, quality, subject, use would support the governance frameworks for appropriate use of analytics.
We recognize the work of ISO/TC 69: Applications of statistical methods, JTC 1/SC 32: Data management and interchange, and SC 42 and will make reference to it. We also recognize a gap when framed against the data value chain.
Thus, we recommend that SC 32 be requested to address this gap considering our thematic focus areas of data frameworks and sensitivities.
5) Terminology – use cases
We recommend that new work be started on the topic of Terminology and use cases for data usage. It should establish terminology for data usage and describe related use cases. The output can be used in the development of other standards and in support of communications among diverse, interested parties/stakeholders. This document would be applicable to all types of organizations (including commercial enterprises, government agencies, not-for-profit organizations).
6) Utility of metadata – SC 32
We recommend requesting SC 32 to investigate formally defining (e.g., standardizing) various kinds of metadata, models of all the defined kinds of metadata, and metamodels of repositories for those kinds of metadata.
7) Proposal for a new Working Group
Given the recommendations for a Guidance for data usage and Terminology and use cases, and the need for additional guidance in the area of data usage, finally, we recommend the establishment of a Working Group on Data usage which would: