Introduction: the basics of data integrity
Data integrity means the certainty that data is complete, correct, accurate and up-to-date throughout its lifecycle. A data integrity policy takes into consideration the data lifecycle and any unintentional or deliberate harm that might be caused to data.
Many official documents refer to the notion of data integrity. The following are a few of the definitions offered:
- According to the ISO/IEC 27000:2016 standard, “integrity is the property of accuracy and completeness.”
- France’s National guidelines on public archives (Référentiel général de gestion des archives) define integrity as “the quality of a document or data item that has not been compromised. Digitally, documents or data are trustworthy if their imprint (or digital fingerprint) at time t+1 is identical to their imprint at time t.”
- According to the French general inter-ministerial directive no. 1300, it is the “property providing assurance that data or processing has not been subject to unauthorised changes or deletion.”
Ensuring data integrity therefore means that data stored or transmitted will always be reliable, consistent, accurate and verifiable no matter the length of time it is kept or the use to which it is put. It forms one aspect of Data governance.
In practical terms, a number of questions lie behind this notion of data integrity: Do you know who has been making changes to your data? Are you in a position to identify all the changes that have affected your data? How can you prove to third parties that your data has not been compromised? How do you guarantee that your data is reliable and valid at any given point in time?
This post will deal with the integrity of data stored on digital storage media. However, data integrity can also apply to paper copies, for example. Similarly, the focus will be on database integrity rather than physical integrity.
There are several types of database integrity, with entity integrity, referential (or relationship) integrity, domain integrity, and user-defined integrity (i.e. following specific rules that the other three types of integrity do not cover).
What types of digital data are involved?
Integrity is a matter for all types of digital data. Generally speaking, there are two main categories of data involved:
- Files: desktop documents, videos, images, etc. but also files linked to IS operation, such as event logs and configuration files, and indeed computer programs.
- Business and technical data held in databases
Focus on data integrity in the pharmaceutical quality system
Data integrity is a matter that affects all industries. The requirements are naturally all the more stringent when an industry generates critical data with a financial, human or strategic impact. The pharmaceuticals industry is therefore one of the areas most affected. Data integrity is of the utmost legal importance in pharmaceuticals.
Data integrity is considered, quite rightly, as a significant component in the pharmaceutical industry’s responsibility for ensuring the safety, effectiveness and quality of medicinal products, and the ability of health bodies to protect public health.
In France, Appendix 11 of the Guide to Good Manufacturing Practice produced by the National Agency for the Safety of Medicines and Health Products (ANSM) on computer systems refers to data integrity in terms of integrated checks to ensure the accuracy and security of data entries and processing, checks on data accuracy, the integrity of backed-up and archived data, the existence of an audit trail of changes, etc. It is clearly stated that risk management must encompass data integrity.
In the United States
In the United States, the Food and Drug Administration (FDA) also attaches great importance to data integrity. It is a crucial requirement of the pharmaceutical quality system described in its Good Manufacturing Practices (GMP). The acronym ALCOA for Attributable, Legible, Contemporaneous, Original and Accurate is used to define the five qualities needed to maintain data quality. The initial ALCOA principles have been supplemented by ALCOA+ which adds Complete, Consistent, Enduring and Available. Data consequently needs to show all nine attributes to be trustworthy.
What is a data integrity failure? What causes one, and what are the consequences?
An integrity failure occurs when data is destroyed or compromised. The impact might or not might be immediate. While in some situations a failure might have no impact, such as a document retained for legal archiving purposes, in other situations the consequences can be very serious.
Common causes of data compromise
- Attempted internal fraud, external perpetrators (e.g. cybercrime), computer viruses
- Technical flaws in the information system (e.g. a bug in an application that deletes the wrong data, inadequate data validation, etc.)
- Errors or replication during data transfer
- Human errors in data entry, use or manipulation
- Hardware hazards (fire, mechanical faults, etc.)
The consequences of data integrity failures
Such causes can have the effect of creating inaccurate or incomplete data records, backdating data, generating inconsistent entries, deleting data or making damaging changes to it, etc.
Ultimately posing a risk to the business:
- Strategic decisions taken based on erroneous data
- Lost productivity and time and money wasted on correcting errors, identifying causes, etc.
- Legal penalties
- Harm to brand image
How can data integrity be ensured? What are the best practices to follow?
Protecting data integrity means the ability to identify irregular or anomalous changes to data and, if necessary, to revert to a previous version of the data. Another aspect is the ability to prove that data has not been changed.
Data security is certainly one dimension to ensuring data integrity (data protection), but it is only one dimension. Security and integrity should not be conflated. Information systems management and fraud detection mechanisms are just as important.
Ensuring data integrity therefore requires:
- Reliable data gathering. All data entries must be checked and validated, and be consistent with the data dictionary.
- Checks on permissions and rights to access and edit data.
- Centralised databases with guaranteed uniqueness. Data integrity also requires that the data being used is the right data.
- Traceability of all changes made to data (additions, deletions and changes) and the availability of a complete, tamper-proof history.
- Confidence that data is backed up and can be restored.
- Periodic audit trail generation: data erasure, job failures, compliance tests, data deletion, backdating, changes, etc.
- Staff training, preparation and involvement. People are often the weakest link. Procedures must be documented, and rules and obligations set out, to ensure the issues are understood. Suppliers and partners also have a role to play in the data integrity chain. Responsibilities and checks to be carried out should be determined with the relevant people, together with communication procedures and how they apply to IT systems.
Data governance to ensure data integrity
Data integrity is a status as well as a process. At Blueway, we firmly believe that it is based on fitting the various aspects of data interchange together to provide data governance, including master data repositories, processes and the circulation of data between internal applications and those external to the IS. Adherence to information system procedures is not optional if data accuracy, traceability and changes are to be checked throughout the data lifecycle.
We also attach great importance to the people factor. Gaps between what users require and what IT systems provide can be a source of integrity failures (use of workaround solutions, security issues, etc.).
In addition, traceability and non-repudiation of changes are not always enough to ensure the integrity of data within the database. It is always possible for someone with the necessary permissions to go directly into the database to delete or edit the data it holds. Tamper-proofing is also essential, which is why Blueway has taken the innovative step of using blockchain in its Master Data Management.
The public-sector view of digital transformation In France, although all public-sector bodies…
Far from being a trivial matter for a business, implementing a single master data…
Decision-makers often wish to put a data governance strategy in place. However, the next step…
Adopting a true data governance policy is no longer an optional matter for any organisation….
Controlled circulation of data around any information system is nowadays crucial to correctly…
In these days of big data, and as Marketing departments are shifting towards data-driven…
Businesses are by now fully aware of the control issues surrounding master data and data…
What is the right approach and right SCV (Single Customer View) to underpin a customer-centric strategy?
The SCV and its central role in customer data quality Providing genuine continuity throughout…
While Business Process Management is primarily itself a process, it requires technological…
to follow to ensure data integrity?