Data Management and Governance in Artificial Intelligence
Artificial intelligence (AI) and data-driven decision-making are two of the most discussed topics in the pharmaceutical and medical device industries. AI has the potential to revolutionise both internal operations and external applications, while data-driven decisions enhance project and business management efficiency.
Both AI and data-driven processes rely on vast amounts of data. The quality, integrity, and governance of this data are critical to ensuring that AI models operate effectively and generate reliable results. The pharmaceutical industry must adopt stringent measures to obtain, manage, and govern data throughout its entire lifecycle.
Key Considerations for Data Management in GxP and Medical Device AI Algorithms
To develop AI models that meet regulatory and operational requirements, organisations must ensure the following:
Reliable Data Sources: Datasets must be sourced from credible and validated origins to maintain accuracy and compliance.
Balanced Data Representation: Data must cover the entire expected range of use for the AI algorithm. If portions of the dataset are underrepresented, data augmentation techniques can be employed to address gaps.
Secure and Unchanged Data Storage: A copy of the original dataset should be stored securely and remain unaltered to ensure traceability.
Data Cleansing, Homogenisation & Annotation: Data preparation activities should be fully traceable, and any modifications must be based on scientific rationale.
Training and Verification Data Separation: Data used for training AI models must not be reused for verification purposes. Instead, a separate test dataset must be created and kept isolated to ensure objective evaluation.
Robust Security Measures: Preventing unauthorised access is essential, particularly for data protected under regulations such as GDPR. Stringent security protocols must be implemented to safeguard sensitive information.
Standard Operating Procedures (SOPs) for AI Data Management
Before engaging in AI-driven projects, companies should establish clear SOPs to define data governance processes. These procedures should include:
Guidelines for data acquisition, storage, and management.
Protocols for updating datasets, including the addition of new information and the handling of outdated or invalidated data.
Policies addressing the impact of dataset changes on already trained AI tools.
How Rephine Can Help
At Rephine, we have extensive experience in ensuring proper data and AI system management, from individual projects to company-wide policies.
If your organisation is looking to implement AI in a GxP or medical device environment, we can support you in navigating this complex landscape to ensure compliance, efficiency, and data integrity.
Ready to Find Out More?
Contact us to learn more about how we can assist you in managing AI and data governance effectively.

About the Author
This blog was authored by Sergi Arcas, Pharmaceutical Consultant at Rephine. For more insights on regulatory compliance and AI integration in the pharmaceutical industry, contact us at Rephine.