
Welcome to the sDHT Adoption Library, featuring NaVi
NaVi is a closed-environment AI research assistant that leverages a carefully curated library of more than 300+ vetted documents, including FDA guidance and industry best practices. NaVi helps you search and explore content across the sDHT Adoption Library and Roadmap using natural language questions.
The Library is intended to serve as a living resource. Content is added periodically as new guidance, standards, and peer-reviewed research are released.
Meet NaVi: Your AI-Powered Research Assistant
Library scope and selection
To ensure high-quality, relevant results, the Library follows a predefined scoping approach:
- Inclusions: FDA guidance, non-commercial standards, and peer-reviewed research (2018–Present) focused on sDHTs being used as measurement tools for medical products in U.S.-based clinical trials.
- Exclusions: Materials from single commercial entities, non-U.S. regulatory bodies (except select EMA guidances with direct U.S. cross-relevance), and conference proceedings, and conference proceedings.
Inclusion in the Library does not imply endorsement, completeness, or regulatory acceptability.
Library scope
Resources in the sDHT Adoption Library are identified using a predefined scoping approach and include publicly available FDA guidance, non-commercial standards and guidance, and peer-reviewed research relevant to sDHT use in U.S.-based clinical trials. Materials from single commercial entities, non-U.S. regulatory bodies, conference proceedings, and studies conducted exclusively outside the United States are excluded; inclusion does not imply endorsement or regulatory acceptability.
Last updated 2026: Library content is reviewed and updated on a periodic basis as new eligible materials become available.
A Hierarchical Framework for Selecting Reference Measures for the Analytical Validation of Sensor-Based Digital Health Technologies
A Hierarchical Framework for Selecting Reference Measures for the Analytical Validation of Sensor-Based Digital Health Technologies
The quality of evidence for the analytical validation of sensor-based digital health technologies (sDHTs), which is the evaluation of algorithms converting sensor data into a clinically interpretable measure, is often inconsistent and insufficient. The existing V3+ framework codifies the overall evaluation process, which includes verification, usability validation, analytical validation, and clinical validation. To improve the scientific rigor of analytical validation, a hierarchical framework for selecting reference measures is needed because not all potential reference measures are of equal quality. The framework classifies reference measures based on attributes that contribute to reduced measurement variability, with defining and principal measures being the most rigorous due to objective data acquisition and the ability to retain source data.
Recommendations
The proposed framework sequentially moves the investigator through four steps: (1) Compile preliminary information, including the digital clinical measure, context of use (COU), algorithm requirements, and sensor verification evidence . (2) Select an existing reference measure, develop a novel comparator, or identify a set of anchor measures, prioritizing measures with the highest scientific rigor (defining → principal → manual → reported) . (3) Consider the impact of the data collection environment to determine if the analytical validation study can be conducted in the intended use environment with the highest-order measure, or if in-lab validation is necessary, ensuring the results are generalizable . (4) Describe the rationale for key study design decisions to encourage transparency for evaluators, regulators, and payers . Investigators must justify passing over a higher-ranked reference measure, generally only acceptable if the higher-ranked measure poses unacceptable risk or is not applicable to the context of use.
Regulatory Considerations
The principles of the framework for analytical validation apply regardless of the regulatory status of the sDHT (regulated medical device, low-risk general wellness apps, or research product) or its intended use (clinical care or clinical research). The framework is intended to help investigators support the most rigorous claims regarding sDHT performance, which is important for acceptance by evaluators, peer-reviewers, regulators, and payers. The categorization of the digital clinical measure as a digital biomarker or an electronic clinical outcome assessment also does not change the framework's applicability.
Some summaries are generated with the help of a large language model; always view the linked primary source of a resource you are interested in.
Advancing the Integration of Digital Health Technologies in the Drug Development Ecosystem
Advancing the Integration of Digital Health Technologies in the Drug Development Ecosystem
Findings
The rapid advancement of sensor technology and connectivity has enabled high-frequency, longitudinal monitoring of physiological processes, yet the infrastructure for large-scale deployment remains resource-intensive. Current challenges include a lack of standardized terminology for digital decision-making tools and significant variability in environmental factors that affect sensor performance. Proprietary algorithms and device-specific barriers often hinder the verification and validation processes necessary for regulatory approval. Additionally, there is a distinct gap between granular digital features and their clinical relevance or meaningfulness to patients. Ethical concerns are emerging around data management, patient anxiety in psychiatric contexts, and the responsibility for addressing adverse events detected by remote monitoring.
Recommendations
Stakeholders should develop consensus-driven frameworks for standardized device performance reporting and environmental testing to streamline evaluations for specific contexts of use. The community should adopt a modular approach to data standards that bins requirements by concept of interest and disease-specific needs. Collaborative efforts between patients and developers are essential to bridge the gap between technical metrics and meaningful aspects of health. It is recommended to implement ""bring-your-own-device"" (BYOD) frameworks that ensure data reliability while supporting the inevitable evolution of technology during long-term studies. Researchers and clinicians must be trained in the ethical, legal, and social implications of digital health technology use, particularly regarding data privacy and the management of remote-detected safety signals.
Regulatory Considerations
Digital health technologies used to collect endpoints must meet high evidentiary requirements for validation, with complexity increasing when multiple sensors or complex software are bundled. Regulatory agencies like the FDA and EMA have established pathways for the qualification of drug development tools, including biomarkers and clinical outcome assessments. Integration of new draft guidance on remote health monitoring with existing regulatory workflows is necessary to reduce uncertainty in trial evaluations. While many digital health technologies do not qualify as medical devices unless they have a specific medical purpose, synergies between device risk assessments and drug trial data integrity frameworks should be explored. Early engagement with regulators remains a critical step for obtaining feedback on novel digital endpoints and ensuring the suitability of evidentiary support.
Some summaries are generated with the help of a large language model; always view the linked primary source of a resource you are interested in.
Artificial Intelligence in Software as a Medical Device
Artificial Intelligence in Software as a Medical Device
The traditional medical device regulatory paradigm is not designed for the adaptive nature of AI/ML technologies, which can learn and change after they are on the market. A key benefit of AI/ML is its ability to improve performance by learning from real-world data, but this also presents a unique regulatory challenge. To ensure patient safety and device effectiveness, a new, flexible regulatory framework is required that can accommodate these iterative improvements. Transparency and robust monitoring are essential to manage the risks associated with evolving algorithms.
Recommendations
The FDA proposes a "Predetermined Change Control Plan" (PCCP) to be included in premarket submissions. This plan would specify the anticipated modifications to the device (the "what") and the methodology for implementing and validating those changes (the "how"). The development of "Good Machine Learning Practice" (GMLP) is encouraged to ensure that AI/ML algorithms are developed and validated using best practices. Manufacturers should implement robust real-world performance monitoring to ensure that their devices remain safe and effective after deployment.
Regulatory Considerations
The FDA is developing a new regulatory framework tailored to the unique aspects of AI/ML-based SaMD, which will leverage a TPLC approach. The agency has issued an "AI/ML SaMD Action Plan" that outlines its multi-pronged approach, including issuing draft guidance on PCCPs and promoting the harmonization of GMLP. The FDA is actively collaborating with stakeholders to foster innovation while ensuring patient safety. The agency maintains a public list of authorized AI/ML-enabled medical devices to enhance transparency.
Some summaries are generated with the help of a large language model; always view the linked primary source of a resource you are interested in.
Biomarker Qualification Program
Biomarker Qualification Program
The traditional process of evaluating biomarkers within the context of a single drug development program is inefficient and creates uncertainty for sponsors. This case-by-case approach leads to redundant efforts, slows down the development of novel therapies, and hinders the broad adoption of promising scientific tools. There is a clear need for a centralized, collaborative pathway to formally validate biomarkers, which can de-risk drug development, encourage innovation, and make the process more predictable and cost-effective for all stakeholders.
Recommendations
Drug developers, academic researchers, and other stakeholders should proactively engage with the FDA through the formal Biomarker Qualification Program to validate biomarkers for specific contexts of use. It is recommended to form public-private partnerships and other collaborations to pool resources and data, which strengthens the evidence package for a biomarker's utility. Developers should use the qualification process to establish a biomarker's value early, making it a publicly available and reliable tool that can accelerate the development of multiple drug products.
Regulatory Considerations
The Biomarker Qualification Program provides a distinct regulatory pathway for establishing a biomarker's validity for a specific Context of Use (COU), separate from an individual Investigational New Drug (IND) or New Drug Application (NDA). The process involves a three-stage submission and review cycle: the Letter of Intent, the Qualification Plan, and the Full Qualification Package. Once qualified, a biomarker is publicly listed and can be incorporated into multiple drug development programs without the need for sponsors to re-submit and re-justify the validation data for that specific COU, streamlining subsequent regulatory reviews.
Some summaries are generated with the help of a large language model; always view the linked primary source of a resource you are interested in.
Considerations for the Use of Artificial Intelligence To Support Regulatory Decision-Making for Drug and Biological Products, Draft, 2025 (FDA)
Considerations for the Use of Artificial Intelligence To Support Regulatory Decision-Making for Drug and Biological Products, Draft, 2025 (FDA)
The document introduces a risk-based credibility assessment framework for establishing and evaluating the credibility of an Artificial Intelligence (AI) model's output when used to support regulatory decisions regarding drug safety, effectiveness, or quality. The framework outlines a 7-step process beginning with defining the question of interest and the Context of Use (COU). Credibility is defined as trust, established through evidence, in the AI model's performance for a particular COU. The credibility assessment is tailored to the AI model risk, which is a combination of model influence (the AI model's evidence contribution relative to other evidence) and decision consequence (the significance of an adverse outcome from an incorrect decision). The document highlights challenges with AI use, including variability in development datasets (training/tuning), the need for methodological transparency due to model complexity, difficulty in quantifying and interpreting uncertainty in model output, and the potential for performance change over time (data drift), which necessitates life cycle maintenance.
Recommendations
Sponsors and interested parties should define the question of interest and clearly define the COU, detailing the AI model's specific role and scope and whether other information will be used. They should assess the AI model risk (low, medium, or high) to ensure that subsequent credibility assessment activities (Step 4) are commensurate with that risk and tailored to the COU. For Step 4, the credibility assessment plan should include a description of the model, model development process (including inputs, architecture, feature selection, and rationale), and data used (training and tuning data). Development data must be deemed fit for use (relevant and reliable) to mitigate issues like algorithmic bias. The plan should also detail the model evaluation process using independent test data and include performance metrics with confidence intervals, an estimate of uncertainty, and a description of model limitations. Early engagement with the FDA is strongly encouraged to discuss model risk and the adequacy of the credibility assessment plan.
Regulatory Considerations
The risk-based credibility assessment framework is intended to help organize and document information for regulatory submissions. The required stringency of assessment activities and the level of documentation should be commensurate with the AI model risk. For AI models whose performance can change over time (e.g., in pharmaceutical manufacturing or postmarketing), sponsors must implement life cycle maintenance plans to monitor performance and manage changes in a risk-based manner. Changes to AI models should be evaluated through the manufacturer's change management system and may require re-execution of parts of the credibility assessment plan. Early engagement can be facilitated through formal meetings (e.g., Pre-IND) or other specialized programs listed in the guidance, such as the Center for Clinical Trial Innovation (C3TI), the Model-Informed Drug Development (MIDD) Paired Meeting Program, and the Emerging Technology Program (ETP) or Advanced Technologies Team (CATT).
Some summaries are generated with the help of a large language model; always view the linked primary source of a resource you are interested in.
Digital Measures: De-risking Cytokine Release Syndrome (CRS)
Digital Measures: De-risking Cytokine Release Syndrome (CRS)
Cytokine Release Syndrome (CRS) is a common and potentially life-threatening adverse event of immunotherapies, particularly in Oncology, complicating patient care and increasing healthcare costs. Standard-of-care inpatient monitoring for CRS is manual, intermittent, costly, and restrictive, providing an incomplete view of the syndrome’s development and progression. The use of Digital Health Technologies (DHTs) for continuous, remote monitoring of vital signs (like heart rate, respiratory rate, skin temperature, SpO2, and activity) can capture early indicators of CRS up to two hours earlier than standard episodic monitoring. This ability to collect multivariate continuous data is valuable for informing robust model development for CRS risk prediction.
Recommendations
Investigators should deploy DHTs available today to monitor vital signs and symptoms currently observed in the hospital setting, but in an outpatient or home environment. The goal is to develop Early Warning Products that assess the probability of developing CRS, providing clinical decision support. Product developers should follow a strategic roadmap that outlines milestones for building products that are clinically relevant and commercially viable. Researchers should use a common set of digital clinical measures to gather high-quality datasets and ensure comparability across studies to build more robust predictive models. Predictive algorithms should be built on a robust reference measure for analytical validation and be clinically validated with sufficient data.
Regulatory Considerations
The resources are designed to help developers build products that are clinically appropriate, regulatory-acceptable, and commercially viable. Future regulatory submissions for CRS de-risking products will benefit from aligning with this industry-wide dialogue that is being built in collaboration with the FDA. Developing a robust CRS safety biomarker could enhance the safety profile of clinical trials, increase trial access, and streamline regulatory decision-making, possibly through a qualification pathway. Products that aim for a higher level of autonomy, such as a Diagnostic that redefines current CRS grading classes, may require very high clinical evidence and likely stringent regulatory review.
Some summaries are generated with the help of a large language model; always view the linked primary source of a resource you are interested in.
List of qualified DDTs
List of qualified DDTs
The database provides a transparent and accessible way for the public to track the progress of various Drug Development Tools (DDTs) through the FDA's qualification pipeline. This includes biomarkers, clinical outcome assessments, and animal models. The information available, such as submission status and supporting documentation, offers insight into the types of tools being developed and the evidence required for their qualification. The platform reveals that a wide range of tools are in development across numerous therapeutic areas, highlighting active areas of research and innovation in drug development.
Recommendations
Stakeholders in the drug development ecosystem are encouraged to utilize this database to inform their research and development strategies. By reviewing the status of existing DDT submissions, sponsors can identify opportunities for collaboration, avoid duplicative efforts, and better understand the evidentiary requirements for tool qualification. Prospective tool developers should use the database to learn from successful submissions and to align their own development plans with FDA expectations.
Regulatory Considerations
This database is a direct implementation of the transparency provisions of the 21st Century Cures Act. The public availability of this information is intended to foster trust and collaboration in the DDT qualification process. By providing a clear view of the regulatory journey of various tools, the FDA aims to standardize the qualification process and encourage the development and use of novel, validated tools in drug development. Users of the database should be aware that the information reflects the status of a DDT at a particular point in time and that the qualification process is an iterative one.
Some summaries are generated with the help of a large language model; always view the linked primary source of a resource you are interested in.
Medical Device Development Tool (MDDT) Summary of Evidence and Basis of Qualification – Apple Atrial Fibrillation History Feature
Medical Device Development Tool (MDDT) Summary of Evidence and Basis of Qualification – Apple Atrial Fibrillation History Feature
Clinically Acceptable Performance: A clinical study demonstrated that the weekly AFib burden estimates from the Apple AFib History Feature were in close agreement with a reference ECG patch, with an average difference of just 0.67%. The vast majority of measurements had paired differences within ±10% of the reference device.
Generalizable Across Subgroups: The device's accuracy was similar across various subgroups, including different sexes, races, ages, and skin tones.
Performance Post-Ablation is Uncertain: In a small subgroup of patients with a prior cardiac ablation, the device's performance, while still strong, showed slightly more variability and exceeded a pre-specified acceptance criterion. The study was not designed or powered to demonstrate equivalent performance in this specific group.
Technical Limitations Exist: The feature only provides a retrospective weekly estimate and does not give specific timestamps or durations of AFib episodes. It also does not detect other atrial tachyarrhythmias, like atrial flutter.
Recommendations
Appropriate Use: The document implicitly recommends using the tool precisely within its qualified context of use—as a secondary, not primary, endpoint for comparing AFib burden between study arms in cardiac ablation device trials.
Supplemental Data Collection: For studies involving patients who have had a prior ablation, it would be beneficial to assess the tool alongside other methods of determining AFib burden to better characterize its performance in this population.
Define Study-Specific Endpoints: Investigators using the tool are responsible for defining and justifying their specific study designs and what constitutes a clinically significant reduction in AFib burden.
Regulatory Considerations
MDDT Qualification: The Apple AFib History Feature is officially qualified by the FDA as a Medical Device Development Tool (MDDT), which reduces the burden on device developers, as they no longer need to independently justify its methodology for collecting weekly AFib burden estimates in their clinical studies.
Secondary Endpoint Only: A key limitation for its regulatory use is its qualification only as a secondary endpoint. It cannot, by itself, be used to evaluate the primary safety and effectiveness of cardiac ablation devices. This is partly because FDA typically requires the inclusion of any atrial tachyarrhythmia (not just AFib) for defining ablation success in pivotal studies.
Not a Replacement for Primary Endpoints: The tool's utility is intended to provide supplemental data and help better understand post-treatment AFib burden; it is not meant to replace more clinically well-defined primary endpoints.
Some summaries are generated with the help of a large language model; always view the linked primary source of a resource you are interested in.
Medical Device Development Tools (MDDT)
Medical Device Development Tools (MDDT)
The development and evaluation of medical devices require scientifically plausible and reliable tools for collecting data to support regulatory submissions. A lack of standardized, pre-vetted tools can lead to inefficiencies and unpredictability in the device development and review process. The qualification of development tools can be applied across a wide range of device areas, including cardiovascular, neurology, imaging, and cybersecurity. The evidence required for tool qualification must be robust enough to support its intended context of use.
Recommendations
Tool developers, medical device sponsors, research organizations, and academic institutions are encouraged to voluntarily submit proposals to the MDDT program to qualify their tools. Submissions should include a detailed description of the tool, a clearly defined context of use (COU), specific performance criteria, and a comprehensive plan for collecting evidence to validate the tool's performance and scientific plausibility. Collaboration in developing tools and supporting evidence is recommended to pool resources and increase the acceptance of qualified tools.
Regulatory Considerations
The MDDT program is a formal regulatory mechanism for the FDA to qualify tools that can be used to support assessments of medical device safety, effectiveness, or performance. Once a tool is qualified for a specific context of use, the FDA accepts assessments from that tool in support of regulatory submissions without needing to re-evaluate the tool's suitability. The program recognizes four main categories of tools: Non-clinical Assessment Models (NAM), Biomarker Tests (BT), Clinical Outcome Assessments (COA), and an "Other" category for tools that do not fit the primary classifications.
Some summaries are generated with the help of a large language model; always view the linked primary source of a resource you are interested in.
Patient-Focused Drug Development: Selecting, Developing, or Modifying Fit-for-Purpose Clinical Outcome Assessments
Patient-Focused Drug Development: Selecting, Developing, or Modifying Fit-for-Purpose Clinical Outcome Assessments
The guidance applies to four types of Clinical Outcome Assessments (COAs): Patient-Reported Outcomes (PROs), Observer-Reported Outcomes (ObsROs), Clinician-Reported Outcomes (ClinROs), and Performance Outcomes (PerfOs). A COA is considered fit-for-purpose when the validation evidence is sufficient to support its context of use (COU). To determine if a COA is fit-for-purpose, sponsors must clearly describe the Concept of Interest (COI) and the COU, and present sufficient evidence to support a clear rationale for the COA's proposed interpretation and use. The rationale for using a COA should include up to eight components, such as justification for the COA type, capturing the important parts of the COI, appropriate administration and scoring, minimal influence from irrelevant factors or measurement error, and correspondence with the Meaningful Aspect of Health (MAH). The most direct assessment of how a patient feels or functions (MAH) should be used as the COI whenever possible.
Recommendations
Sponsors should use the Roadmap to Patient-Focused Outcome Measurement to guide the selection, modification, or development of a COA. The process begins with understanding the disease/condition (including patient perspectives) and conceptualizing clinical benefits and risks (defining the MAH, COI, and COU). When feasible, existing COAs are generally preferred, especially for well-established COIs, as this approach is often the least burdensome. If an existing COA is modified or used in a different context, additional evidence (e.g., cognitive interviews, psychometric studies) must be collected to justify its fitness for the new context of use. For new COA development, sponsors should involve patients, document all steps, and generally avoid using the new COA for the first time in a registration (pivotal) trial; a standalone observational study or early phase trial is recommended for evaluation.
Regulatory Considerations
Sponsors are encouraged to interact early and throughout medical product development with the relevant FDA review division to ensure COAs are appropriate for the intended COU. Sponsors should communicate their proposed COA-based endpoint approach, including the MAH, COI, COA type/name/score, and the final COA-based endpoint, ideally using the suggested format. The type and amount of evidence required to support the rationale for a COA's use is weighed against the degree of uncertainty regarding that part of the rationale. For ClinROs, it is recommended to use an assessor masked to treatment assignment and study visit for primary endpoints, if feasible. FDA strongly discourages proxy-reported measures for concepts known only to the patient (e.g., pain) and recommends using an ObsRO to measure observable behaviors instead when the patient cannot self-report.
Recommendations
Clearly define the concept of interest and its context of use to ensure COAs align with trial objectives.
Use conceptual and measurement frameworks to communicate how COAs measure patient experiences and generate interpretable scores.
Leverage existing COAs where possible, modifying them only when justified, and document all modifications rigorously.
Ensure COAs are accessible and inclusive, incorporating features like large fonts, touch interfaces, or audio assistance for diverse populations.
Conduct early engagement with FDA to discuss COA selection, development, and validation plans.
Regulatory Considerations
Fit-for-purpose validation requires evidence of conceptual alignment, scoring reliability, and sensitivity to clinically meaningful changes.
Digital health technologies used for COAs must comply with FDA’s guidance on data integrity, usability, and technical performance.
COAs intended for regulatory submissions must be developed and validated before pivotal trials to avoid jeopardizing trial outcomes.
Modifications to COAs or scoring methods during trials necessitate justification and revalidation.
Sponsors should submit comprehensive documentation on COA development, including scoring algorithms and item tracking matrices.
Some summaries are generated with the help of a large language model; always view the linked primary source of a resource you are interested in.
Regulatory considerations for successful implementation of digital endpoints in clinical trials for drug development
Regulatory considerations for successful implementation of digital endpoints in clinical trials for drug development
Regulatory Acceptance is Complex: Gaining regulatory acceptance for endpoints derived from Digital Health Technologies (DHTs) is a lengthy, multifaceted, and costly process that requires a global strategy and early health authority consultation.
"Fit-for-Purpose" is Key: A DHT's clearance or approval as a medical device does not automatically ensure it is fit-for-purpose in a clinical trial; its intended use must align with the specific context of use (COU) in the study.
Meaningfulness is a Hurdle: Demonstrating the clinical meaningfulness of novel digital endpoints, especially for abstract concepts like cognitive decline in Alzheimer's Disease, remains a significant challenge for regulatory acceptance.
International Harmonization is Lacking: Differences in regulatory requirements for DHT validation between major health authorities can delay or prevent the successful implementation of digital measures in global clinical trials.
Technology Changes Pose Risks: Software and hardware updates to DHTs during a clinical trial can have significant implications, potentially invalidating study results if not managed through a predetermined change-control plan.
Recommendations
Engage Health Authorities Early and Often: Sponsors should conduct multiple consultations with major health authorities (e.g., FDA, EMA) early in the development process to align on the Concept of Interest (COI), COU, and the validation roadmap.
Develop a Comprehensive Regulatory Strategy: A global regulatory strategy should be an integral part of the overall development plan, tailored to the program's objectives and endpoint hierarchy.
Establish "Fit-for-Purpose" Criteria: Before selecting a DHT, sponsors should establish the minimum technical and performance specifications required for the specific COU to guide the selection of a fit-for-purpose device.
Create a Conceptual Framework: For novel endpoints, sponsors should develop a conceptual framework that visualizes how the DHT-derived measure relates to meaningful health concepts and patient experiences.
Plan for Change and Missing Data: Sponsors should establish predetermined change-control plans with manufacturers to manage DHT updates and create risk management plans to minimize and handle missing data from remote acquisition.
Regulatory Considerations
Distinct Pathways in US vs. EU: The US FDA uses a risk-based approach for DHTs that are medical devices, while in Europe, CE marking for the intended COU is generally expected by the EMA.
Qualification is an Option, Not a Requirement: Both the FDA and EMA offer voluntary qualification programs for Drug Development Tools (DDTs), which can validate a DHT for a specific COU across multiple drug programs, though the process is resource-intensive.
Scientific Advice for Individual Programs: For DHTs used within a single drug development program, engaging with health authorities through scientific advice meetings is a more targeted and confidential pathway for gaining feedback and agreement.
Data Privacy and Security are Paramount: Sponsors must ensure that the collection, transfer, and storage of personal data via DHTs comply with all applicable regulations, such as GDPR in the EU, including cybersecurity and data transfer measures.
Some summaries are generated with the help of a large language model; always view the linked primary source of a resource you are interested in.
Using Artificial Intelligence & Machine Learning in the Development of Drug & Biological Products: Discussion Paper and Request for Feedback, 2025 (FDA)
Using Artificial Intelligence & Machine Learning in the Development of Drug & Biological Products: Discussion Paper and Request for Feedback, 2025 (FDA)
The use of Artificial Intelligence (AI) and Machine Learning (ML) is being applied to a broad range of drug development activities with the potential to accelerate the process and make clinical trials safer and more efficient. The inclusion of AI/ML is most common in the clinical development/research phase of regulatory submissions. Concerns exist that AI/ML algorithms could amplify errors and preexisting biases in underlying data sources, which raises issues related to generalizability and ethical considerations. Other challenges include limited explainability due to model complexity and proprietary reasons, as well as managing risks related to data quality, reliability, and representativeness. The FDA recognizes that a careful, risk-based assessment of the specific context of use (COU) is needed when evaluating AI/ML.
Recommendations
Stakeholders should adhere to practices in three key areas: human-led governance, accountability, and transparency; quality, reliability, and representativeness of data; and model development, performance, monitoring, and validation. A risk management plan should be applied to identify and mitigate risks based on the COU, guiding the level of documentation and transparency. Practices are needed to ensure the integrity of AI/ML and address issues like bias and missing data. For models, developers should use pre-specification steps and clear documentation for development and assessment criteria. Models must be monitored over time for reliability and consistency, and Real-World Data (RWD) performance can provide valuable feedback, including for potential re-training.
Regulatory Considerations
The FDA encourages early engagement through mechanisms like the Critical Path Innovation Meetings (CPIM), ISTAND Pilot Program, and Emerging Technology Program to discuss relevant AI/ML methodologies or technologies. The Verification and Validation (V&V 40) risk-informed credibility assessment framework and the principles for Good Machine Learning Practices (GMLP), while not specific to drug development, are helpful guides for evaluating models. The industry is exploring the use of a Predetermined Change Control Plan (PCCP) mechanism for AI/ML-based devices to proactively specify and manage modifications, enhancing adaptability. In general, a risk-based approach should guide the level of evidence and record keeping needed for the verification and validation of AI/ML models for a specific COU.
Some summaries are generated with the help of a large language model; always view the linked primary source of a resource you are interested in.