Summary of Short Course on Unleashing the Power of Machine Learning and Deep Learning to Accelerate Clinical Development at the 2024 Regulatory-Industry Statistics Workshop
Li Wang (AbbVie), Sheng Zhong (AbbVie), Yunzhao Xing (AbbVie)
With the rapid advancement of machine learning (ML) and deep learning (DL) over the past decades, Artificial Intelligence (AI) has profoundly transformed numerous industries and various aspects of people’s lives, such as with the development of full self-driving cars and large language models like ChatGPT. However, the adoption of ML/DL in the pharmaceutical industry, particularly in clinical development, has been relatively slow. This is partly due to the heavily regulated environment, limited structured data sources, and often a mindset driven by statistical inference in trial design and analysis.
Today, as modern technology advances and diverse types of structured and unstructured data sources—such as medical imaging, electronic medical records (EMR), sensors, and wearables—emerge to aid clinical development, prediction driven ML and DL methodologies are becoming increasingly powerful tools to derive insights from huge datasets. The landscape of drug development has changed substantially over the past decade, and drug development processes continue to evolve. Statisticians must evolve and embrace the new era as well. As the Statistical Innovation Group leading and driving innovation in clinical development at AbbVie, we have pioneered and piloted several use cases using ML and DL, witnessing firsthand the power and potential of AI/ML to expedite and revolutionize clinical development.
To raise innovation awareness in the pharmaceutical industry and bridge the gap between traditional statistical training and modern data science, we propose and conducted a short course in 2024 RISW to highlight the similarities and unique differences between traditional statistics and ML/DL. This course aimed to provide necessary upskilling for industry and regulatory statisticians and to encourage the increased adoption of ML/DL in clinical development. The course began with an overview of the evolution of ML/DL methodologies and key concepts (e.g., backpropagation, hyperparameter tuning), providing a solid foundation in basic ML/DL knowledge. This foundation lead into the more intriguing and complex areas where the latest developments in image processing and natural language processing were introduced, along with their novel applications in pharmaceutical development, as evidenced by our recent projects and published papers.
Here’s a recap of each of the three parts of this short course:
Part I Machine Learning (ML) and Deep Learning (DL) Basics: This section providedan insightful overview of the similarity and difference between traditional statistics and ML/DL, and fundamental backgrounds on neural network architectures, emphasizing their implementation in the deep learning domain. We began by examining how input data is transformed within neural networks, highlighting the conversion of text, speech, and images into numerical formats such as vectors, matrices, and tensors. The discussion detailed the workings of neurons as fundamental computational units, explaining concepts like weights, bias, and activation functions. Key elements like feedforward data flow, loss functions, and backpropagation were elucidated, using both equations and numeric examples to demonstrate how neural networks learn and enhance over time. The section also covered critical optimization techniques, notably gradient descent, to enhance model performance. Through exploring these foundational concepts, we laid the groundwork for understanding and building more complex neural network models for diverse applications.
Part II Deep Convolutional Neural Networks for Computer Vision and Applications: This section provided a comprehensive exploration of the pivotal role that Deep Convolutional Neural Networks (DCNNs) play in computer vision. It begins with an introduction to the foundational concepts and publicly available image datasets. The course progressed through essential operations in DCNNs and highlights significant architectures such as VGGNet. It further delved into object detection frameworks, examining the evolution from traditional two-stage models like Faster R-CNN to more integrated approaches such as YOLO. Additionally, the section covered image segmentation, focusing on techniques like Mask R-CNN and U-Net, and discusses the application of Generative Adversarial Networks for creating realistic data. A practical code session was included, demonstrating the application of U-Net in medical imaging, enriching the theoretical insights with hands-on experience.
Part III Natural Language Processing (NLP) and Applications: To motivate all the concepts covered, we started this part by showing the pipeline workflow of a case study of applying NLP to detect adverse drug events using the X platform (formerly known as Twitter) Data. Then we introduced one of the most basic and important concepts in NLP, the word representations or embeddings, where we discussed the Word2Vec skip-gram model from Mikolov et al. 2013. The core content of this part of the course was language models, for which we took the approach of giving a historical tour of the successive developments in language modeling. We introduced the following language models in their time order: the traditional n-gram statistical models, the fixed-window neural network approach to language modeling, the basic recurrent neural networks (RNNs), the RNNs based on Long Short-Term Memory (LSTM) Units, the RNN variations (bidirectionality and multi-layers), the seq-to-seq RNNs with attention mechanism, the transformer-based large language models (LLMs) based on self-attention, and the very large language models and prompt engineering, where we always discussed the inefficiency and ineffectiveness of an earlier model that led to the next development in language modeling. After all the concepts and methods were introduced, we went back to the motivating example and showed our audiences how to apply all the concepts and LLMs to this case study to predict adverse drug events in X platform posts. We ended this part of the course by sharing the results from different models, discussing some observations regarding the performances of different LLMs, followed by a brief code review for the case study.
It was truly great to see strong attendance from both industry and FDA statisticians at the short course. We hope it has sparked greater enthusiasm for AI and its potential in clinical development, and that more statisticians will evolve to embrace the era of big data and AI.
Photos from the short course: