# Three-line Summary #
- Deep learning algorithms are being applied to biological images and are transforming the analysis and interpretation of imaging data.
- We review the intersection between deep learning and cellular image analysis and provide an overview of both the mathematical mechanics and the programming frameworks of deep learning that are pertinent to life scientists.
- We relay our labs' experience with three key aspects of implementing deep learning in the laboratory: annotating training data, selecting and training a range of neural network architectures, and deploying solutions.
# Detail Review #
1. Introduction
- Progress in optics has yielded microscopes capable of imaging over a range of spatial scales, from single molecules to entire organisms.
- Concurrently, improvements in fluorescent probes have enhanced the brightness, photostability, and spectral range of fluorescent proteins and of small-molecule dyes.
- Combined, these advances allow for a variety of dynamic measurements in living cells, from long-term imaging of single molecules to simultaneous measurements of multiple biosensors, to observations of the development of entire organisms.
- Concurrent with these technological advances has been an increasing demand in the bioscience for image analysis (* tools and ecosystem for bioscience: MATLAB, Python, SuperSegger, CeellProfiler, ImageJ, etc.)
- Given the central role that observation (and therefore imaging) plays in the biological sciences, deep learning has the potential to revolutionize understanding of the inner workings of living systems.
- When a deep learning-based method won the 2012 ImageNet Large Scale Visual Recognition Challenge, there has been a major increase in the variety of problems that can be solved with deep learning.
- The barriers to spreading deep learning throughout biology labs are both cultural and technical.
- Specifically, the need for annotated data means that data and software must be jointly developed, and the amount of data and the computational resources required for deep learning constitute a significant barrier to adoption.
- By focusing on use cases that are common in quantitative cell biology, this Review serves as a practical introduction to deep learning for the analysis of biological images.
- It builds on prior reviews of the intersection of deep learning and the life sciences by incorporating a discussion of our labs' joint experiences in applying these methods to cellular imaging data and is meant to make these methods less opaque to new adopters.
2. The practical mechanics of deep learning
- we believe that there are three essential components to the successful application of deep learning to biological image analysis:
- (1) construction of a pertinent and annotated training dataset, (2) effective training of deep learning models on that dataset, (3) deployment of trained models on new data
- Training data are critical to successful applications of deep learning, assembling sufficient high-quality data often takes as much
- While training data may be limited => Computational approach: image normalization[=reduce variations], data augmentation[=increase the image diversity], transfer learning[=fine tuned general image feature to small dataset]
- Recent strategies have incorporated a search throughout the space of potential architectures to identify the most effective model architecture.
- Python is the most popular language for deep learning; existing frameworks include Tensorflow/Keras, and PyTorch => all of them are constructed by 1) computation graph, 2) automatically perform derivatives, 3) gateway for GPU, 4) implementations of common mathematical objects, optimization algorithms, hyperparameter settings, and performance metrics.
- In our experience, the choice of architectural features often comes down to a tradeoff between overfitting (models perform well on a training dataset but perform poorly on validation dataset) and underfitting (models perform poorly on training data because they are unable capture the feature information).
- Models with high model capacity perform well on large datasets but are prone to overfitting, so if we had small datasets then we should use transfer learning method.
- Once trained, deep learning models must be developed to process new data.
- The effective approach is to use built-in deployment tools in several frameworks (Tensorflow, Pytorch), which allows the models to be shared beyond the original user.
- (1) containerization tools such as Docker have been essential for the creation of reproducible environments for deploying deep learning models.
- (2) the need for GPUs has been a barrier, as a considerable amount of Unix system administration experience is necessary to ensure that all the requisite drivers and software packages are operational.
3. Biological applications of deep learning
- Image classification
- Image classification, the task of assigning a meaningful label to an image, was one of the first high-profile successes of deep learning.
- In a recent study, scientists used a fluorescent marker of differentiation to establish a ground truth and then trained a classifier to identify differentiated cells directly from bright-field images.
- Deep learning has also been used to classify spatial patterns in fluorescence images and to determine protein localization in large datasets from yeast and humans.
- Image segmentation
- Image segmentation is the task of partitioning an image into several parts to identify meaningful objects or features.
- Object tracking
- Object tracking is the task of following objects through a series of time-lapse images, one example of a biological application of this is the tracking of single cells in live-cell imaging movies.
- Object tracking consists of two tasks: object detection and object linkage.
- (1) Object detection: the optimal approach to object detection varies depending on the data. (2) Object linkage: in linear-programming approaches, multiple cues such as object centroids, intensity, and morphology are combined into a similarity score to link objects between frames.
- Augmented microscopy
- Augmented microscopy is the extraction of latent information from biological images, such as the identification of the locations of cellular nuclei in bright-field images.
- Fluorescence images of biological structures serve as the ground truth, and the task is to predict this ground truth directly from the bright-field images.
- Each approach used so far has compared spatially synchronized transmitted light images with images from other modalities to uncover meaningful relationships among the corresponding images.
* Reference: Moen, Erick, et al. "Deep learning for cellular image analysis." Nature methods 16.12 (2019): 1233-1246.
728x90
728x90
댓글