-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Learning to Read Chest X-Rays: Recurrent Neural Cascade Model for Automated Image Annotation #21
Comments
摘要Despite the recent advances in automatically describing image contents, their applications have been mostly limited to image caption datasets containing natural images (e.g.,Flickr 30k, MSCOCO). In this paper, we present a deep learning model to efficiently detect a disease from an image and annotate its contexts (e.g., location, severity and the affected organs). We employ a publicly available radiology |
背景介绍Comprehensive image understanding requires more than single object classification. There have been many advances in automatic generation of image captions to describe image contents, which is closer to a more complete image understanding than classifying an image to a single object class. Our work is inspired by many of the recent progresses in image caption generation [44, 54, 36, |
相关研究This work was initially inspired by the early work in image caption generation [39, 17, 16], where we take more recently introduced ideas of using CNNs and RNNs[44, 54, 36, 14, 61, 15, 6, 62, 31] to combine recent advances in computer vision and machine translation. We also exploit the concepts of leveraging the mid-level RNN representations to infer image labels from the annotations |
数据We use a publicly available radiology dataset of chest x-rays and reports that is a subset of the OpenI [2] open source literature and biomedical image collections. It contains 3,955 radiology reports from the Indiana Network for Patient Care, and 7,470 associated chest x-rays from the hospitals’ picture archiving systems. The entire dataset has been fully anonymized via an aggressive anonymization scheme, |
疾病标签The CNN-RNN based image caption generation approaches [44, 54, 36, 14, 61, 15, 6, 62, 31] require a well-trained CNN to encode input images effectively. Unlike natural images that can simply be encoded by ImageNet trained CNNs, chest x-rays differ significantly from the ImageNet images. In order to train CNNs with chest x-ray images, we sample some frequent annotation patterns with |
使用CNN 进行图像分类We use the aforementioned 17 unique disease annotation patterns (in Table 1, and scoliosis,osteophyte,spondylosis,fractures/bone) to label the images and train CNNs. For |
Regularization by Batch Normalization and Data DropoutEven when we balance the dataset by augmenting many diseased samples, it is difficult for a CNN to learn a good model to distinguish many diseased cases from normal cases which have many variations on their original samples. It was shown in [27] that normalizing via mini-batch statistics during training can serve as an effective regularization technique to improve the performance of a CNN model. B |
5.2. Effect of Model ComplexityWe also validate whether the dataset can benefit from a more complex GoogLeNet [58], which is arguably the current state-of-the-art CNN architecture. We apply both batch-normalization and data-dropout, and follow recommendations suggested in [27] (where human accuracy on |
Annotation Generation with RNNWe use recurrent neural networks (RNNs) to learn the annotation sequence given input image CNN embeddings. We test both Long Short-Term Memory (LSTM) [24] and Gated Recurrent Unit (GRU) [7] implementations of RNNs. Simplified illustrations of LSTM and GRU are shown in Figure 2, and the details of both RNN implementations are briefly introduced below. 6.1. Recurrent Neural Network Implementations6.2. TrainingThe number of MeSH terms describing diseases ranges from 1 to 8 (except normal which is one word), with a mean of 2.56 and standard deviation of 1.36. The majority of descriptions contain up to five words. Since only 9 cases have images with descriptions longer than 6 words, we ignore these by constraining the RNNs to unroll up to 5 time steps. We zero-pad annotations with less than five words with the end-of-sentence token to fill in the five word space. The parameters of the gates in LSTM and GRU decide whether to update their current state h to the new candidate state h, where these are learned from the previous input sequences. Further details about LSTM can be found in [24, 15, 14, 61], and about GRU and its comparisons to LSTM in [7, 30, 9, 8, 32]. We set the initial state of RNNs |
7 Recurrent Cascade Model for Image Labeling with Joint Image/Text ContextIn Section 5, our CNN models are trained with disease labels only where the context of diseases are not considered. For instance, the same calcified granuloma label is assigned to all image cases that actually may describe the disease differently in a finer semantic level, such as “calcified granuloma 7.1. EvaluationThe final evaluated BLEU scores are provided in Table 5.We achieve better overall BLEU scores than those in Table 4 before using the joint image/text context. It is noticeable that higher BLEU-N (N > 1) scores are achieved compared to Table 4, indicating that more comprehensive image contexts are taken into account for the CNN/RNN training. |
8. ConclusionWe present an effective framework to learn, detect disease, and describe their contexts from the patient chest xrays and their accompanying radiology reports with Medical Subject Headings (MeSH) annotations. Furthermore, we introduce an approach to mine joint contexts from a collection |
Learning to Read Chest X-Rays: Recurrent Neural Cascade Model for Automated Image Annotation
论文:https://arxiv.org/pdf/1603.08486v1.pdf
代码:
Interleaved text/image deep mining on a very largescale radiology database
http://www.cs.jhu.edu/~lelu/publication/cvpr15_0371.pdf
Interleaved Text/Image Deep Mining on a Large-Scale Radiology Database for Automated Image Interpretation
https://arxiv.org/abs/1505.00670
Learning the Correlation Between Images and Disease Labels Using Ambiguous Learning
http://link.springer.com/chapter/10.1007%2F978-3-319-24571-3_23
High-Throughput Classification of Radiographs Using Deep Convolutional Neural Networks
http://link.springer.com/article/10.1007/s10278-016-9914-9
Diseño de técnicas de inteligencia artificial aplicadas a imágenes médicas de rayos X para la detección de estructuras anatómicas de los pulmones y sus alteraciones
https://riunet.upv.es/handle/10251/70103
Open-iSM:imaging, informatics, natural language processing, and multi-modal information retrieval
– research and development
https://pdfs.semanticscholar.org/117c/58682be513a5dfe9fe36f638a3f796207076.pdf
The text was updated successfully, but these errors were encountered: