scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Container-code recognition system based on computer vision and deep neural networks

18 Apr 2018-Vol. 1955, Iss: 1, pp 040118
TL;DR: An automatic container-code recognition system based on computer vision and deep neural networks is proposed, which is able to deal with more situations, and generates a better detection result through combination to avoid the drawbacks of the two methods.
Abstract: Automatic container-code recognition system becomes a crucial requirement for ship transportation industry in recent years. In this paper, an automatic container-code recognition system based on computer vision and deep neural networks is proposed. The system consists of two modules, detection module and recognition module. The detection module applies both algorithms based on computer vision and neural networks, and generates a better detection result through combination to avoid the drawbacks of the two methods. The combined detection results are also collected for online training of the neural networks. The recognition module exploits both character segmentation and end-to-end recognition, and outputs the recognition result which passes the verification. When the recognition module generates false recognition, the result will be corrected and collected for online training of the end-to-end recognition sub-module. By combining several algorithms, the system is able to deal with more situations, and the online training mechanism can improve the performance of the neural networks at runtime. The proposed system is able to achieve 93% of overall recognition accuracy.
Citations
More filters
Journal ArticleDOI
TL;DR: An adaptive deep learning framework for shipping container code localization and recognition and it is demonstrated that the proposed framework achieved better localization performance and obtained 93.33% recognition accuracy.
Abstract: Shipping containers play an important role in global transportation. As container codes are the unique identifiers for shipping containers, recognizing these codes is an essential step to manage the containers and logistics. The conventional code localization methods can easily be interfered by varied noises and cannot identify the best regions for code recognition. In this article, we propose an adaptive deep learning framework for shipping container code localization and recognition. In the framework, the noisy text regions will be removed by an adaptive score aggregation (ASA) algorithm. The code region boundaries are identified by the average-to-maximum suppression range (AMSR) algorithm. Thus, the predicted locations can be adjusted within this range to fit the code recognition model to achieve higher accuracy. The experimental results on the comparative study with the state-of-the-art models, including EAST, PSENet, GCRNN, and MaskTextSpotter, demonstrated that the proposed framework achieved better localization performance and obtained 93.33% recognition accuracy. The processing speed reaches 1.13 frames/s, which is sufficient to meet the operational requirements. Thus, the proposed solution will facilitate the digital transformation of shipping container management and logistics at ports.

8 citations


Cites methods from "Container-code recognition system b..."

  • ...[25] combined a connectionist text proposal network and maximally stable extremal regions for code localization and used both character segmentation and CRNN for code recognition....

    [...]

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors developed a container color detection model to predict the color of the container being unloaded and then used the prediction results to develop two crane operator alarm methods, one alerts the crane operator if the detected color of a container is not in compliance with the correct container color.
Abstract: To reduce the extra work, the operation cost, and the risk of cargo delay induced by the unloading of wrong containers, this study first develops a container color detection model to predict the color of the container being unloaded. The prediction results are then used to develop two crane operator alarm methods. Method 1 alerts the crane operator if the detected color of a container is not in compliance with the correct container color. Method 2 constructs a decision problem to decide whether to alert the operator. The results of numerical experiments show that methods 1 and 2 are better than the benchmark. Specifically, method 1 can save the expected annual total cost by about 82% while method 2 can save the expected annual total cost by about 85%. Extensive sensitivity analysis is also conducted to verify the methods performance and robustness.
Book ChapterDOI
01 Jan 2020
TL;DR: An approach to solving key issues in two important facets of the supply chain, predicting the date of restitution for a cargo container and using an optical character recognition (OCR)-centred pipeline to extrapolate data from containers are presented.
Abstract: The shipping industry is a multifaceted trading system, enabling the flow of goods between countries. Often the standards, methods and technologies become interspersed and difficult to plan, disrupting the supply chain. Data is not always accurate, necessitating further analysis. This paper presents an approach to solving key issues in two important facets of the supply chain, predicting the date of restitution for a cargo container and using an optical character recognition (OCR)-centred pipeline to extrapolate data from containers. Both approaches use long short-term memory (LSTM) models, including bi-directional LSTMs. These methods leverage state-of-the-art text recognition architecture and advanced algorithm ensembling, giving significant improvements in data quality. The experimental results illustrate that these methods vastly outperform current industry practices and more recent approaches.
Journal ArticleDOI
TL;DR: In this article , the authors proposed the design and implementation of a BIC Code recognition system using an open source-based OCR engine, deep learning object detection algorithm, and text detector model.
Abstract: The BIC (Bureau International des Containers et du Transport Intermodal) Code is the identification code for ocean shipping containers and is crucial for logistics, transportation, and security. Accurate recognition of container BIC Code is essential for efficient import and export processes, authorities' ability to intercept illegal goods and safe transportation. Nevertheless, the current practice of employees recognizing and manually entering container BIC codes is inefficient and prone to error. Although automated recognition efforts have been made, challenges remain due to the aging of containers, manufacturing differences between companies, and the mixing of letters and numbers in the 11-digit combination. In this paper, we propose the design and implementation of a BIC Code recognition system using an open source-based OCR engine, deep learning object detection algorithm, and text detector model. In the logistics industry, various attempts are being made to seamlessly link the data required at each stage of transportation between these systems. If we can secure the stability and consistency of BIC Code recognition that can be used in the field through our research, it will contribute to overcoming the instability caused by false positives.
References
More filters
Proceedings ArticleDOI
27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

123,388 citations

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a novel neural network architecture, which integrates feature extraction, sequence modeling and transcription into a unified framework, and achieved remarkable performances in both lexicon free and lexicon-based scene text recognition tasks.
Abstract: Image-based sequence recognition has been a long-standing research topic in computer vision. In this paper, we investigate the problem of scene text recognition, which is among the most important and challenging tasks in image-based sequence recognition. A novel neural network architecture, which integrates feature extraction, sequence modeling and transcription into a unified framework, is proposed. Compared with previous systems for scene text recognition, the proposed architecture possesses four distinctive properties: (1) It is end-to-end trainable, in contrast to most of the existing algorithms whose components are separately trained and tuned. (2) It naturally handles sequences in arbitrary lengths, involving no character segmentation or horizontal scale normalization. (3) It is not confined to any predefined lexicon and achieves remarkable performances in both lexicon-free and lexicon-based scene text recognition tasks. (4) It generates an effective yet much smaller model, which is more practical for real-world application scenarios. The experiments on standard benchmarks, including the IIIT-5K, Street View Text and ICDAR datasets, demonstrate the superiority of the proposed algorithm over the prior arts. Moreover, the proposed algorithm performs well in the task of image-based music score recognition, which evidently verifies the generality of it.

2,184 citations

Proceedings ArticleDOI
13 Jun 2010
TL;DR: A novel image operator is presented that seeks to find the value of stroke width for each image pixel, and its use on the task of text detection in natural images is demonstrated.
Abstract: We present a novel image operator that seeks to find the value of stroke width for each image pixel, and demonstrate its use on the task of text detection in natural images. The suggested operator is local and data dependent, which makes it fast and robust enough to eliminate the need for multi-scale computation or scanning windows. Extensive testing shows that the suggested scheme outperforms the latest published algorithms. Its simplicity allows the algorithm to detect texts in many fonts and languages.

1,531 citations

Journal ArticleDOI
TL;DR: An end-to-end system for text spotting—localising and recognising text in natural scene images—and text based image retrieval and a real-world application to allow thousands of hours of news footage to be instantly searchable via a text query is demonstrated.
Abstract: In this work we present an end-to-end system for text spotting--localising and recognising text in natural scene images--and text based image retrieval. This system is based on a region proposal mechanism for detection and deep convolutional neural networks for recognition. Our pipeline uses a novel combination of complementary proposal generation techniques to ensure high recall, and a fast subsequent filtering stage for improving precision. For the recognition and ranking of proposals, we train very large convolutional neural networks to perform word recognition on the whole proposal region at the same time, departing from the character classifier based systems of the past. These networks are trained solely on data produced by a synthetic text generation engine, requiring no human labelled data. Analysing the stages of our pipeline, we show state-of-the-art performance throughout. We perform rigorous experiments across a number of standard end-to-end text spotting benchmarks and text-based image retrieval datasets, showing a large improvement over all previous methods. Finally, we demonstrate a real-world application of our text spotting system to allow thousands of hours of news footage to be instantly searchable via a text query.

1,054 citations

Proceedings Article
01 Nov 2012
TL;DR: This paper combines the representational power of large, multilayer neural networks together with recent developments in unsupervised feature learning, which allows them to use a common framework to train highly-accurate text detector and character recognizer modules.
Abstract: Full end-to-end text recognition in natural images is a challenging problem that has received much attention recently. Traditional systems in this area have relied on elaborate models incorporating carefully hand-engineered features or large amounts of prior knowledge. In this paper, we take a different route and combine the representational power of large, multilayer neural networks together with recent developments in unsupervised feature learning, which allows us to use a common framework to train highly-accurate text detector and character recognizer modules. Then, using only simple off-the-shelf methods, we integrate these two modules into a full end-to-end, lexicon-driven, scene text recognition system that achieves state-of-the-art performance on standard benchmarks, namely Street View Text and ICDAR 2003.

900 citations