What is the purpose of the feature?

In the feature, the authors will focus on two directions as follows: the one is data augmentation technology due to the expensive manual annotations in detection data sets.

What types of defects are found in hot-rolled steel plates?

There are six types of defects from hot-rolled steel plates, including crazing, inclusion, patches, pitted surface, rolled-in scales, and scratches.

What is the way to solve the defect classification task?

To solve it, the simple and direct way is to perform defect localization before defect classification making the inspection task classify on regions of defects instead of a whole defect image, which is the defect detection task.

How many images are used for fine-tuning the network?

The training set containing 1260 images used for fine-tuning the network introduced in Section IV-B, and the test set containing 540 images.

What is the way to improve the recall of a defect detection method?

Increasing the number of proposals can get a promising recall, but this will greatly increase the runtime of the detection [38], and what is worse, low-quality proposals would be involved in the process of detection, leading to failure of defect detection in some cases.

(Open Access) An End-to-End Steel Surface Defect Detection Approach via Fusing Multiple Hierarchical Features (2020) | Yu He

Q: What contributions have the authors mentioned in the paper "An end-to-end steel surface defect detection approach via fusing multiple hierarchical features" ?

In this paper, the authors proposed a novel defect detection system based on deep learning and focused on a practical industrial application: steel plate defect inspection. In order to achieve strong classification ability, this system employs a baseline convolution neural network ( CNN ) to generate feature maps at each stage, and then the proposed multilevel feature fusion network ( MFN ) combines multiple hierarchical features into one feature, which can include more location details of defects. In addition, by using only 50 proposals, their method can detect at 20 ft/s on a single GPU and reach 92 % of the above performance, hence the potential for real-time detection.

Q: What is the importance of depth of networks?

The early successful networks are based on the sequential pipeline architecture [25], which establish the basic structure of CNN and prove the importance of depth of networks.

Q: What is the importance of pretraining on the ImageNet data set?

As the authors know that pretraining on the ImageNet data set is important to achieve competitive performance, and then this pretrained model can be fine-tuned on a relatively small defect data set.

Q: How do the authors fine tune DDN using top-300 region proposals?

In the following, the authors fine-tune DDN using top-300 region proposals owing to the extracted quality region proposals, but reduce this number to accelerate the detection speed without harming accuracy at test-time.

An End-to-End Steel Surface Defect Detection

Approach via Fusing Multiple

Hierarchical Features

Yu He, Kechen Song , Qinggang Meng , and Yunhui Yan

Abstract— A complete defect detection task aims to achieve

the speciﬁc class and precise location of each defect in an image,

which makes it still challenging for applying this task in practice.

The defect detection is a composite task of classiﬁcation and

location, leading to related methods is often hard to take into

account the accuracy of both. The implementation of defect

detection depends on a special detection data set that contains

expensive manual annotations. In this paper, we proposed a novel

defect detection system based on deep learning and focused on a

practical industrial application: steel plate defect inspection. In

order to achieve strong classiﬁcation ability, this system employs a

baseline convolution neural network (CNN) to generate feature

maps at each stage, and then the proposed multilevel feature

fusion network (MFN) combines multiple hierarchical features

into one feature, which can include more location details of

defects. Based on these multilevel features, a region proposal

network (RPN) is adopted to generate regions of interest (ROIs).

For each ROI, a detector, consisting of a classiﬁer and a bounding

box regressor, produces the ﬁnal detection results. Finally, we set

up a defect detection data set NEU-DET for training and

evaluating our method. On the NEU-DET, our method achieves

74.8/82.3 mAP with baseline networks ResNet34/50 by using

300 proposals. In addition, by using only 50 proposals, our

method can detect at 20 ft/s on a single GPU and reach 92% of the

above performance, hence the potential for real-time detection.

Index Terms— Automated defect inspection (ADI), defect

detection dataset (NEU-DET), defect detection network (DDN),

multilevel-feature fusion network (MFN).

I. INT RODUCTION

EFECT inspection is a crucial step to guarantee the

quality of industrial production, especially for steel

plates. However, this process is usually performed manually

This work was supported in p

art by the National Natural Science

Foundation of China under Grant 51805078 and Grant 51374063, in part

by the National Key Research and Development Program of China under

Grant 2017YFB0304200, in part by the Fundamental Research Funds for the

Central Universities under Grant N170304014 and Grant N150308001, and in

art by the China Scholarship Council under Grant 201806085007. The

Associate Editor coordinating the review process was Emanuele Zappa.

(Corresponding authors: Kechen Song; Yunhui Yan.)

Y. He, K. Song, and Y. Yan are with the School of Mechanical Engineering

and Automation, Northeastern University, Shenyang 110819, China, and

also with the Key Laboratory of Vibration and Control of Aero-Propulsion

Systems, Ministry of Education of China, Northeastern University, Shenyang

110819, China (e-mail: heyu142616@gmail.com; songkc@me.neu.edu.cn;

yanyh@mail.neu.edu.cn).

Q. Meng is with the Department of Computer Science, Loughborough

University, Loughborough LE11 3TU, U.K. (e-mail: q.meng@lboro.ac.uk).

Fig. 1. Defect classiﬁcation and defect detection task. (a) Defect classiﬁcation

task aims to “What,” only outputting a defect class score. (b) Defect detection

task aims to “What” and “Where,” outputting a bounding box with a defect

class score.

Fig. 2. Complicated defects. (a) Multiple defects. The yellow boxes indicate

the defects belong to an identical class. (b) Multiclass defects. The red and

blue boxes indicate the defects of different classes. (c) Overlapping defects.

The pink box surrounds an overlapping region of defects of different classes.

in industry, which is unreliable and time-consuming. In order

to replace the manual work, it is desirable to allow a machine

to automatically inspect surface defects from steel plates with

the use of computer vision technologies.

The founder of computer vision, British neurophysiologist

Marr, considers that a vision task can be deﬁned as “What is

Where” that is the process of discovering what presents in an

image and where is it [1]. Therefore, the object classiﬁcation

and detection are the most fundamental problems in the ﬁeld

of computer vision research [2]. Similarly, the automated

defect inspection (ADI) can also be divided into two types:

defect classiﬁcation and defect detection. Given a defect

image, the defect classiﬁcation task is to solve if this image

contains some class of defect [Fig. 1(a)], and the defect

detection task is to solve where a defect exists in this image,

represented by a bounding box with a class score [Fig. 1(b)].

Therefore, a complete defect detection task consists of two

parts: defect classiﬁcation, determining speciﬁc categories of

defects, and defect localization, obtaining detailed regions of

defects. For defect inspection o n steel plates, the detection task

has superio r advantages to comp licated defects, e.g ., multiple

defects [Fig. 2(a)], multiclass defects [Fig. 2(b)], and overlap-

ping defects [Fig. 2(c)]. The classiﬁcation task can only ﬁnd

Fig. 3. Different styles of obtaining a defect region. (a) Many previous

detectors based on hand-craft features directly combine related spatial cells

into a block through various special approaches. The block is regarded as a

detection region, which is a coarse box without reﬁning. (b) Detectors based

on DL mainly use regression methods to reﬁne a predicting box. Through a

large amount of iterative learning, the predicting box is gradually close to the

groundtruth box. Finally, the reﬁned box is regarded as the bounding box of

the defect, which can represent the precise location information of the defect.

the defect with the highest category conﬁdence in an image

and not know the number of defects shown in Fig. 2(a), classes

of defects shown in Fig. 2(b), and emerge of an overlapping

defect shown in Fig. 2(c) . However, for the follow-up quality

assessment system, the quantity, category, and complexity of

defects would be served as the chief indicators to evaluate the

quality of a steel plate. It is apparent that defect detection can

achieve a more comprehensive information reﬂection of a steel

plate surface.

The previous ADI methods have two common problems:

the one is the unclear usage of hand-craft features [3]–[5]. The

determination of features is too subjective, and thereby human

experience usually plays a decisive role in it. The other prob-

lem is imprecise defect localization [Fig. 3(a)]. Most methods

only perform defect classiﬁcation [6]–[8] or an incomplete

defect detection. For example, some methods perform binary

classiﬁcation to ﬁnd the regions of defects [9], [10] or only

provide a coarse region of a defect [11], [12]. The recent

developed deep learning (DL) technology can overcome the

drawbacks of traditional ADI methods and have achieved

signiﬁcant results on many vision tasks. The DL can extract

discriminative representations through a deep network [e.g.,

a convolution neural network (CNN)]. These representations

can reach a high level of abstract and therefore have strong

representation ability. The hand-craft features, by contr ast, are

merely the combination of low-level features [16]. Moreover,

DL can train on location-annotated samples to obtain p recise

location informa tion.

At present, some studies have already applied DL for ADI.

However, most methods can only perform defect classiﬁcation

due to the lack of special data sets [18]–[21]. The defect

classiﬁcation seems to be oversimplify and unable to pro-

vide location information. Other methods use a combination

of DL and traditional image processing to perform defect

detection or segmentation [17]. These methods always use

a DL classiﬁer in parallel with a detector or a segmenter

that based on traditional image processing. This way can

eliminate the need for special training data sets but d amage

the end-to-end characteristic of DL system and lose the

intelligence and generalization to some extent. Unlike the

above-mentioned methods, we attempt to establish an end-

to-end defect detection system for ADI, which can provide

a bounding box with a class score for precisely classifying

and locating a defect [Fig. 3(b)]. A DL-based segmenter like

Mask R-CNN [13] seems to be better for showing the shape

of a defect. However, this kind of segmenter will consume

huge amounts of computation source, which cannot meet the

real-time demand of industrial inspection. Furthermore, it is

highly impracticable for the industry to build a large instance-

level defect segmentation data set, and thereby this kind o f

segmenter is almost impossible to apply. Therefore, it is the

best tradeoff to perform defect detection for ADI at present.

This paper mainly addresses three challenges. First, the

detection system needs strong classiﬁcation ability. The com-

mon classiﬁcation problems such as interclass similarity, intr-

aclass difference, and background interference are also present

in ADI [9], [11]. Therefore, we equip a deep network ResNet

into the system as the backbone [23]. As current research

in transfer learning [15], the key to drive large networks is

pretraining on ImageNet [22]. The detection system can gain

strong classiﬁcation power by training ResNet on enough data.

Second, the challenge of performing defect localization

using CNN features in DL-based methods remains. As we

known, the convolutional layers of CNN can be regarded as

ﬁlters, which results in some location details will be gradually

lost when an image ﬂows in the CNN. Usually, DL-based

methods perform localization based on the last convolutional

feature map [14], [28], [34]. Our method is to fuse multi-

ple feature maps. Because the feature maps exhibit diverse

characteristics at each stage of CNNs: the shallow features

have rich information but not discriminative enough, and the

deep features are semantic robustly but lose too many details.

In other ﬁelds [34], the Hypernet also uses more features but

they are mainly selected from the latter part of the network.

The proposed multilevel-feature fusion network (MFN) com-

bines the multiple features covering all stages. We address the

detection from the industrial perspective. Since gray images

have less information than color images, the MFN must

include lower level features that are discarded by HyperNet.

Furthermore, the MFN uniforms the size of multiple features

before fusion, which can not only save more details of images

but also use less parameters of models.

Third, in defect detection, data annotation is expensive,

because one has to draw a defect’s bounding box and assign a

class label to it. Recent progress in this ﬁeld can be attributed

to two factors: 1) ImageNet pretrained models and 2) large

baseline CNNs, which made great progress in DL-based defect

classiﬁcation [18]–[20]. However, the limited data and expen-

sive annotation still limit the development of d efect detection.

In this paper, we open a defect detection data set NEU-DET

for ﬁne-tuning models. When the DL models have ﬁnished

training on a special data set, they can be used to perform the

defect detection task.

This paper establishes an end-to-end ADI system, called

defect detection network (DDN), in an attemp t to overcome

the above-mentioned challenges. The DDN 1) adopts a strong

ResNet in defect classiﬁcation; 2) proposes the MFN to assem-

ble more location details; and 3) sets up a d efect detection data

set for ﬁne-tuning and reports improvements on it. In more

detail, ﬁrst, we pretrain the ResNet on the ImageNet and

ﬁne-tune all the models on the NEU-DET. The MFN can

fuse the selected features into a multilevel feature, which has

characteristics covering all the stages of the ResNet. Next,

a region proposal network (RPN) is adopted in proposals

generatio n based on the multilevel features and then the DDN

can output the class scores and the coordinates of bounding

box. Finally, we evaluate the proposed method on NEU-DET

and the results can demonstrate a clear superior to other ADI

methods.

To summarize, the main contributions of this paper are as

follows.

1) The introduction of the end-to-end defect detection

pipeline DDN that integrates the ResNet and the RPN

for precise defect classiﬁcation and localization.

2) The proposed MFN for fusing multilevel features. Com-

pared with other fusing methods, MFN can combine

the lower level and higher level features, which makes

multilevel features to have more comprehen sive charac-

teristics.

3) A defect detection data set NEU-DET for ﬁne-tuning

networks and a demonstration that the proposed DDN

has a very competitive performance on this data set.

II. R

ELATED WORK

A. Defect Inspection

Generally, a defect classiﬁcation m ethod includes two parts:

a feature extractor and a classiﬁer. The classic feature extractor

is to obtain hand-craft features such as HOG and LBP,

and they are always followed by a classiﬁer, e.g., SVM.

Therefore, the combination of different feature extractors and

classiﬁers produces a variety of defect classiﬁcation meth-

ods. For instance, Song and Yan [3] improve the LBP to

against noise and adopt NNC and SVM to classify defects.

Ghorai et al. [9] is based on a small set of wavelet features

and use SVM to perform defect classiﬁcation. Different from

above-mentioned two methods, Chu et al. [8] employ a general

feature extractor and enhance SVM. From the perspective of

computer vision, the defect classiﬁcation task is essentially

defect image classiﬁcation, which is struggled in complicated

defect images. To solve it, the simple and direct way is to

perform defect localization before defect classiﬁcation m aking

the inspection task classify on regions of defects instead of a

whole defect image, which is the defect detection task. For

example, the defect detectors in [11] and [12] ﬁrst perform

a 0–1 classiﬁcation to judge features whether belong to a

defect class or a nondefect class, and then ﬁnds defect regions

based on the boundary of defect-class features, ﬁnally perform

different classiﬁcation methods to determine the speciﬁc class

of a defect. In addition, there is another simpliﬁed detector

for the requirement of quick detection, which only focuses on

regions of defects but regardless of the defects are in different

categories [10].

However, the DL-based methods differ radically from the

above methods. Hand-craft f eature extractor locally analyses

a single image and extract features. However, CNN is to

construct the representation of all the input data through

a large amount of learning. CNN has ﬁne generalization

and transf erability so that there are some defect inspection

methods based on CNN. For example, Chen and Ho [21]

demonstrate that an object detector like Overfeat [24] can be

transferred to be a defect detector by some means. Similar

to [18] and [19], they demonstrate that using a sequential

CNN to extract features can improve classiﬁcation accuracy

on defect inspection. Similarly, based on a sequential CNN,

Ren et al. [17] perform an extra defect segmentation task on

classiﬁcation results to deﬁne the boundary of a defect. More-

over, Natarajan et al. [20] employ a deeper neural network

VGG19 for defect classiﬁcation. With the depth of CNN,

the defect classiﬁcation accuracy has been further improved.

B. Baseline Networks

There are three popular CNN architectures at present, which

are used as baseline networks for pretraining. The early suc-

cessful networks are based on the sequential pipeline architec-

ture [25], which establish the basic structure of CNN and prove

the importance o f depth of networks. Subsequently, the incep-

tion networks employed modular units, which increase both

the depth and width of a network without the increment of

computational cost [26]. The third type is ResNet using resid-

ual blocks to make networks deeper without overﬁtting [23].

ResNet is widely applied in various vision tasks, achieving

competitive results with a few parameters.

Choosing a proper baseline network is the key to gain

good results for DL methods. A large network has strong

represent-ability for input data hence the extracted features

at high-abstract level, but there is a great demand for

training data.

C. CNN Detectors

The CNN detectors aim to classify and locate each target

with a bounding box. They are mainly divided into two meth-

ods: one is the region-based method and another is the direct

regression method. The most famous region-based detectors

are the “R-CNN family” [27], [28], [14]. In this framework,

thousands of class-independent region proposals are employed

for detection. Region-based methods are superior in precision

but require slightly more computation. The representative

direct regression methods are YOLO [29] and SSD [30].

They directly divide an image into small grids and then for

each grid predict bounding boxes, which then regressed to

the groundtruth boxes. The direct regression method is fast to

detect but struggles in small instances.

III. D

EFECT DETECTION NETWORK

In this section, the DDN is described in detail (see Fig. 4).

A single-scale image of an arbitrary size is processed by a

CNN, and the convolutional feature maps at each stage of

the ConvNet are produced (ConvNet represents the convo-

lutional par t of a CNN). We extract multiple feature maps

and then aggregate them in the same dimension by using

a lightweight MFN. In this way, MFN features have the

characteristics from several hierarchical levels of ConvNet.

Next, RPN [14] is employed to generate region proposals

Fig. 4. DDN. In a single pass, we extract features from each stage of the Baseline ConvNet, which then fused into a multile vel feature by MFN. RPN is

adopted to generate ROIs based on the multilevel feature. For each ROI, the corresponding multilevel feature is transformed into a ﬁxed-length feature through

the ROI pooling and the GAP layers. Two fc layers process each ﬁxed-length feature and feed into output layers producing two results: a one-of-(C + 1)

defect class prediction (cls) and a reﬁned bounding box coordinate (loc).

[regions of interest (ROIs)] over the MFN feature. Finally,

the MFN feature corresponding to each ROI is transformed

into a ﬁxed-length feature through the ROI pooling [28]

and the global average pooling (GAP) layers. The feature

is fed into two fully connected (fc) layers. One is a one-of-

(C + 1) defect classiﬁcation layer (“cls”) and the other is a

bounding-box regression layer (“loc”).

The rest of this section introduces the d etails of DDN and

motivates why we need to design MFN into the network for

the defect detection task.

A. Baseline ConvNet Architecture

As we know that pretraining on the ImageNet d ata set is

important to achieve co mpetitive perf ormance, and then this

pretrained model can be ﬁne-tuned on a relatively small defect

data set. In this paper, we select the recent successful baseline

network ResNet as the backbone. ResNet presents several

attractive advantages as follows.

1) ResNet can achieve the state-of-the-art precision with

extremely few parame ters, in comparison with the CNN

of sequential pipeline architecture of the same magni-

tude (ResNet50 vs. VGG16, 0.85 M vs. 138 M para-

meters). It implies that ResNet has lower computational

cost and less probability of overﬁtting.

2) ResNet uses GAP to process the ﬁnal convolutional

feature map instead of the dual stacked fc layers, which

can be in a manner of preserving more comprehensive

location information of defects in the image.

3) ResNet has a modularized ConvNet, which is easy to

integrate.

In this paper, we select ResNet34 and ResNet50 as base-

line networks. The detailed structures of both networks

are shown in Table I, and residual blocks are denoted as

{R2, R3, R4, R5}.

B. Produce Multilevel Features

Previous excellent approaches only utilize h igh-level fea-

tures to extract region proposals (like the faster R-CNN extract

proposals upon the last convolutional feature maps). In order

to obtain quality region proposals, single-level features should

TABLE I

RCHITECTURE OF BASELINE NETWORKS

be extended to multilevel features. Obviously, the simplest

way is to assemble feature maps from m ultiple layers [31].

Therefore, now comes the question, which layers should be

combined? There are two essential conditions: nonadjacent,

because adjacent layers have highly local correlation [32], and

coverage, including features from low level to high level. For

a ResNet, the most intuitive way is to combine the last layers

in each residual block.

To fuse features at different levels, the proposed network

MFN is appended on the pretrained model. MFN has four

branches, denoted as {B2, B3, B4, B5}, and each branch

is a small network. B2, B3, B4, and B5 are sequentially

connected to the last layer of R2, R3, R4, and R5. When

an image ﬂows through the baseline ConvNet, the Ri features

are produced in order. The Ri feature means the feature map

output from the last layer of the residual block Ri, i =

2,...,5. Similarly, the Bi feature is the feature map produced

from the last layer of the MFN batch Bi, i = 2,...,5. Then,

each of Ri features is led to the corresponding branch in MFN

producing Bi features. Finally, multilevel features are obtained

via concatenating the B2, B3, B4, and B5 features, which come

from different stages of a CNN.

As a ﬁnal note, MFN is efﬁcient in computation and strong

in generalization. MFN can reduce required parameters via

modifying the number of ﬁlters of 1 × 1 conv. This operation

may hurt accuracy but prevent overﬁtting in the case of

insufﬁcient training data.

C. Extract Region Proposals

The RPN is employed to extract region proposals by sliding

on the multilevel feature maps. RPN takes an image of

arbitrary size as input and outputs anchor boxes (candidate

boxes), each with a score representing whether it is a defect

or not. The o riginality of RPN is the “anchor” scheme that

makes anchor bo xes in multiple scales and aspect ratios. Then,

anchor boxes are hierarchically mapped to the input image

so that region proposals of multiple scales and aspect ratios

produced. As a result of the resolution size of MFN feature, the

RPN can be considered as sliding on the R4 feature. Follow

[14], we set three aspect ratios {1:1, 1:2, 2:1}. Considering

multiple sizes of defects, we set four scales {64

, 128

, 256

512

}. Therefore, RPN produces 12 anchor boxes at each

sliding location.

The region p roposal extractor always ends with an ROI

pooling layer. This layer performs a max-pooling operation

over a feature m ap inside each ROI to convert it into a small

feature vector (512-d for ResNet34 and 2048-d for ResNet50)

with a ﬁxed size of W × H (in this paper, 7 × 7). At last,

based on these small cubes, calculate the offset of each region

proposal with an adjacent g roundtruth box and the probability

whether there exist defects.

For a single image, RPN may extract thousands of region

proposals. To deal with the redundant information, the greedy

nonmaximum suppression (NMS) is o ften applied for elimi-

nating high-overlap region proposals. We set the intersection

over union (IOU) threshold for NMS at 0.7, which can discard

a majority of region proposals. After NMS, the top-K ranked

region proposals are selected from the rest. In the following,

we ﬁne-tune DDN using top-300 region proposals owing to

the extracted quality region proposals, but reduce this number

to accelerate the detection speed without harming accuracy at

test-time.

IV. T

RAINING

A. Multitask Loss Function

The defect detection task can be divided into two subtasks,

hence DDN has two output layers. The cls layer outputs a

discrete probability distribution, k = (k

,...,k

), for each

ROI over C + 1 categories (C defect categories plus one

background category). As usual, k is computed by a softmax

function. The cls loss L

cls

is a log loss over two classes (defect

or not defect). L

cls

=−log(k, k

∗

) where k

∗

is the groundtruth

class. The loc layer outputs bounding box regression offsets,

t = (t

, t

), for each of the C defect categories. As in

[28], the loc loss L

loc

is a smooth L1 loss function. L

loc

SmoothL1(t − t

∗

) where t

∗

is the groundtruth box associated

with a positive sample. For bounding box regression, we adopt

the parameterizations of t and t

∗

given in [27]

= (x − x

)/w

, t

= (y − y

)/h

= log(w/w

), t

= log(h/h

)

∗



∗

− x



, t

∗



∗

− y



∗

= log



∗



, t

∗

= log



∗



(1)

where the subscripts x , y, w,andh denote each box’s center

coordinates and its width and height. The variables x , x

,and

∗

separately represent the predicted box, anchor box, and

groundtruth box (the same rules for y, w,andh).

With these deﬁnitions, we minimize a multitask loss func-

tion, which is deﬁned as

L(k, k

∗

, t, t

∗

) = L

cls

(k, k

∗

) + λp

∗

cls

(t, t

∗

) (2)

An End-to-End Steel Surface Defect Detection Approach via Fusing Multiple Hierarchical Features

Figures

Citations

Computer vision : a modern approach = 计算机视觉 : 一种现代的方法

PGA-Net: Pyramid Feature Fusion and Global Context Attention Network for Automated Surface Defect Detection

20世紀の名著名論：David Marr:Vision:a Computational Investigation into the Human Representation and Processing of Visual Information

Using Deep Learning to Detect Defects in Manufacturing: A Comprehensive Survey and Current Challenges.

EDRNet: Encoder–Decoder Residual Network for Salient Object Detection of Strip Steel Surface Defects

References

Deep Residual Learning for Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet: A large-scale hierarchical image database

Deep learning

Related Papers (5)

Deep Residual Learning for Image Recognition

U-Net: Convolutional Networks for Biomedical Image Segmentation

SSD: Single Shot MultiBox Detector

Very Deep Convolutional Networks for Large-Scale Image Recognition

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Frequently Asked Questions (9)

Q1. What contributions have the authors mentioned in the paper "An end-to-end steel surface defect detection approach via fusing multiple hierarchical features" ?

Q2. What is the importance of depth of networks?

Q3. What is the purpose of the feature?

Q4. What types of defects are found in hot-rolled steel plates?

Q5. What is the way to solve the defect classification task?

Q6. What is the importance of pretraining on the ImageNet data set?

Q7. How many images are used for fine-tuning the network?

Q8. What is the way to improve the recall of a defect detection method?

Q9. How do the authors fine tune DDN using top-300 region proposals?