scispace - formally typeset

Journal ArticleDOI

From the Digitization of Cultural Artifacts to the Web Publishing of Digital 3D Collections: an Automatic Pipeline for Knowledge Sharing

04 Jan 2012-Journal of Multimedia-Vol. 7, Iss: 2, pp 132-144

TL;DR: A novel approach intended to simplify the production of multimedia content from real objects for the purpose of knowledge sharing, which is particularly appropriate to the cultural heritage field is introduced, consisting in a pipeline that covers all steps from the digitization of the objects up to the Web publishing of the resulting digital copies.
Abstract: In this paper, we introduce a novel approach intended to simplify the production of multimedia content from real objects for the purpose of knowledge sharing, which is particularly appropriate to the cultural heritage field. It consists in a pipeline that covers all steps from the digitization of the objects up to the Web publishing of the resulting digital copies. During a first stage, the digitization is performed by a high speed 3D scanner that recovers the object's geometry. A second stage then extracts from the recovered data a color texture as well as a texture of details, in order to enrich the acquired geometry in a more realistic way. Finally, a third stage converts these data so that they are compatible with the recent WebGL paradigm, then providing 3D multimedia content directly exploitable by end-users by means of standard Internet browsers. The pipeline design is centered on automation and speed, so that it can be used by non expert users to produce mul- timedia content from potentially large object's collections, like it may be the case in cultural heritage. The choice of a high speed scanner is particularly adapted for such a design, since this kind of devices has the advantage of being fast and intuitive. Processing stages that follow the digitization are both completely automatic and "seamless", in the sense that it is not incumbent upon the user to perform tasks manually, nor to use external softwares that generally need additional operations to solve compatibility issues.
Topics: Digitization (57%)

Content maybe subject to copyright    Report

From the Digitization of Cultural Artifacts to the
Web Publishing of Digital 3D Collections:
an Automatic Pipeline for Knowledge Sharing
Fr
´
ed
´
eric Larue, Marco Di Benedetto, Matteo Dellepiane and Roberto Scopigno
ISTI-CNR, Pisa, Italy
Abstract In this paper, we introduce a novel approach
intended to simplify the production of multimedia content
from real objects for the purpose of knowledge sharing,
which is particularly appropriate to the cultural heritage
field. It consists in a pipeline that covers all steps from the
digitization of the objects up to the Web publishing of the
resulting digital copies. During a first stage, the digitization
is performed by a high speed 3D scanner that recovers
the object’s geometry. A second stage then extracts from
the recovered data a color texture as well as a texture of
details, in order to enrich the acquired geometry in a more
realistic way. Finally, a third stage converts these data so
that they are compatible with the recent WebGL paradigm,
then providing 3D multimedia content directly exploitable
by end-users by means of standard Internet browsers.
The pipeline design is centered on automation and speed,
so that it can be used by non expert users to produce mul-
timedia content from potentially large object’s collections,
like it may be the case in cultural heritage. The choice of a
high speed scanner is particularly adapted for such a design,
since this kind of devices has the advantage of being fast
and intuitive. Processing stages that follow the digitization
are both completely automatic and “seamless”, in the sense
that it is not incumbent upon the user to perform tasks
manually, nor to use external softwares that generally need
additional operations to solve compatibility issues.
I. INTRODUCTION
In the field of cultural heritage (CH), knowledge shar-
ing is one of the most essential aspects for communication
activities between museal institutions, that conserve and
take care of cultural collections, and the public. Among
other things, these activities include education, research
and study as well as entertainment. All of them are really
precious for the spread of culture. However, the public
is not the only one to benefit from knowledge sharing:
it is important for promotion and advertisement purposes
as well, which are both of a high interest for the museal
institutions themselves regarding visibility, development
and long term sustainability.
In order to preserve the integrity of cultural goods,
knowledge sharing generally makes use of surrogates so
as to avoid to expose directly the real artifacts to poten-
tial risks of deterioration. For this purpose, multimedia
technologies are becoming more and more widespread in
the CH field, where these surrogates are then represented
by digital copies. This popularity can be explained by
at least two reasons. On the one hand, computing tools
clearly provide an underlying easiness for data storage,
indexation, browsing and sharing, due to the existing
network facilities and to the new Web technologies. On
the other hand, recent advances in 3D scanning give the
possibility to create multimedia content from real arti-
facts, producing faithful digital imprints and avoiding the
tedious and time consuming task of a manual modeling
through CAD softwares.
Particularly, new high speed systems like in-hand scanners
present big advantages for CH. Firstly, they are able
to acquire digital copies in a few minutes, which is
really important when the multimedia content must be
produced from huge collections in reasonable times, or
when several fragments must be scanned in order to plan
the restoration of destroyed pieces. Moreover, they can
be manipulated by non-expert users as well, since they
provide an interactive feedback and rely on the temporal
coherency of the high-speed acquisition to get rid of the
traditional alignment problems that generally need to be
solved manually during a tedious post-processing phase.
Despite the availability of these technologies and their
increasing popularity, there is still a lack of global and
automatic solutions enabling to cover the whole process-
ing chain that ranges from content creation to content
publishing.
The first weak point of this chain occurs during the
acquisition itself: for CH applications, a good representa-
tion of the geometry is not sufficient to produce faithful
surrogates, since interactive visualization requires to be
able to provide synthetic images as near as possible
to the real appearance of the depicted object. In that
case, the geometry needs to be paired with an accurate
representation of the surface appearance (color, small
shape details, reflection characteristics). Unfortunately,
commercial scanning systems mostly focused on shape
measurement, putting aside until recently the problem of
recovering quality textures from real objects. This has led
to a lack of efficient and automatic processing tools for
color acquisition and reconstruction.
The second problem is that both tasks of creation and
publishing of multimedia content are generally totally
uncorrelated from each other. From a practical point of
view, it means that a different software must be used
for each of them. A work for converting the various
inputs/outputs in a compatible way is then necessary, and
generally consists in a manual task that is incumbent upon
the user.
132
JOURNAL OF MULTIMEDIA, VOL. 7, NO. 2, MAY 2012
© 2012 ACADEMY PUBLISHER
doi:10.4304/jmm.7.2.132-144

Figure 1. Overview of the presented framework, that covers the whole chain from the 3D digitization of real CH artifacts up to the Web publishing
of the resulting digital 3D copies for archiving, browsing and visualization through Internet.
In this paper, we present a complete system that enables to
create colored 3D digital copies from existing artifacts and
to publish them directly on Internet, through an interactive
visualization based on WebGL technology. This system,
outlined on Figure 1, consists in three stages.
During the first one, acquisition is performed directly by
the user in an intuitive manner thanks to an in-hand digi-
tization device performing 3D scanning in real-time. The
data provided by the scanner, as well as some properties
specific to this kind of devices, are then exploited to
automatically produce a diffuse color texture for the 3D
model. This texture is deprived of the traditional visual
artifacts that may appear due to the presence in the input
pictures of shadows, specular highlights, lighting incon-
sistencies or calibration inaccuracies. Moreover, inasmuch
as high speed scanning systems are often prone to a
lack of accuracy with respect to more classic digitization
technologies, we also estimate a normal texture from these
data, once again in an automatic manner. This texture
captures the finest geometric details that may be missed
during the 3D acquisition, and can then be used afterwards
to enrich the original geometry during visualization.
Once geometry and texture information have been pro-
cessed, the third and last stage of the production pipeline
performs an optimization phase aimed at producing a
compact and Web-friendly version of the data. The output
of this stage will be used for real-time visualization
on commodity platforms. One of our main goals is the
archival and deployment of digital copies using standard,
well-settled and widely accessible technologies: in this
view, we use a standard Web server as our data provider,
and the WebGL technology to visualize and integrate the
digital copy on standard Web pages.
The contributions proposed in this paper can be summa-
rized as follows:
a complete and almost fully automatic pipeline for
the production of 3D multimedia content for Inter-
net applications, covering a chain ranging from the
digitization of real artifacts to the Web publishing of
the produced digital copies;
a texturing method specifically designed for real-
time scanning systems, that accounts for specific
properties of this kind of devices in order to improve
the acquired 3D model with a good quality color
texture without cracks nor illumination related visual
artifacts, as well as a normal texture capturing the
finest geometric features;
the coupling of intuitive acquisition techniques with
the recent paradigms proposed by WebGL technol-
ogy for Web publishing. Hence, the archival and the
sharing of vast item collections becomes possible and
easy also for non expert users.
The remainder of this paper is organized as follows. Sec-
tion II reviews the related work on software approaches
or complete systems for color acquisition, texture recon-
struction and real-time visualization on the Web platform.
Section III presents the two first stages of our system,
namely the in-hand scanner used for the acquisition, as
well as our processing step for generating a digital copy
from the acquired data. The third and last stage dedicated
to the preparation of the digital copy for Web publishing is
then presented in section IV. Finally, section V shows the
results achieved and section VI draws the conclusions.
II. RELATED WORK
A. Real-time 3D scanning
An overview of the 3D scanning and stereo reconstruc-
tion goes well beyond the scope of this paper. We will
mainly focus on systems for real-time, in-hand acquisition
of geometry and/or color. Their main issues are the
availability of technology and the problem of aligning
data in a very fast way.
Concerning the first point, 3D acquisition can be based on
stereo techniques or on active optical scanning solutions.
Among the latter, the most robust approach is based on
the use of fast structured-light scanners [1], where a
high speed camera and a projector are used to recover
the range maps in real-time. The alignment problem is
usually solved with smart implementations of the ICP
algorithm [2], [3], where the most difficult aspect to solve
JOURNAL OF MULTIMEDIA, VOL. 7, NO. 2, MAY 2012
133
© 2012 ACADEMY PUBLISHER

is related to the loop closure during registration.
In the last few years, some in-hand scanning solutions
have been proposed [2], [4], [5]: they essentially differ
on the way projection patterns are handled, and in the
implementation of ICP. None of the proposed systems
takes into account the acquisition of color, although the
one proposed by Weise et al. [5] contains also a color
camera (see next section for a detailed description). This
is essentially due to the low resolution of the cameras,
and to the difficulty of handling the peculiar illumination
provided by the projector. Other systems have been pro-
posed which take into account also the color, but they
are not able to achieve real-time performances [6] or to
reconstruct the geometry in an accurate way [7].
B. Color acquisition and visualization on 3D models
Adding color information to an acquired 3D model is
a complex task. The most flexible approach starts from
a set of images acquired either in a second stage with
respect to the geometry acquisition, or simultaneously but
using different devices. Image-to-geometry registration,
which can be solved by automatic [8]–[10] or semi-
automatic [11] approaches, is then necessary. In our case,
this registration step is not required, because the in-
hand scanning system provides images which are already
aligned to the 3D model.
Once alignment is performed, it is necessary to extract
information about the surface material appearance and
transfer it on the geometry. The most correct way to
represent the material properties of an object is to describe
them through a reflection function (e.g. BRDF), which
attempts to model the observed scattering behavior of a
class of real surfaces. A detailed presentation of its theory
and applications can be found in Dorsey [12]. Unfortu-
nately, state-of-the-art BRDF acquisition approaches rely
on complex and controlled illumination setups, making
them difficult to apply in more general cases, or when
fast or unconstrained acquisition is needed.
A less accurate but more robust solution is the direct
use of images to transfer the color to the 3D model. In
this case, the apparent color value is mapped onto the
digital object’s surface by applying an inverse projection.
In addition to other important issues, there are numerous
difficulties in selecting the correct color when multiple
candidates come from different images.
To solve these problems, a first group of methods selects,
for each surface part, a portion of a representative image
following a specific criterion in most cases, the orthog-
onality between the surface and the view direction [13],
[14]. However, due to the lack of consistency from one
image to another, artifacts are visible at the junctions
between surface areas receiving color from different im-
ages. They can be partially removed by working on these
junctions [13]–[15].
Another group of methods “blends” all image contri-
butions by assigning a weight to each one or to each
input pixel, and by selecting the final surface color as
the weighted average of the input data, as in Pulli et
al. [16]. The weight is usually a combination of various
quality metrics [17]–[19]. In particular, Callieri et al. [20]
presented a flexible weighting system that can be extended
in order to accommodate additional criteria. These meth-
ods provide better visual results and their implementation
permits very complex datasets to be used, i.e. hundreds
of images and very dense 3D models. Nevertheless,
undesirable ghosting effects may be produced when the
starting set of calibrated images is not perfectly aligned.
This problem can be solved, for example, by applying a
local warping using optical flow [21], [22].
Another issue, which is common to all the cited methods,
is the projection of lighting artifacts on the model, i.e.
shadows, highlights, and peculiar BRDF effects, since the
lighting environment is usually not known in advance. In
order to correct (or to avoid to project) lighting artifacts,
two possible approaches include the estimation of the
lighting environment [23] or the use of easily controllable
lighting setups [24].
C. 3D graphics on the Web platform
Since the birth of Internet, content of Web document
has been characterized by several types of media, ranging
from plain text to images, audio or video streams. When
personal computers have started being equipped with fast
enough graphics acceleration hardware, 3D content began
in its turn to have an important role in the multimedia
sphere. The first tools aimed at visualizing 3D models
in Web pages were based on embedded software compo-
nents, such as Java applets or ActiveX controls [25]. Sev-
eral proprietary plug-ins and extensions for Web browsers
were developed, giving evidence at the lack of standard-
ization for this new content type. Beside the developers
fragmentation that arises due to this wide variety of
available tools and to their incompatibilities, the burden
incumbent upon the user for the installation of additional
software components prevented a wide adoption of online
3D content.
Steps toward a standardization have been taken with the
introduction of the Virtual Reality Modeling Language
(VRML) [26] in 1995 and X3D [27] in 2007. However,
even though they have been well-accepted by the com-
munity, the 3D scene visualization was still delegated to
external software components.
The fundamental change happened in 2009 with the
introduction of the WebGL standard [28], promoted by
the Khronos Group [29]. With minor restrictions related to
security issues, the WebGL API is a one-to-one mapping
of the OpenGL|ES 2.0 specifications [30] in JavaScript.
This implies that modern Web browsers, like Google
Chrome or Mozilla Firefox, are able to natively access
the graphics hardware without needing additional plug-
ins or extensions. WebGL being a low-level API, a series
of higher-level libraries have been developed on top of
it. They differ from each other by the programming
paradigm they use, ranging from scene-graph-based in-
terfaces, like Scene.js [31] and GLGE [32], to procedural
paradigms, like SpiderGL [33] and WebGLU [34].
134
JOURNAL OF MULTIMEDIA, VOL. 7, NO. 2, MAY 2012
© 2012 ACADEMY PUBLISHER

Figure 2. The in-hand scanning device used during the first step of the
presented workflow, producing the data flow required for the generation
of cultural artifacts digital copies.
In our pipeline, as it will be shown, we use SpiderGL as
the rendering library for the real-time visualization of the
acquired digital copies.
III. DIGITIZATIO N AND PROCESSING OF CH
ARTIFACTS FOR GENERATING DIGITAL COPIES
Cultural heritage has been a privileged field of applica-
tion for 3D scanning since the beginning of its evolution.
This is due to the enormous variety and variability of the
types of objects that can be acquired. Moreover, archival
and preservation are extremely important issues as well.
Although 3D scanning can be considered now a ”mature”
technology, the acquisition of a large number of objects
can be expensive both in terms of hardware and time
needed for data processing. Very good results can be
achieved by customizing solutions for collections where
objects are almost of the same size and material, but
this can be expensive [35] or hard to extend to generic
cases [36]. Although some low cost and/or hand-held
devices are available, they usually need the placement of
markers on the object, which is something that is hard to
make on CH artifacts. Conversely, the presented method
uses only an affordable scanning system and does not
make any particular assumption on the measured objects
(except the fact that they are manipulable by hand),
neither for the scanning session itself nor for the post-
processing steps.
This section describes the two first stages of our work-
flow: how and with which technology real artifacts can
be easily digitized by the user (section III-A) and how
the resulting data are exploited to recover automatically
a color texture (section III-B) and a texture of details
(section III-C) to enrich the 3D model provided by the
acquisition.
A. Acquisition by in-hand scanning
The first stage of our workflow, namely the one produc-
ing all data required for the creation of digital copies from
cultural artifacts, is based on an in-hand scanner whose
hardware configuration is shown in Figure 2. This scanner,
like most of the high speed digitization systems, is based
on structured light. Shape measurement is performed by
phase-shifting, using three different sinusoidal patterns to
establish correspondences (and then to perform optical
triangulation) between the projector and the two black and
white video cameras. The phase unwrapping, namely how
the different signal periods are demodulated, is achieved
by a GPU stereo matching between both cameras (see [3],
[5] for more details). The whole process produces one
range map in 14ms. Simultaneously, a color video flow
is captured by the third camera. During an acquisition,
the only light source in the scene is the scanner projector
itself, for which the position is always perfectly known.
The scanning can be performed in two different ways.
If the object color is neither red nor brown, it can be
done by holding the object directly by hand. In this case,
occlusions are detected by a hue analysis which produces,
for each video frame, a map of skin presence probability.
Otherwise, a black glove must be used. Although much
less convenient for the scanning itself, it makes the
occlusion detection trivial by simply ignoring dark regions
in the input pictures.
Each scanning session then produces a 3D mesh and
a color video flow. For each frame of this video, the
viewpoint and the position of the light (ie. the scanner
projector) are given, as well as the skin probability map
in the case of a digitization performed by hand. These data
are then used in the next stage of the workflow in order
to produce the color texture and the texture of details, as
explained hereafter, in section III-B.
Even if this stage requires the intervention of the user, the
choice of a real-time scanner to perform the digitization is
particularly appropriate for non expert operators. Indeed,
for reasons already discussed in the introduction, its usage
is particularly easy and intuitive, and does not require
technical knowledge or manual post-processing. Finally,
it must be noticed that our method does not work only
with the presented scanner, but can be implemented for
any high speed digitization device that is based on the
same principle.
B. Recovery of a diffuse color texture
Our texturing method extends the work proposed
in [20] so as to adapt it to the data flow produced by the
scanning system presented above. The idea, summarized
in Figure 3, is to weight each input picture by a mask
(typically a gray scale image) which represents a per-pixel
confidence value. The final color of a given surface point
is then computed as the weighted average of all color
contributions coming from the pictures into which this
point is visible. Masks are built by the composition of
multiple elementary masks, which are themselves com-
puted by image processing applied either on the input
image or on a rendering of the mesh performed from the
same viewpoint.
In the original paper, three criteria related to viewing con-
ditions have been considered for the mask computation:
the distance to the camera, the orientation with respect to
the viewpoint, and the proximity to a step discontinuity.
These criteria have been chosen so as to penalize image
regions that are known to lack of accuracy, in order to
deal with data redundancy from one image to another in
JOURNAL OF MULTIMEDIA, VOL. 7, NO. 2, MAY 2012
135
© 2012 ACADEMY PUBLISHER

Figure 3. The diffuse color texture is computed as the weighted average of all video frames. Weights are obtained by the composition of multiple
elementary masks, each one corresponding to a particular criterion related to viewing, lighting or scanning conditions.
a way that ensures seamless color transitions. More details
about these masks can be found in [20].
Although sufficient to avoid texture cracks, these masks
cannot handle self projected shadows or specular high-
lights since knowledge about the lighting is necessary.
In our case, both positions of the viewpoint and the light
(projector lamp) are always exactly known. Moreover, the
light moves with the scanner, which means that highlights
and shadows are different for each frame, as well as the
illumination direction. We then define the three following
additional masks, that aim at making prevailing image
parts deprived of illumination effects:
Shadows. Since the complete geometric configura-
tion of the scene is known, we can use a simple
shadow mapping algorithm to estimate shadowed
areas, to which a null weight is assigned.
Specular highlights. Conversely to shadows, high-
lights partially depend on the object material, which
is unknown. For this reason, we use a multi-pass
algorithm to detect them. The first pass computes
the object texture without accounting for highlights.
Due to the high data redundancy, the averaging tends
to reduce their visual impact. During the subsequent
passes, highlights are identified by computing the
luminosity difference between the texture obtained
at the previous pass and the input picture. This dif-
ference corresponds to our highlight removal mask.
In practice, only two passes are sufficient.
Projector illumination. This mask aims at avoiding
luminosity loss during the averaging by giving more
influence to surface parts facing the light source. It
corresponds to the dot product between the surface
normal and the line of sight.
We also introduce two other masks to cope with the
occlusions that are inherent to in-hand scanning. Indeed,
if they are ignored, picture regions corresponding to the
operator’s hand may be introduced in the computation,
leading to visible artifacts in the final texture. Thus,
when digitization is performed with the dark glove, an
occlusion mask is simply computed by thresholding pixel
intensities. In the case of a digitization made by hand, the
mask corresponds to the aforementioned skin probability
map produced by the scanner.
Each elementary mask contains values in the range ]0, 1],
zero being excluded in order to ensure that texturing
is guaranteed for every surface point that is visible in
at least one picture. They are multiplied all together
to produce the final mask that selectively weights the
pixels of the corresponding picture. During this operation,
each elementary mask can obviously be applied more
than once. The influence of each criterion can then be
tuned independently, although we empirically determined
default values that work quite well for most cases.
C. Recovery of a texture of details
Despite the fact that in-hand scanning is a really conve-
nient technology, it often leads to a loss of accuracy with
respect to traditional scanning devices, thus preventing the
acquisition of the finest geometric details. Nevertheless,
thanks to the fact that we know the light position for
each video frame, it is possible to partially recover them
by using a photometric stereo approach.
Photometric stereo consists in computing high quality
normal/range maps by taking several photographs from
the same viewpoint but with different illumination direc-
tions [37]–[39], or by moving the object in front of a
camera and a light source that are fixed with respect to
each other [40], [41]. We use here a similar approach for
extracting a normal map from the video flow produced
by the in-hand scanner.
In the following, vectors are assumed to be column
vectors. Let {F
i
} be the set of frames corresponding to the
acquisition sequence. A light position
i
, corresponding
to the scanner projector’s location, is associated to each
frame F
i
. Assuming that the object surface is Lambertian,
the color c
i
observed at a given surface point p in F
i
is
136
JOURNAL OF MULTIMEDIA, VOL. 7, NO. 2, MAY 2012
© 2012 ACADEMY PUBLISHER

Citations
More filters

Journal ArticleDOI
TL;DR: The results indicated that students had a positive attitude toward the use of an interactive virtual museum in cultural heritage education, and confirmed the views of experts regarding the importance and the value of virtual museums as a method for effective learning about cultural heritage.
Abstract: The goal of this study was to investigate students’ views of the interactive Virtual Museum of Al Hassa Cultural Heritage. In this context, a study was carried out during the second semester of the 2014–2015 school year among sixth-grade elementary school students in Al Hassa, Saudi Arabia. After participating in an interactive virtual museum, 118 students answered a questionnaire after the teaching intervention. SPSS v.21 was used to analyze the data. The results indicated that students had a positive attitude toward the use of an interactive virtual museum in cultural heritage education. The results support the inclusion of cultural heritage in the social studies curricula in K–12 education in Saudi Arabia in order to raise awareness and knowledge of national heritage. The results also confirmed the views of experts regarding the importance and the value of virtual museums as a method for effective learning about cultural heritage.

8 citations


Cites background from "From the Digitization of Cultural A..."

  • ...The importance of cultural heritage education has been underlined in the literature review [2-4, 30]....

    [...]

  • ...A number of studies have claimed that museums support and enhance cultural heritage education [1-4]....

    [...]

  • ...In the field of cultural heritage, knowledge sharing is an essential aspect of communication that preserves and maintains cultural collections [2]....

    [...]

  • ...Multimedia technologies such as VMs are increasingly prevalent in cultural heritage [2]....

    [...]


Journal ArticleDOI
TL;DR: A new local shape descriptor for 3D surfaces, called the histogram of spherical orientations (HoSO), is developed, which is used in combination with a bag-of-words approach to compute visual similarity between3D surfaces.
Abstract: We address the problem of the statistical description of 3D surfaces with the purpose of automatic classification and retrieval of archaeological potsherds. These are particularly interesting problems in archaeology, as pottery comprises a great volume of findings in archaeological excavations. Indeed, the analysis of potsherds brings relevant cues for understanding the culture of ancient groups. In particular, we develop a new local shape descriptor for 3D surfaces, called the histogram of spherical orientations (HoSO), which we use in combination with a bag-of-words approach to compute visual similarity between 3D surfaces. Given a point of interest on a 3D surface, its local shape descriptor (HoSO) captures the distribution of the spherical orientations of its neighboring points. In turn, those spherical orientations are computed with respect to the point of interest itself, both in the azimuth and the zenith axis. The proposed HoSO is invariant to scale transformations and highly robust to rotation and noise. In addition, it is efficient, as it only exploits the information of the position of the 3D points and disregards other types of information like faces or normals. We performed experiments on a set of 3D surfaces representing potsherds from the Teotihuacan civilization and further validations on a set of 3D models of generic objects. Our results show that our methodology is effective for describing 3D models and that it improves classification performance with respect to previous local descriptors.

8 citations


Cites background from "From the Digitization of Cultural A..."

  • ...…the cultural value and fragility of objects compel professionals to search for better ways to guarantee long-term archiving [Razdan et al. 2001; Larue et al. 2012], as well as to expedite sharing of This work was partially supported by the Swiss-NSF through the projects Tepalcatl P2ELP2 152166…...

    [...]

  • ...In recent years, 3D digitization has become a standard technique to document shape of artifacts, especially in fields like archaeology [Maiza and Gaildrat 2005; Karasik and Smilanski 2008], where the cultural value and fragility of objects compel professionals to search for better ways to guarantee long-term archiving [Razdan et al. 2001; Larue et al. 2012], as well as to expedite sharing of...

    [...]


DOI
01 Jan 2013
TL;DR: This paper focuses on content-based image retrieval, which involves clustering, sparse coding, and histogram of orientations in the context of Maya civilization.
Abstract: Keywords: content-based image retrieval ; shape descriptor ; histogram of orientations ; clustering ; sparse coding ; image detection ; cultural heritage ; Maya civilization ; hieroglyphs These Ecole polytechnique federale de Lausanne EPFL, n° 5616 (2013)Programme doctoral Genie electriqueFaculte des sciences et techniques de l'ingenieurInstitut de genie electrique et electroniqueLaboratoire de l'IDIAPJury: S. Susstrunk (presidente), S. Marchand-Maillet, J.-Ph. Thiran, C. Wang Public defense: 2013-2-27 Reference doi:10.5075/epfl-thesis-5616Print copy in library catalog Record created on 2013-02-20, modified on 2017-05-10

7 citations


Journal ArticleDOI
TL;DR: The research presented here focuses on the contribution of 3D documentation and subsequent analysis of the well complex for understanding social aspects related to and reflected by the architectural remains.
Abstract: The paper presents the 3D investigation of the architectonic remains at the well complex of the archaeological site of Santa Cristina, located near the town of Paulilatino, in the province of Oristano, the Sardinian Island, Italy. The visible today remains integrate original fragments of the initial structure, built sometime 3000 years ago, and modern reconstruction conducted almost half a century ago. Despite the fact that the site has been excavated and its remains investigated for more than 50 years, no publications detailing the archaeological finds are available. The research presented here focuses on the contribution of 3D documentation and subsequent analysis of the well complex for understanding social aspects related to and reflected by the architectural remains.

6 citations


Book ChapterDOI
01 Nov 2014
TL;DR: The Tepalcatl project, an ongoing bi-disciplinary effort conducted by archaeologists and computer vision researchers, which focuses on developing statistical methods for the automatic categorization of potsherds from ancient Mexico including the Teotihuacan and Aztec civilizations, is introduced.
Abstract: We introduce the Tepalcatl project, an ongoing bi-disciplinary effort conducted by archaeologists and computer vision researchers, which focuses on developing statistical methods for the automatic categorization of potsherds; more precisely, potsherds from ancient Mexico including the Teotihuacan and Aztec civilizations. We captured 3D models of several potsherds, and annotated them using seven taxonomic criteria appropriate for categorization. Our first task consisted in exploiting the descriptive power of two state-of-the-art 3D descriptors. Then, we evaluated their retrieval and classification performance. Finally, we investigated the effects of dimensionality reduction for categorization of our data. Our results are promising and demonstrate the potential of computer vision techniques for archaeological classification of potsherds.

6 citations


Cites background from "From the Digitization of Cultural A..."

  • ...The extraction of 3D digital data has also brought extra benefits, such as the possibility to undertake new types of content analyses, as well as an easier sharing of information among professionals, the design of better ceramic documentation and archiving systems [6, 7], and the performance of virtual reconstruction of vessels [4, 5]....

    [...]

  • ...This is especially true with regard to the creation and use of digital 3D models, which enable capabilities that would not be available using the original artifacts, such as automatic and semi-automatic content analysis [2, 3], virtual reconstructions[4, 5], more efficient archiving [6, 7], sharing documentation online [1, 7], training of novel scholars, etc....

    [...]


References
More filters

01 Jan 1967
Abstract: The main purpose of this paper is to describe a process for partitioning an N-dimensional population into k sets on the basis of a sample. The process, which is called 'k-means,' appears to give partitions which are reasonably efficient in the sense of within-class variance. That is, if p is the probability mass function for the population, S = {S1, S2, * *, Sk} is a partition of EN, and ui, i = 1, 2, * , k, is the conditional mean of p over the set Si, then W2(S) = ff=ISi f z u42 dp(z) tends to be low for the partitions S generated by the method. We say 'tends to be low,' primarily because of intuitive considerations, corroborated to some extent by mathematical analysis and practical computational experience. Also, the k-means procedure is easily programmed and is computationally economical, so that it is feasible to process very large samples on a digital computer. Possible applications include methods for similarity grouping, nonlinear prediction, approximating multivariate distributions, and nonparametric tests for independence among several variables. In addition to suggesting practical classification methods, the study of k-means has proved to be theoretically interesting. The k-means concept represents a generalization of the ordinary sample mean, and one is naturally led to study the pertinent asymptotic behavior, the object being to establish some sort of law of large numbers for the k-means. This problem is sufficiently interesting, in fact, for us to devote a good portion of this paper to it. The k-means are defined in section 2.1, and the main results which have been obtained on the asymptotic behavior are given there. The rest of section 2 is devoted to the proofs of these results. Section 3 describes several specific possible applications, and reports some preliminary results from computer experiments conducted to explore the possibilities inherent in the k-means idea. The extension to general metric spaces is indicated briefly in section 4. The original point of departure for the work described here was a series of problems in optimal classification (MacQueen [9]) which represented special

22,533 citations


"From the Digitization of Cultural A..." refers background or methods in this paper

  • ...Photometric stereo consists in computing high quality normal/range maps by taking several photographs from the same viewpoint but with different illumination directions [37]–[39], or by moving the object in front of a camera and a light source that are fixed with respect to each other [40], [41]....

    [...]

  • ...This is not a one-off fortuitous occurrence, but generally true (see additional examples at [37])....

    [...]

  • ...The figures for other values of d are at [37]....

    [...]

  • ...EXPERIMENTAL RESULTS We have designed all experiments such that they are reproducible, and as such, all data and code are freely available at [37]....

    [...]


Proceedings ArticleDOI
01 Jan 1988
TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.
Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

13,266 citations


"From the Digitization of Cultural A..." refers background or methods in this paper

  • ...They differ from each other by the programming paradigm they use, ranging from scene-graph-based interfaces, like Scene.js [31] and GLGE [32], to procedural paradigms, like SpiderGL [33] and WebGLU [34]....

    [...]

  • ...do consider “Finding Motifs in a Database of Shapes” [34]....

    [...]

  • ...In 2007, Xi et al.[34] proposed a method to find image motifs or the most similar pair of images in the image database....

    [...]

  • ...js [31] and GLGE [32], to procedural paradigms, like SpiderGL [33] and WebGLU [34]....

    [...]


Proceedings ArticleDOI
Sivic1, Zisserman1Institutions (1)
13 Oct 2003
TL;DR: An approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video, represented by a set of viewpoint invariant region descriptors so that recognition can proceed successfully despite changes in viewpoint, illumination and partial occlusion.
Abstract: We describe an approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video. The object is represented by a set of viewpoint invariant region descriptors so that recognition can proceed successfully despite changes in viewpoint, illumination and partial occlusion. The temporal continuity of the video within a shot is used to track the regions in order to reject unstable regions and reduce the effects of noise in the descriptors. The analogy with text retrieval is in the implementation where matches on descriptors are pre-computed (using vector quantization), and inverted file systems and document rankings are used. The result is that retrieved is immediate, returning a ranked list of key frames/shots in the manner of Google. The method is illustrated for matching in two full length feature films.

6,757 citations


Proceedings Article
01 Jan 2004
TL;DR: This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patches and shows that it is simple, computationally efficient and intrinsically invariant.
Abstract: We present a novel method for generic visual categorization: the problem of identifying the object content of natural images while generalizing across variations inherent to the object class. This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patches. We propose and compare two alternative implementations using different classifiers: Naive Bayes and SVM. The main advantages of the method are that it is simple, computationally efficient and intrinsically invariant. We present results for simultaneously classifying seven semantic visual categories. These results clearly demonstrate that the method is robust to background clutter and produces good categorization accuracy even without exploiting geometric information.

4,911 citations


"From the Digitization of Cultural A..." refers background or methods in this paper

  • ...West: A Monograph of the British Desmidiaceae [32], which is still referenced in modern scientific texts, and some of them are vanity publications by “gentlemen scholars”....

    [...]

  • ...They differ from each other by the programming paradigm they use, ranging from scene-graph-based interfaces, like Scene.js [31] and GLGE [32], to procedural paradigms, like SpiderGL [33] and WebGLU [34]....

    [...]

  • ...js [31] and GLGE [32], to procedural paradigms, like SpiderGL [33] and WebGLU [34]....

    [...]

  • ...They are typical examples from the perhaps hundreds of books on Diatoms published during the Victorian era [23][28][32]....

    [...]


Journal ArticleDOI
TL;DR: A comparative evaluation of different detectors is presented and it is shown that the proposed approach for detecting interest points invariant to scale and affine transformations provides better results than existing methods.
Abstract: In this paper we propose a novel approach for detecting interest points invariant to scale and affine transformations. Our scale and affine invariant detectors are based on the following recent results: (1) Interest points extracted with the Harris detector can be adapted to affine transformations and give repeatable results (geometrically stable). (2) The characteristic scale of a local structure is indicated by a local extremum over scale of normalized derivatives (the Laplacian). (3) The affine shape of a point neighborhood is estimated based on the second moment matrix. Our scale invariant detector computes a multi-scale representation for the Harris interest point detector and then selects points at which a local measure (the Laplacian) is maximal over scales. This provides a set of distinctive points which are invariant to scale, rotation and translation as well as robust to illumination changes and limited changes of viewpoint. The characteristic scale determines a scale invariant region for each point. We extend the scale invariant detector to affine invariance by estimating the affine shape of a point neighborhood. An iterative algorithm modifies location, scale and neighborhood of each point and converges to affine invariant points. This method can deal with significant affine transformations including large scale changes. The characteristic scale and the affine shape of neighborhood determine an affine invariant region for each point. We present a comparative evaluation of different detectors and show that our approach provides better results than existing methods. The performance of our detector is also confirmed by excellent matching resultss the image is described by a set of scale/affine invariant descriptors computed on the regions associated with our points.

3,971 citations


"From the Digitization of Cultural A..." refers background in this paper

  • ...Photometric stereo consists in computing high quality normal/range maps by taking several photographs from the same viewpoint but with different illumination directions [37]–[39], or by moving the object in front of a camera and a light source that are fixed with respect to each other [40], [41]....

    [...]


Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20201
20171
20162
20152
20143
20132