|
| VISAPP 2008 Abstracts |
|
Conference
Area 1 - Image Formation and Processing
Area 2 - Image Analysis
Area 3 - Image Understanding
Area 4 - Motion, Tracking and Stereo Vision
Special Sessions
Bayesian Approach for Inverse Problems in Computer Vision
Online Pattern Recognition and Machine Learning Techniques for Computer-Vision Applications
Workshops
VISAPP International Workshop on Robotic Perception (VISAPP-RoboPerc08)
The First International Workshop on Metadata Mining for Image Understanding (MMIU 2008)
The First International Workshop on Image Mining. Theory and Applications (IMTA 2008)
|
|
Area 1 - Image Formation and Processing
|
Paper Nr.: |
18
|
Title: |
BACKGROUND SEGMENTATION IN MICROSCOPY IMAGES
|
Author(s): |
J.J. Charles, L.I. Kuncheva, B. Wells and I.S. Lim |
Abstract: |
In
many applications it is necessary to segment the foreground of an image
from the background. However images from microscope slides illuminated
using transmitted light have uneven background light levels. The
non-uniform illumination makes segmentation difficult. We propose to
fit a set of parabolas in order to segment the image into background
and foreground. Parabolas are fitted separately on horizontal and
vertical stripes of the grey level intensity image. A pixel is labelled
as background or foreground based on the two corresponding parabolas.
The proposed method outperforms the following four standard
segmentation techniques, (1) thresholding determined manually or by
fitting a mixture of Gaussians, (2) clustering in the RGB space, (3)
fitting a two-argument quadratic function on the whole image and (4)
using the morphological closure method. |
|
Paper Nr.: |
53
|
Title: |
MULTIPLE VIEW GEOMETRY FOR MIXED DIMENSIONAL CAMERAS
|
Author(s): |
Kazuki Kozuka and Jun Sato |
Abstract: |
In
this paper, we analyze the multiple view geometry under the case where
various dimensional imaging sensors are used together. Although the
multiple view geometry has been studied extensively and extended
for more general situations, all the existing multiple view geometries
assume that the scene is observed by the same dimensional imaging
sensors, such as 2D cameras. In this paper, we show that there exist
multilinear constraints on image coordinates, even if the dimensions of
camera images are different each other. The new multilinear constraints
can be used for describing the geometric relationships between 1D line
sensors, 2D cameras, 3D range sensors etc., and for calibrating mixed
sensor systems. |
|
Paper Nr.: |
85
|
Title: |
ACCURACY IMPROVEMENTS AND ARTIFACTS REMOVAL IN EDGE BASED IMAGE INTERPOLATION
|
Author(s): |
Nicola Asuni and Andrea Giachetti |
Abstract: |
In this paper we analyse the problem of general purpose image upscaling that preserves edge features and
natural appearance and we present the results of subjective and objective evaluation of images interpolated
using different algorithms. In particular, we consider the well-known NEDI (New Edge Directed Interpolation,
Li and Orchard, 2001) method, showing that by modifying it in order to reduce numerical instability and
making the region used to estimate the low resolution covariance adaptive, it is possible to obtain relevant
improvements in the interpolation quality. The implementation of the new algorithm (iNEDI, improved New
Edge Directed Interpolation), even if computationally heavy (as the Li and Orchard’s method), obtained, in
both subjective and objective tests, quality scores that are notably higher than those obtained with NEDI and
other methods presented in the literature |
|
Paper Nr.: |
117
|
Title: |
IMAGE INPAINTING CONSIDERING BRIGHTNESS CHANGE AND SPATIAL LOCALITY OF TEXTURES
|
Author(s): |
Norihiko Kawai, Tomokazu Sato and Naokazu Yokoya |
Abstract: |
Image
inpainting techniques have been widely investigated to remove undesired
visual objects in images such as damaged portions of photographs and
people who have accidentally entered into pictures. Conventionally, the
missing parts of an image are completed by optimizing the objective
function which is defined based on pattern similarity between the
missing region and the rest of the image. However, unnatural textures
are easily generated due to two factors: (1) available samples in the
data region is quite limited, (2) pattern similarity is one of the
required conditions but is not sufficient for reproducing natural
textures. In this paper, in order to improve the image quality of
completed texture, the objective function is extended by allowing
brightness changes of sample textures (for(1)) and introducing spatial
locality as an additional constraint (for(2)). The effectiveness of
these extensions is successfully demonstrated by applying the proposed
method to one hundred images and comparing the results with those
obtained by the conventional methods. |
|
Paper Nr.: |
131
|
Title: |
CONSTRAIN PROPAGATION FOR GHOST REMOVAL IN HIGH DYNAMIC RANGE IMAGES
|
Author(s): |
Matteo Pedone and Janne Heikkilä |
Abstract: |
Creating
high dynamic range images of non-static scenes is currently a
challenging task. Carefully preventing strong camera shakes during
shooting and performing image-registration before combining the
exposures cannot ensure that the resulting hdr
image is consistent. This is eventually due to the presence of moving
objects in the scene that causes the so called ghosting artifacts.
Different approaches have been developed so far in order to reduce the
visible effects of ghosts in hdr images. Our iterative method
propagates the influences of pixels that have low chances to belong to
the static part of the scene through an image-guided energy
minimization approach. Results produced with our technique show a
significant reduction or total removal of ghosting artifacts. |
|
Paper Nr.: |
140
|
Title: |
DATA EVALUATION FOR DEPTH CALIBRATION OF A CUSTOMARY PMD RANGE IMAGING SENSOR CONSIDERING OBJECTS WITH DIFFERENT ALBEDO
|
Author(s): |
Jochen Radmer, Alexander Sabov and Jörg Krüger |
Abstract: |
For various applications, such as object recognition or tracking and especially
when the object is partly occluded or articulated, 3D information is crucial
for the robustness of the application. A recently developed sensor to aquire distance
information is based on the Photo Mixer Device \textit{(PMD)} technique.
Lateral and depth calibration has
been carried out on a modified research sensor without considering the
reflectivity of the objects. For a customary sensor of this type data evaluation
and depth calibration has not been carried out yet. For that reason this paper
focuses on data evaluation and depth calibration of a customary sensor. In
addition, the dependence of the reflectivity of the considered objects on the
distance measurement is incorporated which had not been considered yet at all. |
|
Paper Nr.: |
142
|
Title: |
ANISOTROPIC DIFFUSION BY QUADRATIC REGULARIZATION
|
Author(s): |
Marcus Hund and Bärbel Mertsching |
Abstract: |
Based
on a regularization formulation of the problem,we present a novel
approach to anisotropic diffusion that brings up a clear and
easy-to-implement theory containing a problem formulation with
existence and uniqueness of the solution.
Unlike many iterative applications, we present a clear condition for
the step size
ensuring the convergence of the algorithm.
The capability of our approach is demonstrated on a variety of well
known test images. |
|
Paper Nr.: |
144
|
Title: |
ACCELERATED SKELETONIZATION ALGORITHM FOR TUBULAR STRUCTURES IN LARGE DATASETS BY RANDOMIZED EROSION
|
Author(s): |
Gerald Zwettler, Franz Pfeifer, Roland Swoboda and Werner Backfrieder |
Abstract: |
Skeletonization
is an important procedure in morphological analysis of
three-dimensional objects. A simplified object geometry allows easy
semantic interpretation at the cost of high computational effort. This
paper introduces a fast morphological thinning approach for
skeletonization of tubular structures and objects of arbitrary shape.
With minimized constraints for erosions at the surface, hit-ratio is
increased allowing high performance thinning with large datasets. Time
consuming neighbourhood checking is solved by use of fast indexing
lookup tables. The novel algorithm homogenously erodes the object’s
surface, resulting in an accurate extraction of the centerline, even
when the medial axis is placed between actual voxel-grid. The thinning
algorithm is applied for vessel tree analysis in the field of
computer-based medical diagnostics, thus meeting high robustness and
performance requirements. |
|
Paper Nr.: |
169
|
Title: |
HISTORICAL DOCUMENT IMAGE BINARIZATION
|
Author(s): |
Carlos A.B.Mello, Adriano L.I.Oliveira and Ángel Sánchez |
Abstract: |
Preservation
and publishing historical documents is an important issue which has
gained more and more interest over the years. Digital media has been
used to storage digital versions of the documents as image files.
However, this digital image needs huge storage space as usually the
documents are digitized in high resolutions and in true colour for
preservation purposes. In order to make easier the access to the images
they can be converted into bi-level images. We present in this work a
new method composed by two algorithms for binarization of historical
document images based on Tsallis entropy. The new method was compared
to several other well-known threshold algorithms and it achieved the
best quantitative results when compared to the gold standard images of
the documents measuring the values of precision, recall, accuracy,
specificity, peak signal-to-noise ratio and mean square error. |
|
Paper Nr.: |
170
|
Title: |
A NEW RELIABILITYMEASURE FOR ESSENTIAL MATRICES SUITABLE IN MULTIPLE VIEWCALIBRATION
|
Author(s): |
Jaume Vergés-Llahí, Daniel Moldovan and Toshikazu Wada |
Abstract: |
This
paper presents a new technique to recover structure and motion from a
large number of images acquired by an intrinsically calibrated
perspective camera. We describe a method for computing reliable camera
motion parameters that combines (1) a camera dependency graph and (2)
an algorithm for computing the weights on the edges. A new criterion
for evaluating the reliability of epipolar constraint is introduced. It
is composed of unreliability of Kanatani's renormalization process and
the decomposition error between the estimated matrix encoding the
epipolar constraint and the decomposed motion parameters. Experimental
results show that there exist a clear correlation between the proposed
criterion and the error in the estimation of motion parameters. The
performance of the proposed method is demonstrated on a long sequence
of short base-line images. |
|
Paper Nr.: |
207
|
Title: |
SELF-CALIBRATION OF CENTRAL CAMERAS BY MINIMIZING ANGULAR ERROR
|
Author(s): |
Juho Kannala, Sami S. Brandt and Janne Heikkilä |
Abstract: |
This
paper proposes a generic self-calibration method for central cameras.
The method requires two-view point correspondences and estimates both
the internal and external camera parameters by minimizing angular
error. In the minimization, we use a generic camera model which is
suitable for central cameras with different kinds of radial distortion
models. The proposed method can be hence applied to a large range of
cameras from narrow-angle to fish-eye lenses and catadioptric
cameras. Here the camera parameters are estimated by minimizing the
angular error which does not depend on the 3D coordinates of the point
correspondences. However, the error still has several local minima and
in order to avoid these we propose a multi-step optimization approach.
This strategy also has the advantage that it can be used together with
RANSAC to provide robustness for false matches. We demonstrate our
method in experiments with synthetic and real data. |
|
Paper Nr.: |
224
|
Title: |
HIGH-SPEED IMAGE FEATURE DETECTION USING FPGA IMPLEMENTATION OF FAST ALGORITHM
|
Author(s): |
Marek Kraft, Adam Schmidt and Andrzej Kasínski |
Abstract: |
Many
of contemporary computer and machine vision applications require
finding of corresponding points across multiple images. To that goal,
among many features, the most commonly used are corner points. Corners
are formed by two or more edges, and mark the boundaries of objects or
boundaries between distinctive object parts. This makes corners the
feature points that used in a wide range of tasks. Therefore, numerous
corner detectors with different properties have been developed.
In this paper, we present a complete FPGA architecture implementing
corer detection. This architecture is based on the FAST algorithm. The
proposed solution is capable of processing the incoming image data with
the speed of hundreds of frames per second for a 512 x 512, 8-bit
gray-scale image. The speed is comparable to the results achieved by
top-of-the-shelf general purpose processors. However, the use of
inexpensive FPGA allows to cut costs, power consumption and to reduce
the footprint of a complete system solution. The paper includes also a
brief description of the implemented algorithm, resource usage summary,
resulting images, as well as block diagrams of the described
architecture. |
|
Paper Nr.: |
235
|
Title: |
FILLING-IN GAPS IN TEXTURED IMAGES USING BIT-PLANE STATISTICS
|
Author(s): |
E. Ardizzone, H. Dindo and G. Mazzola |
Abstract: |
In
this paper we propose a novel approach for the texture
analysis-synthesis problem, with the purpose to restore missing zones
in greyscale images. Bit-plane decomposition is used, and a dictionary
is build with bit-blocks statistics for each plane. Gaps are
reconstructed with a conditional stochastic process, to propagate
texture global features into the damaged area, using information stored
in the dictionary. Our restoration method is simple, easy and fast,
with very good results for a large set of textured images. Results are
compared with a state-of-the-art restoration algorithm. |
|
Paper Nr.: |
241
|
Title: |
BINARY MORPHOLOGY AND RELATED OPERATIONS ON RUN-LENGTH REPRESENTATIONS
|
Author(s): |
Thomas M. Breuel |
Abstract: |
Binary morphology on large images is compute intensive, in particular for large structuring el-
ements. Run-length encoding is a compact and space-saving technique for representing images.
This paper describes how to implement binary morphology directly on run-length encoded binary
images for rectangular structuring elements. In addition, it describes efficient algorithm for trans-
posing and rotating run-length encoded images. The paper evaluates and compares run length
morphologial processing on page images from the UW3 database with an efficient and mature bit
blit-based implementation and shows that the run length approach is several times faster than
bit blit-based implementations for large images and masks. The experiments also show that com-
plexity decreases for larger mask sizes. The paper also demonstrates running times on a simple
morphology-based layout analysis algorithm on the UW3 database and shows that replacing bit
blit morphology with run length based morphology speeds up performance approximately two-fold.
|
|
Paper Nr.: |
274
|
Title: |
A STUDY ON ILLUMINATION NORMALIZATION FOR 2D FACE VERIFICATION
|
Author(s): |
Qian Tao and Raymond Veldhuis |
Abstract: |
Illumination normalization is very important for 2D face
verification. This study examines the state-of-art illumination
normalization methods, and proposes two solutions, namely horizontal
Gaussian derivative filters and local binary patterns. Experiments
show that our methods significantly improve the generalization
capability, while maintaining good discrimination capability of a
face verification system. The proposed illumination normalization
methods have low requirements on image acquisition, and low
computation complexities, and are very suitable for low-end 2D face
verification systems. |
|
Paper Nr.: |
275
|
Title: |
FACE HALLUCINATION USING PCA IN WAVELET DOMAIN
|
Author(s): |
Abdu Rahiman V. and Jiji C. V. |
Abstract: |
The
term face hallucination stands for recognition based super resolution
of face images to improve the spatial resolution. In this paper, we
propose two face hallucination algorithms based on principal component
analysis (PCA) in the wavelet transform domain. In the spatial domain,
PCA based super resolution algorithms, a low resolution (LR)
observation is represented as the linear combination of LR images in an
image database. Super resolved image is obtained as the linear
combination of the corresponding HR images in the database. In the
first approach proposed in this paper, PCA based hallucination
algorithm is applied to the wavelet coefficients of face image. The
hallucinated face image is reconstructed from the super resolved
wavelet coefficients. In second method, face image is split in to four
sub images and the first method is separately applied to three textured
regions. Fourth region, which is relatively smooth, is interpolated
using standard interpolation techniques. We compare the performance of
the two proposed algorithms with their spatial domain counter parts.
The proposed method shows significant improvement over the spatial
domain approaches. |
|
Paper Nr.: |
295
|
Title: |
FREE-VIEW POINT TV WATERMARKING EVALUATED ON GENERATED ARBITRARY VIEWS
|
Author(s): |
Evlambios E. Apostolidis and Georgios A. Triantafyllidis |
Abstract: |
The
recent advances in Image Based Rendering (IBR) has pioneered a new
technology, free view point television, in which TV-viewers select
freely the viewing position and angle by the application of IBR on the
transmitted multi-view video. In this paper, exhaustive tests were
carried out to conclude to the best possible free view-point TV
watermarking evaluated on arbitrary views. The watermark should not
only be extracted from a generated arbitrary view, it should also be
resistant to common video processing and multi-view video processing
operations. |
|
Paper Nr.: |
298
|
Title: |
MULTI-ERROR CORRECTION OF IMAGE FORMING SYSTEMS BY TRAINING SAMPLES MAINTAINING COLORS
|
Author(s): |
Gerald Krell and Bernd Michaelis |
Abstract: |
Optical and electronic components of image forming devices degrade objective and subjective quality of the
acquired or reproduced images. Classical restoration techniques usually require an explicit estimation or
measurement of parameters for each error source. We propose to derive restoration parameters in a training
phase with suitable test patterns for a particular system to be corrected. Space varying properties of different
classes of image degradations are considered simultaneously. It is shown how training is performed in such a
way that colors are reproduced correctly independently of the used test patterns. |
|
Paper Nr.: |
307
|
Title: |
PROGRESSIVE DCT BASED IMAGE CODEC USING STATISTICAL PARAMETERS
|
Author(s): |
Pooneh Bagheri Zadeh, Tom Buggy and Akbar Sheikh Akbari |
Abstract: |
This
paper presents a novel progressive statistical and discrete cosine
transform based image-coding scheme. The proposed coding scheme divides
the input image into a number of non-overlapping pixel blocks. The
coefficients in each block are then decorrelated into their spatial
frequencies using a discrete cosine transform. Coefficients with the
same spatial frequency at different blocks are put together to generate
a number of matrices, where each matrix contains coefficients of a
particular spatial frequency. The matrix containing DC coefficients is
losslessly coded to preserve visually important information. Matrices,
which consist of high frequency coefficients, are coded using a novel
statistical encoder developed in this paper. Perceptual weights are
used to regulate the threshold value required in the coding process of
the high frequency matrices. The coded matrices generate a number of
bitstreams, which are used for progressive image transmission. The
proposed coding scheme, JPEG and JPEG2000 were applied to a number of
test images. Results show that the proposed coding scheme outperforms
JPEG and JPEG2000 subjectively and objectively at low compression
ratios. Results also indicate that the decoded images using the
proposed codec have superior subjective quality at high compression
ratios compared to that of JPEG, while offering comparable results to
that of JPEG2000. |
|
Paper Nr.: |
316
|
Title: |
EVOLVING ROI CODING IN H.264 SVC
|
Author(s): |
Syeda Shamikha F. Shah and Eran A. Edirisinge |
Abstract: |
Region-of-Interest (ROI) based coding is an integral feature of most image/video coding
techniques/standards and has im-portant applications in content based video coding, storage and
transmission. However, in the latest scalable extension of H.264 AVC video coding standard, i.e. H.264
SVC, motion estimation across the slice group boundaries does not preserve the coding quality and
compression rate of the ROI. In this paper novel enhancements to the ROI based coding for H.264 SVC
have been proposed to constrain the inter frame prediction across slice group boundaries. We show that the
proposed algorithms do not negatively affect the rate-distortion performance of the coded video, but provide
useful additional functionality that enables the extended use of the standard in many new application
domains. Further, we pro-pose a method for supporting the coding of moving ROI in the scalable video
coding domain, by adaptively changing the shape, size and position of the slice groups. We show that this
additional functionality is particularly useful in video surveil-lance applications to effectively compress and
transmit the ROI and reduce the storage and transmission requirements without any quality degradation of
the ROI. |
|
Paper Nr.: |
325
|
Title: |
APPROXIMATE POINT-TO-SURFACE REGISTRATION WITH A SINGLE CHARACTERISTIC POINT
|
Author(s): |
Darko Dimitrov, Christian Knauer, Klaus Kriegel and Fabian Stehn |
Abstract: |
We
present approximation algorithms for point-to-surface registration
problems which have applications in medical navigation systems. One of
the central tasks of such a system is to determine a "good" mapping
(the registration transformation or em registration for short) of the
coordinate system of the operation theatre onto the coordinate system
of a 3D model M of a patient, generated from CR- or MRT scans.
The registration q is computed by matching a 3D point set P measured on
the skin of the patient to the 3D model M. It is chosen from a class R
of admissible transformations (e.g., rigid motions) so that it
approximately minimizes a suitable error function e (such as the
directed Hausdorff or mean squared error distance) between q(P) and M,
i.e., q = arg min_(q' elementOf R) e(q'(P), M). A common technique to
support the registration process is to determine either automatically
or manually so-called characteristic points or landmarks, which are
corresponding points on the model and in the point set. Since
corresponding characteristic points are supposed to be mapped onto (or
close to) each other, this reduces the number of degrees of freedom of
the matching problem.
We provide approximation algorithms which compute a rigid motion
registration in the most difficult setting of only a single
characteristic point. |
|
Paper Nr.: |
336
|
Title: |
USE OF SPATIAL ADAPTATION FOR IMAGE RENDERING BASED ON AN EXTENSION OF THE CIECAM02
|
Author(s): |
Olivier Tulet, Mohamed-Chaker Larabi and Christine Maloigne Fernandez |
Abstract: |
With
the development and the multiplicity of imaging devices, the color
quality and portability have become a very challenging problem.
Moreover, a color is perceived with regards to its environment. In
order to take into account the variation of perceptual vision in
function of environment, the CIE (Commission Internationale de
l'éclairage) has standardized a tool named color appearance model
(CIECAM97*, CIECAM02). These models are able to take into account many
phenomena related to human vision of color and can predict the color of
a stimulus, function of its observations conditions. However, these
models do not deal with the influence of spatial frequencies which can
have a big impact on our perception. In this paper, an extended version
of the CIECAM02 was presented. This new version integrates a spatial
model correcting the color in relation to its spatial frequency and its
environment. Moreover, a study on the influence of the background’s
chromaticity has been also performed. The obtained results are sound
and demonstrate the efficiency of the proposed extension. |
|
Paper Nr.: |
341
|
Title: |
LIMITED ANGLE IMAGE RECONSTRUCTION USING FOUR HIGH RESOLUTION PROJECTION AXES AT CO-PRIME RATIO VIEW ANGLES
|
Author(s): |
Anastasios L. Kesidis |
Abstract: |
This
paper proposes a sequential image reconstruction algorithm for the
exact reconstruction of an image from a limited number of projection
angles. Specifically, four projection axes oriented at coprime ratio
view angles are used. The set of proper values for the view angles as
well as the overall number of samples on the projection axis are
explicitly defined and are related only to the dimensions of the image.
The slopes of the four projection axes are calculated according to the
chosen view angle and are symmetrically oriented with respect to the
horizontal and the vertical axis. The reconstruction is a
non-iterative, one pass process based on a decomposition sequence which
defines the order in which the image pixels are restored. Several
simulation results are provided that demonstrate the feasibility of the
proposed method. |
|
Paper Nr.: |
361
|
Title: |
AN AUTOMATED VISUAL EVENT DETECTION SYSTEM FOR CABLED OBSERVATORY VIDEO
|
Author(s): |
Danelle E. Cline, Duane R. Edgington and Jérôme Mariette |
Abstract: |
This
paper presents an overview of a system for processing video streams
from underwater cabled observatory systems based on the Automated
Visual Event Detection (AVED) software. This system identifies
potentially interesting visual events using a neuromorphic vision
algorithm and tracks events frame-by-frame. The events can later be
previewed or edited in a graphical user interface for false detections,
and subsequently imported into a database, or used in an object
classification system. We present a scaleable processing system that
can be used on a single computer, a Beowulf cluster, or a pool of
computers, using the Condor workflow management system. |
|
Paper Nr.: |
370
|
Title: |
EDGE-PRESERVING SMOOTHING OF NATURAL IMAGES BASED ON GEODESIC TIME FUNCTIONS
|
Author(s): |
Jacopo Grazzini and Pierre Soille |
Abstract: |
In this paper, we address the problem of edge-preserving smoothing of
natural images.
We introduce a novel adaptive approach derived from mathematical morphology
as a preprocessing stage in
feature extraction and/or image segmentation.
Likewise other filtering methods, it assumes that the local
neighbourhood of a pixel contains the essential process required for
the estimation of local properties. It performs a weighted averaging by
combining both spatial and tonal information in a single similarity
measure based on the local calculation of geodesic time functions from
the estimated pixel.
By designing relevant geodesic masks, it can deal with specific situation
and type of images.
We describe in the following two possible strategies and we show their
capabilities at smoothing heterogeneous areas while preserving relevant
structures in natural images greyscale and color displaying different
features.
|
|
Paper Nr.: |
382
|
Title: |
COLOR QUANTIZATION BY MORPHOLOGICAL HISTOGRAM PROCESSING
|
Author(s): |
Franklin César Flores, Leonardo Bespalhuk Facci and Roberto de Alencar Lotufo |
Abstract: |
In
a previous paper it was proposed a graylevel quantization method by
morphological histogram processing. This paper introduces the extension
of that quantization method to color images. Considering an image under
the RGB color space model, this extension reduces the number of colors
in the image by partitioning an 3-D histogram, similar to the RGB color
space, in rectangular parallelepiped regions, through a iterative
process. Such partitioning is done, in each interation, by application
of the graylevel quantization method to the longest dimension of the
current region which have the greatest volume. The final classified
color space is used to quantize the image. This paper also shows the
comparison of the proposed method to the classical median cut one. |
|
Paper Nr.: |
427
|
Title: |
FINGERPRINT IMAGE SEGMENTATION BASED ON BOUNDARY VALUES
|
Author(s): |
M. Usman Akram, Anam Tariq, Shahida Jabeen and Shoab A. Khan |
Abstract: |
Fingerprint image segmentation highly influences the performances
of Automatic Fingerprint Identification System(
AFIS). We propose a new enhanced segmentation
technique based on unwanted boundary area gray-level values.
The objective of fingerprint segmentation is to extract
the region of interest(ROI) which contains the desired fingerprint
impression. We present in this paper, a Modified
Gradient Based Method to extract ROI. The distinct feature
of our technique is that it gives high accurate segmentation
percentage for fingerprint images even in case of low quality
fingerprint images. The proposed algorithm is applied
on FVC2004 database. Experimental results demonstrate
the improved performance of the proposed scheme. |
|
Paper Nr.: |
547
|
Title: |
NEWBORN’S BIOMETRIC IDENTIFICATION: CAN IT BE DONE?
|
Author(s): |
Daniel Weingaertner, Olga Regina Pereira Bellon, Luciano Silva and Mônica Nunes Lima Cat |
Abstract: |
In
this article we propose a novel biometric identification method for
newborn babies using their palmprints. A new high resolution optical
sensor was developed, which obtains images with enough ridge minutiae
to uniquely identify the baby. The palm and footprint images of 106
newborns were analysed, leading to the conclusion that palmprints yield
more detailed images then footprints. Fingerprint experts from the
Identifcation Institute of Paraná State performed two matching tests,
resulting in a correct identification rate of $63.3\%$ and $67.7\%$,
more than three times higher than that obtained on similar experiments
described on literature. The proposed image acquisition method also
opens the perspective for the creation of an automatic identification
system for newborns. |
|
|
|
Area 2 - Image Analysis
|
Paper Nr.: |
3
|
Title: |
SEMI-SUPERVISED DIMENSIONALITY REDUCTION USING PAIRWISE EQUIVALENCE CONSTRAINTS
|
Author(s): |
Hakan Cevikalp, Jakob Verbeek, Frédéric Jurie and Alexander Kläser |
Abstract: |
To
deal with the problem of insufficient labeled data, usually side
information - given in the form of pairwise equivalence constraints
between points - is used to discover groups within data. However,
existing methods using side information typically fail in cases with
high-dimensional spaces. In this paper, we address the problem of
learning from side information for high-dimensional data. To this end,
we propose a semi-supervised dimensionality reduction scheme that
incorporates pairwise equivalence constraints for finding a better
embedding space, which improves the performance of subsequent
clustering and classification phases. Our method builds on the
assumption that points in a sufficiently small neighborhood tend to
have the same label.
Equivalence constraints are employed to modify the neighborhoods and to
increase the separability of different classes. Experimental results on
high-dimensional image data sets show that integrating side information
into the dimensionality reduction improves the clustering and
classification performance. |
|
Paper Nr.: |
13
|
Title: |
CLUSTERED CELL SEGMENTATION - Based on Iterative Voting and the Level Set Method
|
Author(s): |
Arjan Kuijper, Yayun Zhou and Bettina Heise |
Abstract: |
In
this paper we deal with images in which the cells cluster together and
the boundaries of the cells are ambiguous. Combining the outcome of an
automatic point detector with the multiphase level set method, the
centre of each cell is detected and used as the ”seed”, in other words,
the initial condition for level set method.
Then by choosing appropriate level set equation, the fronts of the
seeds propagate and finally stop near the boundary of the cells. This
method solves the cluster problem and can distinguish individual cells
properly, therefore it is useful in cell segmentation. By using this
method, we can count the number of the cells and calculate the area of
each cell. Furthermore, this information can be used to get the
histogram of the cell
image. |
|
Paper Nr.: |
15
|
Title: |
CORNER DETECTION WITH MINIMAL EFFORT ON MULTIPLE SCALES
|
Author(s): |
Ernst D. Dickmanns |
Abstract: |
Based on results of fitting linearly shaded blobs to rectangular image
regions a new corner detector has been developed. Theoretical results
for a plane fit with least sum of errors squared to the intensity
distribution within a mask having four mask elements of same
rectangular shape and size with averaged intensity values in these mask
elements, allow very efficient simultaneous computation of pyramid
levels and a new corner criterion at the center of the masks on these
levels. The method is intended for real-time application and has thus
been designed for minimal computing effort. It nicely fits into the
‘Unified Blob-edge-corner Method’ (UBM) developed recently. Results are
given for road scenes. |
|
Paper Nr.: |
17
|
Title: |
A MODEL-BASED APPROACH TO SHAPE FROM FOCUS
|
Author(s): |
R. R. Sahay and A. N. Rajagopalan |
Abstract: |
Shape
from focus (SFF) estimates the structure of a 3D object using the
degree of focus as a cue in a sequence of observations. The estimate of
the depth profile is however, vulnerable to lack of sufficient scene
texture.
In this paper, we propose a method to improve the estimate of the
structure of the object by exploiting neighbourhood dependencies. A
degradation model is used to describe the formation of space-variantly
blurred observations in SFF. The shape of the object is modeled as a
Markov random field and a suitably derived objective function is
minimized to arrive at the final estimate of the shape. |
|
Paper Nr.: |
44
|
Title: |
A NOVEL CHAOTIC CODING SYSTEM FOR LOSSY IMAGE COMPRESSION
|
Author(s): |
Sebastiano Battiato and Francesco Rundo |
Abstract: |
In
this paper a novel image compression pipeline, by making use of a
controlled chaotic system, is proposed. Chaos is a particular dynamic
generated by nonlinear systems. Under certain conditions it is possible
to properly manage the chaotic dynamics obtaining very feasible and
powerful workinginstruments. In the proposed compression pipeline a
linear feedback control strategy has been used to stabilize chaotic
dynamic used to track the 1D signal generated by the input image. The
pipeline is closed by an entropy encoder. Preliminary experiments and
comparison with respect to standard JPEG engine confirm
the effectiveness of the proposed chaotic coding system both for
natural and graphic images. Also the overall performances in terms of
rate-distortion capabilities are promising. |
|
Paper Nr.: |
46
|
Title: |
ROBUST ESTIMATION OF THE PAN-ZOOM PARAMETERS FROM A BACKGROUND AREA IN CASE OF A CRISS-CROSSING FOREGROUND OBJECT
|
Author(s): |
J. Bruijns |
Abstract: |
In
the field of video processing, a model of the background motion has
application in deriving depth from motion. The pan-zoom parameters of
our background model are estimated from the motion vectors of parts
which are a priori likely to belong to the background, such as the top
and side borders ("the background area"). This fails when a foreground
object obscures the greater part of this background area. We have
developed a method to extract a set of pan-zoom parameters for each
different part of the background area.
Using the pan-zoom parameters of the previous frame, we compute from
these sets the pan-zoom parameters most likely to correspond to the
proper background parts. This background area partition method gives
more accurate pan parameters for shots with the greater part of the
background area obscured by one or more foreground objects than
application of the entire background area. |
|
Paper Nr.: |
52
|
Title: |
BINARY IMAGE SKELETON - Continuous Approach
|
Author(s): |
Leonid Mestetskiy and Andrey Semenov |
Abstract: |
In
this paper we propose a correct model building technique of continuous
skeleton for discrete binary image. Our approach is based on
approximation of each connected figure in image by a polygonal figure.
Figure boundary consists of minimal perimeter closed paths which
separate points of foreground and background. Figure skeleton is
constructed as a locus of centers of maximal inscribed circles. A
so-called skeletal base is build from figure skeleton by cutting of
essential noise. It is shown, that the constructed continuous skeleton
exists and is unique for each binary image. There are the following
advantages of derived continuous skeleton: strict mathematical
description, stability to noise, broad capabilities of form
transformations and shape comparison of objects. There is a substantial
advantage in speed of skeleton construction of proposed approach in
comparison with discrete methods, including those in which parallel
calculations are used. This advantage is demonstrated on real images of
big size. |
|
Paper Nr.: |
54
|
Title: |
EFFECT OF FACIAL EXPRESSIONS ON FEATURE-BASED LANDMARK LOCALIZATION IN STATIC GREY SCALE IMAGES
|
Author(s): |
Yulia Gizatdinova and Veikko Surakka |
Abstract: |
The
present aim was to examine effect of facial expressions on the
feature-based landmark localization in static grey scale images. In the
method, local oriented edges were extracted and edge maps of the image
were constructed at two levels of resolution. The landmark candidates
resulted from this step were further verified by matching against the
edge orientation model. The method was tested on a large database of
expressive faces coded in terms of action units. Action units
represented single or conjoint facial muscle activations in upper and
lower face. As results demonstrated, eye regions were localized with
high rates in both neutral and expressive datasets. Nose and mouth
localization was more attenuated by variations in facial expressions.
The present results specified some of the critical facial behaviours
which should be taken into consideration while improving automatic
landmark detectors which rely on the low-level edge and intensity
information. |
|
Paper Nr.: |
79
|
Title: |
4D WARPING FOR ANALYSING MORPHOLOGICAL CHANGES IN SEED DEVELOPMENT OF BARLEY GRAINS
|
Author(s): |
Rainer Pielot, Udo Seiffert, Bertram Manz, Diana Weier, Frank Volke and Winfriede Weschke |
Abstract: |
NMR
imaging allows to obtain 3D-images by non-invasive treatment of
biological structures. In this study intensity-based warping is
evaluated by comparing it to landmark-based warping for a
four-dimensional analysis of morphological changes in seed development
of barley. The datasets of barley grains are obtained at certain
development stages by NMR. Warping algorithms reconstruct intermediate
physically non-measured stages. The landmark-based procedure consists
of automatic definition of landmarks and subsequent distance-weighted
warping. The intensity-based approach uses iterative intensity-based
warping for definition of the displacement vector field and
distance-weighted volume warping for generation of the virtual
intermediate dataset. The approaches were tested with four datasets of
barley at different development stages. As a result, the
intensity-based approach is highly applicable for analysis of
morphological changes in NMR datasets and serves as a tool for an
extensive 4D analysis of seed development in barley grains. |
|
Paper Nr.: |
84
|
Title: |
A NORMALIZED PARAMETRIC DOMAIN FOR THE ANALYSIS OF THE LEFT VENTRICULAR FUNCTION
|
Author(s): |
Jaume Garcia-Barnes, Debora Gil, Sandra Pujadas, Francesc Carreras and Manel Ballester |
Abstract: |
Impairment of Left Ventricular (LV) contractility due to
cardiovascular diseases is reflected in LV motion patterns. The
mechanics of any muscle strongly depends on the spatial
orientation of its muscular fibers since the motion that the
muscle undergoes mainly takes place along the fiber. The helical
ventricular myocardial band concept describes the myocardial
muscle as a unique muscular band that twists in space in a non
homogeneous way. The 3D anisotropy of the ventricular band fibers
suggests a regional analysis of the heart motion. Computation of
normality models of such motion can help in the diagnosis of any
cardiac disorder. In this paper we introduce, for the first time,
a normalized parametric domain that allows comparison of the left
ventricle motion across patients. We address, both, extraction of
the LV motion from Tagged Magnetic Resonance images, as well as,
defining a mapping of the LV to a common normalized domain.
Extraction of normality motion patterns from $17$ healthy
volunteers shows the clinical potential of our LV parametrization.
|
|
Paper Nr.: |
92
|
Title: |
FAST
AND ROBUST LOCALIZATION OF THE HEART IN CARDIAC MRI SERIES - A Cascade
of Operations for Automatically Detecting the Heart in Cine MRI Series
|
Author(s): |
Sebastian Zambal, Andreas Schöllhuber, Katja Bühler and Jiří Hladůvka |
Abstract: |
This
work presents a robust approach for fast initialization of an Active
Appearance Model for subsequent segmentation of cardiac MRI data. The
method automatically determines AAM initialization parameters:
position, orientation, and scaling of the model. Four steps are carried
out: (1) variance images over time are calculated to find a bounding
box that roughly defines the heart region; (2) circle
Hough-transformation adapted to gray values is performed to detect the
left ventricle; (3) thresholding is carried out to determine the
orientation of the heart; (4) the optimal initialization is selected
using a mean texture model.
The method was evaluated on 42 MRI short axis studies coming from two
MRI scanners of two different vendors. Automatic initializations are
compared to manual ones. It is shown that the proposed automatic method
is much faster than and achieves results qualitatively equal to manual
initialization. |
|
Paper Nr.: |
106
|
Title: |
CONTENT-BASED SHAPE RETIEVAL USING DIFFERENT AFFINE SHAPE DESCRIPTORS
|
Author(s): |
Fatma Chaker, Faouzi Ghorbel and Mohamed Tarak Bannour |
Abstract: |
Shape representation is a fundamental issue in the newly emerging multimedia applications. In the Content
Based Image Retrieval (CBIR), shape is an important low level image feature. Many shape representations
have been proposed. However, for CBIR, a shape representation should satisfy several properties such as
affine invariance, robustness, compactness, low computation complexity and perceptual similarity
measurement. Against these properties, in this paper we attempt to study and compare three shape
descriptors: two issued from Fourier method and the Affine Curvature Scale Space Descriptor (ACSSD).
We build a retrieval framework to compare shape retrieval performance in terms of robustness and retrieval
performance. The retrieval performance of the different descriptors is compared using two standard shape
databases. Retrieval results are given to show the comparison. |
|
Paper Nr.: |
107
|
Title: |
MODEL BASED GLOBAL IMAGE REGISTRATION
|
Author(s): |
Niloofar Gheissari, Mostafa Kamali, Parisa Mirshams and Zohreh Sharafi |
Abstract: |
In
this paper, we propose a model-based image registration method capable
of detecting the true transformation model between two images. We
incorporate a statistical model selection criterion to choose the true
underlying transformation model. Therefore, the proposed algorithm is
robust to degeneracy as any degeneracy is detected by the model
selection component. In addition, the algorithm is robust to noise and
outliers since any corresponding pair that does not undergo the chosen
model is rejected by a robust fitting method adapted from the
literature. Another important contribution of this paper is evaluating
a number of different model selection criteria for image registration
task. We evaluated all different criteria based on different levels of
noise. We conclude that CAIC, GBIC slightly outperform other criteria
for this application. The next choices are GIC, SSD and MDL. Finally we
create panorama images using our registration algorithm. The panorama
images show the success of this algorithm. |
|
Paper Nr.: |
115
|
Title: |
DISPLAY REGISTRATION FOR DEVICE INTERACTION - A Proof of Principle Prototype
|
Author(s): |
Nick Pears, Patrick Olivier and Daniel Jackson |
Abstract: |
A
method is proposed to facilitate visually-driven interactions between
two devices, which we call the {\em client}, such as a mobile phone or
personal digital assistant (PDA), which must be equipped with a camera,
and the {\em server}, such as a personal computer (PC) or intelligent
display. The technique that we describe here requires a camera on the
client to view the display on the server, such that either the client
or the server (or both) can compute exactly which part of the server
display is being viewed. The server display and the clients image of
the server display, which can be written onto (part of) the client's
display are then registered. This basic principle, which we call
``display registration" supports a very broad range of interactions
(depending on the context in which the system is operating) and it will
make these interactions significantly quicker, easier and more
intuitive for the user to initiate and control. In addition, either the
client or the server (or both) can compute the six degree-of-freedom (6
DOF) position of the client camera with respect to the server display.
We have built a prototype which proves the principle and usefulness of
display registration. This system employs markers on the server display
for fast registration and it has been used to demonstrate a variety of
operations, such as selecting and zooming into images. |
|
Paper Nr.: |
129
|
Title: |
NONRIGID OBJECT SEGMENTATION AND OCCLUSION DETECTION IN IMAGE SEQUENCES
|
Author(s): |
Ketut Fundana, Niels Chr. Overgaard, Anders Heyden, David Gustavsson and Mads Nielsen |
Abstract: |
We address the problem of nonrigid
object segmentation in image sequences in the presence of occlusion.
The proposed variational segmentation method is based on a
region-based active contour of the Chan-Vese model augmented with a
frame-to-frame interaction term as a shape prior. The interaction
term is constructed to be pose-invariant by minimizing over a group
of transformations and to allow moderate deformation in the shape of
the contour. The segmentation method is then coupled with a novel
variational contour matching formulation between two consecutive
contours which gives a mapping of the intensities from the interior
of the previous contour to the next. With this information
occlusions can be detected from deviations from predicted
intensities and the missing intensities in the occluded areas can be
reconstructed. Experimental results on synthetic and real image
sequences are shown. |
|
Paper Nr.: |
130
|
Title: |
ESTIMATION OF FACIAL EXPRESSION INTENSITY BASED ON THE BELIEF THEORY
|
Author(s): |
Khadoudja Ghanem1, Alice Caplier and Sébastien Stillittano |
Abstract: |
This
article presents a new method to estimate the intensity of a human
facial expression. Supposing an expression occurring on a face has been
recognized among the six universal emotions (joy, disgust,
surprise,sadness, anger, fear), the estimation of the expression’s
intensity is based on the determination of the degree of geometrical
deformations of some facial features and on the analysis of several
distances computed on
skeletons of expressions. These skeletons are the result of a contour
segmentation of facial permanent features (eyes, brows, mouth). The
proposed method uses the belief theory for data fusion. The intensity
of the recognized expression is scored on a three-point ordinal scale:
"low intensity", "medium intensity" or " high intensity". Experiments
on a great number of images validate our method and give good
estimation for facial expression intensity. We have implemented and
tested the method on the following three expressions: joy, surprise and
disgust. |
|
Paper Nr.: |
132
|
Title: |
ENTROPY-BASED SALIENCY COMPUTATION IN LOG-POLAR IMAGES
|
Author(s): |
Nadia Tamayo and V. Javier Traver |
Abstract: |
Visual
saliency provides a filtering mechanism to focus on a set of
interesting areas in the scene, but these mechanisms often overload the
computational resources of many computer vision tasks. In order to
reduce such an overload and improve the computational performance, we
propose to exploit the advantages of log-polar vision to detect salient
regions with economy of computational resources and quite stable
results. Particularly, in this paper we study the application of the
entropy-based saliency to log-polar images. Some interesting
considerations are presented in reference to the concept of ``scale"
and the effects of space-variant sampling on scale selection. We also
propose a necessary border extension to detect objects present in
peripheral areas. The original entropy-based saliency algorithm can be
used in log-polar images, but the results show that our adaptations
allow to detect with more precision log-polar salient forms because
they consider the information redundancy of space-variant sampling.
Compared with cartesian, log-polar salient results allow a significant
saving of computational resources.
|
|
Paper Nr.: |
133
|
Title: |
LEARNING A WARPED SUBSPACE MODEL OF FACES WITH IMAGES OF UNKNOWN POSE AND ILLUMINATION
|
Author(s): |
Jihun Hamm and Daniel D. Lee |
Abstract: |
In this paper we tackle the problem of learning the appearances of a person's face
from images with both unknown pose and illumination.
The unknown, simultaneous change in pose and illumination makes it difficult
to learn 3D face models from data without manual labeling and tracking of features.
In comparison, image-based models do not require geometric knowledge of faces
but only the statistics of data itself, and therefore are easier to train with images with such variations.
We take an image-based approach to the problem and
propose a generative model of a warped illumination subspace.
Image variations due to illumination change are accounted for by
a low-dimensional linear subspace, whereas variations due to pose change
are approximated by a geometric warping of images in the subspace.
We demonstrate that this model can be efficiently learned via MAP estimation and
multiscale registration techniques.
With this learned warped subspace we can jointly estimate the pose and the lighting conditions
of test images and improve recognition of faces under novel poses and illuminations.
We test our algorithm with synthetic faces and real images from the
CMU PIE and Yale face databases.
The results show improvements in prediction and recognition performance compared
to other standard methods. |
|
Paper Nr.: |
136
|
Title: |
ADDING COLOR TO GEODESIC INVARIANT FEATURES
|
Author(s): |
Pier Paolo Campari, Matteo Matteucci and Davide Migliore |
Abstract: |
Geodesic
invariant feature have been originally proposed to build a new local
feature descriptor invariant not only to affine transformations, but
also to general deformations. The aim of this paper is to investigate
the possible improvements given by the use of color information in this
kind of descriptor. We introduced color information both in geodesic
feature construction and description. At feature construction level, we
extended the fast marching algorithm to use color information; at
description level, we tested several color spaces on real data and we
devised the opponent color space as an useful integration to intensity
information. The experiments used to validate our theory are based on
publicly available data and show the improvement, in precision and
recall, with respect to the original intensity based geodesic features.
We also compared this kind of features, on affine and non affine
transformation, with SIFT, steerable filters, moments invariants, spin
images and GIH. |
|
Paper Nr.: |
145
|
Title: |
POISSON LOCAL COLOR CORRECTION FOR IMAGE STITCHING
|
Author(s): |
Mohammad Amin Sadeghi, Seyyed Mohammad Mohsen Hejrati and Niloofar Gheissari |
Abstract: |
A
new method for seamless image stitching is presented. The proposed
algorithm is a hybrid method which uses optimal seam methods and
smoothes the intensity transition between two images by color
correction. A dynamic programming algorithm that finds an optimal seam
along which gradient disparities are minimized is used. A modification
of Poisson image editing is utilized to correct color differences
between two images.
Different boundary conditions for the Poisson equation were
investigated and tested, and mixed boundary conditions generated the
most accurate results. To evaluate and compare the proposed method with
competing ones, a large image database consisting of more than two
hundred image pairs was created. The test image pairs are taken at
different lighting conditions, scene geometries and camera positions.
On this database the proposed approach tested favorably as compared to
standard methods and has shown to be very effective in producing
visually acceptable images. |
|
Paper Nr.: |
147
|
Title: |
A FRAMEWORK FOR ANALYZING TEXTURE DESCRIPTORS
|
Author(s): |
Timo Ahonen and Matti Pietikäinen |
Abstract: |
This
paper presents a new unified framework for texture descriptors such as
Local Binary Patterns (LBP) and Maximum Response 8 (MR8) that are based
on histograms of local pixel neighborhood properties. This framework is
enabled by a novel filter based approach to the LBP operator which
shows that it can be seen as a special filter based texture operator.
Using the proposed framework, the filters to implement LBP are shown to
be both simpler and more descriptive than MR8 or Gabor filters in the
texture categorization task. It is also shown that when the filter
responses are quantized for histogram computation, codebook based
vector quantization yields slightly better results than threshold based
binning at the cost of higher computational complexity. |
|
Paper Nr.: |
148
|
Title: |
INNER LIP SEGMENTATION BY COMBINING ACTIVE CONTOURS AND PARAMETRIC MODELS
|
Author(s): |
Sebastien Stillittano and Alice Caplier |
Abstract: |
Lip
reading applications require accurate information about lip movement
and shape, and both outer and inner contours are useful. In this paper,
we introduce a new method for inner lip segmentation. From the outer
lip contour given by a preexisting algorithm, we use some key points to
initialize an active contour called “jumping snake”. According to some
optimal information of luminance and chrominance gradient, this active
contour fits the position of two parametric models; a first one
composed of two cubic curves and a broken line in case of a closed
mouth, and a second one composed of four cubic curves in case of an
open mouth. These parametric models give a flexible and accurate final
inner lip contour. Finally, we present several experimental results
demonstrating the effectiveness of the proposed algorithm. |
|
Paper Nr.: |
168
|
Title: |
RECOGNITION OF DYNAMIC VIDEO CONTENTS BASED ON MOTION TEXTURE STATISTICAL MODELS
|
Author(s): |
Tomas Crivelli, Bruno Cernushi-Frias, Patrick Bouthemy and Jian-feng Yao |
Abstract: |
The
aim of this work is to model, learn and recognize, dynamic contents in
video sequences, displayed mostly by natural scene elements, such as
rivers, smoke, moving foliage, fire, etc. We adopt the mixed-state
Markov random fields modeling recently introduced to represent the
so-called motion textures. The approach consists in describing the
spatial distribution of some motion measurements which exhibit values
of two types: a discrete component related to the absence of motion and
a continuous part for measurements different from zero. Based on this,
we present a method for recognition and classification of real motion
textures using the generative statistical models that can be learned
for each motion texture class. Experiments on sequences from the DynTex
dynamic texture database demonstrate the performance of this novel
approach. |
|
Paper Nr.: |
177
|
Title: |
A FAST AND ROBUST METHOD FOR VOLUMETRIC MRI BRAIN EXTRACTION
|
Author(s): |
Sami Bourouis and Kamel Hamrouni |
Abstract: |
This paper presents a method for
magnetic resonance imaging (MRI) segmentation and the extraction of
main brain tissues. The method uses an image processing technique
based on level-set approach and EM-algorithm. The paper describes
the main features of the method, and presents experimental results
with real volumetric images in order to evaluate the performance of
the method. |
|
Paper Nr.: |
178
|
Title: |
MULTIRESOLUTION MESH SEGMENTATION OF MRI BRAIN USING CLASSIFICATION AND DISCRETE CURVATURE
|
Author(s): |
Sami Bourouis, Kamel Hamrouni and Mounir Dhibi |
Abstract: |
This paper presents a method for brain tissue segmentation and
characterization of magnetic resonance imaging (MRI) scans. It is
based on statistical classification, differential geometry, and
multiresolution representation. The Expectation Maximization
algorithm and k-means clustering are applied to generate an initial
mask of tissue classes of data volume. Then, we generate a
hierarchical multiresolution representation of each object. The idea
is that the low-resolution description is used to determine
constraints for the segmentation at the higher resolutions. Thus,
our contribution is the design of a pipeline procedure for brain
characterization/labeling by using discrete curvature and
multiresolution representation. We have tested our method on several
MRI data.
|
|
Paper Nr.: |
182
|
Title: |
WAVELET TRANSFORM FOR PARTIAL SHAPE RECOGNITION USING SUB-MATRIX MATCHING
|
Author(s): |
El-hadi Zahzah |
Abstract: |
In
this paper, we propose a method for 2D partial shape recognition
under affine transform using the discrete dyadic wavelet transform
invariant to translation well known as \textit{Stationary Wavelet
Transform or SWT}. The main problem of this type of transforms is
its dependence to the signal starting point since the same signal
may have several representations depending on the starting point.
The choice of the starting point is then necessary to match two
descriptors. Moreover, the contours must be closed which is not
realistic, this is due generally to the image quality, and the methods
of contour extraction. Recently, we proposed a 2D shape recognition
method based on the Discrete Wavelet Transform. This method was applied
on contours represented by close curves. The method we propose in this
is about partial shape matching based on contour representation using
the wavelet transform.
A technique of sub matrix matching is then used to match partial
shapes |
|
Paper Nr.: |
197
|
Title: |
INDEX, MIDDLE, AND RING FINGER EXTRACTION AND IDENTIFICATION BY INDEX, MIDDLE, AND RING FINGER OUTLINES
|
Author(s): |
Ching-Liang Su |
Abstract: |
In
this study, the new technique is used to extract the index, middle and
ring finger outlines. The orientations and geometrical features of
these outlines are calculated and compared to identify different
individuals. The techniques of database SQL searching and manipulation,
image dilation, object position locating, image shifting, rotation, and
interpolation are used to recognize different individuals. The hand was
fixed each time when a photograph was taken, and one can assume that
each time when a hand was acquired, the image was the same as the
previous one. Since the photographs are the same, after the index,
middle or ring fingers have been extracted from the hand image, the
acquired images can be used to identify different persons. |
|
Paper Nr.: |
202
|
Title: |
IMAGE PROCESSING IN MATERIAL ANALYSES OF ARTWORKS
|
Author(s): |
Miroslav Beněs, Barbara Zitová, Janka Hradilová and David Hradil |
Abstract: |
In this paper we present system for processing, description and archiving material analyses used during art
restoration - Nephele. The aim of the material analyses of painting layers is to identify inorganic and organic
compounds using microanalytical methods, and to describe stratigraphy and morphology of layers. The results
are used to interpret the applied painting technique. The Nephele system is the database system for material
analysis reports, extended with the image preprocessing modules and the image retrieval facility. The imple-
mented digital image processing methods are image registration, layers segmentation, and grains segmenta-
tion. In the archiving part of the Nephele, in addition to the traditional database functions we have incorporated
image-based retrieval methods into the developed system. They are based on the feature descriptions such as
Haralick descriptors of co-occurence matrices or features computed using the wavelet decomposition of the
images. Presented examples of achieved results show the applicability of the system. |
|
Paper Nr.: |
211
|
Title: |
CONTENT-BASED IMAGE RETRIEVAL USING GENERIC FOURIER DESCRIPTOR AND GABOR FILTERS
|
Author(s): |
Quan He, ZhengQiao Ji and Q. M. Jonathan Wu |
Abstract: |
Content-based
image retrieval (CBIR) is an important research area with application
to large amount image databases and multimedia information. CBIR has
three general visual contents, including color, texture and shape. The
focus of this paper is on the problem of shape and texture feature
extraction and representation for CBIR. We apply Generic Fourier
Descriptor (GFD) for shape feature extraction and Gabor Filters (GF)
for texture feature extraction, and we successfully combine GFD and GF
together for shape and texture feature extraction. Experimental results
show that the proposed GFD+GF is robust to all the test databases with
best retrieval rate. |
|
Paper Nr.: |
232
|
Title: |
ON THE IMPROVEMENT OF THE TOPOLOGICAL ACTIVE VOLUMES MODEL : A Tetrahedral Approach
|
Author(s): |
N. Barreira, M. G. Penedo, M. Ortega and J. Rouco |
Abstract: |
The
Topological Active Volumes model is a 3D active model focused on
segmentation and reconstruction tasks. The segmentation process is
based on the adjustment of a 3D mesh composed of polyhedra. This
adjustment is guided by the minimisation of several energy functions
related to the mesh. Even though the original cubic mesh achieves good
segmentation results, it has difficulties in some cases due to its
shape. This paper proposes a new topology for the TAV mesh based on
tetrahedra that overcomes the cubic mesh difficulties. Also, the paper
explains an improvement in the tetrahedral topology to increase the
accuracy of the results as well as the efficiency of the overall
process. |
|
Paper Nr.: |
242
|
Title: |
MULTIREGION GRAPH CUT IMAGE SEGMENTATION
|
Author(s): |
Mohamed Ben Salah, Ismail Ben Ayed and Amar Mitiche |
Abstract: |
Graph cut image segmentation, which solves a
labeling problem by combinatorial optimization, has been applied successfully
to a variety of images. For common objective functions, graph cut methods run
significantly faster than level set methods. However, because they assign a
grey level label to each pixel from the set of all possible grey levels, they
lead to an implicit partition of the image domain which is generally an
oversegmentation. The purpose of our study is two-fold: (1) investigate an
image segmentation method which combines parametric modeling of the image
data and graph cut combinatorial optimization and, (2) use a prior which
allows the number of labels/regions to decrease when the number of regions is
not known and the algorithm initialized with a larger number. Experimental
verification shows that the method results in good segmentations and runs
faster than conventional graph cut methods. |
|
Paper Nr.: |
243
|
Title: |
ACTIVE APPEARANCE MODEL(AAM) - From Theory to Implementation
|
Author(s): |
Aleksandra Pizurica, Nikzad Babaii Rizvandi and Wilfried Philips |
Abstract: |
Active
Appearance Model (AAM) is a powerful object modeling technique and one
of the best available ones in computer vision and computer graphics.
This approach is however quite complex and various parts of its
implementation were addressed separately by different researchers in
several recent works. In this paper, we present systematically a full
implementation of the AAM model with pseudo codes for the crucial steps
in the construction of this model. |
|
Paper Nr.: |
245
|
Title: |
ENHANCED PHASE–BASED DISPLACEMENT ESTIMATION - An Application to Facial Feature Extraction and Tracking
|
Author(s): |
Mohamed Dahmane and Jean Meunier |
Abstract: |
In
this work, we develop a multi-scale approach for automatic facial
feature detection and tracking. The method is based on a coarse to fine
paradigm to characterize a set of facial fiducial points using a bank
of Gabor filters that have interesting properties such as
directionality, scalability and hierarchy. When the first face image is
captured, a trained grid is used on the coarsest level to estimate a
rough position for each facial feature. Afterward, a refinement stage
is performed from the coarsest to the finest (original) image level to
get accurate positions. These are then tracked over the subsequent
frames using a modification of a fast phase–based technique.
Experimental results show that facial features can be localized with
high accuracy and that their tracking can be kept during long periods
of free head motion. |
|
Paper Nr.: |
257
|
Title: |
NOVEL
TECHNIQUES FOR AUTOMATICALLY ENHANCED VISUALIZATION OF CORONARY
ARTERIES IN MSCT DATA AND FOR DRAWING DIRECT COMPARISONS TO
CONVENTIONAL ANGIOGRAPHY
|
Author(s): |
Marion Jähne, Christina Lacalli and Stefan Wesarg |
Abstract: |
The
new generation of multi-slice computed tomography (MSCT) scanners
enables the radiologist to assess the coronary arteries in a
non-invasive way. The question of particular interest is whether the
quality of the findings based on MSCT data can compete with the gold
standard - the coronary angiography.
In this work we present novel automated methods for a reliable
visualization of coronary arteries and
for drawing direct visual side-by-side comparisons to conventional
angiograms. Our approach comprises a new method for automatically
extracting the heart from cardiac CT data and an advanced masking
method for eliminating large cardiac cavities to obtain a better
visibility of the coronary arteries in the rendered CT data. For
drawing direct side-by-side comparisons we present a novel approach for
simulating the conventional coronary angiography in an easy-to-handle
manner.
The new methods have been developed for and tested with
contrast-enhanced cardiac CT datasets. |
|
Paper Nr.: |
296
|
Title: |
DEPTH-BASED DETECTION OF SALIENT MOVING OBJECTS IN SONIFIED VIDEOS FOR BLIND USERS
|
Author(s): |
Benoît Deville, Guido Bologna, Michel Vinckenbosch and Thierry Pun |
Abstract: |
The
context of this work is the development of a mobility aid for visually
impaired persons. We present here an original approach for a real time
alerting system, based on the use of feature maps for detecting visual
salient parts in images. In order to improve the quality of this
method, we propose here to benefit from a new feature map constructed
from the depth gradient. A specific distance function is described,
which takes into account both stereoscopic camera limitations and
user's choices. We demonstrate here that this additional depth-based
feature map allows the system to detect the salient regions with good
accuracy in most situations, even with noisy disparity maps. |
|
Paper Nr.: |
294
|
Title: |
REDUCING THE EFFECT OF PARTIAL OCCLUSIONS
|
Author(s): |
Meryem Erbilek and Önsen Toygar |
Abstract: |
The
difficulty in the process of human identification by iris recognition
is that the iris images captured may have occlusions by the eyelids and
eyelashes. In that case, recognition of occluded iris patterns becomes
hard and the corresponding person may not be correctly recognized. In
order to reduce the effect of eyelid or eyelash occlusion on the
recognition of human beings by their iris patterns, we propose a
trivial and efficient method for iris recognition using specific
regions on the iris images without using the traditional preprocessing
approach before applying the feature extraction method to recognize the
irises. First of all, these regions are individually experimented and
then the outputs of each region are combined using a multiple
classifier combination method with the feature extraction method
Principal Component Analysis (PCA). The experiments on the iris images,
with and without occlusions, demonstrate that the proposed approach
achieves better recognition rates compared to the recognition rates of
the holistic approaches. |
|
Paper Nr.: |
305
|
Title: |
EVALUATION OF LOCAL ORIENTATION FOR TEXTURE CLASSIFICATION
|
Author(s): |
Dana Elena Ilea, Ovidiu Ghita and Paul F. Whelan |
Abstract: |
The
aim of this paper is to present a study where we evaluated the optimal
inclusion of the texture orientation in the classification process. In
this paper the orientation for each pixel in the image is extracted
using the partial derivatives of the Gaussian function and the main
focus of our work is centred on the evaluation of the local dominant
orientation (which is calculated by combining the magnitude and local
orientations) on the classification results. While the dominant
orientation of the texture depends strongly on the observation scale,
in this paper we propose to evaluate the macro-texture by calculating
the distribution of the dominant orientations for all pixels in the
image that sample the texture at micro-level. The experimental results
were conducted on standard texture databases and the results indicate
that the dominant orientation calculated at micro-level is an
appropriate measure for texture description. |
|
Paper Nr.: |
306
|
Title: |
AUTOMATIC SHOT BOUNDARY DETECTION USING GAUSSIAN MIXTURE MODEL
|
Author(s): |
A. Adhipathi Reddy and Sridhar Varadharajan |
Abstract: |
The
basic step for video analysis is the detection of shots in a given
video. A shot is sequence of frames captured in a single continuous
action in time and space using a single camera. The boundary between
two adjacent shots may be an abrupt change (hard cut) or gradual
change. In literature, many shot boundary detection algorithms have
been proposed for detecting the hard cut or gradual changes like
fadein/out and dissolve. The performance of these algorithms degrades
with zooming, lighting change conditions, and fast moving type of
videos. In this paper, a novel algorithm based on Gaussian Mixture
Model (GMM) is developed for shot boundary detection. The behavior of
GMM with abrupt and gradual change is used for detection of hard cut,
fadein/out and dissolve. Experimental results shows credibility of the
proposed algorithm with zooming, lighting change conditions, and fast
moving type of videos. |
|
Paper Nr.: |
313
|
Title: |
MEAN SHIFT SEGMENTATION - Evaluation of Optimization Techniques
|
Author(s): |
Jens N. Kaftan, André A. Bell and Til Aach |
Abstract: |
The mean shift algorithm is a powerful clustering technique, which is based on an iterative scheme to detect
modes in a probability density function. It has been utilized for image segmentation by seeking the modes in
a feature space composed of spatial and color information.
Although the modes of the feature space can be efficiently calculated in that scheme, different optimization
techniques have been investigated to further improve the calculation speed. Beside those techniques that
improve the efficiency using specialized data structures, there are other ones, which take advantage of some
heuristics, and therefore affect the accuracy of the algorithm output.
In this paper we discuss and evaluate different optimization strategies for mean shift based image segmentation.
These optimization techniques are quantitatively evaluated based on different real world images. We
compare segmentation results of heuristic-based, performance-optimized implementations with the segmentation
result of the original mean shift algorithm as a gold standard. Towards this end, we utilize different
partition distance measures, by identifying corresponding regions and analyzing the thus revealed differences. |
|
Paper Nr.: |
315
|
Title: |
A ROBUST AND EFFICIENT METHOD FOR TOPOLOGY ADAPTATIONS IN DEFORMABLE MODELS
|
Author(s): |
Jochen Abhau |
Abstract: |
In
this paper, we present a novel algorithm for calculating topological
adaptations in explicit evolutions of surface meshes in 3D. Our
topological adaptation system consists of two main ingredients: A
spatial hashing technique is used to detect mesh self-collisions during
the evolution. Its expected running time is linear with respect to the
number of vertices. A database consisting of possible topology changes
is developed in the mathematical framework of homology theory. This
database allows for fast and robust topology adaptation during a mesh
evolution. The algorithm works without mesh reparametrizations, global
mesh smoothness assumptions or vertex sampling density conditions,
making it suitable for robust, near real-time application. Furthermore,
it can be integrated into existing mesh evolutions easily. Numerical
examples from medical imaging are given. |
|
Paper Nr.: |
319
|
Title: |
ESTIMATING CAMERA ROTATION PARAMETERS FROM A BLURRED IMAGE
|
Author(s): |
Giacomo Boracchia, Vincenzo Cagliotia and Alberto Danesea |
Abstract: |
A
fast rotation of the camera during the image acquisition results in a
blurred image, which typically shows curved smears. We propose an
algorithm for estimating both the camera rotation axis and the camera
angular speed from a single blurred image. The algorithm is based on
local analysis of the blur smears. Contrary to the existing methods in
literature, we treat the more general case where the rotation axis can
be not orthogonal to the image plane, taking into account the
perspective effects that in such case affect the smears.
The algorithm is validated in experiments with synthetic and real
blurred images, providing accurate estimates in both cases. |
|
Paper Nr.: |
332
|
Title: |
LATTICE EXTRACTION BASED ON SYMMETRY ANALYSIS
|
Author(s): |
Manuel Agustí-Melchor, Jose-Miguel Valiente-González and Ángel Rodas-Jordá |
Abstract: |
In
many computer tasks it is necessary to structurally describe the
contents of images for further processing, for example, in regular
images produced in industrial processes such as textiles or ceramics.
After reviewing the different approaches found in the literature, this
work redefines the problem of periodicity in terms of the existence of
local symmetries.
Phase symmetry analysis is chosen to obtain these symmetries because of
its robustness when dealing with image contrast and noise. Also, the
multiresolution nature of the technique offers independence from using
fixed thresholds to segment the image. Our adaptation of the original
technique, based on lattice constraints, has result in a parameter free
algorithm for determining the lattice. It offers a significant increase
in computational speed with respect to the original proposal. Given
that there is no set of images for assessing this type of techniques,
various sets of images have been used, and a proposal to create more
images for evaluating algorithms related to this task, is presented. A
measure to enable the evaluation of results is also introduced, so that
each calculated lattice can be tagged with an index regarding its
correctness. The experiments show that using this statistic, good
results are reported from image collections. Possible applications of
the lattice extraction are suggested. |
|
Paper Nr.: |
334
|
Title: |
BUILDING A NORMALITY SPACE OF EVENTS - A PCA Approach to Event Detection
|
Author(s): |
Angelo Cenedese, Ruggero Frezza, Enrico Campana, Giambattista Gennari and Giorgio Raccanelli |
Abstract: |
The detection of events in video streams is a central task in the automatic vision paradigm, and spans heterogeneous
fields of application from the surveillance of the environment, to the analysis of scientific data.
Actually, although well captured by intuition, the definition itself of event is somewhat hazy and depending
on the specific application of interest. In this work, the approach to the problem of event detection is different
in nature. Instead of defining the event and searching for it within the data, a normality space of the scene is
built from a chosen learning sequence (which represents the only input from the human operator). The event
detection algorithm works by projecting any newly acquired image onto the normality space so as to calculate
a distance from it that represents the innovation of the new frame, and define the metric for triggering an event
alert. The algorithm has been validated in real life situations, in indoor and outdoor environments, and present
appealing features in terms of robustness to natural motions and weather conditions. |
|
Paper Nr.: |
337
|
Title: |
A SUBJECTIVE SURFACES BASED SEGMENTATION FOR THE RECONSTRUCTION OF BIOLOGICAL CELL SHAPE
|
Author(s): |
Matteo Campana, Cecilia Zanella, Barbara Rizzi, Paul Bourgine, Nadine Peyriéras and Alessandro Sarti |
Abstract: |
Confocal
laser scanning microscopy provides nondestructive in vivo imaging to
capture specific structures that have been fluorescently labeled, such
as cell nuclei and membranes, throughout early Zebrafish embryogenesis.
With this strategy we aim at reconstruct in time and space the
biological structures of the embryo during the organogenesis. In this
paper we propose a method to extract bounding surfaces at the
cellular-organization level from microscopy images. The shape
reconstruction of membranes and nuclei is obtained first with an
automatic identification of the cell center and then a subjective
surfaces based segmentation is used to extract the bounding surfaces. |
|
Paper Nr.: |
340
|
Title: |
CHARACTERISATION AND AUTOMATIC DETECTION OF LYMPH NODES ON MR COLORECTAL IMAGES
|
Author(s): |
Jeong-Gyoo Kim and J. Michael Brady |
Abstract: |
Colorectal
cancer is the second most common cause of death in Western countries.
It is often curable by chemoradiotherapy and/or surgery; however,
accurate staging has a significant impact on patient management and
outcome. Numerous clinical reports attest to the fact that staging is
not currently satisfactory, and so more precise methods are required
for effective treatment. The three major components of disease staging
are tumour size; whether or not there is distal metastatic spread; and
the extent of lymph node involvement. Of these, the latter is currently
by far the hardest to quantify, and it is the subject of this paper.
Lymph nodes are distributed throughout the mesorectal fascia that
envelops the colorectum. In practice, they are detected and assessed by
clinicians using properties such as their size and shape. We are not
aware of any previous image analysis approach for colorectal images
that makes this subjective approach more scientific. To aid precise
staging and surgery, we have developed methods that characterises lymph
nodes by extracting implicit properties as computed from magnetic
resonance colorectal images. We first learn the probability density
function (PDF) of the intensities of the mesorectal fascia and find
that it closely approximates a Gaussian distribution. The parameters of
a Gaussian, fitted to the PDF, were estimated and the mean intensity of
a lymph node candidate was compared with it. The fitting provides an
explicit criterion for a region to be classed as a lymph node: namely,
it is an outlier of the Gaussian distribution. As a key part of this
process, we need to segment the boundaries of the mesorectal fascia,
which is enclosed by two closed contours. Clinicians recognise the
outer contour as thin edges. Since the thin edges are often ambiguous
and disconnected, differentiating them from neighbouring tissues is a
non-trivial problem; the surrounding tissues have no significant
difference from the mesorectal fascia in both intensity and texture. We
employed a level set method to segment three sets of objects: the
mesorectal fascia, the colorectum, and lymph node candidates. Our
segmentation results led us to build a PDF and to use it for the
criterion that we propose. The whole process of implementation of our
methods is automatic including the lookup of lymph candidates. The
results of clinical cases are summarised in the paper. |
|
Paper Nr.: |
355
|
Title: |
SPECKLE MODELIZATION IN OCT IMAGES FOR SKIN LAYERS SEGMENTATION
|
Author(s): |
Ali Mcheik, Clovis Tauber, Hadj Batatia, Jerome George and Jean-Michel Lagarde |
Abstract: |
In
dermatology, the optical coherence tomography (OCT) is used to
visualize the skin over a few millimetre depth. These images are
affected by speckle, which can alter the interpretation, but which also
carry information that characterizes locally the visualized tissue. In
this paper, we present a statistical study of the speckle distribution
in OCT images. The capability of three probability density functions
(pdf) (Rayleigh, Lognormal, and Nakagami) to differentiate the speckle
distribution according to the skin layer is analysed. For each pdf, the
vector of parameters, estimated over several images which are annotated
by experts, are mapped onto a parameter space. Quantitative results
over 30 images are compared to the manual delineations of 5 experts.
Results confirm the potential of the method for the segmentation of the
layers of the skin. |
|
Paper Nr.: |
362
|
Title: |
INCORPORATING A NEW RELATIONAL FEATURE IN ARABIC ONLINE HANDWRITTEN CHARACTER RECOGNITION
|
Author(s): |
Sara Izadi and Ching Y. Suen |
Abstract: |
Artificial neural networks have shown good performance in classification tasks. However, models used for
learning in pattern classification are challenged when the differences between the patterns of the training set
are small. Therefore, the choice of effective features is mandatory for obtaining good performance. Statistical
and geometrical features alone are not suitable for recognition of hand printed characters due to variations in
writing styles that may result in deformations of character shapes. We address this problem by using a relational
context feature combined with a local descriptor for training a neural network-based recognition system
in a user-independent online character recognition application. Our feature extraction approach provides a
rich representation of the global shape characteristics, in a considerably compact form. This new relational
feature generally provides a higher distinctiveness and that increases robustness with respect to character deformations,
and potentially increasing the recognition rate in a user-independent system. While enhancing the
recognition accuracy, the feature extraction is computationally simple. We show that the ability to discriminate
in Arabic handwriting characters is increased by adopting this mechanism which provides input to the
feed forward neural network architecture. Our experiments on Arabic character recognition show comparable
results with the state-of-the-art methods for online recognition of these characters. |
|
Paper Nr.: |
368
|
Title: |
PROJECTIVE IMAGE ALIGNMENT BY USING PROJECTIVE IMAGE ALIGNMENT
|
Author(s): |
Georgios D. Evangelidis and Emmanouil Z. Psarakis |
Abstract: |
Nonlinear
projective transformation provides the exact number of desired
parameters to account for all possible camera motions thus making its
use in problems where the
objective is the alignment of two or more image profiles to be
considered as a natural choice. Moreover, the ability of an alignment
algorithm to quickly and accurately estimate the parameter values of
the geometric transformation even in cases of
over-modelling of the warping process constitutes a basic requirement
to many computer vision applications. In this paper the appropriateness
of the Enhanced Correlation Coefficient (ECC) function as a performance
criterion in the projective image
registration problem is investigated. Since this measure is a highly
nonlinear function of the warp parameters, its maximization is achieved
by using an iterative technique. The main theoretical results
concerning the nonlinear optimization problem and an
efficient approximation that leads to an optimal closed form solution
(per iteration) are presented. The performance of the iterative
algorithm is compared against the well known Lucas-Kanade algorithm
with the help of a series of experiments involving strong or weak
geometric deformations, ideal and noisy conditions and even
over-modelling of the warping process. In all cases ECC based algorithm
exhibits a better behavior in speed, as well as in the probability of
convergence as compared to the Lucas-Kanade scheme. |
|
Paper Nr.: |
369
|
Title: |
PERFORMANCE EVALUATION OF ROBUST MATCHING MEASURES
|
Author(s): |
Federico Tombari, Luigi Di Stefano, Stefano Mattoccia and Angelo Galanti |
Abstract: |
This
paper is aimed at evaluating the performances of different measures
which have been proposed in literature for robust matching. In
particular, classical matching metrics typically employed for this task
are considered together with specific approaches aiming at achieving
robustness. The main aspects assessed by the proposed evaluation are
robustness with respect to photometric distortions, noise and occluded
patterns. Specific datasets have been used for testing, which provide a
very challenging framework for what concerns the considered disturbance
factors and can also serve as testbed for evaluation of future robust
visual correspondence measures. |
|
Paper Nr.: |
373
|
Title: |
HEAD POSE ESTIMATION IN FACE RECOGNITION ACROSS POSE SCENARIOS
|
Author(s): |
M. Saquib Sarfraz and Olaf Hellwich |
Abstract: |
We
present a robust front-end pose classification/estimation procedure to
be used in face recognition scenarios. A novel discriminative feature
description that encodes underlying shape well and is insensitive to
illumination and other common variations in facial appearance, such as
skin colour etc., is proposed. Using such features we generate a pose
similarity feature space (PSFS) that turns the multi-class problem into
two-class by using inter-pose and intra-pose similarities. A new
classification procedure is laid down which models this feature space
and copes well with discriminating between nearest poses. For a test
image it outputs a measure of confidence or so called posterior
probability for all poses without explicitly estimating underlying
densities. The pose estimation system is evaluated using CMU Pose,
Illumination and Expression (PIE) database. |
|
Paper Nr.: |
379
|
Title: |
IMAGE RE-SEGMENTATION - A New Approach Applied to Urban Imagery
|
Author(s): |
Thales Sehn Korting, Leila Maria Garcia Fonseca, Luciano Vieira Dutra and Felipe Castro da Silva |
Abstract: |
This
article presents a new approach for Image Segmentation, applied to
Urban Scenes of Remote Sensing Data. Our method is called
re-segmentation, since it uses the results of a previous over-segmented
image, using well-known algorithms like Region Growing or Watershed.
Resultant objects of the first segmentation are connected through a
weighted Region Adjacency Graph, and by analyzing the connections, we
look for regular shapes, i.e. rectangles and circles, within the
connected nodes. The objects, or graph vertices, whose union forms more
regular objects, are merged resulting in new regions with shape
characteristics adequate to the urban case.
|
|
Paper Nr.: |
384
|
Title: |
SURFACE DEFECTS DETECTION ON ROLLED STEEL STRIPS BY GABOR FILTERS
|
Author(s): |
Roberto Medina, Fernando Gayubo, Luis M. González, David Olmedo, Jaime Gómez, Eduardo Zalama and José R. Perán |
Abstract: |
Product
material integrity and surface appearance, in steel flat products
manufacturing and processing, are important attributes that will affect
product operation, reliability and customer confidence. Automated
visual inspection has to be envisaged, but five major problems have to
be overcome: (i) The variable nature of the defects, (ii) The high
reflective nature of the metallic surfaces, (iii) The oil presence,
(iv) The huge amount of visual data to be acquired and processed, and
(v) The high speed in the section where inspections are performed. We
have developed an automated cellular visual inspection system of flat
products in a flat steel cutting factory. Among the approaches that the
system uses to detect defects, we have included the two-dimensional
Gabor filters. In this paper a detection procedure of defects in flat
steel products based on Gabor filters is presented. The traditional
methods based on the study of the grey-level histogram and shape
analysis, have shown quite good results, but there are not good enough
to achieve the level of success required. Experimental results show
that a greater number of defects can be readily detected using the
proposed approach. |
|
Paper Nr.: |
428
|
Title: |
CORE POINT DETECTION USING FINE ORIENTATION FIELD ESTIMATION
|
Author(s): |
M. Usman Akram, Rabia Arshad, Rabia Anwar, Shoab A. Khan and Sarwat Nasir |
Abstract: |
Performance of Automatic Fingerprint Identification System(
AFIS) is greatly influenced by the detection of core
point. Extraction of best Region Of Interest(ROI) from image
can play a vital role for core point detection. In this
paper, we present an improved technique for fine orientation
field estimation and core point detection. The distinct
feature of our technique is that it gives high detection percentage
of core point even in case of low quality fingerprint
images. The proposed algorithm is applied on FVC2004
database. Results of experiments demonstrate improved
performance for detecting core point. |
|
Paper Nr.: |
429
|
Title: |
FACIAL EXPRESSION RECOGNITION BASED ON FUZZY LOGIC
|
Author(s): |
M. Usman Akram, Irfan Zafar, Wasim Siddique Khan and Zohaib Mushtaq |
Abstract: |
We present a novel scheme for facial expression
recognition from facial features using Mamdani-type fuzzy system.
Facial expression recognition is of prime importance in
human-computer interaction systems (HCI). HCI has gained
importance in web information systems and e-commerce and
certainly has the potential to reshape the IT landscape towards
value driven perspectives. We present a novel algorithm for
facial region extraction from static image. These extracted facial
regions are used for facial feature extraction. Facial features
are fed to a Mamdani-type fuzzy rule based system for facial
expression recognition. Linguistic models employed for facial
features provide an additional insight into how the rules combine
to form the ultimate expression output. Another distinct feature
of our system is the membership function model of expression
output which is based on different psychological studies and
surveys. The validation of the model is further supported by
the high expression recognition percentage. |
|
|
|
Area 3 - Image Understanding
|
Paper Nr.: |
11
|
Title: |
A CORRECTIVE FRAMEWORK FOR FACIAL FEATURE DETECTION AND TRACKING
|
Author(s): |
Hussein Hamshari, Steven Beauchemin, Denis Laurendeau and Normand Teasdale |
Abstract: |
Epidemiological
studies indicate that automobile drivers from varying demographics are
confronted by difficult driving contexts such as negotiating
intersections, yielding, merging and overtaking. This research is based
on the hypothesis that visual search patterns of at-risk drivers
provide vital information required for assessing driving abilities and
improving the skills of such drivers under varying conditions. We aim
to detect and track the face and eyes of the driver during several
driving scenarios, allowing for further processing of a driver's visual
search pattern behavior. Traditionally, detection and tracking of
objects in visual media has been performed using specific techniques.
These techniques vary in terms of their robustness and computational
cost. This research proposes a framework that is built upon a
foundation synonymous to boosting. The idea of an integrated framework
employing multiple trackers is advantageous in forming a globally
strong tracking methodology. In order to model the effectiveness of
trackers, a confidence parameter is introduced to help minimize the
errors produced by incorrect matches and allow more effective trackers
with a higher confidence value to correct the perceived position of the
target. |
|
Paper Nr.: |
31
|
Title: |
DIFFUSION
FILTERING FOR ILLUMINATION INVARIANT FACE RECOGNITION - Illumination
Approximation with Diffusion Filters within Retinex Context
|
Author(s): |
Peter Dunker and Melanie Keller |
Abstract: |
Face recognition becomes a very important technology in recent years for a lot of various applications. One
major problem of the most state-of-the-art algorithms are different lightning conditions which can decrease
recognition rates dramatically. To reduce the influence of illumination in the recognition process normalization
methods can be used. In this paper we introduce illumination normalization algorithms based on diffusion
filters. Further we compare our approach with human perceptional inspired retinex algorithms. Finally
we present the evaluation results of our experiments with well known face recognitions techniques such as
principal component analysis (PCA). The results show that the diffusion filter approaches outperforms known
retinex algorithms which demonstrates the capabilities of the diffusion filter technology for illumination normalization.
|
|
Paper Nr.: |
39
|
Title: |
LOSS-WEIGHTED DECODING FOR ERROR-CORRECTING OUTPUT CODING
|
Author(s): |
Sergio Escalera, Oriol Pujol and Petia Radeva |
Abstract: |
The
multi-class classification is a challenging problem for several
applications in Computer Vision. Error Correcting Output Codes
technique (ECOC) represents a general framework capable to extend any
binary classification process to the multi-class case. In this work, we
present a novel decoding strategy that takes advantage of the ECOC
coding to outperform the up to now existing decoding strategies. The
novel decoding strategy is applied to the state-of-the-art coding
designs, extensively tested on the UCI Machine Learning repository
database and in two real vision applications: tissue characterization
in medical images and traffic sign categorization. The results show
that the presented methodology considerably increases the performance
of the traditional ECOC strategies and the state-of-the-art
multi-classifiers. |
|
Paper Nr.: |
41
|
Title: |
HARMONIC DEFORMATION MODEL FOR EDGE BASED TEMPLATE MATCHING
|
Author(s): |
Andreas Hofhauser, Carsten Steger and Nassir Navab |
Abstract: |
The paper presents an approach to the detection of deformable objects in single images.
To this end we propose a robust match metric that preserves the relative
edge point neighborhood, but allows significant shape changes. Similar metrics
have been used for the detection of rigid objects \cite{olson:97,steger:02}.
To the best of our knowledge this adaptation to deformable objects is new.
In addition, we present a fast algorithm for model deformation.
In contrast to the widely used thin-plate spline \cite{bookstein:89,Gianluca:02},
it is efficient even for several thousand points.
For arbitrary deformations, a forward-backward
interpolation scheme is utilized. It is based on harmonic inpainting, i.e. it
regularizes the displacement in order to obtain smooth deformations.
Similar to optical flow,
we obtain a dense deformation field, though the template contains only a sparse
set of model points. Using a coarse-to-fine representation for the distortion of the
template further increases efficiency.
We show in a number of experiments that the presented approach in not only fast,
but also very robust in detecting deformable objects. |
|
Paper Nr.: |
45
|
Title: |
A NEW FACE RECOGNITION SYSTEM - Using HMMs Along with SVD Coefficients
|
Author(s): |
Pooya Davari and Hossein Miar Naimi |
Abstract: |
In
this paper, a new Hidden Markov Model (HMM)-based face recognition
system is proposed. As a novel point despite of five-state HMM used in
pervious researches, we used 7-state HMM to cover more details. As
another novel point, we used a small number of quantized Singular Value
Decomposition (SVD) coefficients as features describing blocks of face
images. This makes the system very fast. In order to additional
reduction in computational complexity and memory consumption (in
hardware implementation) the images are resized to jpeg format. Before
anything, an order-statistic filter is used as a preprocessing
operation. Then a top-down sequence of overlapping sub-image blocks is
considered. Using quantized SVD coefficients of these blocks, each face
is considered as a numerical sequence that can be easily modeled by
HMM. The system has been examined on the Olivetti Research Laboratory
(ORL) face database. The experiments showed a recognition rate of 99%,
using half of the images for training. Our system has been evaluated on
YALE database too. Using five and six training images, we obtained
97.78% and 100% recognition rates respectively, a record in the
literature. The proposed method is compared with the best researches in
the literature. The results show that the proposed method is the
fastest one, having approximately 100% recognition rate. |
|
Paper Nr.: |
49
|
Title: |
NEW INVARIANT DESCRIPTORS BASED ON THE MELLIN TRANSFORM
|
Author(s): |
S. Metari and François Deschęnes |
Abstract: |
In
this paper we introduce two new classes of radiometric and combined
radiometric-geometric invariant descriptors. The first class includes
two types of radiometric descriptors. The first type is based on Mellin
transform and the second one is based on central moments. Both
descriptors are invariant to contrast changes and to convolution
with any kernel having a symmetric form with respect to the diagonals.
The second class contains two subclasses of combined descriptors. The
first subclass includes central-moment based descriptors invariant
simultaneously to translations, to uniform and anisotropic scaling, to
stretching, to contrast changes and to convolution. The second subclass
includes central-complex-moment based descriptors invariant
simultaneously to similarity transformation and to contrast changes. We
apply those invariants to the matching of geometric transformed and/or
blurred images. |
|
Paper Nr.: |
59
|
Title: |
EYE DETECTION USING LINE EDGE MAP TEMPLATE
|
Author(s): |
Mihir Jain, Suman K. Mitra and Naresh Jotwani |
Abstract: |
Location
of eyes is an important visual clue for processes such as scaling and
orientation correction, which are precursors to face recognition. This
paper presents a robust algorithm for eye detection which makes use of
edge information and distinctive features of eyes, starting from a
roughly localized face image. Potential region pairs are generated, and
then template matching is applied to match these region pairs with a
generated eye line edge map template using primary line segment
Hausdorff distance to get an estimation of the centers of two eyes.
This result is then refined to get iris centers and also eye centers.
Experimental results demonstrate the excellent performance of the
proposed algorithm. |
|
Paper Nr.: |
62
|
Title: |
VIDEO EVENT CLASSIFICATION AND DETECTION USING 2D TRAJECTORIES
|
Author(s): |
Alexandre Hervieu, Patrick Bouthemy and Jean-Pierre Le Cadre |
Abstract: |
This
paper describes an original statistical trajectory-based approach which
can address several issues related to dynamic video content
understanding: unsupervised clustering of events, recognition of events
corresponding to learnt classes of dynamic video contents, and
detection of unexpected events. Appropriate local differential features
combining curvature and motion magnitude are robustly computed on the
trajectories. They are invariant to image translation, in-the-plane
rotation and scale transformation. The temporal causality of these
features is then captured by hidden Markov models whose states are
properly quantized values, and similarity between trajectories is
expressed by exploiting the HMM framework. We report experiments on two
sets of data, a first one composed of typical classes of synthetic
(noised) trajectories (such as parabola or clothoid), and a second one
formed with trajectories computed in sports videos. We have also
favorably compared our method to other ones, including feature
histogram comparison, use of the longest common subsequence (LCSS)
distance and SVM-based classification. |
|
Paper Nr.: |
63
|
Title: |
FACE
AND FACIAL FEATURE DETECTION EVALUATION - Performance Evaluation of
Public Domain Haar Detectors for Face and Facial Feature Detection
|
Author(s): |
M. Castrillón-Santana, O. Déniz-Suárez, L. Antón-Canalís and J. Lorenzo-Navarro |
Abstract: |
Fast
and reliable face and facial feature detection is a required ability
for any Human Computer Interaction approach based on Vision. Since the
publication of Viola-Jones object detection framework and
the more recent open source implementation, an increasing number of
applications have appeared in the context of facial processing. In this
sense, the OpenCV community shares a collection of public domain
classifiers for this scenario. However, as far as we know these
classifiers have been rarely compared. In this paper we first analyze
the individual performance of all
those public classifiers getting the best performance for each target.
These results are valid to define a baseline for future approaches.
Additionally we propose a simple hierarchical combination of those
classifiers to increase facial feature detection and reduce false
facial detection.
|
|
Paper Nr.: |
64
|
Title: |
ROBUST FACE ALIGNMENT USING CONVOLUTIONAL NEURAL NETWORKS
|
Author(s): |
Stefan Duffner and Christophe Garcia |
Abstract: |
Face
recognition in real-world images mostly relies on three successive
steps: face detection, alignment and identification. The second step of
face alignment is crucial as the bounding boxes produced by robust face
detection algorithms are still too imprecise for most face recognition
techniques, i.e. they show slight variations in position, orientation
and scale.
We present a novel technique based on a specific neural architecture
which, without localizing any facial feature points, precisely aligns
face images extracted from bounding boxes coming from a face detector.
The neural network processes face images cropped using misaligned
bounding boxes and is trained to simultaneously produce several
geometric parameters characterizing the global misalignment.
After having been trained, the neural network is able to robustly and
precisely correct translations of up to +-13% of the bounding box
width, in-plane rotations of up to +-30 degrees and variations in scale
from 90% to 110%.
Experimental results show that 94% of the face images of the BioID
database and 80% of the images of a complex test set extracted from the
internet are aligned with an error of less than 10% of the face
bounding box width.
|
|
Paper Nr.: |
65
|
Title: |
INVARIANT FACE RECOGNITION IN A NETWORK OF CORTICAL COLUMNS
|
Author(s): |
Philipp Wolfrum, Jörg Lücke and Christoph von der Malsburg |
Abstract: |
We
describe a neural network for invariant object recognition. The network
is generative in the sense that it explicitly represents both the
recognized object and the extrinsic properties to which it is invariant
(especially object position). The model is biologically plausible,
being formulated as a neuronal system composed of cortical columns and
dynamic links. At the same time it has competitive face recognition
performance. |
|
Paper Nr.: |
82
|
Title: |
IMAGE RETRIEVAL USING KRAWTCHOUK CHROMATICITY DISTRIBUTION MOMENTS
|
Author(s): |
E. Tziola, K. Konstantinidis, L. Kotoulas and I. Andreadis |
Abstract: |
In
this paper a set of Krawtchouk Chromaticity Distribution Moments
(KCDMs) for the effective representation of image color content is
introduced. The proposed method describes chromaticity through a set of
KCDMs applied on the associated chromaticity distribution function in
the L*a*b* color space. Using only a small fixed number of KCDMs the
method achieves satisfactory retrieval rates. The computational
requirements
of this approach are relatively small, compared to other methods
addressing the issue of image retrieval using color features. This has
a direct impact on the time required to index an image database.
Furthermore, due to the short-length of KCDMs feature vector, there is
a straight reduction on the time needed to retrieve the whole database.
Comparing to previous relative works, KCDMs provide a more accurate
representation of
the L*a*b* chromaticity distribution functions, since no numerical
approximation is involved in deriving the moments. Furthermore, unlike
other orthogonal moments, Krawtchouk moments can be employed to extract
local features of a chromaticity diagram. This property makes them more
analytical near the centre of mass of the chromaticity distribution.
The theoretical framework is validated by experiments which prove the
superior performance of KCDMs above other methods. |
|
Paper Nr.: |
112
|
Title: |
AUTOMATED OBJECT SHAPE MODELLING BY CLUSTERING OF WEB IMAGES
|
Author(s): |
Giuseppe Scardino, Ignazio Infantino and Salvatore Gaglio |
Abstract: |
The paper deals with the description
of a framework to create shape models of an object using images
from the web. Results obtained from different image search engines
using simple keywords are filtered, and it is possible to select
images viewing a single object owning a well-defined contour. In
order to have a large set of valid images, the implemented system
uses lexical web databases (e.g. WordNet) or free web
encyclopedias (e.g. Wikipedia), to get more keywords correlated to
the given object. The shapes extracted from selected images are
represented by Fourier descriptors, and are grouped by K-means
algorithm. Finally, the more representative shapes of main
clusters are considered as prototypical contours of the object,
and they can be used to search the same object in images showing a
more complex structure. Preliminary experimental results are
illustrated to show the effectiveness of the proposed approach. |
|
Paper Nr.: |
119
|
Title: |
SPATIAL NEIGHBORING HISTOGRAM FOR SHAPE-BASED IMAGE RETRIEVAL
|
Author(s): |
Noramiza Hashim, Patrice Boursier and Hong Tat Ewe |
Abstract: |
The
integration of camera in mobile phones has become a standard in mobile
devices. Man-made object recognition such as building taken from such
devices requires a fast and efficient approach in a practical
application. Our work focuses on recognizing buildings based on a novel
shape-based two dimensional histogram descriptor. It combines both the
low level feature (i.e. edge orientation) and the middle level feature
(i.e. spatial neighborhood pattern). The neighborhood pattern is coded
in a 4-bit binary representation which offers a simple and efficient
way to incorporate local spatial data into the histogram. We find that
the proposed method increases the retrieval precision by approximately
12% compared to other similar shape-based histogram methods. |
|
Paper Nr.: |
121
|
Title: |
TEXTURE BASED DESCRIPTION OF MOVEMENTS FOR ACTIVITY ANALYSIS
|
Author(s): |
Kellokumpu Vili, Zhao Guoying and Pietikäinen Matti |
Abstract: |
Human
motion can be seen as a type of moving texture pattern. In this paper,
we propose a novel approach for activity analysis by describing human
activities with texture features. Our approach extracts spatially
enhanced local binary pattern (LBP) histograms from temporal templates
(Motion History Images and Motion Energy Images) and models their
temporal behavior with hidden Markov models. The description is useful
for action modeling and is suitable for detecting and recognizing
various kinds of activities. The method is computationally simple. We
perform tests on two published databases and clearly show the good
performance of our approach in classification and detection tasks.
Furthermore, experimental results show that the approach performs
robustly against irregularities in data, such as limping and walking
with a dog, partial occlusions and low video quality. |
|
Paper Nr.: |
124
|
Title: |
IMAGE COMPLETION USING A DIFFUSION DRIVEN MEAN CURVATURE FLOWIN A SUB-RIEMANNIAN SPACE
|
Author(s): |
Gonzalo Sanguinetti, Giovanna Citti and Alessandro Sarti |
Abstract: |
In this paper we present an implementation of a perceptual
completion model performed in the three dimensional space of
position and orientation of level lines of an image. We show that
the space is equipped with a natural subriemannian metric. This
model allows to perform disocclusion representing both the occluding
and occluded objects simultaneously in the space. The completion is
accomplished by computing minimal surfaces with respect to the non
Euclidean metric of the space. The minimality is achieved via
diffusion driven mean curvature flow. Results are presented in a
number of cognitive relevant cases. |
|
Paper Nr.: |
126
|
Title: |
AN AUTOMATICWELDING DEFECTS CLASSIFIER SYSTEM
|
Author(s): |
Juan Zapata, Ramón Ruiz and Rafael Vilar |
Abstract: |
Radiographic
inspection is a well-established testing method to detect weld defects.
However, interpretation of radiographic films is a difficult task. The
reliability of such interpretation and the expense of training suitable
experts have allowed that the efforts being made towards automation in
this field. In this paper, we describe an automatic detection system to
recognise welding defects in radiographic images. In a first stage,
image processing techniques, including noise reduction, contrast
enhancement, thresholding and labelling, were implemented to help in
the recognition of weld regions and the detection of weld defects. In a
second stage, a set of geometrical features which characterise the
defect shape and orientation was proposed and extracted between defect
candidates. In a third stage, an artificial neural network for weld
defect classification was used under three regularisation process with
different architectures. For the input layer, the principal component
analysis technique was used in order to reduce the number of feature
variables; and, for the hidden layer, a different number of neurons was
used in the aim to give better performance for defect classification in
both cases. The proposed classification consists in detecting the four
main types of weld defects met in practice plus the non-defect type. |
|
Paper Nr.: |
128
|
Title: |
INVARIANT CODES FOR SIMILAR TRANSFORMATION AND ITS APPLICATION TO SHAPE MATCHING
|
Author(s): |
Eiji Yoshida and Seiichi Mita |
Abstract: |
In
this paper, we propose a new method for the measurement of shape
similarity. Our proposed method encodes the contour of an object by
using the curvature of the object. If two objects are similar (under
translation, rotation, and scaling) in shape, these codes themselves or
their cyclic shift have the same values. We have compared our method
with other methods such as Fourier descriptor, CSS (curvature scale
space) and shape context. We have shown that the computational cost of
our method is about one-hundredth that of CSS, and the recognition rate
of our method is 90.40% for the scaling robustness test using
MPEG7_CE-Shape1 and 81.82% for the similarity-based retrieval test
using Kimia’s silhouette. These values are slightly better than those
of CSS. |
|
Paper Nr.: |
141
|
Title: |
DRIVING WARNING SYSTEM BASED ON VISUAL PERCEPTION OF ROAD SIGNS
|
Author(s): |
Juan Pablo Carrasco, Arturo de la Escalera and José María Armingol |
Abstract: |
Advanced
Driver Assistance Systems are used to increase the security of
vehicles. Computer Vision is one of the main technologies used for this
aim. Lane marks recognition, pedestrian detection, driver drowsiness or
road sign detection and recognition are examples of these systems. The
last one is the goal of this paper. A system that can detect and
recognize road signs based on color and shape features is presented in
this article. It will be focused on detection, especially the color
space used, investigating on the case of road signs under shadows. The
system, also tracks the road sign once it has been detected. It warns
the driver in case of anomalous speed for the recognized road sign
using the information from a GPS. |
|
Paper Nr.: |
146
|
Title: |
FAST WIREFRAME-VISIBILITY ALGORITHM
|
Author(s): |
Ezgi Gunaydin Kiper |
Abstract: |
In
this paper, a fast wireframe-visibility algorithm is introduced. The
algorithm’s inputs are 3D wireframe model of the object, internal and
external camera calibration parameters. Afterwards, the algorithm
outputs the 2D image of the object with only visible lines and
surfaces. 2D image of an object is constructed by using a camera model
with the given camera calibration parameters and 3D wireframe model.
The idea behind the algorithm is finding the intersection points of the
all lines in 2D image of the object. These intersection points are
called as critical points and the lines having them are also called as
critical lines. Lines without any critical points are called as normal
lines. Critical lines are separated into smaller lines by its critical
points and depth calculation is performed for the middle points of
these smaller lines. For the normal lines, depth of the middle point of
the normal line is calculated to determine if it is visible or not.
Therefore, the algorithm provides the minimum amount of point’s depth
calculation. Moreover, this idea provides much faster process for the
reason that there aren’t any resolution and memory problems like
well-known image-space scan-line and z-buffering algorithms. |
|
Paper Nr.: |
149
|
Title: |
A MULTI-SCALE LAYOUT DESCRIPTOR BASED ON DELAUNAY TRIANGULATION FOR IMAGE RETRIEVAL
|
Author(s): |
Agnés Borrŕs Angosto and Josep Lladós Canet |
Abstract: |
Working
with large collections of videos and images has need of effective and
flexible techniques of retrieval and browsing. Beyond the classical
color histogram approaches, the layout information has proven to be a
very descriptive cue for image
description. We have developed a descriptor that encodes the layout of
an image using a histogram-based representation. The descriptor uses a
multi-layer representation that captures the saliency of the image
parts. Furthermore it encodes their relative positions using the
properties of a Delaunay triangulation. The descriptor is a compact
feature vector which content is normalized. Their properties make it
suitable for image retrieval and indexing applications. Finally, have
applied it to a video browsing application that detects characteristic
scenes of a news program. |
|
Paper Nr.: |
150
|
Title: |
A SIGNAL-SYMBOL LOOP MECHANISM FOR ENHANCED EDGE EXTRACTION
|
Author(s): |
Sinan Kalkan, Florentin Wörgötter, Shi Yan, Volker Krüger and Norbert Krüger |
Abstract: |
The transition to symbolic information from images
involves in general the loss or misclassification of information. One way to deal with this
missing or wrong information is to get feedback from concrete hypotheses derived at a symbolic level
to the sub-symbolic (signal) stage to amplify weak information or correct misclassifications.
This paper proposes such a feedback mechanism between the symbolic level
and the signal level, which we call signal symbol loop. We apply this framework for
the detection of low contrast edges making use of predictions based on Rigid Body Motion.
Once the Rigid Body Motion is known, the location and the properties
of edges at a later frame can be predicted. We use these predictions as feedback to the
signal level at a later frame to improve the detection of low contrast edges. We demonstrate
our mechanism on a real example, and evaluate the results using an artificial scene, where the ground
truth data is available. |
|
Paper Nr.: |
160
|
Title: |
CLASSIFIER SELECTION FOR FACE RECOGNITION ALGORITHM BASED ON ACTIVE SHAPE MODEL
|
Author(s): |
Andrzej Florek and Maciej Król |
Abstract: |
In
this paper, experimental results from face contour classification tests
are shown. Presented approach is dedicated to a face recognition
algorithm based on the Active Shape Model. The results were obtained
from experiments carried out on the set of 2700 images taken from 100
persons. Manually fitted contours (194 samples for 8 components of one
face contour) were classified after feature space decomposition carried
out by Linear Discriminant Analysis or by Support Vector Machines
algorithms. |
|
Paper Nr.: |
167
|
Title: |
ON THE CONTRIBUTION OF COMPRESSION TO VISUAL PATTERN RECOGNITION
|
Author(s): |
Gunther Heidemann and Helge Ritter |
Abstract: |
Most
pattern recognition problems are solved by highly task specific
algorithms. However, all recognition and classification architectures
are related in at least one aspect: They rely on compressed
representations of the input. It is therefore an interesting question
how much compression itself contributes to the pattern recognition
process. The question has been answered by Benedetto et al. (2002) for
the domain of text, where a common compression program (gzip) is
capable of language recognition and authorship attribution. The
underlying principle is estimating the mutual information from the
obtained compression factor. Here we show that compression achieves
astonishingly high recognition rates even for far more complex tasks:
Visual object recognition, texture classification, and image retrieval.
Though, naturally, specialized recognition algorithms still outperform
compressors, our results are remarkable, since none of the applied
compression programs (gzip, bzip2) was ever designed to solve this type
of tasks. Compression is the only known method that solves such a wide
variety of tasks without any modification, data preprocessing, feature
extraction, even without parametrization. We conclude that compression
can be seen as the ``core'' of a yet to develop theory of unified
pattern recognition.
|
|
Paper Nr.: |
171
|
Title: |
POSE INVARIANT FACE RECOGNITION USING IMAGE HISTOGRAMS
|
Author(s): |
Hasan Demirel and Gholamreza Anbarjafari |
Abstract: |
The
faces with changing poses show significant variations on the local
details of the facial features. However, the global pixel statistics,
represented by the image histograms, of the same subject with pose
variations, are highly correlated. The image histograms are very robust
features that capture the global pixel statistics of faces. In this
paper, histograms of the intensity images are used as the feature
vectors for the recognition of the faces of different poses. Histogram
matching, based on the cross correlation of the image histograms, is
used as the measure of similarity in the classification process. The
recognition rate of the proposed face recognition system reaches to
98.80% on the HP face database, with 10 poses incorporating up to ą90o
of horizontal pose variations. |
|
Paper Nr.: |
189
|
Title: |
SUBJECT RECOGNITION USING A NEW APPROACH FOR FEATURE SELECTION
|
Author(s): |
Ŕgata Lapedriza , David Masip and Jordi Vitria |
Abstract: |
In this paper we propose a feature selection method that uses the
mutual information (MI) measure on a Principal Component Analysis
(PCA) based decomposition. PCA finds a linear projection of the
data in a non-supervised way, which preserves the larger variance
components of the data under the reconstruction error criterion.
Previous works suggest that using the MI among the PCA projected
data and the class labels applied to feature selection can add the
missing discriminability criterion to the optimal reconstruction
feature set. Our proposal goes one step further, defining a global
framework to add independent selection criteria in order to filter
misleading PCA components while the optimal variables for
classification are preserved. We apply this approach to a face
recognition problem using the AR Face data set. Notice that, in
this problem, PCA projection vectors strongly related to
illumination changes and occlusions are usually preserved given
their high variance. Our additional selection tasks are able to
discard this type of features while the relevant features to
perform the subject recognition classification are kept. The
experiments performed show an improved feature selection process
using our combined criterion. |
|
Paper Nr.: |
196
|
Title: |
EFFICIENT OBJECT DETECTION USING PCA MODELING AND REDUCED SET SVDD
|
Author(s): |
Venkataramana Kini and Rudra Hota |
Abstract: |
Object detection problem is traditionally tackled as two class
problem. Wherein the non object classes are not precisely defined.
In this paper we propose cascade of principal component modeling
with associated test statistics and reduced set support vector data
description for efficient object detection, both of which hinge
mainly on modeling of object class training data. The PCA modeling
enables quick rejection of comparatively obvious non object in
initial stage of the cascade to gain computation advantage. The
reduced set SVDD is applied in latter stages of cascade to classify
relatively difficult images. This combination of PCA modeling and
reduced set support vector data description leads to a good object
detection with simple pixel features.
|
|
Paper Nr.: |
200
|
Title: |
RELEVANCE FEEDBACK WITH MAX-MIN POSTERIOR PSEUDO-PROBABILITY FOR IMAGE RETRIEVAL
|
Author(s): |
Yuan Deng, Xiabi Liu and Yunde Jia |
Abstract: |
In
this paper, a new relevance feedback method for image retrieval based
on max-min posterior pseudo-probabilities (MMP) framework is proposed
to learn user’s intention during feedback. We assume that the feature
vectors extracted from the relevant images be of the distribution of
Gaussian mixture model (GMM). The posterior pseudo-probability function
for the relevant images is used as user intention model. The relevant
image’s posterior pseudo-probability function is used to classify
images into two categories: relevant and irrelevant. During feedback,
relevant and irrelevant images labelled by user are taken as the
training data of user intention model. The optimum parameter set of the
model is learned from the training data using MMP criterion.
Experimental results on Corel database show the effectiveness of the
proposed approach. |
|
Paper Nr.: |
209
|
Title: |
TEXT DETECTION WITH CONVOLUTIONAL NEURAL NETWORKS
|
Author(s): |
Manolis Delakis and Christophe Garcia |
Abstract: |
Text
detection is an important preliminary step before text can be
recognized in unconstrained image environments. We present an approach
based on convolutional neural networks to detect and localize
horizontal text lines from raw color pixels. The network learns to
extract and combine its own set of features through learning instead of
using hand-crafted ones. Learning was also used in order to precisely
localize the text lines by simply training the network to reject
badly-cut text and without any use of tedious knowledge-based
post-processing. Although the network was trained with synthetic
examples, experimental results demonstrated that it can outperform
other methods on the real-world test set of ICDAR'03. |
|
Paper Nr.: |
210
|
Title: |
FAST TEMPLATE MATCHING FOR MEASURING VISIT FREQUENCIES OF DYNAMICWEB ADVERTISEMENTS
|
Author(s): |
Dániel Szolgay, Csaba Benedek and Tamás Szirányi |
Abstract: |
In
this paper an on-line method is proposed for statistical evaluation of
dynamic web advertisements via measuring their visit frequencies. To
minimize the required user-interaction, the eye movements are tracked
by a special eye camera, and the hits on advertisements are
automatically recognized. The detection step is mapped to a 2D template
matching problem, and novel algorithms are developed to significantly
decrease the processing time, via excluding quickly most of the false
hit-candidates. We show that due to the improvements the method runs in
real time in the context of the selected application. The solution has
been validated on real test data and quantitave results have been
provided to show the gain in recognition rate and processing time
versus previous approaches. |
|
Paper Nr.: |
226
|
Title: |
OBJECTIVE EVALUATION OF SEAM PUCKER USING AN ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM
|
Author(s): |
K. L. Mak and Wei Li |
Abstract: |
Seam
pucker evaluation plays a very important role in the garments
manufacturing industry. At present, seam puckers are usually evaluated
by human inspectors, which is subjective, unreliable and
time-consuming. With the development of image processing and pattern
recognition technologies, an automatic vision-based seam pucker
evaluation system becomes possible. This paper presents a new approach
based on adaptive neuro-fuzzy inference system (ANFIS) to establish the
relationship between seam pucker grades and textural features of seam
pucker images. The evaluation procedure is performed in two stages:
features extraction with the co-occurrence matrix approach, and
classification with ANFIS. Experimental results demonstrate the
validity and effectiveness of the proposed ANFIS-based method. |
|
Paper Nr.: |
248
|
Title: |
A BAYESIAN APPROACH TO 3D OBJECT RECOGNITION USING LINEAR COMBINATION OF 2D VIEWS
|
Author(s): |
Vasileios Zografos and Bernard F. Buxton |
Abstract: |
In
this work, we introduce a Bayesian approach for pose-invariant
recognition of the images of 3d objects modelled by a small number of
stored 2d intensity images
taken from nearby but otherwise arbitrary viewpoints. A linear
combination of views approach is used to combine images from two
viewpoints of a 3d object and synthesise novel views of that object.
Recognition is performed by matching a target, scene image to such a
synthesised, novel view using an optimisation algorithm, constrained by
construction of Bayes prior distributions on the linear combination. We
have experimented with both a direct search and an evolutionary
optimisation method on a real-image, public database. The Bayes priors
effectively regularised the posterior distribution so that all
algorithms were able to find good solutions close to the optimum.
Further exploration of the parameter space has been carried out using
Markov-Chain Monte-Carlo sampling. |
|
Paper Nr.: |
253
|
Title: |
IMAGE ANNOTATION WITH RELEVANCE FEEDBACK USING A SEMI-SUPERVISED AND HIERARCHICAL APPROACH
|
Author(s): |
Cheng-Chieh Chiang, Ming-Wei Hung, Yi-Ping Hung and Wee Kheng Leow |
Abstract: |
This
paper presents a novel approach for image annotation with relevance
feedback that interactively employs a semi-supervised learning to build
hierarchical classifiers associated with annotation labels. We
construct individual hierarchical classifiers each corresponding to one
semantic label that is used for describing the semantic contents of the
images. This proposed semi-supervised and hierarchical approach is
involved in an interactive scheme of relevance feedbacks to assist the
user in annotating images. Our semi-supervised approach for learning
classifiers reduces the need of training images by use of both labeled
and unlabeled images. We adopt hierarchical approach for classifiers to
divide the whole semantic concept associated with a label into several
parts such that the complex contents in images can be simplified. We
also describe some experiments to show the performance of the proposed
approach. |
|
Paper Nr.: |
264
|
Title: |
MULTI-DISCRIMINANT CLASSIFICATION ALGORITHM FOR FACE VERIFICATION
|
Author(s): |
Cheng-Ho Huang and Jhing-Fa Wang |
Abstract: |
Linear
discriminant analysis (LDA) is a common method used for face
verification. For computing the large amounts of data collected for a
given face verification system, we propose a multi-discriminated
classification algorithm to classify and verify voluminous facial
images. In the training phase, it indexes all discriminant features of
the training data to class them as the clients’ individual discriminant
sets. In order to verify whether a claimant is a client, we only verify
the client’s discriminant set to determine the result: acceptance or
rejection. The results of comparative experiments demonstrate that our
algorithm achieves encouraging improvement in the performances for
volumes of face verification. |
|
Paper Nr.: |
263
|
Title: |
TOWARDS EMBEDDEDWASTE SORTING - Using Constellations of Visual Words
|
Author(s): |
Toon Goedemé |
Abstract: |
In this paper, we present a method for fast and robust object recognition,
especially developed for implementation on an embedded platform. As an
example, the method is applied to the automatic sorting of consumer waste. Out
of a stream of different thrown-away food packages, specific items—in this case
beverage cartons — can be visually recognised and sorted out. To facilitate and
optimise the implementation of this algorithm on an embedded platform containing
parallel hardware, we developed a voting scheme for constellations of visual
words, i.e. clustered local features (SURF in this case). On top of easy implementation
and robust and fast performance, even with large databases, an extra
advantage is that this method can handle multiple identical visual features in one
model. |
|
Paper Nr.: |
270
|
Title: |
INTRODUCING 3D VISION AND COMPUTER GRAPHICS TO ARCHAEOLOGICAL WORKFLOW - An Applicable Framework
|
Author(s): |
Hubert Mara, Andreas Monitzer and Julian Stöttinger |
Abstract: |
Cataloging drawings of ancient vessels and sherds is
still the most time consuming task in the typical archaeological workflow. The
properties of these findings like profile, volume, and wall thickness have
always been estimated and drawn by hand. Through archiving, classifying and
exhibiting these ancient artifacts we wish to gather as precise information as
possible. Within seconds, today's 3D-scanners provide surface meshes of ancient
vessels which are more precise than any manual estimation which may take up to
several hours.
We propose a semi-automated, applicable framework for dealing with large 3D-meshes
of ancient findings from scanning the vessels for publication. In
this interactive environment we estimate the axis of vessels, estimate their
profile lines and render real time visualizations using state-of-the-art 3D-hardware
techniques. The results can be printed in their real size for direct
use in archaeological literature. Further, these methods will give the ability
to publish 3D-meshes of ancient vessels for archaeological research.
Recent extended tests have been carried out on archaeological sites in Peru and
Austria. These experiments showed under real life circumstances the improvement
of using this system in both precision and time efficiency.
|
|
Paper Nr.: |
272
|
Title: |
LOW-LEVEL FUSION OF AUDIO AND VIDEO FEATURE FOR MULTI-MODAL EMOTION RECOGNITION
|
Author(s): |
Matthias Wimmer, Björn Schuller, Dejan Arsic, Gerhard Rigoll and Bernd Radig |
Abstract: |
Bimodal
emotion recognition through audiovisual feature fusion has been shown
superior over each individual modality in the past. Still,
synchronization of the two streams is a challenge, as many vision
approaches work on a frame basis opposing audio turn- or chunk-basis.
Therefore, late fusion schemes such as simple logic or voting
strategies are commonly used for the overall estimation of underlying
affect. However, early fusion is known to be more effective in many
other multimodal recognition tasks.
We therefore suggest a combined analysis by descriptive statistics of
audio and video Low-Level-Descriptors for subsequent static SVM
Classification. This strategy also allows for a combined feature-space
optimization which will be discussed herein. The high effectiveness of
this approach is shown on a database of 11.5h containing six emotional
situations in an airplane scenario. |
|
Paper Nr.: |
280
|
Title: |
DETERMINATION OF THE VISUAL FIELD OF PERSONS IN A SCENE
|
Author(s): |
Adel Lablack, Frédéric Maquet and Chabane Djeraba |
Abstract: |
The
determination of the visual field for multiple persons in a scene is an
important problem with many applications in human behavior
understanding for security and customized marketing. One such
application, addressed in this paper, is to catch the visual field of
persons in a scene. We obtained the head pose in the image sequence
manually in order to determine exactly the visual field of persons in
the monitored scene. We use the knowledge about the vision of a human,
the trigonometrical relations to calculate the length and the height of
the visual field and quaternion approach for doing several changes of
reference mark. We demonstrate this technique using a realistic data
set of videos taken by surveillance camera on shops. |
|
Paper Nr.: |
282
|
Title: |
BUILDING DETECTION IN IKONOS IMAGES FROM DISPARITY OF EDGES
|
Author(s): |
Charles Beumier |
Abstract: |
The
availability of very high resolution satellite images has enabled the
automatic remote detection of man-made structures for applications such
as damage assessment or change detection. In particular, stereo pairs
of Ikonos or Quickbird images allow for the estimation of the third
dimension so distinctive for buildings. Since the areas to be studied
may be quite large we propose a simple, fast and possibly accurate
approach for building detection. This approach consists in a three step
procedure which first detects linear segments independently in the left
and right images, then matches segments according to their mutual
coverage, orientation and plausible disparity, and finally identifies
building areas thanks to the presence of elevated segments. The
solution is fast as only pixels of high gradient connected into linear
segments are considered. Modelling object parts with linear segments is
valid for the vast majority of man-made objects and allows for rapid
segment pairing for disparity computation with possible sub-pixel
accuracy. This approach has been applied to an Ikonos pair for the
detection of large buildings in the context of risk assessment within
GMOSS, a European Network of Excellence. |
|
Paper Nr.: |
285
|
Title: |
TOWARDS THE ESTIMATION OF CONSPICUITY WITH VISUAL
|
Author(s): |
Ludovic Simon, Jean-Philippe Tarel and Roland Brémond |
Abstract: |
The
estimation of conspicuity is of importance for engineers who aim at
making traffic signs conspicuous enough to attract drivers' attention.
Unfortunately, conspicuity remains a poorly understood attribute due to
the relatively limited - although growing - knowledge about the human
visual processing system. Our goal is to develop a system which
estimates the conspicuity of a traffic sign based on the processing of
images acquired with a camera onboard a vehicle, in order to be able to
make a diagnosis regarding their conspicuity. Aside from specific
feature known to be of importance for road signs, there is currently no
complete model for conspicuity. The previously proposed attentional
conspicuity model, which is based on vision science knowledge of the
low levels of the human visual processing system, was shown to be not
suitable for sign detection tasks. We thus propose a new paradigm for
conspicuity estimation in search tasks based on statistical learning of
the features of the searched object. |
|
Paper Nr.: |
290
|
Title: |
EFFICIENT OBJECT DETECTION ROBUST TO RST WITH MINIMAL SET OF EXAMPLES
|
Author(s): |
Sebastien Onis, Henri Sanson, Christophe Garcia and Jean-Luc Dugelay |
Abstract: |
In
this paper, we present an object detection approach based on a
similarity measure combining cross-correlation and affine deformation.
Current object detection systems provide good results, at the expense
of requiring a large training database. The use of correlation anables
object detection with very small training set but is not robust to the
luminosity change and RST (Rotation, Scale, translation)
transformation. This paper presents a detection system that first
searches the likely positions and scales of the object using image
preprocessing and cross-correlation method and secondly, uses a
similarity measure based on affine deformation to confirm or not the
predetection. We apply our system to face detection and show the
improvement in results due to the images preprocessing and the affine
deformation. |
|
Paper Nr.: |
291
|
Title: |
RELATIONS BETWEEN RECONSTRUCTED 3D ENTITIES
|
Author(s): |
Nicolas Pugeault, Sinan Kalkan, Florentin Wöergöetter, Emre Baseski and Norbert Krüger |
Abstract: |
In this paper, we first propose an analytic formulation for
the position's and orientation's uncertainty of local 3D line
descriptors reconstructed by stereo.
We evaluate these predicted uncertainties with Monte Carlo simulations,
and study their dependency on different parameters (position and orientation).
In a second part, we use this definition to derive a new formulation for
inter--features distance and coplanarity.
These new formulations take into account the predicted uncertainty,
allowing for better robustness.
We demonstrate the positive effect of the modified definitions
on some simple scenarios. |
|
Paper Nr.: |
353
|
Title: |
RECOGNITION OF TEXTWITH KNOWN GEOMETRIC AND GRAMMATICAL STRUCTURE
|
Author(s): |
Jan Rathouský, Martin Urban and Vojtech Franc |
Abstract: |
The
optical character recognition (OCR) module is a
fundamental part of each automated text processing system. The OCR
module
translates an input image with a text line into a string of
symbols. In many applications (e.g. license plate recognition) the text
has
some a priori known geometric and grammatical structure. This article
proposes
an OCR method exploiting this knowledge which restricts the set of
possible
strings to a limited set of feasible combinations. The recognition task
is
formulated as maximization of a similarity function which uses
character templates as reference. These templates are estimated by a
support vector machine method from a set of
examples. In contrast to the common approach, the proposed method
performs character
segmentation and recognition simultaneously. The method was
successfully
evaluated in a car license plate recognition system. |
|
Paper Nr.: |
357
|
Title: |
REPRESENTATION AND RECOGNITION OF HUMAN ACTIONS - A New Approach based on an Optimal Control Motor Model
|
Author(s): |
Sumitra Ganesh and Ruzena Bajcsy |
Abstract: |
We
present a novel approach to the problem of representation and
recognition of human actions, that uses an optimal control based model
to connect the high-level goals of a human subject to the low-level
movement trajectories captured by a computer vision system. These
models quantify the high-level goals as a performance criterion or cost
function which the human sensorimotor system optimizes by picking the
control strategy that achieves the best possible performance. We show
that the human body can be modeled as a hybrid linear system that can
operate in one of several possible modes, where each mode corresponds
to a particular high-level goal or cost function. The problem of action
recognition, then is to infer the current mode of the system from
observations of the movement trajectory. We demonstrate our approach on
3D visual data of human arm motion. |
|
Paper Nr.: |
371
|
Title: |
FACE MODEL FITTINGWITH GENERIC, GROUP-SPECIFIC, AND PERSON-SPECIFIC OBJECTIVE FUNCTIONS
|
Author(s): |
Sylvia Pietzsch, Matthias Wimmer, Freek Stulp and Bernd Radig |
Abstract: |
In model-based fitting, the model parameters that best fit the image are determined by searching for the optimum
of an objective function. Often, this function is designed manually, based on implicit and domaindependent
knowledge. We acquire more robust objective function by learning them from annotated images, in
which many critical decisions are automated, and the remaining manual steps do not require domain knowledge.
Still, the trade-off between generality and accuracy remains. General functions can be applied to a large
range of objects, whereas specific functions describe a subset of objects more accurately. (Gross et al., 2005)
have demonstrated this principle by comparing generic to person-specific Active Appearance Models. As it
is impossible to learn a person-specific objective function for the entire human population, we automatically
partition the training images and then learn partition-specific functions. The number of groups influences the
specificity of the learned functions. We automatically determine the optimal partitioning given the number of
groups, by minimizing the expected fitting error.
Our empirical evaluation demonstrates that the group-specific objective functions more accurately describe
the images of the corresponding group. The results of this paper are especially relevant to face model tracking,
as individual faces will not change throughout an image sequence. |
|
Paper Nr.: |
372
|
Title: |
SIMILARITY MEASURES FUSION USING SVM CLASSIFIER FOR FACE AUTHENTICATION
|
Author(s): |
Mohammad T. Sadeghi,Masoumeh Samiei, Seyed Mohammad T. Almodarresi and Josef Kittler |
Abstract: |
In
this paper, the problems of measuring similarity in LDA face space
using different metrics and fusing the associated classifiers are
considered. A few similarity measures used in different pattern
recognition applications, including the recently proposed Gradient
Direction (GD) metric are reviewed. An automatic parameter selection
algorithm is then proposed for optimising the GD metric. In extensive
experimentation on the BANCA database, we show that the optimised GD
metric outperforms the other metrics in various conditions. Moreover,
we demonstrate that by combining the GD metric and seven other metrics
in the decision level using Support Vector Machines, the performance of
the resulting decision making scheme consistently improves.
|
|
Paper Nr.: |
375
|
Title: |
HIERARCHICAL EVALUATION MODEL FOR 3D FACE RECOGNITION
|
Author(s): |
Sídnei A. Drovetto Jr., Luciano Silva and Olga R. P. Bellon |
Abstract: |
In
this paper we propose a 3D face matching based on alignments obtained
using the Simulated Annealing global optimization algorithm guided by
the Mean Squared Error with M-estimator Sample Consensus and the
Surface Interpenetration Measure (SIM). The matching score is obtained
by the calculation of the SIM after the registration process. Since the
SIM is a sensitive measure, it needs a good alignment to give relevance
to its value. Our registration approach tends to reach a near global
solution and, therefore, produces the necessary precise alignments. By
analyzing the matching score, the system can identify if the input
images come from the same subject or not. In a verification scenario we
use a hierarchical evaluation model which maximizes the results and
reduces the computing time. Extensive experiments were performed on the
well-known FRGC v2.0 3D face database using five different facial
regions: three regions of the nose; the region of the eyes; and the
face itself. Compared to state-of-the-art works, our approach have
achieved a high rank-one recognition rate and, also, a high
verification rate. |
|
Paper Nr.: |
378
|
Title: |
SINGLE-IMAGE 3D RECONSTRUCTION OF BALL VELOCITY AND SPIN FROM MOTION BLUR - An Experiment in Motion-from-Blur
|
Author(s): |
Giacomo Boracchi, Vincenzo Caglioti and Alessandro Giusti |
Abstract: |
We
present an algorithm for analyzing a single calibrated image of a ball
and reconstruct its instantaneous motion (3D velocity and spin) by
exploiting motion blur. We use several state-of-the-art image
processing techniques for extracting information from space-variant
blur, then robustly integrate such information in a geometrical model
of the 3D motion. We initially handle the simpler case in which the
ball apparent translation is neglegible w.r.t. its spin, then extend
the technique to handle the full motion.
We show extensive experimental results both on synthetic and camera
images.
In a broader scenario, we exploit this specific problem for discussing
motivations, advantages and limits of reconstructing motion from motion
blur. |
|
Paper Nr.: |
381
|
Title: |
COMPLETE AND STABLE PROJECTIVE HARMONIC COMPLETE AND STABLE PROJECTIVE HARMONIC
|
Author(s): |
Faten Chaieb and Faouzi Ghorbel |
Abstract: |
Planar
shapes recognition is an important problem in computer vision and
pattern recognition. We deal with planar shape contour views that
differ by a general projective transformation. One method for solving
such problem is to use projective invariants. In this work, we propose
a projective and parametrization invariant generation framework based
on the harmonic analysis theory. In fact, invariance to
reparameterization is obtained by a projective arc length curve
reparameterization process. Then, a complete and stable set of
projective harmonic invariants is constructed from the Fourier
coefficients computed on the reparameterized contours. We experiment
this set of descriptors on analytic images in order to recognize
projectively similar contours.
|
|
Paper Nr.: |
402
|
Title: |
FACIAL EXPRESSION RECOGNITION USING ACTIVE APPEARANCE MODELS
|
Author(s): |
Pedro Martins, Joana Sampaio and Jorge Batista |
Abstract: |
A
framework for automatic facial expression recognition combining Active
Appearance Model (AAM) Linear Discriminant Analysis (LDA) is proposed.
Seven different expressions of several subjects, representing the
neutral face and the facial emotions of happiness, sadness, surprise,
anger, fear and disgust were analysed. The proposed solution starts by
describing the human face by an AAM model, projecting the appearance
results to a Fisherspace using LDA to emphasize the different
expression categories. Finaly the performed classification is based on
malahanobis distance.
|
|
Paper Nr.: |
442
|
Title: |
MPEG-7 DESCRIPTORS BASED CLASSIFIER FOR FACE/NON FACE DETECTION
|
Author(s): |
Malek Nadil, Abdenour Labed and Feryel Souami |
Abstract: |
In
this paper we present a high level Face/Non-face classifier which can
be integrated to a content based image retrieving system. It will help
to extract semantics from images prior to their retrieving. This
two-steps retrieval allows reducing effects of semantic gaps on the
performance of existing systems. To construct our classifier, we
exploit a standardized MPEG-7 low level descriptor. Experiments
performed on images taking from two data bases, showed that our
technique outperforms, in many cases, others presented in the
literature. |
|
|
|
Area 4 - Motion, Tracking and Stereo Vision
|
Paper Nr.: |
12
|
Title: |
TRAFFIC SURVEILLANCE USING GABOR FILTER BANK AND KALMAN PREDICTOR
|
Author(s): |
Mehmet Celenk, James Graham and Santosh Singh |
Abstract: |
This
paper describes a non-linear scene prediction method for use with
traffic surveillance video. A Gabor-filter bank is selected as a
primary detector for any changes in a given image sequence. The
detected ROI (region of interest) in arbitrary motion is fed to a
non-linear Kalman filter for predicting the next scene in time-varying
video, which is subject to prediction error correction. Potential
applications of this research are mainly in the areas of traffic
control and monitoring, accident detection, traffic flow surveillance,
and MPEG video-compression. Experimental results reported herein show
that non-linear Kalman filtering based scene prediction is quite
effective in the estimation of future frames in visual-band intensity
driven sensing. The least mean square error (LMSE) in predicting future
frames is relatively low, on the average of about 2 to 3 %, proving the
effectiveness of the approach for traffic-motion control and management. |
|
Paper Nr.: |
19
|
Title: |
AUTOMATIC INITIALIZATION FOR BODY TRACKING - Using Appearance to Learn a Model for Tracking Human Upper Body Motions
|
Author(s): |
Joachim Schmidt and Modesto Castrillón-Santana |
Abstract: |
Social robots require the ability to communicate and recognize the
intention of a human interaction partner. Humans commonly make use
of gestures during everyday life for interactive purposes. For a
social robot, recognition of gestures is therefore a necessary
skill. As a common intermediate step, the pose of an individual is
tracked over time making use of a body model. For a system based on
such a communication scenario, self-starting tracking is a favored
characteristic. The acquisition of a suitable body model, however,
is a complex task.
This paper presents an approach to facilitate the acquisition of the
body model during interaction. Taking advantage of a robust face
detection algorithm provides the opportunity for automatic and
markerless acquisition of a 3D body model using a monocular color
camera.
For the given human robot interaction scenario, a prototype has been
developed for a single user configuration. It provides automatic
initialization and failure recovery of a 3D body tracker based on
head and hand detection information, delivering promising results.
|
|
Paper Nr.: |
28
|
Title: |
POSE ESTIMATION FROM LINES BASED ON THE DUAL-NUMBER METHODS
|
Author(s): |
Caixia Zhang, Zhanyi Hu and Fengmei Sun |
Abstract: |
It
is a classical problem to estimate the camera pose from a calibrated
image of 3D entities (points or lines) in computer vision,
photogrammetry and even in mathematics. Although lines provide a more
stable image feature to match and the point feature will often be
missing from consecutive image for carrying on a series of camera pose
determination, only the point features are used in the most papers and
the line features have occasionally appeared in the literature. In this
paper, based on the dual-number methods, we present a new method for
pose estimation from lines, and introduce a similar formula with a
general rigid transformation, thus, we set up a unified framework of
coordinate transformations for lines and points. Then, according to the
coplanarity of the corresponding image line and space line, a new group
of constraints is introduced. Although they are not independent of each
other, redundant constraints may be used to improve the estimation
precision for all practical applications where noise in the data cannot
be avoided. Different from the existing methods based on lines, we do
not use an isolated point on either the space line or the image line,
but the whole line data. Thus, it is evitable to detect the corner as
well as the corresponding propagating error. Simulations and tests on
real images confirm the validity and usefulness of our method. |
|
Paper Nr.: |
29
|
Title: |
MODEL-FREE MARKERLESS TRACKING FOR REMOTE SUPPORT IN UNKNOWN ENVIRONMENTS
|
Author(s): |
Alexander Ladikos, Selim Benhimane, Nassir Navab and Mirko Appel |
Abstract: |
We
propose a complete system that performs real-time markerless tracking
for Augmented Reality-based remote user support in a priori unknown
environments. In contrast to existing systems, which require a prior
setup and/or knowledge about the scene, our system can be used without
preparation. This is due to our tracking algorithm which does not need
a 3D-model of the scene or a learning-phase for the initialization.
This allows us to perform fast and robust markerless tracking of the
objects which are to be augmented. The proposed solution does not
require artificial markers or special lighting conditions. The only
requirement is the presence of locally planar objects in the scene,
which is true for almost every man-made structure and in particular
technical installations. The augmentations are chosen by a remote
expert who is connected to the user over a network and receives a live
stream of the scene. |
|
Paper Nr.: |
35
|
Title: |
IMAGE SEQUENCE STABILIZATION USING FUZZY KALMAN FILTERING AND LOG-POLAR TRANSFORMATION
|
Author(s): |
Nikolaos Kyriakoulis, Antonios Gasteratos and Angelos Amanatiadis |
Abstract: |
Digital
image stabilization (DIS) is the process that compensates the undesired
fluctuations of a frame’s position in an image sequence by means of
digital image processing techniques. DIS techniques usually comprise
two successive units. The first one estimates the motion and the
successive one compensates it. In this paper, a novel digital image
stabilization technique is proposed, which is featured with a fuzzy
Kalman estimation of the global motion vector in the log-polar plane.
The global motion vector is extracted using four local motion vectors
computed on respective sub-images in the log-polar plane. The proposed
technique exploits both the advantages of the fuzzy Kalman system and
the log-polar plane. The compensation is based on the motion estimation
in the log-polar domain, filtered by the fuzzy Kalman system. The
described technique outperforms in terms of response times, the output
quality and the level of compensation. |
|
Paper Nr.: |
38
|
Title: |
OPTICAL-FLOWFOR 3D ATMOSPHERIC MOTION ESTIMATION
|
Author(s): |
Patrick Héas and Etienne Mémin |
Abstract: |
In
this paper, we address the problem of estimating three-dimensional
motions of a stratified atmosphere from satellite image sequences. The
complexity of three-dimensional atmospheric fluid flows associated to
incomplete observation of atmospheric layers due to the sparsity of
cloud systems makes very difficult the estimation of dense atmospheric
motion field from satellite images sequences. The recovery of the
vertical component of fluid motion from a monocular sequence of image
observations is a very challenging problem for which no solution exists
in the literature. Based on a physically sound vertical decomposition
of the atmosphere into layers of different altitudes, we propose here a
dense motion estimator dedicated to the extraction of three-dimensional
wind fields characterizing the dynamics of a layered atmosphere. Wind
estimation is performed over the complete three-dimensional space using
a multi-layer model describing a stack of dynamic horizontal layers of
evolving thickness, interacting at their boundaries via vertical winds.
The efficiency of our approach is demonstrated on synthetic and real
sequences.
|
|
Paper Nr.: |
40
|
Title: |
GLOBAL DEPTH ESTIMATION FOR MULTI-VIEW VIDEO CODING USING CAMERA PARAMETERS
|
Author(s): |
Xiaoyun Zhang, Weile Zhu and George Yang |
Abstract: |
JVT
decides to focus on multi-view video plus depth (MVD) data format for
Multi-view Video Coding (MVC), in order to support rendering a wide
range continuum of views at the decoder for advanced 3DV and FVV
systems. Thus, it is important to study global depth to minimize rate
for depth side information and to improve depth search efficiency. In
this paper, we propose a global depth estimation algorithm from
multi-view images using camera parameters. First, an initial depth
value is obtained from the convergent point of the camera system by
solving a set of linear equations. Then, the global depth is searched
to minimize the absolute difference between the synthesized view and
the practical view. Because the initial depth can provide appropriate
depth search range and step size, the global depth can be estimated
efficiently and quickly with less computation. Experimental results
verify the algorithm performance. |
|
Paper Nr.: |
51
|
Title: |
TOWARDS EUCLIDEAN RECONSTRUCTION FROM VIDEO SEQUENCES
|
Author(s): |
Dimitri Bulatov |
Abstract: |
This
paper presents two algorithms needed to perform a dense
3D-reconstruction from video streams recorded with uncalibrated
cameras. Our algorithm for camera self-calibration makes extensive use
of the constant focal length. Furthermore, a fast dense reconstruction
can be performed by fusion of tesselations obtained from different
sub-sequences (LIFT). Moreover, we will present our system for
performing the reconstruction in a projective coordinate system. Since
critical motions are common in the majority of practical situations,
care has been taken to recognize and deal with them. |
|
Paper Nr.: |
57
|
Title: |
BACKGROUND SUBTRACTION WITH ADAPTIVE SPATIO-TEMPORAL NEIGHBORHOOD ANALYSIS
|
Author(s): |
Marco Cristani and Vittorio Murino |
Abstract: |
In the literature, visual surveillance methods based on joint pixel and region analysis for back-
ground subtraction are proven to be effective in discovering foreground objects in cluttered scenes.
Typically, per-pixel foreground detection is contextualized in a local neighborhood region in order
to limit false alarms. However, such methods have an heavy computational cost, depending on the
size of the surrounding region considered for each pixel. In this paper, we propose an original and
eącient joint pixel-region analysis technique able to automatically select the sampling rate with
which pixels in different areas are checked out, while adapting the size of the neighborhood region
considered. The algorithm has been validated on standard videos with benchmark tests, proving
the goodness of the approach, especially in terms of quality of the detection with respect to the
frame rate achieved. |
|
Paper Nr.: |
60
|
Title: |
SIMPLE BUT EFFECTIVE TREE STRUCTURES FOR DYNAMIC PROGRAMMING-BASED STEREO MATCHING
|
Author(s): |
Michael Bleyer and Margrit Gelautz |
Abstract: |
This
work describes a fast method for computing dense stereo correspondences
that is capable of generating results close to the state-of-the-art. We
propose running a separate disparity computation process in each image
pixel. The idea is to root a tree graph on the pixel whose disparity
needs to be reconstructed. The tree thereby forms an individual
approximation of the standard four-connected grid for this specific
pixel. An exact optimum of a predefined energy function on the applied
tree structure is determined via dynamic programming (DP), and the root
pixel is assigned to the disparity of optimal costs. We present two
simple tree structures that allow for the efficient calculation of all
trees' optima with only four scanline-based DP passes. These simple
trees are designed to capture all pixels of the reference frame and
incorporate horizontal and vertical smoothness edges in order to weaken
the scanline streaking problem inherent in DP-based approaches. We
evaluate our results using the Middlebury test set. Our algorithm
currently ranks at the eighth position of approximately 30 algorithms
in the Middlebury database. More importantly, it is the currently
best-performing method that does not use image segmentation and is
significantly faster than most competing algorithms. Our method needs
less than a second to determine the disparity map for typical stereo
pairs. |
|
Paper Nr.: |
66
|
Title: |
A FAST POST-PROCESSING TECHNIQUE FOR REAL-TIME STEREO CORRESPONDENCE
|
Author(s): |
Georgios - Tsampikos Michailidis, Leonidas Kotoulas and Ioannis Andreadis |
Abstract: |
In
computer vision, the extraction of dense and accurate disparity maps is
a computationally expensive and challenging problem, and high quality
results typically require from several seconds to several minutes to be
obtained. In this paper, we present a new post-processing technique,
which detects the incorrect reconstructed pixels after the initial
matching process and replaces them with correct disparity values.
Experimental results with Middlebury data sets show that our approach
can process images of up to 3MPixels in less than 3.3 msec, producing
at the same time semi-dense (up to 99%) and accurate (up to 94%)
disparity maps. We also propose a way to adaptively change, in real
time, the density and the accuracy of the extracted disparity maps. In
addition, the matching and post-processing procedures are calculated
without using any multiplication, which makes the algorithm very fast,
while its reduced complexity simplifies its implementation. Finally, we
present the hardware implementation of the proposed algorithm. |
|
Paper Nr.: |
71
|
Title: |
TOUCH-LESS PALM PRINT BIOMETRIC SYSTEM
|
Author(s): |
Michael Goh Kah Ong, Connie Tee and Andrew Teoh Beng Jin |
Abstract: |
In
this research, we propose an innovative touch-less palm print
recognition system. This project is motivated by the public’s demand
for non-invasive and hygienic biometric technology. For various
reasons, users are concerned about touching the biometric scanners.
Therefore, we propose to use a low-resolution web camera to capture the
user’s hand at a distance for recognition. The users do not need to
touch any device for their palm print to be extracted for analysis. A
novel hand tracking and palm print region of interest (ROI) extraction
technique are used to track and capture the user’s palm in real time
video streams. The discriminative palm print features are extracted
based on a new way that applies local binary pattern (LBP) texture
descriptor on the palm print directional gradient responses.
Experiments show promising result by using the proposed method.
Performance can be further improved when a modified probabilistic
neural network (PNN) is used for feature matching. |
|
Paper Nr.: |
76
|
Title: |
USING LOW-LEVEL MOTION TO ESTIMATE GAIT PHASE
|
Author(s): |
Ben Daubney, David Gibson and Neill Campbell |
Abstract: |
This paper presents a method that is
capable of robustly estimating gait phase of a human walking using
the motion of a sparse cloud of feature points extracted using a
standard feature tracker. We first learn statistical motion models
of the trajectories we would expect to observe for each of the main
limbs. By comparing the motion of the tracked features to our models
and integrating over all features we create a state probability
matrix that represents the likelihood of being at a particular phase
as a function of time. By using dynamic programming and allowing
only likely phase transitions to occur between consecutive frames an
optimal solution can be found that estimates the gait phase for each
frame. This work demonstrates that despite the sparsity and noise
contained in the tracking data the information encapsulated in the
motion of these points is sufficient to extract gait phase to a high
level of accuracy. Presented results demonstrate our system is
robust to changes in height of the walker, gait frequency and
individual gait characteristics. |
|
Paper Nr.: |
77
|
Title: |
EXPERIMENTAL EVALUATION OF RELATIVE POSE ESTIMATION ALGORITHMS
|
Author(s): |
Marcel Brückner, Ferid Bajramovic and Joachim Denzler
|
Abstract: |
We
give an extensive experimental comparison of four popular relative pose
(epipolar geometry) estimation algorithms: the eight, seven, six and
five point algorithms.
We focus on the practically important case that only a single solution
may be returned by automatically selecting one of the solution
candidates, and investigate the choice of error measure for the
selection. We show that the five point algorithm gives very good
results with automatic selection. As sometimes the eight point
algorithm is better, we propose a combination algorithm which selects
from the solutions of both algorithms and thus combines their
strengths. We further investigate the behavior in the presence of
outliers by using adaptive RANSAC, and give practical recommendations
for the choice of the RANSAC parameters. Finally, we verify the
simulation results on real data. |
|
Paper Nr.: |
87
|
Title: |
EXPERIMENTAL EVALUATION OF RELATIVE POSE ESTIMATION ALGORITHMS
|
Author(s): |
Marcel Brückner, Ferid Bajramovic and Joachim Denzler |
Abstract: |
Recently,
much work has been devoted to multiple object tracking on the one hand
side and to appearance model adaptation for a single object tracker on
the other side. In this paper, we do both tracking of multiple objects
(faces of people) in a meeting scenario and on-line learning to
incrementally update the models of the tracked objects to account for
appearance changes during tracking. Additionally, we automatically
initialize and terminate tracking of individual objects based on
low-level features, i.e. face color, face size, and object movement.
For tracking a particle filter is incorporated to propagate sample
distributions over time. Numerous experiments on meeting data
demonstrate the capabilities of our tracking approach. Additionally, we
provide an empirical verification of appearance model learning during
tracking on an indoor and outdoor scene which supports a more robust
tracking. |
|
Paper Nr.: |
89
|
Title: |
A SLAG TEMPERATURE AND FLOW MONITORING SYSTEM
|
Author(s): |
Jean-Philippe Andreu |
Abstract: |
Quality
assessment of steel processing essentially relies on the continuous
monitoring and control of the steel temperature and the flow patterns
of the molten material. Among the various sensors developed to control
that process, CCD sensors emerge as a good alternative to more
classical measuring devices like thermocouple probes and pyrometers.
While thermographic infrared cameras are often discarded as an option
because of their high cost, multi-spectral imaging systems based on
cameras working in the visible spectrum offer a viable alternative.
This paper presents a slag monitoring system based on dual wavelength
thermographic cameras. The system allows a real-time and contactless
monitoring of the slag temperature and, as an added-value from the
continuous video monitoring, it provides the flow patterns of the ingot
slag topping in order to assess the quality of the steel processing. |
|
Paper Nr.: |
90
|
Title: |
THE ACCURACY OF SCENE RECONSTRUCTION FROM IR IMAGES BASED ON KNOWN CAMERA POSITIONS - An Evaluation with the Aid of LiDAR Data
|
Author(s): |
Stefan Lang, Marcus Hebel and Michael Kirchhof |
Abstract: |
In
this work a system for 3D scene reconstruction from aerial infrared
imagery by means of known pose and position information of the sensor
is presented. Detected 2D image features are tracked and triangulated
afterwards. Each estimated 3D point is assessed by means of its
covariance matrix which is associated with the respective uncertainty.
Finally a non-linear optimization (Gauss-Newton iteration) of 3D points
yields the resulting point cloud.
The obtained results are evaluated with the aid of LiDAR data. For that
purpose we present a novel approach which quantifies the error of a
reconstructed scene by means of a 3D point cloud acquired by a laser
scanner. The evaluation procedure takes into account that the main
uncertainty of a Structure from Motion (SfM) system is in direction of
the line of sight. Results of both the SfM system and the evaluation
are presented. |
|
Paper Nr.: |
101
|
Title: |
IMPLEMENTATION OF REAL-TIME VISUAL TRACKING SYSTEM FOR AIRBORNE TARGETS
|
Author(s): |
Muhammad Asif Memon, Furqan Muhammad Khan, Farrukh H. Khan, Rana Muhammad Anees and Omair Abdul Rahman |
Abstract: |
A
real-time visual tracking system is presented for tracking airborne
targets. The algorithm is based on intensity difference between
background and the target in a gray-scale frame. As the background is
uniform for aerial videos, decision is made on contrast between
tracking gate boundary and the target inside that gate. The algorithm
is embedded on DSP Starter Kit (DSK) 6713 and a 586 embedded controller
is used for servo control and processing. A personal computer (PC)
provides the user interface for the system. The performance of the
system is verified with different airborne targets from birds to
helicopters and its reliability and constraints are observed. |
|
Paper Nr.: |
110
|
Title: |
REAL-TIME OBJECT DETECTION AND TRACKING FOR INDUSTRIAL APPLICATIONS
|
Author(s): |
Selim Benhimane, Hesam Najafi, Matthias Grundmann, Yakup Genc, Nassir Navab and Ezio Malis |
Abstract: |
This
paper deals with the fundamental problem of simultaneously tracking
complex objects and accurately estimating the 3D displacement of the
camera. It is targetting applications in industrial environments. We
adapted recently proposed methods in order to overcome most of the
limitations that the community is facing in markerless applications.
The proposed algorithm permits to detect and track complex industrial
machines that have poor textures in order to provide the user with
information virtually overlaid on the images acquired. It is tailored
such that most recent advances in computer vision and in pattern
recognition are combined to realize a solution able to cope with real
industrial environments. |
|
Paper Nr.: |
120
|
Title: |
3D ARTICULATED HAND TRACKING BY NONPARAMETRIC BELIEF PROPAGATION ON FEASIBLE CONFIGURATION SPACE
|
Author(s): |
Tangli Liu, Wei Liang and Yunde Jia |
Abstract: |
An
efficient articulated hand tracking method underlying the 3D graphical
model from monocular image sequences is proposed in this paper. Due to
the inaccurate dependences among the components of human hand leading
to distorted estimates in previous work, we design a pertinence
graphical model combined with domain–specific heuristics among the
components of human hand describing the hand’s 3D structure,
kinematics, and dynamics. The proposed model decomposes multivariate,
joint distributions into a set of local interactions among small
subsets. The modular structure provides an intuitive language for
expressing domain–specific knowledge about the variable relationships,
and facilitates tracking each hand component independently. And then,
we provide a novel belief propagation algorithm to inference in hand
graphical model. The algorithm can accommodate an extremely broad class
of potential functions besides the potentials appropriate for our
model. The experimental results show the robustness and efficiency of
tracking each hand component. |
|
Paper Nr.: |
122
|
Title: |
A NOVEL EVOLUTIONARY FRAMEWORK FOR FEATURE MATCHING
|
Author(s): |
Biao Wang and Chaoying Tang |
Abstract: |
The
paper presents a new feature matching scheme based on the Queen-bee
Evolution for two uncalibrated images. Matching features needs an
exhaustive search in a vast space, for which evolutionary algorithms
are recommended recently. This paper propose a simple and effective
algorithm. We intuitively encode a string of integer numbers assigned
to the features as chromosomes and develop a novel crossover operator
respectively which can preserve the position information without any
disruption. We also tailor swap mutation operator to prevent from
premature convergence and invalid solutions. As a result, the proposed
algorithm can quickly achieve the global or near global optimal
solution cooperating with the linear ranking selection and the elitist
replacement. Meanwhile, it is a more general framework for matching
various types of features. The experimental results illustrate the
performance of the proposed approach. |
|
Paper Nr.: |
123
|
Title: |
TRACK AND CUT: SIMULTANEOUS TRACKING AND SEGMENTATION OF MULTIPLE OBJECTS WITH GRAPH CUTS
|
Author(s): |
Aurelie Bugeau and Patrick Pérez |
Abstract: |
This
paper presents a new method to both track and segment multiple objects
in
videos using min-cut/max-flow optimizations. We introduce objective
functions
that combine low-level pixel-wise measures (color, motion), high-level
observations obtained via an independent detection module (connected
components of foreground detection masks in the experiments), motion
prediction and contrast-sensitive contextual regularization. One
novelty is that external observations are used without adding any
association step.
The minimization
of these cost functions simultaneously allows "detection-before-track"
tracking (track-to-observation assignment and automatic initialization
of new
tracks) and segmentation of tracked objects. When several tracked
objects get
mixed up by the detection module (e.g., single foreground detection
mask for
objects close to each other), a second stage of minimization allows the
proper
tracking and segmentation of these individual entities despite the
observation
confusion. Experiments on sequences from PETS 2006 corpus demonstrate
the
ability of the method to detect, track and precisely segment persons as
they
enter and traverse the field of view, even in cases of occlusions
(partial or
total), temporary grouping and frame dropping. |
|
Paper Nr.: |
153
|
Title: |
DETECTING ,TRACKING AND COUNTING FISH IN LOW QUALITY UNCONSTRAINED UNDERWATER VIDEOS
|
Author(s): |
Yun-Heh Chen-Burger, Gayathri Nadarajan and Robert B. Fisher |
Abstract: |
In
this work a machine vision system capable of analysing underwater
videos for detecting, tracking and counting fish is presented. The
real-time videos, collected near the Ken-Ding sub-tropical coral reef
waters are managed by EcoGrid, Taiwan and are barely analysed by marine
biologists. The video processing system is consists of three
subsystems: the video texture analysis, fish detection and tracking
modules. The fish detection is based on two algorithms computed
independently, whose results are combined in order to obtain a more
accurate outcome. The tracking was carried out by the application of
the CamShift algorithm that enables the tracking of objects whose
numbers may vary over time. Unlike existing fish-counting methods, our
approach provides a reliable method in which the fish number is
computed in unconstrained environments and under several scenarios
(murky water, algae on camera lens, moving plants, low contrast, etc.).
The proposed approach was tested with 20 underwater videos, achieving
an overall accuracy as high as 85%. |
|
Paper Nr.: |
164
|
Title: |
MULTI-LANE VISUAL PERCEPTION FOR LANE DEPARTURE WARNING SYSTEMS
|
Author(s): |
Juan M. Collado, Cristina Hilario, Arturo de la Escalera and Jose M. Armingol |
Abstract: |
This
paper presents a Road Detection and Tracking algorithm for Lane
Departure Warning Systems. An inverse perspective transformation gives
a bird-eye view of the road, where longitudinal road markings are
detected by exploration of horizontal gradient, looking for a road
marking model. Next, a parabolic lane model is fitted to road markings
and tracked through a particle filter. The right and left lane
boundaries are classified in three types (solid, broken or merge lane
boundaries), through a Fourier analysis, and adjacent lanes are
searched when broken or merge lines are detected. This gives the system
the ability to automatically detect the number and type of road lanes.
This ability allows to tell the difference between allowed and
forbidden manoeuvres, such as crossing a solid line, and it is used by
the lane departure warning system. Despite of its importance, lane
boundary classification has been seldom considered in previous works. A
Lane Departure Warning System launches an acoustic signal when a lane
departure is detected. Warnings are suppressed when the blinkers are
enabled, or when the vehicle is crossing a solid line regardless of the
state of the blinkers. |
|
Paper Nr.: |
175
|
Title: |
A FEATURE GUIDED PARTICLE FILTER FOR ROBUST HAND TRACKING
|
Author(s): |
Matti-Antero Okkonen, Janne Heikkilä and Matti Pietikäinen |
Abstract: |
Particle
filtering offers an interesting framework for visual tracking. Unlike
the Kalman filter, particle filters can deal with non-linear and
non-Gaussian problems, which makes them suitable for visual tracking in
presence of real-life disturbance factors, such as background clutter
and movement, fast and unpredictable object movement and unideal
illumination conditions. This paper presents a robust hand tracking
particle
filter algorithm which exploits the principle of importance sampling
with a novel proposal distribution. The proposal distribution is based
on effectively calculated color blob features, propagating the
particles robustly through time even in unideal conditions. In
addition, a novel method for conditional color model adaptation is
proposed. The experiments show that using these methods in the particle
filtering framework enables hand
tracking with fast movements under real world conditions. |
|
Paper Nr.: |
181
|
Title: |
BIOLOGICALLY INSPIRED ATTENTIVE MOTION ANALYSIS FOR VIDEO SURVEILLANCE
|
Author(s): |
Florian Raudies and Heiko Neumann |
Abstract: |
Recently proposed algorithms in the
field of vision based video surveillance are build upon
directionally consistent flow, or statistics of
foreground and background. Here, we present a novel
approach which utilizes an attention mechanism to focus processing
on (highly) suspicious image regions. The attention signal is
generated from temporal integration of localized image features from
monocular image sequences. This model uses biologically inspired
mechanisms, like feature extraction and grouping to analyze
spatio-temporal patterns aiming at defining scene signatures. Main
parts of the model are the construction of a motion streak image,
the estimation of image flow, and the incorporation of information
from both parts for the computation of an attention signal. This
incorporation of information is motivated by feature binding,
assumed to exist at various stages in biologically plausible systems.
We compare our model with an existing approach for the task of video
surveillance with a receiver operator characteristic (ROC) analysis.
In conclusion our model is shown to yield results which are comparable
with existing approaches. |
|
Paper Nr.: |
188
|
Title: |
DEPTH PREDICTION AT HOMOGENEOUS IMAGE STRUCTURES
|
Author(s): |
Sinan Kalkan, Florentin Wörgötter and Norbert Krüger |
Abstract: |
This paper proposes a voting-based model that predicts depth at weakly-structured image
areas from the depth that is extracted using a feature-based stereo method. We provide results,
on both real and artificial scenes, that show the accuracy and robustness of our approach.
Moreover, we compare our method to different dense
stereo algorithms to investigate the effect of texture on performance of the two different approaches.
The results confirm the expectation that dense stereo methods are suited better
for textured image areas and our method for weakly-textured image areas. |
|
Paper Nr.: |
193
|
Title: |
CORRELATION ICP ALGORITHM FOR POSE ESTIMATION BASED ON LOCAL AND GLOBAL FEATURES
|
Author(s): |
Marco A. Chavarria and Gerald Sommer |
Abstract: |
In
this paper we present a new variant of ICP (iterative closest
point) algorithm based on local feature correlation. Our approach
combines global and local feature information to find better
correspondence sets and to use them to compute the 3D pose of the
object model
even for the case of large displacements between model and image data.
For such cases,
we propose a 2D alignment in the image plane (rotation plus
translation) before
the feature extraction process. This has some advantages over the
classical methods like better convergence and robustness. Furthermore,
it avoids the need of a normal pre-alignment step in 3D. Our approach
was tested on synthetical and
real-world data to compare the convergence behavior and performance
against other versions of the ICP algorithm combined with a classical
pre-alignment approach. |
|
Paper Nr.: |
212
|
Title: |
A NEW SET OF FEATURES FOR ROBUST CHANGE DETECTION
|
Author(s): |
José Sigut, Sid-Ahmed Ould Sidha, Juan Díaz and Carina González |
Abstract: |
A
new set of features for robust change detection is proposed. These
features are obtained from a transformation of the thresholded
intensity difference image. Their performance is tested on two video
sequences acquired in a human-machine interaction scenario under very
different illumination conditions. Several performance measures are
computed and a comparison with other well known classical change
detection methods is done. The performed experiments show the
effectiveness and robustness of our proposal. |
|
Paper Nr.: |
215
|
Title: |
HAND GESTURE TRACKING FOR WEARABLE COMPUTING SYSTEMS
|
Author(s): |
Xiujuan Chai, Kongqiao Wang, Luosi Wei and Hao Wang |
Abstract: |
Wearable
computing is a hot research field in recent years. For the important
role in wearable computing systems, hand gesture tracking attracts many
researchers’ interests. This paper proposes a simple but effective
temporal differencing based hand motion tracking scheme which is used
to build an augmented drumming system. In our method, the accurate
motion information is gotten by a fine-coarse-fine strategy. Once
getting the motion region candidates, a skin detector based on skin
colour histogram is used to determine which region is our concerned
hand. In the tracking procedure, motion direction constraint is also
adopted in order to get a robust result. Different with the traditional
skin detection for the whole image frame, combining with the motion
region detection, the hand detection is no longer effected by the
skin-like background. Experimental results show that our presented hand
gesture tracking is robust and fast. We also adopt it into an augmented
drumming system to show the good performance and powerful potential of
our method in wearable computing system. |
|
Paper Nr.: |
217
|
Title: |
EXACT VISUAL HULL FROM MARCHING CUBES
|
Author(s): |
Chen Liang and Kwan-Yee K. Wong |
Abstract: |
The
marching cubes algorithm has been widely adopted for extracting a
surface mesh from a volumetric description of the visual hull
reconstructed from silhouettes. However, typical volumetric
descriptions, such as an octree, provide only a binary description
about the visual hull. The lack of interpolation information along each
voxel edge, which is required by the marching cubes algorithm, usually
results in inaccurate and bumpy surface mesh. In this paper, we propose
a novel method to efficiently estimate the exact intersections between
voxel edges and the visual hull boundary, which replace the missing
interpolation information. The method improves both the visual quality
and accuracy of the estimated visual hull mesh, while retaining the
simplicity and robustness of the volumetric approach. To verify this
claim, we present both synthetic and real-world experiments, as well as
comparisons with existing volumetric approaches and other approaches
targeting at an exact visual hull reconstruction.
|
|
Paper Nr.: |
218
|
Title: |
ROBUST MULTI-TARGET TRACKING USING MEAN SHIFT AND PARTICLE FILTER WITH TARGET MODEL UPDATE
|
Author(s): |
Hong Liu, Jintao Li, Yueliang Qian and Qun Liu |
Abstract: |
We
propose a novel multiple targets tracking algorithm combining Mean
Shift and Particle Filter, and enhance the performance with target
model update process. Mean Shift has a low complexity, but is weak in
dealing with multi-modal probability density functions (pdfs). Particle
Filter is robust to the partial occlusion and can deal with multi-modal
pdfs. In real application, illumination conditions, the visual angle as
well as object occlusion can change target appearance, thus influence
the quality of Particle Filter. For multi-target tracking task, the
mutual occlusion of targets and computational complexity are important
problems for tracking system. In this paper, Mean Shift algorithm is
embedded into Particle Filter framework to get stable tracking and
reduce computational load. To overcome the target appearance changes
caused by illumination changes and object occlusion, targets model are
updated adaptively during tracking. Experimental results show that our
tracking system can robustly track multiple targets with mutual
occlusion and correctly maintain their identities with smaller number
of particles than Particle Filter. |
|
Paper Nr.: |
222
|
Title: |
A PDES METHOD PRESERVING BOUNDARIES ON DENSE DISPARITY MAP RECONSTRUCTION
|
Author(s): |
Ji liu1, Junjian Peng, Yuechao Wang and Yandong Tang |
Abstract: |
Over
smoothness restricts the application of PDEs in the field of dense
disparity map reconstruction, because disparity map reconstruction
usually requires preserving discontinuousness in some areas such as the
boundaries of objects. To preserve disparity discontinuousness, this
paper adopts two strategies. Firstly, ground control points (GCPs) are
introduced as the soft constraint. Secondly, this paper designs a
structure of smoothness part in energy functional, which can preserve
discontinuousness effectively. Moreover, the adjustable parameters in
the smoothness part advance its robustness. In experiments, we compare
proposed method with graph cuts method and prove that PDEs is also a
useful solution for disparity map reconstruction and has the advantage
of dealing with smooth images. |
|
Paper Nr.: |
228
|
Title: |
PRINCIPLED DETECTION-BY-CLASSIFICATION FROM MULTIPLE VIEWS
|
Author(s): |
Jérôme Berclaz, François Fleuret and Pascal Fua |
Abstract: |
Machine-learning based classification techniques have been shown to be effective
at detecting objects in complex scenes. However, the final results are often
obtained from the alarms produced by the classifiers through a post-processing
which typically relies on \emph{ad hoc} heuristics. Spatially close alarms are
assumed to be triggered by the same target and grouped together.
Here we replace those heuristics by a principled Bayesian approach, which uses
knowledge about both the classifier response model and the scene geometry to
combine multiple classification answers. We demonstrate its effectiveness for
multi-view pedestrian detection.
We estimate the marginal probabilities of presence of people at any location in
a scene, given the responses of classifiers evaluated in each view. Our approach
naturally takes into account both the occlusions and the very low metric
accuracy of the classifiers due to their invariance to translation and scale.
Results show our method produces one order of magnitude fewer false positives
than a method that is representative of typical state-of-the-art
approaches. Moreover, the framework we propose is generic and could be applied
to any detection-by-classification task. |
|
Paper Nr.: |
229
|
Title: |
STRUCTURE FROM OMNIDIRECTIONAL STEREO RIG MOTION FOR CITY MODELING
|
Author(s): |
Michal Havlena, Tomáš Pajdla and Kurt Cornelis |
Abstract: |
This
paper deals with a step towards a 3D reconstruction system for city
modeling from omnidirectional video sequences using structure from
motion together with stereo constraints. We concentrate on two issues.
First, we show how the tracking and reconstruction paradigm were
adapted to use omnidirectional images taken by lenses with 180 degrees
field of view. This concerns mainly camera calibration transforming the
pixel locations into rays and solving the minimal problem for 3D-to-2D
matches using RANSAC. Secondly, we compare the results of the
reconstruction using additional stereo constraints to the results when
these constraints are not used and show that they are needed to make
the reconstruction stable. Performance of the system is demonstrated on
a sequence of 870 images acquired while driving in a city. |
|
Paper Nr.: |
230
|
Title: |
3D HUMAN FACE MODELLING FROM UNCALIBRATED IMAGES USING SPLINE BASED DEFORMATION
|
Author(s): |
Nikos Barbalios, Nikos Nikolaidis and Ioannis Pitas |
Abstract: |
Accurate
and plausible 3D face reconstruction remains a difficult problem up to
this day, despite the tremendous advances in computer technology and
the continuous growth of the applications utilizing 3D face models
(e.g. biometrics, movies, gaming). In this paper, a two-step technique
for efficient 3D face reconstruction from a set of face images acquired
using an uncalibrated camera is presented. Initially, a robust
structure from motion (SfM) algorithm is applied over a set of manually
selected salient image features to retrieve an estimate of their 3D
coordinates. These estimates are further utilized to deform a generic
3D face model, using smoothing splines, and adapt it to the
characteristics of a human face. |
|
Paper Nr.: |
266
|
Title: |
KLT TRACKING USING INTRINSIC AND EXTRINSIC CAMERA PARAMETERS IN CONSIDERATION OF UNCERTAINTY
|
Author(s): |
Michael Trummer, Joachim Denzler and Christoph Munkelt |
Abstract: |
Feature tracking is an important task in computer vision, especially for 3D reconstruction applications. Such
procedures can be run in environments with a controlled sensor, e.g. a robot arm with camera. This yields
the camera parameters as special knowledge that should be used during all steps of the application to improve
the results. As a first step, KLT (Kanade-Lucas-Tomasi) tracking (and its variants) is an approach widely
accepted and used to track image point features. So, it is straightforward to adapt KLT tracking in a way
that camera parameters are used to improve the feature tracking results. The contribution of this work is an
explicit formulation of the KLT tracking procedure incorporating known camera parameters. Since practical
applications do not run without noise, the uncertainty of the camera parameters is regarded and modeled within
the procedure. Comparing practical experiments have been performed and the results are presented. |
|
Paper Nr.: |
265
|
Title: |
FEATURE SETS FOR PEOPLE AND LUGGAGE RECOGNITION IN AIRPORT SURVEILLANCE UNDER REAL-TIME CONSTRAINTS
|
Author(s): |
J. Rosell-Ortega, G. Andreu-García, A. Rodas-Jordŕ, V. Atienza-Vanacloig and J. Valiente-González |
Abstract: |
We study two different sets of features with the aim of classifying
objects from videos taken in the halls and corridors of an airport.
Objects are classified as being one of three different classes: single
person, group of people, and luggage. We have used two different
feature sets, one set based on classical geometric features, and
another based on dividing the blob into several cells and calculating
the density of foreground pictures in each cell. In both cases, easily
computed features were selected because our system must run under
real-time constraints. During the development of the algorithms, we
also studied if shadows affect the classification rate of objects. We
achieved this by applying two shadow removal algorithms to estimate the
usefulness of such techniques under real-time constraints. |
|
Paper Nr.: |
269
|
Title: |
CALIBRATION-FREE EYE GAZE DIRECTION DETECTION WITH GAUSSIAN PROCESSES
|
Author(s): |
Basilio Noris, Karim Benmachiche and Aude G. Billard |
Abstract: |
In
this paper we present a solution for eye gaze detection from a wireless
head mounted camera designed for children aged between 6 months and 18
months. Due to the constraints of working with very young children, the
system does not seek to be as accurate as other state-of-the-art eye
trackers, however it requires no calibration process from the wearer.
Gaussian Process Regression and Support Vector Machines are used to
analyse the raw pixel data from the video input and return an estimate
of the child's gaze direction. A confidence map is used to determine
the accuracy the system can expect for each coordinate on the image. |
|
Paper Nr.: |
281
|
Title: |
A MAXIMUM LIKELIHOOD SURFACE NORMAL ESTIMATION ALGORITHM FOR HELMHOLTZ STEREOPSIS
|
Author(s): |
Jean-Yves Guillemaut, Ondřej Drbohlav, John Illingworth and Radim Šára |
Abstract: |
Helmholtz
stereopsis is a relatively recent reconstruction technique which is
able to reconstruct scenes with arbitrary and unknown surface
reflectance properties. Conventional implementations of the method
estimate surface normal direction at each surface point via an
eigenanalysis, thereby optimising an algebraic distance. We develop a
more physically meaningful radiometric distance whose minimisation is
shown to yield a Maximum Likelihood surface normal estimate. The
proposed method produces more accurate results than algebraic methods
on synthetic imagery and yields excellent reconstruction results on
real data. Our analysis explains why, for some imaging configurations,
a sub-optimal algebraic distance can yield good results. |
|
Paper Nr.: |
283
|
Title: |
LUCAS-KANADE INVERSE COMPOSITIONAL USING MULTIPLE BRIGHTNESS AND GRADIENT CONSTRAINTS
|
Author(s): |
Ahmed Fahad and Tim Morris |
Abstract: |
A
recently proposed fast image alignment algorithm is the inverse
compositional algorithm based on Lucas-Kanade. In this paper, we
present an overview of different brightness and gradient constraints
used with the inverse compositional algorithm. We also propose an
efficient and robust data constraint for the estimation of global
motion from image sequences. The constraint combines brightness and
gradient constraints under multiple quadratic errors. The method can
accommodate various motion models. We concentrate on the global
efficiency of the constraint in capturing the global motion for image
alignment. We have applied the algorithm to various test sequences with
ground truth. From the experimental results we conclude that the new
constraint provides reduced motion error at the expense of extra
computations. |
|
Paper Nr.: |
287
|
Title: |
VIEW-BASED ROBOT LOCALIZATION USING ILLUMINATION-INVARIANT SPHERICAL HARMONICS DESCRIPTORS
|
Author(s): |
Holger Friedrich, David Dederscheck, Martin Mutz and Rudolf Mester |
Abstract: |
In
this work we present a view-based approach for robot self-localization
using a hemispherical camera system. We use view descriptors that are
based upon Spherical Harmonics as orthonormal basis functions on the
sphere. The resulting compact representation of the image signal
enables us to efficiently compare the views taken at different
locations. With the view descriptors stored in a database, we compute a
similarity map for the current view by means of a suitable distance
metric. Advanced statistical models based upon PCA introduced to that
distance metric also allow to deal with even severe illumination
changes, which extends our method to real-world applications. |
|
Paper Nr.: |
289
|
Title: |
MEASUREMENT NOISE IN PHOTOMETRIC STEREO BASED SURFACE RECONSTRUCTION
|
Author(s): |
Toni Kuparinen, Ville Kyrki and Pekka Toivanen |
Abstract: |
In this paper, surface reconstruction techniques for surfaces with
high frequency height variation are studied. Such surfaces are important
for many industrial settings, for example, in paper and textile
manufacturing. Traditionally, photometric stereo methods have been
developed and evaluated on large objects with strong additive Gaussian
noise. The paper presents the derivation of the effect of white image
noise to gradient fields and proposes a denoising approach of the
gradient fields using Wiener filter. Several known surface
reconstruction methods are evaluated experimentally, with respect to
the effect of the noise, and the boundary conditions of the
reconstruction. The experimental results validate that the proposed
approach improves the surface reconstruction on surfaces with high
frequency height variation. |
|
Paper Nr.: |
293
|
Title: |
AN EFFICIENT SENSOR FOR TRAFFIC MONITORING AND TRACKING APPLICATIONS
|
Author(s): |
Nikolaos Zournis-Karouzos, Alexandra Koutsia, Kosmas Dimitropoulos and Nikos Grammalidis |
Abstract: |
We
propose a novel video sensor for real-time motion detection at specific
user-defined regions of interest, designed primarily for traffic
monitoring, surveillance and tracking applications. The ultimate goal
is to extend the capabilities and to alleviate shortcomings of embedded
motion detection video sensors (like AutoscopeŽ) for target tracking
and surveillance applications, including road traffic monitoring or
Advanced Surface Movement, Guidance and Control Systems (A-SMGCS) at
airports. Specifically, the new sensor a) supports virtual detectors
with a generalized (polygonal) shape, thus providing additional
flexibility in the design of detector configurations, b) is based on
fast implementations of recent state-of-the art background extraction
and update techniques and c) constitutes a generic, inexpensive
software solution, which can be used with any video camera. First
experimental results confirm that the new video sensor meets the
expectations in terms of real-time performance and demonstrates the
additional functionalities, according to which it was designed. The
final goal is to use this new sensor as an alternative, improved
version of the Autoscope video sensors for the targeted applications. |
|
Paper Nr.: |
304
|
Title: |
OMNIDIRECTIONAL CAMERA MOTION ESTIMATION
|
Author(s): |
Akihiko Torii and Tomáš Pajdla |
Abstract: |
We present an automatic technique for computing relative camera motion
and simultaneous omnidirectional image matching. Our technique works
for small as well as large motions, tolerates multiple moving objects
and very large occlusions in the scene. We combine three principles
and obtain a practical algorithm which improves the state of the art.
First, we show that the correct motion is found much sooner if the
tentative matches are sampled after ordering them by the similarity of
their descriptors. Secondly, we show that the correct camera motion
can be better found by soft voting for the direction of the motion
than by selecting the motion that is supported by the largest set of
matches. Finally, we show that it is useful to filter out the
epipolar geometries which are not generated by points reconstructed in
front of cameras. We demonstrate the performance of the technique in
an experiment with 189 image pairs acquired in a city and in a park.
All camera motion were recovered with the error of the motion
direction smaller than 8 degree, which is 4% of the 183 degree
field of view, w.r.t.\ the ground truth.
|
|
Paper Nr.: |
317
|
Title: |
PROBABILISTIC APPEARANCE-BASED NAVIGATION OF A MOBILE ROBOT
|
Author(s): |
Luis Payá, Oscar Reinoso, Arturo Gil, M. Asuncion Vicente and Jose L. Aznar |
Abstract: |
This
work presents an appearance-based approach to route following in
multi-robot systems, using the information captured by a conventional
forward-looking camera. In the teaching phase, the most relevant
information along the route is stored using incremental Principal
Components Analysis (PCA). Thanks to this approach, the follower robot
can begin the route while the leader is still recording it and follow
it with a distance as in time or in space. The follower robot makes an
auto-localization process, comparing the current view with the
information stored in the database, using a probabilistic approach that
takes into account the current sensory input and the previous position.
Then, a fuzzy controller is in charge of calculating the speed and
turning to follow the route. The inputs of this controller are obtained
also through the visual information. Experimental results have shown
the robustness of the algorithms in an office environment. |
|
Paper Nr.: |
322
|
Title: |
AUTONOMOUS
MODEL-BASED OBJECT IDENTIFICATION & CAMERA POSITION ESTIMATION WITH
APPLICATION TO AIRPORT LIGHTING QUALITY CONTROL
|
Author(s): |
James H. Niblock, Jian-Xun Peng, Karen R. McMenemy and George W. Irwin |
Abstract: |
The
development of an autonomous system for the accurate measurement of the
quality of aerodrome ground lighting (AGL) in accordance with current
standards and recommendations is presented. The system is composed of
an imager which is placed inside the cockpit of an aircraft to record
images of the AGL during a normal descent to an aerodrome. Before the
performance of the AGL is assessed, it is first necessary to uniquely
identify each luminaire within the image and track it through the
complete image sequence. A model-based (MB) methodology is used to
ascertain the optimum match between a template of the AGL and the
actual image data. Projective geometry, in addition to the image and
real world location of the extracted luminaires, is then used to
calculate the position of the camera at the instant the image was
acquired. Algorithms are also presented which model the distortion
apparent within the sensors optical system and average the camera's
intrinsic parameters over multiple frames, so as to minimise the
effects of noise on the acquired image data and hence make the camera's
estimated position and orientation more accurate. The positional
information is validated using actual approach image data. |
|
Paper Nr.: |
328
|
Title: |
MULTI-CAMERA DETECTION AND MULTI-TARGET TRACKING - Traffic Surveillance Applications
|
Author(s): |
R. Reulke, S. Bauer, T. Döring and R. Spangenberg |
Abstract: |
Non-intrusive
video-detection for traffic flow observation and surveillance is the
primary alternative to conventional inductive loop detectors. Video
Image Detection Systems (VIDS) can derive traffic parameters by means
of image processing and pattern recognition methods. Existing VIDS
emulate the inductive loops. We propose a trajectory based recognition
algorithm to expand the common approach and to obtain new types of
information (e.g. queue length or erratic movements). Different views
of the same area by more than one camera sensor are necessary, because
of the typical limitations of single camera systems, resulting from
occlusions by other cars, trees and traffic signs. A distributed
cooperative multi-camera system enables a significant enlargement of
the observation area. The trajectories are derived from multi-target
tracking. The fusion of object data from different cameras will be done
by a tracking approach. This approach opens up opportunities to
identify and specify traffic objects, their location, speed and other
characteristic object information. The system creates new derived and
consolidated information of traffic participants. Thus, also
descriptions of individual traffic participants are possible. |
|
Paper Nr.: |
347
|
Title: |
RANDOM FOREST CLASSIFIERS FOR REAL-TIME OPTICAL MARKERLESS TRACKING
|
Author(s): |
Ińigo Barandiaran, Charlotte Cottez, Céline Paloc and Manuel Grańa |
Abstract: |
Augmented
reality (AR) is a very promising technology that can be applied in many
areas such as healthcare, broadcasting or manufacturing industries. One
of the bottlenecks of such application is a robust real-time optical
markerless tracking strategy. In this paper we focus on the development
of tracking by detection for plane homography estimation. Feature or
keypoint matching is a critical task in such approach. We propose to
apply machine learning techniques to solve this problem. We present an
evaluation of an optical tracking implementation based on Random Forest
classifier. The implementation has been successfully applied to indoor
and outdoor augmented reality design review application. |
|
Paper Nr.: |
359
|
Title: |
CAMERA MOTION ESTIMATION USING PARTICLE FILTERS
|
Author(s): |
Symeon Nikitidis, Stefanos Zafeiriou and Ioannis Pitas |
Abstract: |
In this paper a novel algorithm for estimating the parametric form of
the camera motion is proposed. In particular, a novel stochastic vector
field model is proposed which can handle smooth motion patterns derived
from long periods of stable camera movement and also can cope with rapid
motion changes and periods where camera remains still. A set of rules for
robust and online updating of the model parameters is also proposed,
based on the Expectation Maximization algorithm. Finally, we fit this
model in a particle filters framework, in order to predict the future camera
motion based on current and prior knowledge. Extensive experimental
results verify the usefulness of the proposed scheme in camera motion
pattern classification and in accurate estimation of the camera 2D affine
transform parameters. |
|
Paper Nr.: |
364
|
Title: |
ITERATIVE RIGID BODY TRANSFORMATION ESTIMATION FOR VISUAL 3-D OBJECT TRACKING
|
Author(s): |
Micha Hersch, Thomas Reichert and Aude Billard |
Abstract: |
We present a novel yet simple 3D stereo vision tracking
algorithm which computes the position and orientation of an object from the
location of markers attached to the object. The novelty of this algorithm is
that it does not assume that the markers are tracked syncronously. This
provides a higher robustness to the noise in the data, missing points and
outliers. The principle of the algorithm is to perform a simple gradient
descent on the rigid body transformation describing the object position and
orientation. This is proved to converge to the correct solution and is
illustrated in a simple experimental setup involving two USB cameras. |
|
Paper Nr.: |
374
|
Title: |
ANOMALY DETECTION WITH LOW-LEVEL PROCESSES IN VIDEOS
|
Author(s): |
Ákos Utasi and László Czúni |
Abstract: |
In
our paper we deal with the problem of low-level motion modeling and
unusual event detection in urban surveillance videos. We model the
direction of optical flow vectors at image pixels. We implemented and
tested probability based approaches such as probability estimation,
Mixture of Gaussians modeling, and spatial averaging (with Mean-shift
segmentation). We propose a Markovian prior to get reliable
spatio-temporal support. We tested the tech-niques on synthetic and
real video sequences. |
|
Paper Nr.: |
380
|
Title: |
ESTIMATING VEHICLE VELOCITY USING RECTIFIED IMAGES
|
Author(s): |
Cristina Maduro, Katherine Batista, Paulo Peixoto and Jorge Batista |
Abstract: |
In this paper we propose a technique to estimate vehicles velocity, using rectified images that represent a top
view of the highway. To rectify image sequences captured by uncalibrated cameras, this method automatically
estimates two vanishing points using lines from the image plane. This approach requires two known lengths
on the ground plane and can be applied to highways that are fairly straight near the surveillance camera. Once
the background image is rectified it is possible to locate the stripes and boundaries of the highway lanes. This
process may also be used to count vehicles, estimate their velocities and the mean velocity associated to each
of the previously identified highway lanes. |
|
Paper Nr.: |
401
|
Title: |
LONG-TERM VS. GREEDY ACTION PLANNING FOR COLOR LEARNING ON A MOBILE ROBOT
|
Author(s): |
Mohan Sridharan and Peter Stone |
Abstract: |
A major challenge in the path of widespread use of mobile robots is the ability to function autonomously,
learning useful models for environmental features, and adapting these models in accordance to environmental
changes. In this paper, we address an important subtask of robot vision, namely color modeling/learning. We
present and analyze the performance of two algorithms that enable a mobile robot to plan an action sequence to
facilitate color learning: local heuristic planning, and global action selection. We show that global planning,
which maximizes color learning opportunities while minimizing localization, provides better performance.
Our approach is fully implemented and tested on the Sony AIBO robots |
|
Paper Nr.: |
433
|
Title: |
AN
ARTICULATED MODELWITH A KALMAN FILTER FOR REAL TIME VISUAL TRACKING -
Application to the Tracking of Pedestrians with a Monocular Camera
|
Author(s): |
Youssef Rouchdy |
Abstract: |
This
work presents a method for the visual tracking of articulated targets
in image
sequences in real time. Each part of the target object is considered as
a region of
interest and tracked by a parametric transformation. Prior geometric
and dynamic informations about the target are introduced with a Kalman
filter to
guide the evolution of the tracking process of regions. An articulated
model with two areas is proposed and applied to track pedestrians in
the urban image sequences. |
|
|
|
Bayesian Approach for Inverse Problems in Computer Vision
|
Paper Nr.: |
486
|
Title: |
USING LOGARITHMIC OPINION POOLING TECHNIQUES IN BAYESIAN BLIND MULTI-CHANNEL RESTORATION
|
Author(s): |
Bruno Amizic, Aggelos K. Katsaggelos and Rafael Molina |
Abstract: |
In
this paper we examine the use of logarithmic opinion pooling techniques
to combine two observations models that are normally used in
multi-channel image restoration techniques. The combined observation
model is used together with simultaneous autoregression prior models
for the image and blurs to define the joint distribution of image,
blurs and observations. Assuming that all the unknown parameters are
previously estimated
we use variational techniques to approximate the posterior distribution
of the real underlying image and the unknown blurs. We will examine the
use of two approximations of the posterior distribution. Experimental
results are used to validate the proposed approach. |
|
Paper Nr.: |
499
|
Title: |
VARIATIONAL BAYES WITH GAUSS-MARKOV-POTTS PRIOR MODELS FOR JOINT IMAGE RESTORATION AND SEGMENTATION
|
Author(s): |
Hacheme Ayasso and Ali Mohammad-Djafari |
Abstract: |
In
this paper, we propose a family of non-homogeneous Gauss-Markov fields
with Potts region labels model for images to be used in a Bayesian
estimation framework, in order to jointly restore and segment images
degraded by a known point spread function and additive noise. The joint
posterior law of all the unknowns ( the unknown image, its segmentation
hidden variable and all the hyperparameters) is approximated by a
separable probability laws via the variational Bayes technique. This
approximation gives the possibility to obtain practically implemented
joint restoration and segmentation algorithm. We will present some
preliminary results and comparison with a MCMC Gibbs sampling based
algorithm |
|
Paper Nr.: |
511
|
Title: |
A
MINIMUM ENTROPY IMAGE DENOISING ALGORITHM - Minimizing Conditional
Entropy in a New Adaptive Weighted K-th Nearest Neighbor Framework for
Image Denoising
|
Author(s): |
Cesario Vincenzo Angelino, Eric Debreuve and Michel Barlaud |
Abstract: |
In this paper we address the image restoration problem in the variational framework. The focus is set in
denoising applications. Natural image statistics are consistent with a Markov random field (MRF) model for
the image structure, thus in a restoration process attention must be posed on the spatial correlation between
adjacent pixels.The proposed approach minimizes the conditional entropy of a pixel knowing its neighborhood.
The estimation procedure of statistical properties of the image is carried out in a new adaptive weighted k-th
nearest neighbor (AWkNN) framework. Experimental results shows the interest of such approach. Images
quality is evaluated by means of the RMSE measure and SSIM index, more adapted to the human visual
system. |
|
|
|
Online Pattern Recognition and Machine Learning Techniques for Computer-Vision Applications
|
Paper Nr.: |
192
|
Title: |
MULTITASK LEARNING - An Application to Incremental Face Recognition
|
Author(s): |
David Masip, Ŕgata Lapedriza and Jordi Vitriŕ |
Abstract: |
Usually
face classification applications suffer from two important problems:
the number of training samples from each class is reduced, and the
final system usually must be extended to incorporate new people to
recognize. In this paper we introduce a face recognition method that
extends a previous boosting-based classifier adding new classes and
avoiding the need of retraining the system each time a new person joins
the system. The classifier is trained using the multitask learning
principle and multiple verification tasks are trained together sharing
the same feature space. The new classes are added taking advantage of
the previous learned structure, being the addition of new classes not
computationally demanding. Our experiments with two different data sets
show that the performance does not decrease drastically even when the
number of classes of the base problem is multiplied by a factor of $8$. |
|
Paper Nr.: |
418
|
Title: |
AN ONLINE SELF-BALANCING BINARY SEARCH TREE FOR HIERARCHICAL SHAPE MATCHING
|
Author(s): |
N. Tsapanos, A. Tefas and I. Pitas |
Abstract: |
In
this paper we propose a self-balanced binary search tree data structure
for shape matching. This was originaly developed as a fast method of
silhouette matching in videos recorded from IR cameras by firemen
during rescue operations. We introduce a similarity measure with which
we can make decisions on how to traverse the tree and backtrack to find
more possible matches. Then we describe every basic operation a binary
search tree can perform adapted to a tree of shapes. Note that as a
binary search tree, all operations can be performed in O(log n) time
and are very fast and efficient. Finally we present experimental data
evaluating the performance of our proposed data structure. |
|
Paper Nr.: |
444
|
Title: |
CONTINUOUS LEARNING OF SIMPLE VISUAL CONCEPTS USING INCREMENTAL KERNEL DENSITY ESTIMATION
|
Author(s): |
Danijel Skočaj, Matej Kristan and Aleš Leonardis |
Abstract: |
In this paper we propose a method for continuous learning of simple
visual concepts. The method continuously associates words
describing observed scenes with automatically extracted visual
features. Since in our setting every sample is labelled with
multiple concept labels, and there are no negative examples,
reconstructive representations of the incoming data are used. The
associated features are modelled with kernel density probability
distribution estimates, which are built incrementally. The proposed
approach is applied to the learning of object properties and spatial
relations. |
|
Paper Nr.: |
447
|
Title: |
ONLINE LEARNING OF GAUSSIAN MIXTURE MODELS - A Two-Level Approach
|
Author(s): |
Arnaud Declercq and Justus H. Piater |
Abstract: |
We
present a method for incrementally learning mixture models that avoids
the necessity to keep all data points around. It contains a single
user-settable parameter that controls via a novel statistical criterion
the trade-off between the number of mixture components and the accuracy
of representing the data. A key idea is that each component of the
(non-overfitting) mixture is in turn represented by an underlying
mixture that represents the data very precisely (without regards to
overfitting); this allows the model to be refined without sacrificing
accuracy. |
|
Paper Nr.: |
454
|
Title: |
TIME DEPENDENT ON-LINE BOOSTING FOR ROBUST BACKGROUNDMODELING
|
Author(s): |
Helmut Grabner, Christian Leistner and Horst Bischof |
Abstract: |
In
modern video surveillance systems change and outlier detection is of
highest interest. Most of these systems are based on standard
pixel-by-pixel background modeling approaches. In this paper, we
propose a novel robust block-based background model as it is suitable
for outlier detection using an extension to on-line boosting for
feature selection. In order to be robust and still easy to operate our
system incorporates several novelties in both previous proposed on-line
boosting algorithms and classifier-based background modeling systems.
We introduce time-dependency and control into on-line boosting. Our
system allows for automatically adjusting its temporal behavior to the
underlying scene by using a control system which regulates the model
parameters. The benefits of our approach are illustrated on several
experiments on challenging standard datasets. |
|
|
|
VISAPP International Workshop on Robotic Perception (VISAPP-RoboPerc08)
|
Paper Nr.: |
412
|
Title: |
Comparing Two Action Planning Approaches for Color Learning on a Mobile Robot
|
Author(s): |
Mohan Sridharan and Peter Stone |
Abstract: |
A
major challenge to the deployment of mobile robots in a wide range of
tasks is the ability to function autonomously, learning appropriate
models for environmental features and adapting those models, over time,
in accordance to environmental changes. Such autonomous operation is
feasible iff the robot is able to autonomously select/plan an action
sequence that facilitates learning and adaptation. In this paper, we
focus on the task of color modeling/learning, and present and analyze
two algorithms that enable a mobile robot to plan action sequences that
facilitate color learning. We propose a long-term action selection
approach that maximizes color learning opportunities while minimizing
localization errors over an entire action sequence, and compare it with
a greedy/heuristic action selection approach that plans incrementally,
one step at a time, to maximize the benefits based on the current state
of the world. We show (experimentally) that long-term action selection
results in a more principled solution that requires minimal human
supervision, and that better failure recovery can be achieved by
incorporating some features of the greedy planning approach as well.
All algorithms are fully implemented and tested on the Sony AIBO robots. |
|
Paper Nr.: |
505
|
Title: |
Implementation of an Intentional Vision System to support Cognitive Architectures
|
Author(s): |
Ignazio Infantino, Carmelo Lodato, Salvatore Lopes and Filippo Vella |
Abstract: |
An
effective cognitive architecture has to be able to model, recognize and
interpret user wills. The aim of the proposed framework is the
development of an intentional vision system oriented to man-machine
interaction. Such system will be able to recognize user faces, to
recognize and tracking human postures by video cameras. It could be
integrated in an cognitive software architecture, and could be tested
in several demonstrative scenarios such as domotics, or entrainment
robotics, and so on. The described framework is organized on two
modules mapped on the corresponding outputs to obtain: intentional
perception of faces; intentional perception of human body movements.
Moreover a possible integration of intentional vision module in a
completecognitive architecture is proposed. |
|
Paper Nr.: |
512
|
Title: |
Data Fusion by Uncertain Projective Geometry in 6DoF Visual SLAM
|
Author(s): |
Daniele Marzorati, Matteo Matteucci, Davide Migliore and Domenico G. Sorrenti |
Abstract: |
In
this paper we face the issue of fusing 3D data from different sensors
in a seamless way, using the unifying framework of uncertain projective
geometry. Within this framework it is possible to describe, combine,
and estimate various types of geometric elements (2D and 3D points, 2D
and 3D lines, and 3D planes) taking their uncertainty into account.
Because of the size of the data involved in this process, the
integration process and thus the SLAM algorithm turns out to be very
slow.
For this reason, in this work, we propose the use of an R*-Tree data
structure to speed up the whole process, managing in an efficent way
both the estimated map and the 3D points clouds coming out from the
stereo camera.
The experimental section shows that the use of uncertain projective
geometry and the R*-Tree data structure improves the mapping and the
pose estimation. |
|
Paper Nr.: |
513
|
Title: |
Mutual Calibration of a Camera and a Laser Rangefinder
|
Author(s): |
Vincenzo Caglioti, Alessandro Giusti and Davide Migliore |
Abstract: |
We present a novel geometrical method for mutually calibrating a
camera and a laser rangefinder by exploiting the image of the laser dot in relation
to the rangefinder reading.
Our method simultaneously estimates all intrinsic parameters of a pinhole natural
camera, its position and orientation w.r.t. the rangefinder axis, and four parame-
ters of a very generic rangefinder model with one rotational degree of freedom.
The calibration technique uses data from at least 5 different rangefinder rota-
tions: for each rotation, at least 3 different observations of the laser dot and the
respective rangefinder reading are needed. Data collection is simply performed
by generically moving the rangefinder-camera system, and does not require any
calibration target, nor any knowledge of the environment or motion.
We investigate the theoretical limits of the technique as well as its practical ap-
plication; we also show extensions for using more data than strictly necessary or
exploit a priori knowledge of some parameters.
|
|
Paper Nr.: |
514
|
Title: |
Integration of Tracked and Recognized Features for Locally and Globally Robust Structure from Motion
|
Author(s): |
Chris Engels, Friedrich Fraundorfer and David Nistér |
Abstract: |
We
present a novel approach to structure from motion that integrates wide
baseline local features with tracked features to rapidly and robustly
reconstruct scenes from image sequences. Rather than assume we can
create and maintain a consistent and drift-free reconstructed map over
an arbitrarily long sequence, we instead create small, independent
submaps generated over short periods of time and attempt to link the
submaps together via recognized features. The tracked features provide
accurate pose estimates frame to frame, while the recognizable local
features stabilize the estimate over larger baselines and provide a
context for linking submaps together. As each frame in the submap is
inserted, we apply real-time bundle adjustment to maintain a high
accuracy for the submaps. Recent advances in feature-based object
recognition enable us to efficiently localize and link new submaps into
a reconstructed map within a localization and mapping context. Because
our recognition system can operate efficiently on many more features
than previous systems, our approach easily scales to larger maps. We
provide results that show that accurate structure and motion estimates
can be produced from a handheld camera under shaky camera motion. |
|
Paper Nr.: |
515
|
Title: |
Pose Clustering From Stereo Data
|
Author(s): |
Ulrich Hillenbrand |
Abstract: |
This
article describes an algorithm for pose or motion estimation based on
clustering of parameters in the 6-dimensional pose space. The parameter
samples are computed from data samples randomly drawn from stereo data
points. The estimator is global and robust, performing matches to parts
of a scene without prior pose information. It is general, in that it
does not require any particular object features. Empirical object
models can be built largely automatically. An implemented application
from the service robotic domain and a quantitative performance study on
real data are presented. |
|
|
|
The First International Workshop on Metadata Mining for Image Understanding (MMIU 2008)
|
Paper Nr.: |
411
|
Title: |
Combining Visual and Text Features for Learning in Multimedia Direct Marketing Domain
|
Author(s): |
Sebastiano Battiato, Giovanni Maria Farinella, Giovanni Giuffrida, Catarina Sismeiro and Giuseppe Tribulato |
Abstract: |
Direct marketing companies systematically dispatch the offers under
consideration to a limited sample of potential buyers, rank them with respect to
their performance and, based on this ranking, decide which offers to send to the
wider population. Though this pre-testing process is simple and widely used, recently
the direct marketing industry has been under increased pressure to further
optimize learning, in particular when facing severe time and space constraints.
Taking into account the multimedia nature of offers, which typically comprise
both a visual and text component, we propose a two-phase learning strategy based
on a cascade of regression methods. This proposed approach takes advantage of
visual and text features to improve and accelerate the learning process. Experiments
in the domain of a commercial Multimedia Messaging Service (MMS)
show the effectiveness of the proposed methods that improve on classical learning
techniques. |
|
Paper Nr.: |
417
|
Title: |
Automatic Image Annotation using Visual Content and Folksonomies
|
Author(s): |
Roland Mörzinger, Robert Sorschag, Georg Thallinger1 and Stefanie Lindstaedt |
Abstract: |
Automatic
image annotation is an important and challenging task in content-based
image retrieval.
This paper describes techniques for automatic image annotation by
taking advantage of collaboratively annotated image databases, so
called visual folksonomies. Our approach includes a classification and
tag propagation system using content-based image analysis.
Classification annotates images with a controlled vocabulary while tag
propagation uses user generated, folksonomic annotations and is
therefore capable of dealing with unlimited vocabulary. Experiments
with a pool of Flickr images demonstrate that the high accuracy and
efficiency of the proposed methods in the task of automatic image
annotation.
|
|
Paper Nr.: |
422
|
Title: |
Computational
Linguistics for Metadata Building (CLiMB) Text Mining for the Automatic
Extraction of Subject Terms for Image Metadata
|
Author(s): |
Judith L. Klavans, Tandeep Sidhu, Carolyn Sheffield, Dagobert Soergel
Jimmy Lin, Eileen Abels and Rebecca Passonneau |
Abstract: |
In
this paper, we present a fully-implemented system using computa-tional
linguistic techniques to apply automatic text mining for the extraction
of metadata for image access. We describe the implementation of a
workbench created for, and evaluated by, image catalogers. We discuss
the current func-tionality and future goals for this image catalogers’
toolkit, developed under the Computational Linguistics for Metadata
Building (CLiMB) research project.1 Our primary user group for initial
phases of the project is the cataloger expert; in future work we
address applications for end users. |
|
Paper Nr.: |
425
|
Title: |
Travel
Blog Assistant System (TBAS) - An Example Scenario of how to Enrich
Text with Images and Images with Text using Online Multimedia
Repositories
|
Author(s): |
Marco Bressan, Gabriela Csurka, Yves Hoppenot and Jean-Michel Renders |
Abstract: |
In
this paper we present a Travel Blog Assistant System that facilitates
the travel blog writing by automatically selecting for each blog
paragraph written by the user the mostr elevant images from an uploaded
image set. In order to do this, the system first automatically adds
metadata to the traveler's photos based both on a Generic Visual
Categorizer (visual keywords) and by exploiting cross-content web
repositories (textual keywords). For a given paragraph, the system
ranks the uploaded images according to the similarity between the
extracted metadata and the paragraph. The technology developed and
presented here has potential beyond travel blogs, which served just as
an illustrative example. Clearly, the same methodology can be used by
professional users in the fields of multimedia document generation and
automatic illustration and captioning.
|
|
Paper Nr.: |
437
|
Title: |
Describing the Where – Improving Image Annotation and Search through Geography
|
Author(s): |
Ross S. Purves, Alistair Edwardes and Mark Sanderson |
Abstract: |
Image
retrieval, using either content or text-based techniques, does not
match up to the current quality of standard text retrieval. One
possible reason for this mismatch is the semantic gap – the terms by
which images are indexed do not accord with those imagined by users
querying image databases. In this paper we set out to describe how
geography might help to index the where facet of the Pansofsky-Shatford
matrix, which has previously been shown to accord well with the types
of queries users make. We illustrate these ideas with existing (e.g.
identifying place names associated with a set of coordinates) and novel
(e.g. describing images using land cover data) techniques to describe
images and contend that such methods will become central as increasing
numbers of images become georeferenced. |
|
Paper Nr.: |
440
|
Title: |
Which Strategy to combine Face Identification Tools with Clothing Similarity - Contesting or Reinforcing?
|
Author(s): |
Saďd Kharbouche and Michel Plu |
Abstract: |
This
paper describes a novel and efficient approach that integrates clothing
similarity into face identification process in personal photos. The
information extracted from people's clothes would be helpful if they
are dissimilar, however, this information could make errors and noise
if we have some people with similar clothes. To resolve this problem,
we propose here a new and intelligent methodology that exploits
clothing similarity. The main idea is summarized as follows: if a
person is well identified in a detected face, instead to reinforce this
person in every face (in other photo) with similar clothes, we contest
her/him in every face with dissimilar clothes. The weight and the
influence of the information extracted from a face in a photo to
another face depend on the spatiotemporal distance between photos, the
similarity degree between the clothes and the incertitude level about
their real identities. We utilize belief functions theory in order to
manage efficiently the imprecision and the uncertainty. Besides, the
results obtained showed off the useful of our approach.
|
|
Paper Nr.: |
450
|
Title: |
Improved Image Retrieval using Visual Sorting and Semi-Automatic Semantic Categorization of Images
|
Author(s): |
Kai Uwe Barthel, Sebastian Richter, Anuj Goyal and Andreas Follmann |
Abstract: |
The
increasing use of digital images has led to the growing problem of how
to organize these images efficiently for search and retrieval.
Interpretation of what we see in images is hard to characterize, and
even more so to teach a machine such that any automated organization
can be possible. Due to this, both keyword-based Internet image search
systems and content-based image retrieval systems are not capable of
searching images according to the human high-level semantics of images.
In this paper we propose a new image search system using keyword
annotations, low-level visual metadata and semantic inter-image
relationships. The semantic relationships are learned exclusively from
the human users’ interaction with the image search system. Our system
can be used to search huge (web-based) image sets more efficiently.
However, the most important advantage of the new system is that it can
be used to generate semi-automatically semantic relationships between
the images. |
|
Paper Nr.: |
453
|
Title: |
Can Feature Information Interaction help for Information Fusion in Multimedia Problems?
|
Author(s): |
Jana Kludas, Eric Bruno and Stephane Marchand-Maillet |
Abstract: |
... |
|
Paper Nr.: |
455
|
Title: |
Extracting Semantic Meaning from Photographic Annotations using a Hybrid Approach
|
Author(s): |
Rodrigo Carvalho, Sam Chapman and Fabio Ciravegna |
Abstract: |
This
paper evaluates singular then hybrid methodologies for extracting
semantics considered to be relevant to users in cataloguing and
searching of personal photographs. This work concentrates upon
extraction of meaningful concepts within textual annotations focusing
around geographical identification, together with references to people
and objects concerning each image. Extraction considers a number of
approaches to achieve this goal; machine learning, rule based
approaches as well as a novel hybrid approach considering both previous
techniques. This evaluation identifies the strengths of the singular
approaches and defines rules best suited to differing extractions
providing a higher performing hybrid method.
|
|
Paper Nr.: |
488
|
Title: |
Functional Semantic Categories for Art History Text - Human Labeling and Preliminary Machine Learning
|
Author(s): |
Rebecca J. Passonneau, Tae Yano, Tom Lippincott and Judith Klavans |
Abstract: |
Descriptive metadata for indexing images of works of art can be classified
into a variety of functions, such as descriptions of the depicted work, versus
about the art historical impact of the work ([7]). Similarly, illustrated art history
survey texts address multiple topics pertaining to a given work. We report on an
effort to develop a set of functional semantic categories to classify text extracts
from art history survey texts, for use in locating specific classes of descriptive
metadata. Each category specifies a distinct relation between the depicted work
and the text, one that indicates the expository purpose the text serves. In a series
of pilot studies, we found that the ability of humans to label text consistently using
our categories varied widely, depending on a wide range of factors such as
the labeler’s area of expertise, the image-text pair under consideration, the constraints
placed on the labeling task, and the method used to introduce labelers to
the categories. Based on these studies, we implemented a labeling interface which
we have used to collect the first 10% to 20% of a large dataset of text that will be
used in training and testing a machine learner. Initial machine learning results on
our pilot data indicate the three most relevant categories are machine learnable. |
|
|
|
The First International Workshop on Image Mining. Theory and Applications (IMTA 2008)
|
Paper Nr.: |
435
|
Title: |
Text-Dependent Speaker Identification using Spectrograms based on Conditional Quantization
|
Author(s): |
Tridibesh Dutta |
Abstract: |
The
goal of this paper is to study a new approach to text dependent speaker
identification using spectrograms. This, mainly, revolves around
trapping the complex patterns of variation in frequency and amplitude
with time while an individual utters a given word through spectrogram
segmentation. These optimally segmented spectrograms are used as a
database to successfully identify the unknown individual from his/her
voice. The methodology used for identifying, rely on classification of
spectrograms (of speech signals), based on clustering of the quantized
frequency-time domain features of the database spectrogram samples and
the unknown speech sample. Performance of this novel approach on a
sample collected from 40 speakers show that this methodology can be
effectively used to produce a desirable success rate. |
|
Paper Nr.: |
439
|
Title: |
Fast Multi-View Evaluation of Data Represented by Symmetric Clusters
|
Author(s): |
Alexander Vinogradov |
Abstract: |
A
new framework is proposed for a fast calculation of linear scalings
posed on structured data. Several widely used types of data
representation based on clusters with intrinsic features of local
simmetry are taken into account. Paper presents some Image Mining
technologies that are used for improvement of abstract data multi-view
evaluation procedures. |
|
Paper Nr.: |
441
|
Title: |
Search Algorithm and the Distortion Analysis of Fine Details of Real Images
|
Author(s): |
Sai S.V. and Sorokin N.Yu. |
Abstract: |
This
work describes a search algorithm and a method of the distortions
analysis of fine details of real images based on objective criteria. |
|
Paper Nr.: |
448
|
Title: |
Elements of a Gestalt Algebra: Steps towards understanding Images and Scenes
|
Author(s): |
Eckart Michaelsen, Michael Arens and Leo Doktorski |
Abstract: |
A
mathematical structure is sketched that is meant to capture the
regularities and hierarchies in the structure of images. The approach
is motivated by difficulties arising from aerial image analysis of
urban terrain. It is not feasible to list and model all possibilities
for things such as buildings that occur in such data. Emanating from
the Gestalt-theory of perception an abstract algebra of operations on
image objects is defined and the formal properties are discussed. It is
intended to build future software system on such formalisms that will
realize only those gestalt models that are evident from the data and
can build and recognize structures of previously unseen and unexpected
structure. |
|
Paper Nr.: |
449
|
Title: |
A Proposal for Automatic Inference of Pressure Ulcers Grade Based on Wound Images and Patient Data
|
Author(s): |
Rinaldo de S. Neves, Simônia F. Silva, Edvar F.Rocha Jr.
Levy A. Santana, Renato Guadagnin and Edílson Ferneda |
Abstract: |
Pressure
ulcers (PU) occurs in a significant amount of patients that cannot move
for long periods. Data from patient concerning both their individual
features and wound origin are collected. PU images and medical
diagnosis about PU grade can be stored. Such sets of information can be
submitted to data mining procedures in order to be detected some
relations between data. Is seems to be also possible computationally to
generate a PU grade inference that will help medical experts to
accomplish therapeutic procedures. Present proposal aims so to sPUport
PU diagnosis process and so to accelerate healing process towards
important benefits for a better patients life quality with lower
medical assistance costs. |
|
Paper Nr.: |
484
|
Title: |
Descriptive Approach to Medical Image Analysis - Substantiation and Interpretation
|
Author(s): |
I. Gurevich, V. Yashina, H. Niemann and O. Salvetti |
Abstract: |
The
paper is devoted to the development and formal representation of the
descriptive model of information technology for automating morphologic
analysis of cytological specimens (lymphatic system tumors). The main
contributions are detailed description of algebraic constructions used
for creating of mathematical model of information technology and its
specification in the form of algorithmic scheme based on Descriptive
Image Algebras. It is specified the descriptive model of an image
recognition task and the stage of an image reduction to a recognizable
from. The theoretical base of the model is the Descriptive Approach to
Image Analysis and its main mathematical tools. It is demonstrated
practical application of algebraic tools of the Descriptive Approach to
Image Analysis and presented an algorithmic scheme of a technology
implementing the apparatus of Descriptive Image Algebras. |
|
Paper Nr.: |
485
|
Title: |
Descriptive Analysis of Image Data: Basic Models
|
Author(s): |
I. Gurevich and V. Yashina |
Abstract: |
The
paper is devoted to the foundations, general methodology, the axiomatic
and formal structures of the Descriptive Theory for Image Analysis
(DTIA) providing a methodology, mathematical and computational
techniques for automation of image analysis and estimation (IAE). The
main purpose of theoretical apparatus of the DTIA is structuring of the
variety of methods, operations and representations being used in IEA.
The final goal of the DTIA is automated image mining: a) automated
selection of techniques and algorithms for image recognition,
estimation, and understanding; b) automated testing of the raw data
quality and its suitability for solving the image recognition problem.
The DTIA provides mathematical fundamentals for image mining. The
axiomatics and formal structures of Descriptive Theory of Image
Analysis provide the ways and means to represent and to describe images
for its analysis and estimating. The main contributions of axiomatics
are Descriptive Image Models: its definitions, classification,
properties, interrelations, and conditions of generation |
|
Paper Nr.: |
495
|
Title: |
Geo-located Image Categorization and Location Recognition
|
Author(s): |
Marco Cristani, Alessandro Perina, Umberto Castellani and Vittorio Murino |
Abstract: |
Image categorization is undoubtedly one of the most recent and
challenging problems faced in Computer Vision. The scientific
literature is plenty of methods more or less efficient and dedicated
to a specific class of images; further, commercial systems are also
going to be advertised in the market. Nowadays, additional data can
also be attached to the images, enriching its semantic
interpretation beyond the pure appearance. This is the case of
geo-location data that contain information about the geographical
place where an image has been acquired. This data allow, if not
require, a different management of the images, for instance, to the
purpose of easy retrieval from a repository, or of identifying the
geographical place of an unknown picture, given a geo-referenced
image repository. This paper constitutes a first step in this sense,
presenting a method for geo-referenced image categorization, and for
the recognition of the geographical location of an image without
such information available. The solutions presented are based on
robust pattern recognition techniques, such as the probabilistic
Latent Semantic Analysis, the Mean Shift clustering and the Support
Vector Machines. Experiments have been carried out on a couple of
geographical image databases: results are actually very promising,
opening new interesting challenges and applications in this research
field. |
|
Paper Nr.: |
500
|
Title: |
Pearling: Stroke segmentation with crusted pearl strings
|
Author(s): |
B. Whited, J. Rossignac, G. Slabaugh, T. Fang and G. Unal |
Abstract: |
We introduce a novel segmentation technique, called Pearling, for the
semi-automatic extraction of idealized models of networks of strokes (variable
width curves) in images. These networks may for example represent roads in an
aerial photograph, vessels in a medical scan, or strokes in a drawing. The operator
seeds the process by selecting representative areas of good (stroke interior) and
bad colors. Then, the operator may either provide a rough trace through a particu-
lar path in the stroke graph or simply pick a starting point (seed) on a stroke and a
direction of growth. Pearling computes in realtime the centerlines of the strokes,
the bifurcations, and the thickness function along each stroke, hence producing a
purified medial axis transform of a desired portion of the stroke graph. No prior
segmentation or thresholding is required. Simple gestures may be used to trim
or extend the selection or to add branches. The realtime performance and relia-
bility of Pearling results from a novel disk-sampling approach, which traces the
strokes by optimizing the positions and radii of a discrete series of disks (pearls)
along the stroke. A continuous model is defined through subdivision. By design,
the idealized pearl string model is slightly wider than necessary to ensure that it
contains the stroke boundary. A narrower core model that fits inside the stroke
is computed simultaneously. The difference between the pearl string and its core
contains the boundary of the stroke and may be used to capture, compress, visu-
alize, or analyze the raw image data along the stroke boundary. |
|
Paper Nr.: |
503
|
Title: |
Automatic Target Retrieval in a Video-Surveillance Task
|
Author(s): |
Davide Moroni and Gabriele Pieri |
Abstract: |
In this paper we face the automatic target search problem. While performing
an object tracking task, we address the problem of identifying a previously
selected target when it is lost due to masking, occlusions, or quick and unexpected
movements. Firstly a candidate target is identified in the scene through
motion detection techniques, subsequently using a semantic categorization and
content based image retrieval techniques, the candidate target is identified whether
it is the correct one (i.e. the previous lost target), or not. Content Based Image Retrieval
serves as support to the search problem and is performed using a reference
data base which was populated a priori. |
|
Paper Nr.: |
504
|
Title: |
Learning Probabilistic Models for Recognizing Faces under Pose Variations
|
Author(s): |
M. Saquib Sarfraz and Olaf Hellwich |
Abstract: |
Recognizing
a face from a novel view point poses major challenges for automatic
face recognition. Recent methods address this problem by trying to
model the subject specific appearance change across pose. For this,
however, almost all of the existing methods require a perfect alignment
between a gallery and a probe image. In this paper we present a pose
invariant face recognition method centered on modeling joint appearance
of gallery and probe images across pose in a probabilistic framework.
We propose novel extensions in this direction by introducing to use a
more robust feature description as opposed to pixel-based appearances.
Using such features we propose to synthesize the non-frontal views to
frontal. Furthermore, using local kernel density estimation, instead of
commonly used normal density assumption, is proposed to derive the
prior models. Our method does not require any strict alignment between
gallery and probe images which makes it particularly attractive as
compared to the existing state of the art methods. Improved recognition
across a wide range of poses has been achieved using these extensions. |
|
Paper Nr.: |
506
|
Title: |
Shape Modeling for the Analysis of Heart Deformation Patterns
|
Author(s): |
Davide Moroni, Sara Colantonio, Ovidio Salvetti and Mario Salvetti |
Abstract: |
In this paper, we present an approach to the description of
time-varying anatomical structures. The main goal is to compactly
but faithfully describe the whole heart cycle in such a way to allow
for deformation pattern characterization and assessment. Using such
an encoding, a reference database can be built, thus permitting
similarity searches or data mining procedures. |
|
Paper Nr.: |
508
|
Title: |
Media Analysis and the Algorithm Ontology
|
Author(s): |
Patrizia Asirelli, Sara Colantonio, Suzanne Little,
Massimo Martinelli and Ovidio Salvetti |
Abstract: |
Media analysis algorithms are used for a variety of purposes. They
may improve media facets such as contrast or signal-to-noise ratio or extract lowlevel
details such as MPEG-7 features to be used in data mining and other higherlevel
processing. However, algorithms are difficult to manage, understand and
apply in particular for non-expert users. Therefore we are developing an algorithm
ontology to support identification, aggregation and recording of algorithms
for media analysis. This is especially useful for domains with high-volumes of
complex media objects to investigate and integrate. Algorithms for media analysis
may be applied at multiple points within a typical multimedia lifecycle. This
article discusses a proposed algorithm ontology to support identification, retrieval
and application of multimedia analysis processes and its application to metadata
management and multimedia interoperability. |
|
Paper Nr.: |
510
|
Title: |
An Image Mining Medical Warehouse
|
Author(s): |
Sara Colantonio, Igor B. Gurevich, Ovidio Salvetti and Yulia Trusova |
Abstract: |
Advances in medical imaging technologies have assured the availability
of more and more precise and detailed images whose analysis has became a
necessary step in the diagnostic, prognostic and monitoring processes of main
pathologies. Such development has stressed the need for advanced systems that
are not limited to storage and management but include intelligent representation
and retrieval of images. In this paper, we report current results of a medical
warehouse we are developing for mining medical images, thus offering medical
experts and researchers the possibility of storing, retrieving, analyzing and investigating
biomedical images to discover novel knowledge relevant to diagnostic
processes |
|
|
|
|