VISAPP 2010 Abstracts


Area 1 - Image Formation and Processing

Full Papers
Paper Nr: 16
Title:

NEURAL IMAGE RESTORATION FOR DECODING 1-D BARCODES USING COMMON CAMERA PHONES

Authors:

A. Zamberletti, I. Gallo, M. Carullo and E. Binaghi

Abstract: The existing open-source libraries for 1-D barcodes recognition are not able to recognize the codes from images acquired using simple devices without autofocus or macro function. In this article we present an improvement of an existing algorithm for recognizing 1-D barcodes using camera phones with and without autofocus. The multilayer feedforward neural network based on backpropagation algorithm is used for image restoration in order to improve the selected algorithm. Performances of the proposed algorithm were compared with those obtained from available open-source libraries. The results show that our method makes possible the decoding of barcodes from images captured by mobile phones without autofocus.

Paper Nr: 43
Title:

ESTIMATION OF CURVATURES IN POINT SETS BASED ON GEOMETRIC ALGEBRA

Authors:

Helmut Seibert, Dietmar Hildenbrand, Meike Becker and Arjan Kuijper

Abstract: For applications like segmentation, feature extraction and classification of point sets it is essential to know the principal curvatures and the corresponding principal directions. For the purpose of curvature estimation conformal geometric algebra promises to be a natural mathematical language: Local curvatures can be described with the help of osculating circles or spheres. On one hand, conformal geometric algebra is able to directly compute with these geometric objects, as well as with lines and planes needed for the description of vanishing curvature. On the other hand, distance measures for fitting these objects into point sets can be handled in a linear way, leading to efficient algorithms. In this paper we use conformal geometric algebra advantageously in order to locally compute continuous curvatures as well as principal curvatures of point sets without the need of costly pre-processing of raw data. We show results on artificial and real data. Numerical verification on artificial data shows the accuracy of our approach. Furthermore, the results are obtained in a fast manner and are also visually satisfactory.

Paper Nr: 63
Title:

REAL-TIME VISUAL ODOMETRY FOR GROUND MOVING ROBOTS USING GPUS

Authors:

Michael Schweitzer, Alois Unterholzner and Hans-Joachim Wuensche

Abstract: This paper introduces a novel visual odometry framework for ground moving robots. Recent work showed that assuming non-holonomic motion can simplify the ego motion estimation task to one yaw and one scale parameter. Furthermore, a very efficient way of computing image frame to frame correspondences for those robots was presented by skipping rotational invariance and optimizing keypoint extraction and matching for massive parallelism on a GPU. Here, we combine both contributions to a closed framework. Long term correpondences are preserved, classified and stablized by motion prediction, building up and keeping a trusted map of depth-registered keypoints. We also allow other ground moving objects. From this map, the ego motion is infered, extended by constrained rotational perturbations in pitch and roll. A persistent focus is on keeping algorithms suitable for parallelization and thus achieving up to one hundred frames per second. Experiments are carried out to compare against ground-truth given by DGPS and IMU data.

Paper Nr: 64
Title:

SHAPE FROM SHADINGS UNDER PERSPECTIVE PROJECTION AND TURNTABLE MOTION

Authors:

Miaomiao Liu and Kwan-Yee K. Wong

Abstract: Two-Frame-Theory is a recently proposed method for 3D shape recovery. It estimates shape by solving a first order quasi-linear partial differential equation through the method of characteristics. One major drawback of this method is that it assumes an orthographic camera which limits its application. This paper re-examines the basic idea of the Two-Frame-Theory under the assumption of a perspective camera, and derives a first order quasi-linear partial differential equation for shape recovery under turntable motion. Dynamic programming is used here to provide the Dirichlet boundary condition. The proposed method is tested against synthetic and real data. Experimental results show that perspective projection can be used in the framework of Two-Frame-Theory, and competitive results can be achieved.

Paper Nr: 116
Title:

SHAPE AND SIZE FROM THE MIST - A Deformable Model for Particle Characterization

Authors:

Anders Dahl, Thomas Martini Jørgensen, Phanindra Gundu and Rasmus Larsen

Abstract: Process optimization often depends on the correct estimation of particle size, their shape and their concentration. In case of the backlight microscopic system, which we investigate here, particle images suffer from out-of-focus blur. This gives a bias towards overestimating the particle size when particles are behind or in front of the focus plane. In most applications only in-focus particles get analyzed, but this weakens the statistical basis and requires either particle sampling over longer time or results in uncertain predictions. We propose a new method for estimating the size and the shape of the particles, which includes out-of-focus particles. We employ particle simulations for training an inference model predicting the true size of particles from image observations. This also provides depth information, which can be used in concentration predictions. Our model shows promising results on real data with ground truth depth, shape and size information. The outcome of our approach is a reliable particle analysis obtained from shorter sampling time.

Paper Nr: 124
Title:

SKELETON REPRESENTATION BASED ON COMPOUND BEZIER CURVES

Authors:

Leonid Mestetskiy

Abstract: A new method to describe the skeleton of a polygonal figure is presented. The skeleton is represented as a planar graph, whose edges are linear and quadratic Bezier curves. The description of a radial function in Bezier splines form is given. An algorithm to calculate control polygons of Bezier curves is proposed. Also, we introduce a new representation of skeleton as a straight planar control graph of a compound Bezier curve. We show that such skeleton representation allows simple visualization and easy-to-use skeleton processing techniques for image processing.

Paper Nr: 127
Title:

ADAPTIVE PATCH-BASED INPAINTING FOR IMAGE BLOCK RECOVERY

Authors:

Yunqiang Liu, Jin Wang and Huanhuan Zhang

Abstract: This paper presents an adaptive patch-based inpainting algorithm for image block recovery in block-based coding image transmission. The proposed approach is based on prior information - patch similarity within the image. By taking advantage of the information, we recover the lost pixels by copying pixel values from the source based on a similarity criterion to keep local continuity. The pixel recovery is performed in a sequential fashion in which the recovered pixels can be used in the recovery process afterwards. In order to alleviate the error propagation with sequential recovery, we proposed an adaptive combination strategy which merges different directional recovered pixels according to the confidence of the estimated recovery performance. Experimental results show that the proposed method provides significant gains in both subjective and objective measurements.

Paper Nr: 152
Title:

DESIGN OF A CUSTOMIZED PATTERN FOR IMPROVING COLOR CONSTANCY ACROSS CAMERA AND ILLUMINATION CHANGES

Authors:

Hazem Wannous, Sylvie Treuillet, Yves Lucas, Alamin Mansouri and Yvon Voisin

Abstract: This paper adresses the problem of color constancy on a large image database acquired with varying digital cameras and lighting conditions. Automatic white balance control proposed by an available commercial camera is not sufficient to provide reproducible color classification. A device-independent color representation may be obtained by applying a chromatic adaptation transform, from a calibrated color checker pattern included in the field of view. Instead of using the standard Macbeth color checker, we suggest to select judicious colors to design a customized pattern from contextual information. A comparative study demonstrates that this approach insures a stronger constancy of the interesting colors before the vision control.

Paper Nr: 190
Title:

FAST DUAL MINIMIZATION OF WEIGHTED TV + L1-NORM FOR SALT AND PEPPER NOISE REMOVAL

Authors:

S. Jehan-Besson and Jonas Koko

Abstract: In this paper, the minimization of a weighted total variation regularization term (denoted TVg) with L1 norm as the data fidelity term is addressed using the Uzawa block relaxation method. Numerical experiments show the availability of our algorithm for salt and pepper noise removal and its robustness against the choice of the penalty parameter. This last property is useful to attain the convergence in a reduced number of iterations leading to efficient numerical schemes. The specific role of the function g in the weighted total variation term is also investigated and we show that an appropriate choice leads to a significant improvement of the final denoising results. Using this function, we propose a whole algorithm for salt and pepper noise removal (UBR-EDGE) that is able to handle high noise levels at a low computational cost.

Paper Nr: 202
Title:

CAN ANISOTROPIC IMAGES BE UPSAMPLED?

Authors:

Mads F. Hansen, Thomas H. Mosbech, Hildur Ólafsdóttir, Michael S. Hansen and Rasmus Larsen

Abstract: This paper presents a novel method for upsampling anisotropic medical gray-scale images. The resolution is increased by fitting an image function, modeled by cubic B-splines, to the slices. The method simulates the observed slices with an image function and iteratively updates the function by comparing the simulated slices with observed slices. The approach handles partial voluming by modeling the thickness of the slices. The formulation is ill-posed, and thus a prior needs to be included. Correspondences between adjacent slices are established using a symmetric registration method with a free-form deformation model. The correspondences are then converted into a prior that penalizes gradients along lines of correspondence. Tests on the Shepp-Logan phantom show promising results, and the approach performs better than methods such as cubic interpolation and one-way registration-based interpolation.

Short Papers
Paper Nr: 62
Title:

COMPUTATIONALLY EFFICIENT SERIAL COMBINATION OF ROTATION-INVARIANT AND ROTATION COMPENSATING IRIS RECOGNITION ALGORITHMS

Authors:

Mario Konrad, Herbert Stögner, Andreas Uhl and Peter Wild

Abstract: Rotation compensation is one of the computational bottlenecks in large scale iris-based identification schemes, since a significant amount of Hamming distance computations is required in a single match due to the necessary shifting of the iris codes to compensate for eye tilt. To cope with this problem, a serial classifier combination approach is proposed for iris-based identification, combining rotation-invariant pre-selection with a traditional rotation compensating iris code-based scheme. The primary aim, a reduction of computational complexity, can easily be met - at comparable recognition accuracy, the computational effort required is reduced to 20% or even less of the fully fledged iris code based scheme. As a by-product, the recognition accuracy is shown to be additionally improved in open-set scenarios.

Paper Nr: 68
Title:

A NOVEL PERFORMANCE METRIC FOR GREY-SCALE EDGE DETECTION

Authors:

Ian Williams, David Svoboda and Nicholas Bowring

Abstract: This paper will discuss grey-scale edge detection evaluation techniques. It will introduce three of the most common edge comparison methods and assess their suitability for grey-scale edge detection evaluation. This suitability evaluation will include Pratt’s Figure Of Merit (FOM), Bowyer’s Closest Distance Metric (CDM), and Prieto and Allen’s Pixel Correspondence Metric. The relative merits of each method will be discussed alongside the inconsistencies inherent to each technique. Finally, a novel performance criterion for grey-scale edge comparison, the Grey-scale Figure Of Merit (GFOM) will be introduced which overcomes some of the evaluation faults discussed. Furthermore, a new technique for assessing the relative connectivity of detected edges will be described and evaluated. Overall this will allow a robust and objective method of gauging edge detector performance.

Paper Nr: 137
Title:

COMBINED MACHINE LEARNING WITH MULTI-VIEW MODELING FOR ROBUST WOUND TISSUE ASSESSMENT

Authors:

Hazem Wannous, Yves Lucas and Sylvie Treuillet

Abstract: From colour images acquired with a hand held digital camera, an innovative tool for assessing chronic wounds has been developed. It combines both types of assessment, colour analysis and dimensional measurement of injured tissues in a user-friendly system. Colour and texture descriptors have been extracted and selected from a sample database of wound tissues, before the learning stage of a support vector machine classifier with perceptron kernel on four categories of tissues. Relying on a triangulated 3D model captured using uncalibrated vision techniques applied on a stereoscopic image pair, a fusion algorithm elaborates new tissue labels on each model triangle from each view. The results of 2D classification are merged and directly mapped on the mesh surface of the 3D wound model. The result is a significative improvement in the robustness of the classification. Real tissue areas can be computed by retro projection of identified regions on the 3D model.

Paper Nr: 150
Title:

BIDIRECTIONAL HIERARCHICAL NEURAL NETWORKS - Hebbian Learning Improves Generalization

Authors:

Mohammad Saifullah, Rita Kovordanyi and Chandan Roy

Abstract: Visual pattern recognition is a complex problem, and it has proven difficult to achieve satisfactorily in standard three-layer feed-forward artificial neural networks. For this reason, an increasing number of researchers are using networks whose architecture resembles the human visual system. These biologically-based networks are bidirectionally connected, use receptive fields, and have a hierarchical structure, with the input layer being the largest layer, and consecutive layers getting increasingly smaller. These networks are large and complex, and therefore run a risk of getting overfitted during learning, especially if small training sets are used, and if the input patterns are noisy. Many data sets, such as, for example, handwritten characters, are intrinsically noisy. The problem of overfitting is aggravated by the tendency of error-driven learning in large networks to treat all variations in the noisy input as significant. However, there is one way to balance off this tendency to overfit, and that is to use a mixture of learning algorithms. In this study, we ran systematic tests on handwritten character recognition, where we compared generalization performance using a mixture of Hebbian learning and error-driven learning with generalization performance using pure error-driven learning. Our results indicate that injecting even a small amount of Hebbian learning, 0.01 %, significantly improves the generalization performance of the network.

Paper Nr: 156
Title:

USING ASSOCIATION RULES AND SPATIAL WEIGHTING FOR AN EFFECTIVE CONTENT BASED-IMAGE RETRIEVAL

Authors:

Ismail Elsayad, Jean Martinet, Thierry Urruty, Taner Danisman, Haidar Sharif and Chabane Djeraba

Abstract: Nowadays, having effective methods for accessing the desired images is essential with the huge amount of digital images. The aim of this paper is to build a meaningful mid-level representation of visual documents to be used later for matching between the query image and other images in the desired database. The approach is based firstly on constructing different visual words using local patch extraction and fusion of descriptors. Then, we represent the spatial constitution of an image as a mixture of n Gaussians in the feature space. Finally, we extract different association rules between frequent visual words in the local context of the image to construct visual phrases. Experimental results show that our approach outperforms the results of traditional image retrieval techniques.

Paper Nr: 185
Title:

RERANKING WITH CONTEXTUAL DISSIMILARITY MEASURES FROM REPRESENTATIONAL BREGMAN K-MEANS

Authors:

Olivier Schwander and Frank Nielsen

Abstract: We present a novel reranking framework for Content Based Image Retrieval (CBIR) systems based on contextual dissimilarity measures. Our work revisit and extend the method of Perronnin et al. (Perronnin et al., 2009) which introduces a way to build contexts used in turn to design contextual dissimilarity measures for reranking. Instead of using truncated rank lists from a CBIR engine as contexts, we rather use a clustering algorithm to group similar images from the rank list. We introduce the representational Bregman divergences and further generalize the Bregman k-means clustering by considering an embedding representation. These representation functions allows one to interpret a-divergences/projections as Bregman divergences/projections on a-representations. Finally, we validate our approach by presenting some experimental results on ranking performances on the INRIA Holidays database.

Paper Nr: 203
Title:

FACE LOG GENERATION FOR SUPER RESOLUTION USING LOCAL MAXIMA IN THE QUALITY CURVE

Authors:

Kamal Nasrollahi and Thomas B. Moeslund

Abstract: Using faces of small sizes and low qualities in surveillance videos without utilizing some super resolution algorithms for their enhancement is almost impossible. But these algorithms themselves need some kind of assumptions like having only slight motions between low resolution observations, which is not the case in real situations. Thus a very fast and reliable method based on the face quality assessment has been proposed in this paper for choosing low resolution observations for any super resolution algorithm. The proposed method has been tested using real video sequences.

Paper Nr: 204
Title:

ROBUST GRAYSCALE CONVERSION FOR VISION-SUBSTITUTION SYSTEMS

Authors:

Codruta Orniana Ancuti, Cosmin Ancuti and Philippe Bekaert

Abstract: Substitution systems have proved an important potential in mobility assistance for visually disable persons. Particularly, proficient users of auditory-vision substitution are able to identify and reconstruct visual targets. The content of non-visual image is simplified with the purpose to minimize the cognitive process for recognition and also to reduce the duration of the sound patterns. Motivated by these facts, many of the existing substitution systems discard the color information by dealing with grayscale images. This paper presents a robust and effective method of color-to-gray transformation, that preserves the original color contrast of the initial images but also the original saliency. The study is focused taken into consideration the hypothesis that visual salient areas are tightly connected with visual attention. We show that an appropriate translation allows a more accurate rendering of the important image regions but that creates a better mental representation of the environment.

Paper Nr: 248
Title:

APPLICATION OF SELF-QUOTIENT ε- FILTER TO IMPULSE NOISE CORRUPTED IMAGE

Authors:

Mitsuharu Matsumoto

Abstract: This paper introduces an application of self-quotient ε-filter (SQEF) to impulse noise corrupted images. SQEF is an improved self-quotient filter (SQF) to extract the image feature from noise corrupted image. Although SQF is a simple nonlinear filter to extract the feature from the image, it cannot extract the feature from the noise corrupted image. On the other hand, SQEF can extract the feature not only when it is applied to the clear image but also when it is applied to the noise corrupted image. In this paper, we especially focus on feature extraction from impulse noise corrupted image, and investigate the effectiveness of self-quotient ε-filter to impulse noise corrupted images.

Posters
Paper Nr: 51
Title:

RADIOMETRIC RANGE IMAGE FILTERING FOR TIME-OF-FLIGHT CAMERAS

Authors:

Faisal Mufti and Robert Mahony

Abstract: Time-of-Flight (TOF) imaging devices provide distance measurements between the sensor and an observed target over a full image array at video frame rate. An essential step in the development of these devices is an understanding of the reliability of noisy range image data. This paper provides a unified frame work for TOF camera measurement and a radiometric reflectance model. A statistical analysis of the radiometric model is used to develop a range pixel reliability criterion to identify range errors. The radiometric model is verified using real data and the proposed range criterion is experimentally verified.

Paper Nr: 71
Title:

TIME-WEIGHTED EVALUATION OF IMAGE SEGMENTATION WITH A GENETIC ALGORITHM

Authors:

Hassan Almuhairi, Martin Fleury and Adrian F. Clark

Abstract: The performance of a segmentation algorithm can be evaluated by systematic comparison with hand-segmented ground-truth images. When evaluation extends over an algorithm's parameter space, then the search for satisfactory settings has a considerable cost in time. This paper considers applying a genetic algorithm (GA) to avoid an exhaustive search. To further reduce evaluation time and subsequent image batch-processing times, this paper introduces a time factor into the GA cost function. This procedure while preserving the GA solution, selection of parameters to minimize the fit to hand-segmented images, also improves interpretation and parameter selection.

Paper Nr: 114
Title:

AUTOMATIC VIDEO ZOOMING FOR SPORT TEAM VIDEO BROADCASTING ON SMART PHONES

Authors:

Fabien Lavigne, Fan Chen and Xavier Desurmont

Abstract: This paper presents a general framework to adapt the size of a sport team video extracted from TV to a small device screen. We use a soccer game context to describe the four main steps of our video processing framework: (1) A view type detector helps to decide whether the current frame of the video has to be resized or not. (2) If the camera point of view is far, a ball detector localizes the interesting area of the scene. (3) Then, the current frame is resized and centred on the ball, taking into account some parameters, such as the ball position and its speed. (4) At the end of the process, the score banner is detected and removed by an inpainting method.

Paper Nr: 117
Title:

FREE SPACE COMPUTATION FROM STOCHASTIC OCCUPANCY GRIDS BASED ON ICONIC KALMAN FILTERED DISPARITY MAPS

Authors:

Carsten Høilund, Thomas B. Moeslund, Claus B. Madsen and Mohan M. Trivedi

Abstract: This paper presents a method for determining the free space in a scene as viewed by a vehicle-mounted camera. Using disparity maps from a stereo camera and known camera motion, the disparity maps are first filtered by an iconic Kalman filter, operating on each pixel individually, thereby reducing variance and increasing the density of the filtered disparity map. Then, a stochastic occupancy grid is calculated from the filtered disparity map, providing a top-down view of the scene where the uncertainty of disparity measurements are taken into account. These occupancy grids are segmented to indicate a maximum depth free of obstacles, enabling the marking of free space in the accompanying intensity image. The test shows successful marking of free space in the evaluated scenarios in addition to significant improvement in disparity map quality.

Paper Nr: 120
Title:

EVALUATION OF DENOISING METHODS WITH RAW IMAGES AND PERCEPTUAL MEASURES

Authors:

Matteo Pedone, Janne Heikkilä, Jarno Nikkanen, Leena Lepistö and Timo Kaikumaa

Abstract: In this paper we present a performance evaluation of different state-of-the-art denoising method, applied to RAW images in Bayer pattern format. Several measures for assessing objective quality are considered. We also propose, a novel and straightforward extension to the SSIM-Index that handles color information. The evaluation is divided in two stages: first an entire set of images is artificially degraded and then restored with the considered denoising/demosaicking methods. The second stage involved a subjective evaluation with real noisy RAW images. We observed that the resulting qualities of the considered denoising methods are in agreement between the two different evaluation stages, and the best performing algorithms are easily identified. Moreover, the proposed extension of the SSIM-Index proved to behave more consistently in respect to the artifacts introduced by the denoising algorithms, and its outcome was always in fair accordance with the subjective perceived quality.

Paper Nr: 142
Title:

IMPLEMENTATION ANALYSIS FOR A HYBRID PARTICLE FILTER ON AN FPGA BASED SMART CAMERA

Authors:

I. Zuriarrain, N. Arana and F. Lerasle

Abstract: Design and development of embedded devices which perform computer vision related task presents many challenges, many of which stem from attempting to fit the complexity of many higher level vision algorithms into the constraints presented by programmable embedded devices. In this paper, we follow a simulation-based methodology in order to develop an architecture which will allow us to implement a mixed Particle Filter/Markov Chain Monte Carlo tracking algorithm in an FPGA-based smart camera, using tools such as SystemC and Transaction LevelModeling (TLM). Use of these tools has allowed us to make some preliminary predictions as to the memory usage and performance of the system, which will be compared to the results of more detailed simulations obtained in the way towards implementing this system.

Paper Nr: 167
Title:

TOWARDS AUTOMATED CROP YIELD ESTIMATION - Detection and 3D Reconstruction of Pineapples in Video Sequences

Authors:

Supawadee Chaivivatrakul, Jednipat Moonrinta and Matthew N. Dailey

Abstract: Towards automation of crop yield estimation for pineapple fields, we present a method for detection and 3D reconstruction of pineapples from a video sequence acquired, for example, by a mobile field robot. The detection process incorporates the Harris corner detector, the SIFT keypoint descriptor, and keypoint classification using a SVM. The 3D reconstruction process incorporates structure from motion to obtain a 3D point cloud representing patches of the fruit's surface followed by least squares estimation of the quadric (in this case an ellipsoid) best fitting the 3D point cloud. We performed three experiments to establish the feasibility of the method. Experiments 1 and 2 tested the performance of the Harris, SIFT, and SVM method on indoor and outdoor data. The method achieved a keypoint classification accuracy of 87.79% on indoor data and 76.81% on outdoor data, against base rates of 81.42% and 53.83%, respectively. In Experiment 3, we performed 3D reconstruction from indoor data. The method achieved an average of 34.96% error estimating the ratio of the fruits' major axis to short axis length. Future work will focus on increasing the robustness and accuracy of the 3D reconstruction method as well as resolving the 3D scale ambiguity.

Paper Nr: 181
Title:

CONTENT BASED IMAGE RETRIEVAL USING SPATIAL RELATIONSHIPS BETWEEN DOMINANT COLOURS OF IMAGE SEGMENTS

Authors:

Hasitha Bimsara Ariyaratne and Koichi Harada

Abstract: Content Based Image Retrieval (CBIR) is a quickly evolving area in computer vision and image processing due to the ever increasing number of digital images. Therefore efficient indexing is a vital part in image retrieval systems. Since the ultimate goal of any CBIR system is to simulate the Human visual system (HVS), applying some of the fundamental concepts used in HVS for identifying images such as colour, position size and shape could greatly help enhance the accuracy. Therefore, this research proposes a simple yet effective text based indexing scheme that relies on spatial relationships among dominant colours of image segments. A new connected component labelling approach along with an efficient graph based image segmentation algorithm is used for segment identification. The indexing scheme is capable of identifying both complete and partial image matches. Experiments carried out using different sets of images have yielded promising results, validating the concept’s viability for Content Based Image Retrieval.

Paper Nr: 256
Title:

MULTIPLE-VIEWPOINT IMAGE STITCHING

Authors:

Kai-Chi Chan and Yiu-Sang Moon

Abstract: A wide view image can be generated from a collection of images. Its field of view can be expanded as much as to capture a 360º scene. Common approaches, like panorama, mosaic, assume all source images are taken at the same camera center by pure rotation. However, this assumption limits the quality and feasibility of the generated images. In this paper, the problem of generating a wide view image from multiple viewpoint images is formulated. A simple and novel way is proposed to loosen the single viewpoint constraint. A wide view image is generated by 1) transforming images from different viewpoints into a unified viewpoint using SIFT feature matching, etc; 2) stitching the transformed images together by overlapping. Test results demonstrate that the proposed method is an efficient way for stitching images from different viewpoints.

Area 2 - Image Analysis

Full Papers
Paper Nr: 23
Title:

HYPERACCURATE ELLIPSE FITTING WITHOUT ITERATIONS

Authors:

Kenichi Kanatani and Prasanna Rangarajan

Abstract: This paper presents a new method for fitting an ellipse to a point sequence extracted from images. It is widely known that the best fit is obtained by maximum likelihood. However, it requires iterations, which may not converge in the presence of large noise. Our approach is algebraic distance minimization; no iterations are required. Exploiting the fact that the solution depends on the way the scale is normalized, we analyze the accuracy to high order error terms with the scale normalization weight unspecified and determine it so that the bias is zero up to the second order. We demonstrate by experiments that our method is superior to the Taubin method, also algebraic and known to be highly accurate.

Paper Nr: 32
Title:

TREE-STRUCTURED TEMPORAL INFORMATION FOR FAST HISTOGRAM COMPUTATION

Authors:

Séverine Dubuisson

Abstract: In this paper we present a new method for fast histogram computing. Based on the known tree-representation histogram of a region, also called reference histogram,, we want to compute the one of another region. The idea consists in computing the spatial differences between these two regions and encode it to update the histogram. We never need to store complete histograms, except the reference image one (as a preprocessing step). We compare our approach with the well-known integral histogram, and obtain better results in terms of processing time while reducing the memory footprint. We show theoretically and with experimental results the superiority of our approach in many cases. Finally, we demonstrate the advantage of this method on a visual tracking application using a particle filter by improving its time computing.

Paper Nr: 48
Title:

CIRCLE DETECTION USING THE IMAGE RAY TRANSFORM - A Novel Technique for using a Ray Analogy to Extract Circular Features

Authors:

Alastair H. Cummings, Mark S. Nixon and John N. Carter

Abstract: Physical analogies are an exciting paradigm for creating techniques for image feature extraction. A transform using an analogy to light rays has been developed for the detection of circular and tubular features. It uses a 2D ray tracing algorithm to follow rays through an image, interacting at a low level, to emphasise higher level features. It has been empirically tested as a pre-processor to aid circle detection with the Hough Transform and has been shown to provide a clear improvement over standard techniques. The transform was also used on natural images and we show its ability to highlight circles even in complex scenes. We also show the flexibility available to the technique through adjustment of parameters.

Paper Nr: 60
Title:

MINIMUM SPANNING TREE FUSING MULTI-SALIENT POINTS HIERARCHICALLY FOR MULTI-MODALITY IMAGE REGISTRATION

Authors:

Shaomin Zhang, Lijia zhi, Dazhe Zhao and Hong Zhao

Abstract: In this paper, we propose a novel registration algorithm based on minimal spanning tree. There are two novel aspects of the new method. First, instead of a single feature points, we extracted corner-like as well as edge-like points from image, and also added a few random points to cover the low contrast regions; Second, the hierarchical mechanism which fusing multi-salient points is used to drive the registration during the registration procedure. The new algorithm has solved the low robustness brought by the instability of extraction of feature points and the speed bottleneck problem when using MST to estimate the Rényi entropy. Experiment results show that on the simulated and real brain datasets, the algorithm achieves better robustness while maintaining good registration accuracy.

Paper Nr: 103
Title:

MULTISPECTRAL TEXTURE ANALYSIS USING LOCAL BINARY PATTERN ON TOTALLY ORDERED VECTORIAL SPACES

Authors:

Vincent Barra

Abstract: Texture is an important feature when considering image segmentation. Since more and more image segmentation problems involve multi- and hyperspectral data, including color images, it becomes necessary to define multispectral texture features. In this article, we propose LMBP, an extension of the classical Local Binary Pattern (LBP) operator to the case of multispectral images. The LMBP operator is based on the definition of total orderings in the image space and on an extension of the standard univariate LBP. It allows the computation of both a multispectral texture structure coefficient and a multispectral contrast parameter for each spatial location, that serve as an input to an unsupervised clustering algorithm. Results are demonstrated in the case of the segmentation of brain tissues from multispectral MR images, and compared to other multispectral texture features.

Paper Nr: 111
Title:

FACE DETECTION AND TRACKING WITH 3D PGA CLM

Authors:

Meng Yu and Bernard Tiddeman

Abstract: In this paper we describe a system for facial feature detection and tracking using a 3D extension of the Constrained Local Model (CLM) (Cristinacce and Cootes, 2006) algorithm. The use of a 3D shape model allows improved tracking through large head rotations. CLM uses a shape and texture appearance model to generate a set of region template detectors. A search is then performed in the global pose / shape space using these detectors. The proposed extension uses multiple appearance models from different viewpoints and a single 3D shape model built using Principal Geodesic Analysis (PGA) (Fletcher et al., 2004) instead of direct Principal Components Analysis (PCA). During fitting or tracking the current estimate of pose is used to select the appropriate appearance model. We demonstrate our results by fitting the model to image sequences with large head rotations. The results show that the proposed multi-view 3D CLM algorithm using PGA improves the performance of the algorithm using PCA for tracking faces in videos with large out-of-plane head rotations.

Paper Nr: 112
Title:

MOTION SEGMENTATION OF ARTICULATED STRUCTURES BY INTEGRATION OF VISUAL PERCEPTION CRITERIA

Authors:

Hildegard Kuehne and Annika Woerner

Abstract: The correct segmentation of articulated motion is an important factor to extract and understand the functional structures of complex, articulated objects. Segmenting such body motion without additional appearance information is still a challenging task, because articulated objects as e.g. the human body are mainly based on fine, connected structures. The proposed approach combines consensus based motion segmentation with biological inspired visual perception criteria. This allows the grouping of sparse, dependent moving features points into several clusters, representing the rigid elements of an articulated structure. It is shown how geometric and time-based feature properties can be used to improve the result of motion segmentation in this context. We evaluated our algorithm on artificial as well as natural video sequences in order to segment the motion of human body elements. The results of the evaluation of parameter influences and also the practical evaluation show, that good motion segmentation can be achieved by this approach.

Paper Nr: 119
Title:

SETTING GRAPH CUT WEIGHTS FOR AUTOMATIC FOREGROUND EXTRACTION IN WOOD LOG IMAGES

Authors:

Enrico Gutzeit, Stephan Ohl, Arjan Kuijper, Joerg Voskamp and Bodo Urban

Abstract: The automatic extraction of foreground objects from the background is a well known problem. Much research has been done to solve the foreground/background segmentation with graph cuts. The major challenge is to determine the weights of the graph in order to obtain a good segmentation. In this paper we address this problem with a focus on the automatic segmentation of wood logs. We introduce a new solution to get information about foreground and background. This information is used to set the weights of the graph cut method. We compare four different methods to set these weights and show that the best results are obtained with our novel method, which is based on density estimation.

Paper Nr: 136
Title:

GRAPH CUTS AND APPROXIMATION OF THE EUCLIDEAN METRIC ON ANISOTROPIC GRIDS

Authors:

Ondřej Daněk and Pavel Matula

Abstract: Graph cuts can be used to find globally minimal contours and surfaces in 2D and 3D space, respectively. To achieve this, weights of the edges in the graph are set so that the capacity of the cut approximates the contour length or surface area under chosen metric. Formulas giving good approximation in the case of the Euclidean metric are known, however, they assume isotropic resolution of the underlying grid of pixels or voxels. Anisotropy has to be simulated using more general Riemannian metrics. In this paper we show how to circumvent this and obtain a good approximation of the Euclidean metric on anisotropic grids directly by exploiting the well-known Cauchy-Crofton formulas and Voronoi diagrams theory. Furthermore, we show that our approach yields much smaller metrication errors and most interestingly, it is in particular situations better even in the isotropic case due to its invariance to mirroring. Finally, we demonstrate an application of the derived formulas to biomedical image segmentation.

Paper Nr: 194
Title:

CONTOUR SEGMENT ANALYSIS FOR HUMAN SILHOUETTE PRE-SEGMENTATION

Authors:

Cyrille Migniot, Pascal Bertolino and Jean-Marc Chassery

Abstract: Human detection and segmentation is a challenging task owing to variations in human pose and clothing. The union of Histograms of Oriented Gradients based descriptors and of a Support Vector Machine classifier is a classic and efficient method for human detection in the images. Conversely, as often in detection, accurate segmentation of these persons is not performed. Many applications however need it. This paper tackles the problem of giving rise to information that will guide the final segmentation step. It presents a method which uses the union mention above to relate to each contour segment a likelihood degree of being part of a human silhouette. Thus, data previously computed in detection are used in the pre-segmentation. A human silhouette database was ceated for learning.

Paper Nr: 215
Title:

ROAD CRACK EXTRACTION WITH ADAPTED FILTERING AND MARKOV MODEL-BASED SEGMENTATION - Introduction and Validation

Authors:

S. Chambon, C. Gourraud, J.-M. Moliard and P. Nicolle

Abstract: The automatic detection of road cracks is important in a lot of countries to quantify the quality of road surfaces and to determine the national roads that have to be improved. Many methods have been proposed to automatically detect the defects of road surface and, in particular, cracks: with tools of mathematical morphology, neuron networks or multiscale filter. These last methods are the most appropriate ones and our work concerns the validation of a wavelet decomposition which is used as the initialisation of a segmentation based on Markovian modelling. Nowadays, there is no tool to compare and to evaluate precisely the peformances and the advantages of all the existing methods and to qualify the efficiency of a method compared to the state of the art. In consequence, the aim of this work is to validate our method and to describe how to set the parameters.

Paper Nr: 216
Title:

SEED–GROWING HEART SEGMENTATION IN HUMAN ANGIOGRAMS

Authors:

Antonio Bravo, José Clemente and Rubén Medina

Abstract: In this paper an image segmentation scheme that is based on combinations of a non–parametric technique and a seed based clustering algorithm is reported. The method has been applied to clinical unsubtracted angiograms of the human heart. The first step of the method consists in applying a mean shift–based filter in order to improve the left ventricle cavity information in angiographic images. Second, the initial seed is semi–automatically generated from the aortic valve manual localization by a specialist. Third, each angiographic image is segmented using a clustering algorithm that begins with the seed which is grown until image pixels associated to the left ventricle cavity are clustered. A validation is performed by comparing the estimated contours with respect to contours manually traced by a cardiologists. From this validation stage the maximum of the average contour error considering six angiographic sequences (a total of 178 images) is 7.30 % .

Paper Nr: 217
Title:

ADAPTIVE SEGMENTATION OF CELLS AND PARTICLES IN FLUORESCENT MICROSCOPE IMAGES

Authors:

Birgit Möller, Oliver Greß, Nadine Stöhr, Stefan Hüttelmaier and Stefan Posch

Abstract: Microscope imaging is an indispensable tool in modern systems biology. In combination with fully automatic image analysis it allows for valuable insights into biological processes on the sub-cellular level and fosters understanding of biological systems. In this paper we present two new techniques for automatic segmentation of cell areas and included sub-cellular particles. A new cascaded and intensity-adaptive segmentation scheme based on coupled active contours is used to segment cell areas. Structures on the sub-cellular level, i.e.~stress granules and processing bodies, are detected applying a scale-adaptive wavelet-based detection technique. Combining these results allows for complementary analysis of biological processes. It yields new insights into interactions between different particles and distributions of particles among different cells. Our experimental evaluations based on ground-truth data prove the high-quality of our segmentation results regarding these aims and open perspectives towards deeper insights into biological systems on the sub-cellular level.

Paper Nr: 219
Title:

A HYBRID BOUNDARY–REGION LEFT VENTRICLE SEGMENTATION IN COMPUTED TOMOGRAPHY

Authors:

Antonio Bravo, José Clemente, Miguel Vera, José Avila and Rubén Medina

Abstract: An automatic approach based on the generalized Hough transform (GHT) and unsupervised clustering technique to obtain the endocardial surface is proposed. The approach is applied to multi slice computerized tomography (MSCT) images of the heart. The first step is the initialization, where a GHT–based segmentation algorithm is used to detect the edocardial contour in one MSCT slice. The centroid of this contour is used as a seed point for initializing a clustering algorithm. A two stage segmentation algorithm is used for segmenting the three–dimensional MSCT database. First, the complete database is filtered using mathematical morphology operators in order to improve the left ventricle cavity information in these images. The second stage is based on a region growing method. A seed point located inside the cardiac cavity is used as input for the clustering algorithm. This seed point is propagated along the image sequence to obtain the left ventricle surfaces for all instants of the cardiac cycle. The method is validated by comparing the estimated surfaces with respect to left ventricle shapes drawn by a cardiologist. The average error obtained was 1.52 mm.

Short Papers
Paper Nr: 46
Title:

BELIEF PROPAGATION IN SPATIOTEMPORAL GRAPH TOPOLOGIES FOR THE ANALYSIS OF IMAGE SEQUENCES

Authors:

Volker Willert and Julian Eggert

Abstract: Belief Propagation (BP) is an efficient approximate inference technique both for Markov Random Fields (MRF) and Dynamic Bayesian Networks (DBN). 2DMRFs provide a unified framework for early vision problems that are based on static image observations. 3D MRFs are suggested to cope with dynamic image data. To the contrary, DBNs are far less used for dynamic low level vision problems even though they represent sequences of state variables and hence are suitable to process image sequences with temporally changing visual information. In this paper, we propose a 3D DBN topology for dynamic visual processing with a product of potentials as transition probabilities. We derive an efficient update rule for this 3D DBN topology that unrolls loopy BP for a 2D MRF over time and compare it to update rules for conventional 3D MRF topologies. The advantages of the 3D DBN are discussed in terms of memory consumptions, costs, convergence and online applicability. To evaluate the performance of infering visual information from dynamic visual observations, we show examples for image sequence denoising that achieve MRF-like accuracy on real world data.

Paper Nr: 49
Title:

DIRECT SURFACE FITTING

Authors:

Nils Einecke, Sven Rebhan, Julian Eggert and Volker Willert

Abstract: In this paper, we propose a new method for estimating the shape of a surface from visual input. Assuming a parametric model of a surface, the parameters best explaining the perspective changes of the surface between different views are estimated. This is in contrast to the usual approach of fitting a model into a 3-D point cloud, generated by some previously calculated local correspondence matching method. The main ingredients of our approach are formulas for a perspective mapping of parametric 3-D surface models between different camera views. Model parameters are estimated using the Hooke-Jeeves optimization method, which works without the derivative of the objective function. We demonstrate our approach with models of a plane, a sphere and a cylinder and show that the parameters are accurately estimated.

Paper Nr: 72
Title:

UNSUPERVISED IMAGE SEGMENTATION BASED ON THE MULTI-RESOLUTION INTEGRATION OF ADAPTIVE LOCAL TEXTURE DESCRIPTORS

Authors:

Dana E. Ilea, Paul F. Whelan and Ovidiu Ghita

Abstract: The major aim of this paper consists of a comprehensive quantitative evaluation of adaptive texture descriptors when integrated into an unsupervised image segmentation framework. The techniques involved in this evaluation are: the standard and rotation invariant Local Binary Pattern (LBP) operators, multi-channel texture decomposition based on Gabor filters and a recently proposed technique that analyses the distribution of dominant image orientations at both micro and macro levels. The motivation to investigate these texture analysis approaches is twofold: (a) they evaluate the texture information at micro-level in small neighborhoods and (b) the distributions of the local features calculated from texture units describe the texture at macro-level. This adaptive scenario facilitates the integration of the texture descriptors into an unsupervised clustering based segmentation scheme that embeds a multi-resolution approach. The conducted experiments evaluate the performance of these techniques and also analyse the influence of important parameters (such as scale, frequency and orientation) upon the segmentation results.

Paper Nr: 75
Title:

MULTI-RESOLUTION APPROACH FOR FINE STRUCTURE EXTRACTION - Application and Validation on Road Images

Authors:

Nicolas Coudray, Argyro Karathanou and Sylvie Chambon

Abstract: In the context of fine structure extraction, this paper presents a new method based on multi-resolution segmentation applied for the detection of road cracks. A method already developed to detected low-contrasted biological membranes has been adapted to detect cracks on images: crack features are defined as heterogeneities rather than transitions of closed regions characterizing the membranes. This new methodology is quantitatively validated on reference segmentations and compared to an adapted filtering and Markovian modelling algorithm.

Paper Nr: 78
Title:

SHADOW MODELING AND DETECTION FOR ROBUST FOREGROUND SEGMENTATION IN HIGHWAY SCENARIOS

Authors:

Katherine Batista, Rui Caseiro and Jorge Batista

Abstract: This paper presents a method to automatically model and detect shadows on highway surveillance scenarios. This approach uses a cascade of two classifiers. The first stage of this method uses a weak classifier to ascertain the color information of possibly shadowed pixels which will be used by the second stage of this method (strong classifier). The weak classifier estimates the Color Normalized Cross-Correlation (CNCC) and the color information of the pixels identified as shadow, will be used to build or update multi-layered statistical shadow models of the RGB appearance of shadow. These models will then be used, by the strong classifier, to correctly distinguish shadow. To prevent misclassifications from corrupting the results of both classifiers, spatial dependencies are also taken into account. For this purpose, nonparametric kernel density estimators in a pyramidal decomposition (PKDE), as well as, Markov Random Fields (MRF) were independently employed. This technique is being used in a real outdoor traffic surveillance system in order to minimize the effects of cast vehicle shadows as well as shadows induced by illumination changes. Several results are presented in this paper to prove its effectiveness and the advantages of applying spatial contextualization methods to the weak and strong classifiers.

Paper Nr: 95
Title:

3D INSPECTION SYSTEM IN CERAMIC TILES SURFACES WITH RANGE IMAGES

Authors:

G. Pabón Rodríguez, G. Andreu-García, A. Rodas-Jordá, J. Valiente-González and F. Acebrón-Linuesa

Abstract: In this paper we propose a system to characterize 3D defects of range images, which can be combined with traditional surface inspection methods in an industrial environment for ceramic tiles inspection. Our application has the advantage of learning the geometric features of the ceramic pieces, creating a unique 3D model against which we compare the test pieces. In addition to this, the system includes a robust learning phase, which discards tiles with defects impossible to see from a human expert and a more stringent inspection in areas with low uncertainty. Experiments with real data were performed. Our data consist of tiles of different types, shapes and silk-screen of ceramic tiles. Results are promising for tiles with a straight orientation, over 99 % of defects are correctly classified.

Paper Nr: 96
Title:

WATERSHED FROM PROPAGATED MARKERS IMPROVED BY THE COMBINATION OF SPATIO-TEMPORAL GRADIENT AND BINDING OF MARKERS HEURISTICS

Authors:

Franklin César Flores and Roberto de Alencar Lotufo

Abstract: This paper presents the improvement of the watershed from propagated markers, a generic method to interactive segmentation of objects in image sequences, by the inclusion of a temporal gradient to the segmentation framework. Segmentation is done by applying the watershed from markers to a gradient image extracted from the temporal gradient sequence and using markers provided by the binding of markers heuristics. The performance of the improved method is demonstrated by application of a benchmark that supports a quantitative evaluation of assisted segmentation of objects in image sequences. Experimental results provided by the combination of temporal gradient with the binding of markers heuristics show that the proposed improvement can decrease the number of human interferences and the time required to process the sequences.

Paper Nr: 99
Title:

ROUTE PLANNING FOR THE BEST VALUE FUEL

Authors:

Alva Sheehy and Kenneth Dawson-Howe

Abstract: The system described in this paper is an extension to the standard satellite navigation systems used in vehicles. A new route planning algorithm is developed using both shortest path and simplest path criteria and then this algorithm is extended to allow the incorporation of a visit to a petrol station along the route. The choice of petrol station is based on a combination of the relative location of the petrol stations and the cost of petrol at those stations. The price of petrol at each petrol station is constantly updated on a central server which is provided with prices by all vehicles which are using the system as they pass by. The price of petrol is determined using a camera mounted on the dashboard of the car, the images from which are processed with reasonably standard OCR software. When a vehicle requires fuel the central server provides prices for petrol stations in the vicinity of the planned route.

Paper Nr: 101
Title:

TWO DOF CAMERA POSE ESTIMATION WITH A PLANAR STOCHASTIC REFERENCE GRID

Authors:

Giovanni Gherdovich and Xavier Descombes

Abstract: Determining the pose of the camera is a need to many higher level computer vision tasks. We assume a set of features to be distributed on a planar surface (the world plane) as a Poisson point process, and to know their positions in the image plane. Then we propose an algorithm to recover the pose of the camera, in the case of two degrees of freedom (slant angle and distance from the ground). The algorithm is based on the observation that cell areas of the Voronoi tessellation generated by the points in the image plane represent a reliable sampling of the Jacobian determinant of the perspective transformation up to a scaling factor, the density of points in the world plane, which we demand as input. In the process, we develop a transformation of our input data (areas of Voronoi cells) so that they show almost constant variances among the locations, and analytically find a correcting factor to considerably reduce the bias of our estimates. We perform intensive synthetic simulations and show that with few hundreds of random points our errors on angle and distance are not more than few percents.

Paper Nr: 121
Title:

A SHAPE DESCRIPTOR BASED ON SCALE-INVARIANT MULTISCALE FRACTAL DIMENSION

Authors:

Vítor Baccetti Garcia and Ricardo da S. Torres

Abstract: This paper proposes a new scale-invariant shape descriptor based on the Multiscale Fractal Dimension (MFD). The MFD is a curve that describes boundary complexity and self-affinity characteristics by obtaining fractal dimension values as function of Euclidean morphologic dilation radii. Using this concept, which guarantees rotation and translation invariance, we introduce a new scale-invariant descriptor that is obtained by selecting a relevant fragment of this curve using a sliding window. The novel shape descriptor is compared with the Multiscale Fractal Dimension and four other shape descriptors. Experimental results demonstrate that the new descriptor is scale-invariant and yields very good results in terms of effectiveness performace when compared with well-known shape descriptors.

Paper Nr: 147
Title:

SEGMENTING COLOR IMAGE OF PLANTS WITH A SPATIO-COLORIMETRIC APPROACH

Authors:

Cindy Torres, Alain Clément and Bertrand Vigouroux

Abstract: An unsupervised vectorial segmentation method using both spatial and color information is presented. To overcome the problem of memory space, this method is based on a multidimensional compact histogram and an original compact spatial neighborhood probability matrix (SNPM). The multidimensional compact histogram allows a drastic reduction of memory space without any data loss. Leaning upon the compact histogram, a SNPM has been computed. It contains all non-negative probabilities of spatial connectivity between pixel colors. In an unsupervised histogram analysis classification process, two phases are classically distinguished: (i) a learning process during which histogram modes are identified and (ii) a second step called the decision step in which a full partition of the colorimetric space is carried out according the previously defined classes. During the second step of a standard colorimetric approach, a colorimetric distance like Euclidean or Mahalanobis is used. We insert here a spatio-colorimetric distance defined as a weighed mixture between a colorimetric distance and the spatial distance calculated from the SNPM. The vectorial classification method is based on previously presented principles, achieving a hierarchical analysis of the color histogram by means of a 3D-connected components labeling. Results are applied to color images of plants to separate plantlets and loam.

Paper Nr: 161
Title:

SHAPE RETRIEVAL USING CONTOUR FEATURES AND DISTANCE OPTIMIZATION

Authors:

Daniel Carlos Guimarães Pedronette and Ricardo da S. Torres

Abstract: This paper presents a shape descriptor based on a set of features computed for each point of an object contour. We also present an algorithm for distance optimization based on the similarity among ranked lists. Experiments were conducted on two well-known data sets: MPEG-7 and Kimia. Experimental results demonstrate that the combination of the two methods is very effective and yields better results than recently proposed shape descriptors.

Paper Nr: 188
Title:

EVALUATING THE POTENTIAL OF TEXTURE AND COLOR DESCRIPTORS FOR REMOTE SENSING IMAGE RETRIEVAL AND CLASSIFICATION

Authors:

Jefersson A. dos Santos, Otávio A. B. Penatti and Ricardo da S. Torres

Abstract: Classifying Remote Sensing Images (RSI) is a hard task. There are automatic approaches whose results normally need to be revised. The identification and polygon extraction tasks usually rely on applying classification strategies that exploit visual aspects related to spectral and texture patterns identified in RSI regions. There are a lot of image descriptors proposed in the literature for content-based image retrieval purposes that may be useful for RSI classification. This paper presents a comparative study to evaluate the potential of using successful color and texture image descriptors for remote sensing retrieval and classification. Seven descriptors that encode texture information and twelve color descriptors that can be used to encode spectral information were selected. We perform experiments to evaluate the effectiveness of these descriptors, considering image retrieval and classification tasks. To evaluate descriptors in classification tasks, we also propose a methodology based on KNN classifier. Experiments demonstrate that Joint Auto-Correlogram (JAC), Color Bitmap, Invariant Steerable Pyramid Decomposition (SID) and Quantized Compound Change Histogram (QCCH) yield the best results.

Paper Nr: 192
Title:

INTRODUCING SHAPE CONSTRAINT VIA LEGENDRE MOMENTS IN A VARIATIONAL FRAMEWORK FOR CARDIAC SEGMENTATION ON NON-CONTRAST CT IMAGES

Authors:

Julien Wojak, Elsa D. Angelini and Isabelle Bloch

Abstract: In thoracic radiotherapy, some organs should be considered with care and protected from undesirable radiation. Among these organs, the heart is one of the most critical to protect. Its segmentation from routine CT scans provides valuable information to assess its position and shape. In this paper, we present a novel variational segmentation method for extracting the heart on non-contrast CT images. To handle the low image contrast around the cardiac borders, we propose to integrate shape constraints using Legendre moments and adding an energy term in the functional to be optimized. Results for whole heart segmentation in non-contrast CT images are presented and comparisons are performed with manual segmentations.

Paper Nr: 195
Title:

LOCAL SEGMENTATION BY LARGE SCALE HYPOTHESIS TESTING - Segmentation as Outlier Detection

Authors:

Sune Darkner, Anders B. Dahl, Rasmus Larsen, Arnold Skimminge, Ellen Garde and Gunhild Waldemar

Abstract: We propose a novel and efficient way of performing local image segmentation. For many applications a threshold of pixel intensities is sufficient. However, determining the appropriate threshold value poses a challenge. In cases with large global intensity variation the threshold value has to be adapted locally. We propose a method based on large scale hypothesis testing with a consistent method for selecting an appropriate threshold for the given data. By estimating the prominent distribution we characterize the segment of interest as a set of outliers or the distribution it self. Thus, we can calculate a probability based on the estimated densities of outliers actually being outliers using the false discovery rate (FDR). Because the method relies on local information it is very robust to changes in lighting conditions and shadowing effects. The method is applied to endoscopic images of small particles submerged in fluid captured through a microscope and we show how the method can handle transparent particles with significant glare point. The method generalizes to other problems. This is illustrated by applying the method to camera calibration images and MRI of the midsagittal plane for gray and white matter separation and segmentation of the corpus callosum. Comparing this segmentation method with manual corpus callosum segmentation an average dice score of 0.88 is obtained across 40 images.

Paper Nr: 213
Title:

A NEW APPROACH FOR DETECTING LOCAL FEATURES

Authors:

Giang Phuong Nguyen and Hans Jørgen Andersen

Abstract: Local features up to now are often mentioned in the meaning of interest points. A patch around each point is formed to compute descriptors or feature vectors. Therefore, in order to satisfy different invariant imaging conditions such as scales and viewpoints, an input image is often represented in a scale-space, i.e. size of patches are defined by their corresponding scales. Our proposed technique for detecting local features is different, where no scale-space is required, by dividing the given image into a number of triangles with sizes dependent on the content of the image at the location of each triangle. In this paper, we demonstrate that the triangular representation of images provide invariant features of the image. Experiments using these features show higher retrieval performance over existing methods.

Paper Nr: 214
Title:

FPGA-BASED NORMALIZATION FOR MODIFIED GRAM-SCHMIDT ORTHOGONALIZATION

Authors:

I. Sajid, Sotirios G. Ziavras and M. M. Ahmed

Abstract: Eigen values evaluation is an integral but computation-intensive part for many image and signal processing applications. Modified Gram-Schmidt Orthogonalization (MGSO) is an efficient method for evaluating the Eigen values in face recognition algorithms. MGSO applies normalization of vectors in its iterative orthogonal process and its accuracy depends on the accuracy of normalization. Using software, floating-point data types and floating-point operations are applied to minimize rounding and truncation effects. Hardware support for floating-point operations may be very costly in execution time per operation and also may increase power consumption. In contrast, lower-cost fixed-point arithmetic reduces execution times and lowers the power consumption but reduces slightly the precision. Normalization involves square root operations in addition to other arithmetic operations. Hardware realization of the floating-point square root operation may be prohibitively expensive because of its complexity. This paper presents three architectures, namely ppc405, ppc_ip and pc_pci, that employ fixed-point hardware for the efficient implementation of normalization on an FPGA. We evaluate the suitability of these architectures based on the needed frequency of normalization. The proposed architectures produce a less than 10-3 error rate compared with their software-driven counterpart for implementing floating-point operations. Furthermore, four popular databases of faces are used to benchmark the proposed architectures.

Posters
Paper Nr: 34
Title:

OBSTACLE DETECTION AND AVOIDANCE ON SIDEWALKS

Authors:

D. Castells, J. M. F. Rodrigues and J. M. H. du Buf

Abstract: We present part of a vision system for blind and visually impaired people. It detects obstacles on sidewalks and provides guidance to avoid them. Obstacles are trees, light poles, trash cans, holes, branches, stones and other objects at a distance of 3 to 5 meters from the camera position. The system first detects the sidewalk borders, using edge information in combination with a tracking mask, to obtain straight lines with their slopes and the vanishing point. Once the borders are found, a rectangular window is defined within which two obstacle detection methods are applied. The first determines the variation of the maxima and minima of the gray levels of the pixels. The second uses the binary edge image and searches in the vertical and horizontal histograms for discrepancies of the number of edge points. Together, these methods allow to detect possible obstacles with their position and size, such that the user can be alerted and informed about the best way to avoid them. The system works in realtime and complements normal navigation with the cane.

Paper Nr: 40
Title:

A RELIABLE HYBRID TECHNIQUE FOR HUMAN FACE DETECTION

Authors:

Ayesha Hakim, Stephen Marsland and Hans W. Guesgen

Abstract: The progress of computer vision technology has opened new doors for interactive and friendly computer interfaces. Human face detection is an essential step of various human-related computer applications, including face recognition, emotion recognition, lip reading, and several intelligent human computer interfaces. Since it is the basic step in such applications, it must be reliable enough to support further steps. Several approaches to detecting human faces have been proposed so far, but none of them can detect faces in all different conditions such as varying lighting conditions; frontal, profile, tilted and rotated faces; occlusions by glasses, hijab, facial hair; and noise. We propose a more reliable hybrid approach that is able to detect human faces in multiple circumstances. Moreover, a brief, but comprehensive, review of the literature is presented that may be useful to evaluate any face detection system. Our proposed approach gives up to 97% accuracy on 600 images (both simple and complicated), which is the highest accuracy rate reported to date to our knowledge.

Paper Nr: 44
Title:

FRACTAL ANALYSIS TOOLS FOR CHARACTERIZING THE COLORIMETRIC ORGANIZATION OF DIGITAL IMAGES - Case Study using Natural and Synthetic Images

Authors:

Julien Chauveau, David Rousseau, Paul Richard and François Chapeau-Blondeau

Abstract: The colorimetric organization of RGB color images is analyzed through the computation of algorithms which can characterize fractal organizations in the support and population of their three-dimensional color histogram. These algorithms have shown that complex organizations across scales exist in the colorimetric domain for natural images with often non-integer fractal dimension over a certain range of scale. In this paper, we apply this method of colorimetric characterization to synthetic images produced by rendering techniques of increasing sophistication. We show that the fractal or scale invariant signatures are more pronounced when the realism of the synthetic images increases. Such results could have interesting applications to improve the colorimetric realism of synthetic images. This also may contribute to progress in classification and vision, in using fractal colorimetric properties to differentiate natural and synthetic images.

Paper Nr: 110
Title:

GRAPH MATCHING USING SIFT DESCRIPTORS - An Application to Pose Recovery of a Mobile Robot

Authors:

Gerard Sanromà, René Alquézar and Francesc Serratosa

Abstract: Image-feature matching based on Local Invariant Feature Extraction (LIFE) methods has proven to be successful, and SIFT is one of the most effective. SIFT matching uses only local texture information to compute the correspondences. A number of approaches have been presented aimed at enhancing the image-features matches computed using only local information such as SIFT. What most of these approaches have in common is that they use a higher level information such as spatial arrangement of the feature points to reject a subset of outliers. The main limitation of the outlier rejectors is that they are not able to enhance the configuration of matches by adding new useful ones. In the present work we propose a graph matching algorithm aimed not only at rejecting erroneous matches but also at selecting additional useful ones. We use both the graph structure to encode the geometrical information and the SIFT descriptors in the node's attributes to provide local texture information. This algorithm is an ensemble of successful ideas previously reported by other researchers. We demonstrate the effectiveness of our algorithm in a pose recovery application.

Paper Nr: 118
Title:

SHAPE FEATURES FOR MASS DIAGNOSIS IN MAMMOGRAPHIC IMAGES

Authors:

Ali Cherif Chaabani, Atef Boujelben, Adel Mahfoudhi and Mohamed Abid

Abstract: Mammography is the most efficient method for early mass detection and diagnosis. This paper deals with the problem of shape features extraction in digital mammogram for mass diagnosis. We propose to combine a region and boundary features in order to ameliorate the diagnosis quality. For boundary analysis we propose to ameliorate the RDM method by using an extended approach noted XRDM. We also define a new feature (IA) based on angle calculation. Based on the literature, we exploit a set of region features that are the most used and the simplest for mass description. For experiments, we use the DDSM database and some classifiers as Multilayer Perception (MLP) and K-Nearest Neighbours (KNN). Using KNN classifiers, we obtained 97.1% as sensitivity (percentage of pathological ROIs correctly classified). The results in term of specificity (percentage of non-pathological ROIs correctly classified) grew around 95.63% using MLP classifier.

Paper Nr: 222
Title:

TOWARDS LOW-COST ROBUST AND STABLE HAND TRACKING FOR EXERCISE MONITORING

Authors:

Rui Liu and Burkhard Wuensche

Abstract: Applications for home-based care are rapidly increasing in importance due to spiraling health care and elderly care costs. An important aspect of home-based care is exercises for rehabilitation and improving general health. However, without caregivers supervising these exercises it is difficult tomonitor them, i.e., to determine whether the exercises have been performed correctly and for the prescribed duration. In this paper we present the first steps toward a computer-based tool for monitoring hand exercises. Hand exercises are important for various diseases such as Parkinson disease. While many algorithms exist for gesture recognition, most of them do require special set-ups and are difficult to use for very inexperienced users in home-based environments. In this paper we present a robust hand region segmentation method which represents the first step toward a hand-tracking algorithm. Our solution requires no calibration and is easily set-up. We evaluate its robustness with regard to complex backgrounds, changes in illuminations, and different hand colours. Our results indicate that the robust hand region segmentation provides a solid foundation for monitoring hand exercises.

Paper Nr: 244
Title:

DETECTING PERSONS USING HOUGH CIRCLE TRANSFORM IN SURVEILLANCE VIDEO

Authors:

Hong Liu, Yueliang Qian and Shouxun Lin

Abstract: Robust person detection in real-world images is interesting and important for a variety of applications, such as visual surveillance. We address the task of detecting persons in elevator surveillance scenes in this paper. To get more passengers in the lift car, the camera usually installed at the corner of ceiling. However, the high and space of lift car are limited, which makes person occluded by each other or some parts of body invisible in captured images. In this paper, we propose a novel approach to detect head contours, which includes three main steps: pre-processing, head contour detection and post-processing. Hough circle transform is adopted in the second stage, which is robust to discontinuous boundaries in circle detection. Proposed pre-processing and post-processing methods are efficient to remove false alarms on background or body part. Experimental results show our proposed approach is time saving and has better person detection results than some other methods.

Area 3 - Image Understanding

Full Papers
Paper Nr: 26
Title:

EVALUATION OF PREYS / PREDATORS SYSTEMS FOR VISUAL ATTENTION SIMULATION

Authors:

M. Perreira Da Silva, V. Courboulay, A. Prigent and P. Estraillier

Abstract: This article evaluates different improvements of Laurent Itti’s (Itti et al., 1998) visual attention model. Sixteen persons have participated in a qualitative evaluation protocol on a database of 48 images. Six different methods were evaluated, including a random fixations generation model. A real time conspicuity maps generation algorithm is also described. Evaluations show that this algorithm allows fast maps generation while improving saliency maps accuracy. The results of this study reveal that preys / predators systems can help modelling visual attention. The relatively good performances of our centrally biased random model also show the importance of the central preference in attentional models.

Paper Nr: 42
Title:

FROM AERIAL IMAGES TO A DESCRIPTION OF REAL PROPERTIES - A Framework

Authors:

Philipp Meixner and Franz Leberl

Abstract: We automate the characterization of real property and propose a processing framework for this task. Information is being extracted from aerial photography and various data products derived from that photography in the form of a true orthophoto, a dense digital surface model and digital terrain model, and a classification of land cover. To define a real property, one has available a map of cadastral property boundaries. Our goal is to develop a table for each property with descriptive numbers about the buildings, their dimensions, number of floors, number of windows, roof shapes, impervious surfaces, garages, sheds, vegetation, the presence of a basement floor etc.

Paper Nr: 47
Title:

OBJECT RETRIEVAL BASED ON USER-DRAWN SKETCHES

Authors:

Sang Min Yoon and Arjan Kuijper

Abstract: Sketches drawn by users are one of the most intuitive forms of Human Computer Interaction. Users can easily express their intention by sketching simple hand-drawn lines. In this paper, we consider the problem of target object detection and retrieval from a query by a sketch which is not in the database. Our novel approach consists of three steps: (1) Preprocessing to extract the skeletal features from a sketched query using size normalization, labelling, and binarization, (2) Skeletal feature extraction of query and data images in the space of diffusion tensor fields, and (3) Similarity measure using tensorial information between sketched query and database to retrieve the most similar target object in database. Experiments are conducted to evaluate the performance of our methodology, which shows to be an efficient and mature retrieval system.

Paper Nr: 163
Title:

HOLISTIC AND FEATURE-BASED INFORMATION TOWARDS DYNAMIC MULTI-EXPRESSIONS RECOGNITION

Authors:

Zakia Hammal and Corentin Massot

Abstract: Holistic and feature-based processing have both been shown to be involved differently in the analysis of facial expression by human observer. The current paper proposes a novel method based on the combination of both approaches for the segmentation of “emotional segments” and the dynamic recognition of the corresponding facial expressions. The proposed model is a new advancement of a previously proposed feature-based model for static facial expression recognition (Hammal et al., 2007). First, a new spatial filtering method is introduced for the holistic processing of the face towards the automatic segmentation of “emotional segments”. Secondly, the new filtering-based method is applied as a feature-based processing for the automatic and precise segmentation of the transient facial features and estimation of their orientation. Third, a dynamic and progressive fusion process of the permanent and transient facial feature deformations is made inside each “emotional segment” for a temporal recognition of the corresponding facial expression. Experimental results show the robustness of the holistic and feature-based analysis, notably for the analysis of multi-expression sequences. Moreover compared to the static facial expression classification, the obtained performances increase by 12% and compare favorably to human observers’ performances.

Short Papers
Paper Nr: 22
Title:

AN INFANT FACIAL EXPRESSION RECOGNITION SYSTEM BASED ON MOMENT FEATURE EXTRACTION

Authors:

C. Y. Fang, H. W. Lin and S. W. Chen

Abstract: This paper presents a vision-based infant surveillance system utilizing infant facial expression recognition software. In this study, the video camera is set above the crib to capture the infant expression sequences, which are then sent to the surveillance system. The infant face region is segmented based on the skin colour information. Three types of moments, namely Hu, R, and Zernike are then calculated based on the information available from the infant face regions. Since each type of moment in turn contains several different moments, given a single fifteen-frame sequence, the correlation coefficients between two moments of the same type can form the attribute vector of facial expressions. Fifteen infant facial expression classes have been defined in this study. Three decision trees corresponding to each type of moment have been constructed in order to classify these facial expressions. The experimental results show that the proposed method is robust and efficient. The properties of the different types of moments have also been analyzed and discussed.

Paper Nr: 41
Title:

SEMI-SUPERVISED ESTIMATION OF PERCEIVED AGE FROM FACE IMAGES

Authors:

Kazuya Ueki, Masashi Sugiyama and Yasuyuki Ihara

Abstract: We address the problem of perceived age estimation from face images and propose a new semi-supervised age prediction method that involves two novel aspects. The first novelty is an efficient active learning strategy for reducing the cost of labeling face samples. Given a large number of unlabeled face samples, we reveal the cluster structure of the data and propose to label cluster representative samples for covering as many clusters as possible. This simple sampling strategy allows us to boost the performance of a manifold-based semisupervised learning method only with a relatively small number of labeled samples. The second contribution is to take the heterogeneous characteristics of human age perception into account. It is rare to misregard the age of a 5-year-old child as 15 years old, but the age of a 35-year-old person is often misregarded as 45 years old. Thus, magnitude of the error is different depending on subjects’ age. We carried out a largescale questionnaire survey for quantifying human age perception characteristics and propose to encode the quantified characteristics by weighted regression. Consequently, our proposed method is expressed in the form of weighted least-squares with a manifold regularizer, which is scalable to massive datasets. Through real-world age estimation experiments, we demonstrate the usefulness of the proposed method.

Paper Nr: 53
Title:

SCENE CLASSIFICATION USING SPATIAL RELATIONSHIP BETWEEN LOCAL POSTERIOR PROBABILITIES

Authors:

Tetsu Matsukawa and Takio Kurita

Abstract: This paper presents scene classification methods using spatial relationship between local posterior probabilities of each category. Recently, the authors proposed the probability higher-order local autocorrelations (PHLAC) feature. This method uses autocorrelations of local posterior probabilities to capture spatial distributions of local posterior probabilities of a category. Although PHLAC achieves good recognition accuracies for scene classification, we can improve the performance further by using crosscorrelation between categories. We extend PHLAC features to crosscorrelations of posterior probabilities of other categories. Also, we introduce the subtraction operator for describing another spatial relationship of local posterior probabilities, and present vertical/horizontal mask patterns for the spatial layout of auto/crosscorrelations. Since the combination of category index is large, we compress the proposed features by two-dimensional principal component analysis. We confirmed the effectiveness of the proposed methods using Scene-15 dataset, and our method exhibited competitive performances to recent methods without using spatial grid informations and even using linear classifiers.

Paper Nr: 56
Title:

IMPROVING PERSON DETECTION IN VIDEOS BY AUTOMATIC SCENE ADAPTATION

Authors:

Roland Möerzinger and Marcus Thaler

Abstract: The task of object detection in videos can be improved by taking advantage of the continuity in the data stream, e.g. by object tracking. If tracking is not possible due to missing motion features, low frame rate, severe occlusions or rapid appearance changes, then a detector is typically applied in each frame of the video separately. In this case the run-time performance is impaired by exhaustively searching each frame at numerous locations and multiple scales. However, it is still possible to significantly improve the detector's performance if a static camera and a single planar ground plane can be assumed, which is the case in many surveillance scenarios. Our work addresses this issue by automatically adapting a detector to the specific yet unknown planar scene. In particular, during the adaptation phase robust statistics about few detections are used for estimating the appropriate scales of the detection windows at each location. Experiments with an existing person detector based on histograms of oriented gradients show that the scene adaptation leads to an improvement of both computational performance and detection accuracy. For scene specific person detection, changes to the implementation of the existing detector were made. The code is available for download. Results on benchmark datasets (9 videos from i-LIDS and PETS) demonstrate the applicability of our approach.

Paper Nr: 58
Title:

FACE RECOGNITION WITH HISTOGRAMS OF ORIENTED GRADIENTS

Authors:

Oscar Déniz, Gloria Bueno, Jesus Salido and Fernando de la Torre

Abstract: Histograms of Oriented Gradients have been recently used as discriminating features for face recognition. In this work we improve on that work in a number of aspects. As a first contribution, it identifies the necessity of performing feature selection or transformation, especially if HOG features are extracted from overlapping cells. Second, the use of four different face databases allowed us to conclude that, if HOG features are extracted from facial landmarks, the error of landmark localization plays a crucial role in the absolute recognition rates achievable. This implies that the recognition rates can be lower for easier databases if landmark localization is not well adapted to them. This prompted us to extract the features from a regular grid covering the whole image. Overall, these considerations allow to obtain a significant recognition rate increase (up to 10% in some subsets) on the standard FERET database with respect to previous work.

Paper Nr: 65
Title:

THE STRUCTURAL FORM IN IMAGE CATEGORIZATION

Authors:

Juha Hanni, Esa Rahtu and Janne Heikkilä

Abstract: In this paper we show an unsupervised approach how to find the most natural organization of images. Previous methods which have been proposed to discover the underlying categories or topics of visual objects create no structure or at least the structure, usually tree-shaped, is defined in advance. This causes a problem since the most relevant structure of the data is not always known. It is worthwhile to consider a generic way to find the most suitable structure of images. For this, we apply the model of finding the structural form (among eight natural forms) to automatically discover the best organization of objects in visual domain. The model simultaneously finds the structural form and an instance of that form that best explains the data. In addition, we present a generic structural form, so called meta structure, which can result in even more natural connections between clusters of images. We show that the categorization results are competitive with the state-of-the-art methods while giving more generic insight to the connections between different categories.

Paper Nr: 66
Title:

REAL-TIME ROAD SCENE CLASSIFICATION USING INFRARED IMAGES

Authors:

David Forslund, Per Cronvall and Jacob Roll

Abstract: This paper aims at employing scene classification in real-time to the two-class problem of separating city and rural scenes in images constructed from an infrared sensor that is mounted at the front of a vehicle. The 'Bag of Words' algorithm for image representation has been evaluated and compared to two low-level methods 'Edge Direction Histograms', and 'Invariant Moments'. A method for fast scene classification using the Bag of Words algorithm is proposed using a grey patch based algorithm for image element representation and a modified floating search for visual word selection. It is also shown empirically that floating search for visual word selection outperforms the currently popular k-means clustering for small vocabulary sizes.

Paper Nr: 76
Title:

REAL-TIME GENDER RECOGNITION FOR UNCONTROLLED ENVIRONMENT OF REAL-LIFE IMAGES

Authors:

Duan-Yu Chen and Kuan-Yi Lin

Abstract: Gender recognition is a challenging task in real life images and surveillance videos due to their relatively low-resolution, under uncontrolled environment and variant viewing angles of human subject. Therefore, in this paper, a system of real-time gender recognition for real life images is proposed. The contribution of this work is fourfold. A skin-color filter is first developed to filter out non-face noises. In order to make the system robust, a mechanism of decision making based on the combination of surrounding face detection, context-regions enhancement and confidence-based weighting assignment is designed. Experimental results obtained by using extensive dataset show that our system is effective and efficient in recognizing genders for uncontrolled environment of real life images.

Paper Nr: 80
Title:

TOWARDS GENERIC FITTING USING DISCRIMINATIVE ACTIVE APPEARANCE MODELS EMBEDDED ON A RIEMANNIAN MANIFOLD

Authors:

Pedro Martins and Jorge Batista

Abstract: A solution for Discriminative Active Appearance Models is proposed. The model consists in a set of descriptors which are covariances of multiple features evaluated over the neighborhood of the landmarks whose locations are governed by a Point Distribution Model (PDM). The covariance matrices are a special set of tensors that lie on a Riemannian manifold, which make it possible to measure the dissimilarity and to update them, imposing the temporal appearance consistency. The discriminative fitting method produce patch response maps found by convolution around the current landmark position. Since the minimum of the responce map isn't always the correct solution due to detection ambiguities, our method finds candidates to solutions based on a mean-shift algorithm, followed by an unsupervised clustering technique used to locate and group the candidates. A mahalanobis based metric is used to select the best solution that is consistent with the PDM. Finally the global PDM optimization step is performed using a weighted least-squares warp update, based on the Lucas and Kanade framework. The weights were extracted from a landmark matching score statistics. The effectiveness of the proposed approach was evaluated on unseen data on the challenging Talking Face video sequence, demonstrating the improvement in performance.

Paper Nr: 87
Title:

REAL-TIME ENHANCEMENT OF IMAGE AND VIDEO SALIENCY USING SEMANTIC DEPTH OF FIELD

Authors:

Zhaolin Su and Shigeo Takahashi

Abstract: In this paper, we propose a method for automatically directing viewers' visual attention to important regions of images and videos in low-level vision. Inspired by the modern model of visual attention, the importance map of an input scene is automatically calculated by the combination of low-level features such as intensity and color, which are extracted using spatial filters in different spatial frequencies, together with a set of temporal features extracted using a temporal filter in case of dynamic scenes. A variable-kernel-convolution based on the importance map is then performed on the input scene, in order to make semantic depth of field effects in a way that important regions remain focused while others are blurred. The pipeline of our method is efficient enough to be executed in real time on modern low-end machines, and the associated experiment demonstrates that the proposed system can be complementary to the human visual system.

Paper Nr: 106
Title:

MEASURING ATMOSPHERIC SCATTERING FROM DIGITAL IMAGE SEQUENCES

Authors:

Tarek El-Gaaly and Joshua Gluckman

Abstract: Current environmental monitoring devices are limited in their capability of measuring atmospheric particulate matter (PM) over large areas. Quantifying the visual degrading effects of atmospheric scattering in digital images of urban scenery and correlating these effects to PM levels is a vital step in more practically monitoring our environment. Currently, image haze removal (or dehazing) techniques exist which remove all the haze from a scene for the sole purpose of enhancing vision. This paper presents an extension to existing dehazing algorithms to use sequences of images captured over time and enforce a constant depth constraint. An experimental comparison of dehazing algorithms is then presented in the context of measuring atmospheric scattering and depth recovery using both simulation and depth measurements from real data.

Paper Nr: 126
Title:

FACE RECOGNITION USING MARGIN-ENHANCED CLASSIFIER IN GRAPH-BASED SPACE

Authors:

Ju-Chin Chen, Shang-You Shi and Jenn-Jier James Lien

Abstract: In this paper, we develop a face recognition system with the derived subspace learning method, i.e. classifier-concerning subspace, where not only the discriminant structure of data can be preserved but also the classification ability can be explicitly considered by introducing the Mahalanobis distance metric in the subspace. Most of graph-based subspace learning methods find a subspace with the preservation of certain geometric and discriminant structure of data but not explicitly include the classification information from the classifier. Via the distance metric, which is constrained by k-NN classification rule, the pairwise distance relation can be locally adjusted and thus the projected data in the classifier-concerning subspace are more suitable for k-NN classifier. In addition, an iterative procedure is derived to get rid of the overfitting problem. Experimental results show that the proposed system can yield the promising recognition results under various lighting, pose and expression conditions.

Paper Nr: 132
Title:

UNDERSTANDING OBJECT RELATIONS IN TRAFFIC SCENES

Authors:

Irina Hensel, Alexander Bachmann, Britta Hummel and Quan Tran

Abstract: An autonomous vehicle has to be able to perceive and understand its environment. At perception level objects are detected and classified using raw sensory data, while at situation interpretation level high-level object knowledge, like object relations, is required. In order to make a step towards bridging this gap between low-level perception and scene understanding we combine computer vision models with the probabilistic logic formalism Markov logic. The proposed approach allows for joint inference of object relations between all object pairs observed in a traffic scene, explicitly taking into account the scene context. Experimental results based on simulated data as well as on automatically segmented traffic videos from an on-board stereo camera platform are provided.

Paper Nr: 143
Title:

OBJECT DETECTION USING PICTORIAL STRUCTURE OF GABOR TEMPLATE

Authors:

Babak Saleh and Mohammad Rastegari

Abstract: Object detection methods are divided into two main branches: In the global approach one extracts low level features and uses machine learning techniques. In the part-based approach one uses deformable templates. We present a Hybrid approach for constructing a deformable template for modeling and detection. Initially one applies Gabor wavelet filters to extract low level features and constructs graphs which resemble shock graphs. A minimum spanning tree (MST) is extracted and is called the pictorial graph. It is used for matching. The pictorial graph is suitable for preserving the visual appearance of the shape of the object and for accommodating shape variances. In this hybrid approach we maintain the generality of the global and the efficiency of part-based approaches. Our algorithm has been applied to a set of test cases and the result shows improved performance as compared to standard object detection methods that do not rely on human intervention.

Paper Nr: 149
Title:

CLASSIFICATION OF CHALLENGING MARINE IMAGERY

Authors:

Piyanuch Silapachote, Frank R. Stolle, Allen R. Hanson and Cynthia H. Pilskaln

Abstract: Covering over 70% of the Earth’s surface and containing over 95% of the planet’s water, the aquatic ecosystem has a great influence on many environmental functions. An indicator of the health of a marine habitat is its populations, estimated by taking underwater images and labeling various species. Designing an automated algorithm for this task is quite a challenge. Image quality tends to be low due to the dynamics of the water body. The diversity of shapes and motions among living plankton and non-living detritus are remarkable. We have applied two very different techniques from computer vision to the automatic labeling of tiny planktonic organisms. One is a common approach involving segmentation and calculations of statistical features. The other is inspired by the sophisticated visual processing in primates. Both achieved competitively high accuracies, comparable to general agreement among expert marine scientists. We found that a relatively simple biologically motivated system can be as effective as a more complicated classical schema in this domain.

Paper Nr: 168
Title:

AUTOMATIC FACIAL FEATURE DETECTION FOR FACIAL EXPRESSION RECOGNITION

Authors:

Taner Danisman, Marius Bilasco, Nacim Ihaddadene and Chabane Djeraba

Abstract: This paper presents a real-time automatic facial feature point detection method for facial expression recognition. The system is capable of detecting seven facial feature points (eyebrows, pupils, nose, and corners of mouth) in grayscale images extracted from a given video. Extracted feature points then used for facial expression recognition. Neutral, happiness and surprise emotions have been studied on the Bosphorus dataset and tested on FG-NET video dataset using OpenCV. We compared our results with previous studies on this dataset. Our experiments showed that proposed method has the advantage of locating facial feature points automatically and accurately in real-time.

Paper Nr: 171
Title:

A METHOD FOR SEGMENTING AND RECOGNIZING A VEHICLE LICENCE PLATE FROM A ROAD IMAGE

Authors:

Abdelhalim Boutarfa, Mahfoud Hamada and Emptoz Hubert

Abstract: To solve the problems of heavy traffic, due to the increase in the number of vehicles, modern cities need to establish effectively automatic systems for traffic monitoring and management. One of the most useful systems is the License-Plate Recognition System which captures images of vehicles and reads the plate’s registration numbers automatically. Our method in this paper presents a robust algorithm for segmenting and recognizing a vehicle license plate area from a road image. As preprocessing steps, we statistically analyze the features of some sample plate images, and compute thresholds for each feature to decide whether a pixel is inside a plate or we cannot decide it. Our methodology starts from constructing the binary version of a road image according to the thresholds. Then, we select at most three strong candidate areas by searching the binary image with a moving window. The plate area is selected among the candidates with simple heuristics. Our algorithm is stable and robust against the cases of plate transformation and/or decolorization. The experimental results show 98.05% of successful plate recognition for 256 input images.

Paper Nr: 175
Title:

HARDWARE ARCHITECTURE FOR OBJECT DETECTION BASED ON ADABOOST ALGORITHM

Authors:

Hui Xu, Feng Zhao and Ran Ju

Abstract: This paper implements a hardware architecture for object detection based on AdaBoost learning algorithm and Haar-like features. To increase detection speed and reduce hardware consumption, an integral image calculation array with pipelined feature data flow are introduced. Input images are scanned by sub-windows and detected by cascade classifiers. Moreover, special design is made to enhance the parallelism of the architecture. In comparison with the original design, detection speed is improved by three, with only 5% increase in hardware consumption. The final hardware detection system, implemented on Xilinx V2pro FPGA platform, reaches the detection speed of 80 f ps and consumes 91% resources of the platform.

Paper Nr: 179
Title:

UNDERSTANDING PHOTOGRAPHIC COMPOSITION THROUGH DATA-DRIVEN APPROACHES

Authors:

Dansheng Mao, Ramakrishna Kakarala, Deepu Rajan and Shannon Lee Castleman

Abstract: Many elements contribute to a photograph's aesthetic value, include context, emotion, color, lightness, and composition. Of those elements, composition, which is how the arrangement of subjects, background, and features work together, is both highly challenging, and yet amenable, for understanding with computer vision techniques. Choosing famous monochromic photographs for which the composition is the dominant aesthetic contributor, we have developed data-driven approaches to understand composition. We obtain two novel results. The first shows relationships between the composition styles of master photographers based on their works, as obtained by analyzing extracted SIFT features. The second result, which relies on data obtained from eye-tracking equipment on both expert photographers and novices, shows that there are significant differences between them in what is salient in a photograph's composition.

Paper Nr: 205
Title:

STEREO VISION BASED VEHICLE DETECTION

Authors:

Benjamin Kormann, Antje Neve, Gudrun Klinker and Walter Stechele

Abstract: This paper describes a vehicle detection method using 3D data derived from a disparity map available in real-time. The integration of a flat road model reduces the search space in all dimensions. Inclination changes are considered for the road model update. The vehicles, modeled as a cuboid, are detected in an iterative refinement process for hypotheses generation on the 3D data. The detection of a vehicle is performed by a mean-shift clustering of plane fitted segments potentially belonging together in a first step. In the second step a u/v-disparity approach generates vehicle hypotheses covering differently appearing vehicles. The system was evaluated in real-traffic-scenes using a GPS system.

Paper Nr: 218
Title:

A GENDER RECOGNITION EXPERIMENT ON THE CASIA GAIT DATABASE DEALING WITH ITS IMBALANCED NATURE

Authors:

Raúl Martín Félez, Ramón A. Mollineda and J. Salvador Sánchez

Abstract: The CASIA Gait Database is one of the most used benchmarks for gait analysis among the few non-small-size datasets available. It is composed of gait sequences of 124 subjects, which are unequally distributed, comprising 31 women and 93 men. This imbalanced situation could correspond to some real contexts where men are in the majority, for example, a sports stadium or a factory. Learning from imbalanced scenarios usually requires suitable methodologies and performance metrics capable of managing and explaining biased results. Nevertheless, most of the reported experiments using the CASIA Gait Database in gender recognition tasks limit their analysis to global results obtained from reduced subsets, thus avoiding having to deal with the original setting. This paper uses a methodology to gain an insight into the discriminative capacity of the whole CASIA Gait Database for gender recognition under its imbalanced condition. The classification results are expected to be more reliable than those reported in previous papers.

Paper Nr: 225
Title:

AUTOMATIC LICENSE PLATE DETECTION IN COMPLEX CONDITIONS OF ACQUISITION

Authors:

L. A. D'Amore and M. Marengoni

Abstract: The work presented here shows a robust method for license plate detection. The term robust in this work is directly related to the efficiency of the system as an automated locator of license plates without human intervention and considering specific characteristics of image acquisition and license plate features. The proposed method is based on the characters and digits thickness found on the Brazilian license plates. Although the method was designed for the Brazilian license plate pattern it can be easily adjusted to other patterns. The results obtained using the proposed method showed a better performance even when compared to commercial systems.

Paper Nr: 231
Title:

BUILDING AND ROAD EXTRACTION ON URBAN VHR IMAGES USING SVM COMBINATIONS AND MEAN SHIFT SEGMENTATION

Authors:

Christophe Simler and Charles Beumier

Abstract: A method is proposed for building and road detection on very high spatial resolution multispectral aerial image of dense urban areas. First, objects are extracted with a segmentation algorithm in order to use both spectral and spatial information. Second, a spectral-spatial object-level pattern is formed, and then classification is performed using a 3-class SVM classifier, followed by a post-processing using contextual information to handle conflicts. However, in the particular case where many building roofs are grey like the roads and have similar geometry, classification accuracy is inevitably limited. In order to overcome this limitation, different classifiers are combined and different patterns used, improving the accuracy of 10%.

Paper Nr: 232
Title:

ROBUST MULTIMODAL BIOMETRIC SYSTEM USING MARKOV CHAIN BASED RANK LEVEL FUSION

Authors:

Maruf Monwar and Marina Gavrilova

Abstract: Multimodal biometrics is an emerging area of pattern recognition research that aims at increasing the reliability of biometric systems through utilizing more than one biometric in decision-making process. But an effective fusion scheme is necessary for combining information from various sources. Such information can be integrated at several distinct levels, such as sensor level, feature level, match score level, rank level and decision level. In this research, we develop a multimodal biometric system utilizing face, iris and ear features through rank level fusion method. We apply Fisherimage technique on face and ear image databases for recognition and Hough transform and Hamming distance techniques for iris image recognition. We introduce Markov chain approach for biometric rank aggregation. We investigate various rank fusion techniques and observe that Markov chain approach gives us the best result. Also this approach satisfies the Condorcet criterion which is essential in any fair rank aggregation system. The system can be effectively used by of security and intelligence services for controlling access to prohibited areas and protecting important national or public information.

Paper Nr: 247
Title:

HIERARCHICAL CONDITIONAL RANDOM FIELD FOR MULTI-CLASS IMAGE CLASSIFICATION

Authors:

Michael Ying Yang, Wolfgang Förstner and Martin Drauschke

Abstract: Multi-class image classification has made significant advances in recent years through the combination of local and global features. This paper proposes a novel approach called hierarchical conditional random field (HCRF) that explicitly models region adjacency graph and region hierarchy graph structure of an image. This allows to set up a joint and hierarchical model of local and global discriminative methods that augments conditional random field to a multi-layer model. Region hierarchy graph is based on a multi-scale watershed segmentation.

Posters
Paper Nr: 9
Title:

AN IMPROVED ILLUMINATION NORMALIZATION APPROACH BASED ON WAVELET TRANFORM FOR FACE RECOGNITION FROM SINGLE TRAINING IMAGE PER PERSON

Authors:

Chun-Nian Fan and Fu-Yan Zhang

Abstract: Recent research on face recognition shows that the illumination change is one of the key issues remaining to be addressed. To recognize faces under varying illuminations with single training image per person conditions, we propose an improved wavelet-based normalization method. We use wavelet transform to decompose an image into its low frequency and high frequency components. Then, we apply histogram equalization to the low frequency coefficients and de-noise the high frequency coefficients adaptively. Lastly, the high frequency coefficients are accentuated by multiplying by a scalar so as to enhance edges. A normalized image is obtained from the modified coefficients by inverse wavelet transform. Among others, the proposed method has the following advantages: (1) it does not need any prior information of 3D shape or light sources, and it aims at addressing illumination issue for face recognition from only one training image per person; (2) due to the multiscale nature of wavelet transform, it has better edge-preserving ability in low frequency illumination fields; and (3) it is computationally feasible and fast. We use PCA method to recognize normalized image with only one training image. The experimental results obtained by testing on the Yale face database B demonstrate the effectiveness of our method with significant improvement in the face recognition system.

Paper Nr: 29
Title:

PATTERN RECOGNITION FOR FAULT DIAGNOSIS OF SOLAR POWER INVERTER BY TRAJECTORY IMAGE UNDERSTANDING

Authors:

JaeHo Hwang, Nanhwa Kim, Naejoung Kwak and WonPyo Hong

Abstract: This paper presents an approach based on pattern recognition to detect and diagnose faults of solar power inverter by its fault trajectory image understanding. The drive system for simulation is modeled using Matlab Simulink toolboxes. Solar power device uses control/filter structure to connect the pulse width modulation (PWM) inverter. Multistage diagnosis factors are calculated from faults patterning procedure. It is based on the analysis of the vector trajectory and of the space syntax in faulty image mode.

Paper Nr: 33
Title:

THE POTENTIAL OF CONTOUR GROUPING FOR IMAGE CLASSIFICATION

Authors:

Christoph Rasche

Abstract: An image classification system is introduced, that is predominantly based on a description of contours and their relations. A contour is described by geometric parameters characterizing its global aspects (arc or alternating) and its local aspects (degree of curvature, edginess, symmetry). To express the relation between contours, we use a multi-dimensional vector, whose parameters describe distances between contour points and the contours’ local aspects. This allows comparing for instance L features or parallel contours with a simple distance measure. The approach has been evaluated on two image collections (Caltech 101 and Corel) and shows a reasonable categorization performance, yet its future lies in exploiting the preprocessing to understand ’parts’ of the image.

Paper Nr: 84
Title:

VISUAL PITCH CLASS PROFILE - A Video-based Method for Real-time Guitar Chord Identification

Authors:

M. Cicconet, P. Carvalho, L. Velho and M. Gattass

Abstract: We propose a video-based method for real-time guitar chord identification which is analogous to the state-of-the-art audio-based method. While the method based on audio data uses the Pitch Class Profile feature and supervised Machine Learning techniques to ``teach'' the machine about the chord ``shape'', we use as feature the approximated positions of fingertips in the guitar fretboard (what we call Visual Pitch Class Profile), captured using especial hardware. We show that visual- and audio-based methods have similar classification performance, but the former outperforms the latter with respect to the immunity to noise caused by strumming.

Paper Nr: 123
Title:

USING SR-TREE IN A CONTENT-BASED AND LOCATION-BASED IMAGE RETRIEVAL SYSTEM

Authors:

Hien Phuong Lai, Nhu Van Nguyen, Alain Boucher and Jean-Marc Ogier

Abstract: This paper presents an approach for combining content-based and location-based information in an image retrieval system. With the performance for nearest neighbour queries in the area of multidimensional data and for spatial data structuring, the SR-tree (Katayama and Satoh, 1997) structure is chosen for structuring the images simultaneously in location space and visual content space. The proposed approach also uses the SR-tree structure to organize various geographic objects of a Geographic Information System (GIS). We apply then this approach to a decision-aid system in a situation of post-natural disaster in which images describe different disasters and geographic objects are monuments registered in GIS data in the form of polygons. The proposed system aims at finding emergencies in the city after a natural disaster and giving them an emergency level. Some scenarios showing the interest of using content-based and location-based search in different ways are also presented and tested in the developed system.

Paper Nr: 139
Title:

NON-PARAMETRIC BAYESIAN ALIGNMENT AND RECOVERY OF OCCLUDED FACE USING DIRECT COMBINED MODEL

Authors:

Ching-Ting Tu and Jenn-jier James Lien

Abstract: This paper focuses on the problem of recovering the occluded facial image automatically with the aid of domain specific prior knowledge and no manual face alignment or user-specified occlusion region is needed. The robust alignment and occlusion recovery are solved sequentially by a novel recovery scheme called the direct combined model (DCM). Local occluded facial patches are recovered by utilizing the information propagated from other non-occluded patches and is further constrained by a global facial geometry. The error residue between the recovered result and the geometric constraint is then used for updating the parameter of alignment function for the next iteration. Into this recovering framework, DCM efficiently and robustly updates the results of recovering and aligning based on a compact statistic model representing the prior updating knowledge. Our extensive experiment results demonstrate that the recovered images are quantitatively closer to the ground truth with no manual alignment and occlusion dection.

Paper Nr: 151
Title:

FACIAL POSE ESTIMATION USING ACTIVE APPEARANCE MODELS AND A GENERIC FACE MODEL

Authors:

Thorsten Gernoth, Katerina Alonso Martínez, André Gooßen and Rolf-Rainer Grigat

Abstract: The complexity in face recognition emerges from the variability of the appearance of human faces. While the identity is preserved, the appearance of a face may change due to factors such as illumination, facial pose or facial expression. Reliable biometric identification relies on an appropriate response to these factors. In this paper we address the estimation of the facial pose as a first step to deal with pose changes. We present a method for pose estimation from two-dimensional images captured under active infrared illumination using a statistical model of facial appearance. An active appearance model is fitted to the target image to find facial features. We formulate the fitting algorithm using a smooth warp function, namely thin plate splines. The presented algorithm requires only a coarse and generic three-dimensional model of the face to estimate the pose from the detected features locations. The desired field of application requires the algorithm to work with many different faces, including faces of subjects not seen during the training stage. A special focus is therefore on the evaluation of the generalization performance of the algorithm which is one weakness of the classic active appearance model algorithm.

Paper Nr: 173
Title:

FAST NON-LINEAR NORMALIZATION ALGORITHM FOR IRIS RECOGNITION

Authors:

Wen-Shiung Chen, Jen-Chih Li, Ren-He Jeng, Lili Hsieh and Sheng-Wen Shih

Abstract: In biometrics, human iris recognition provides a high-level security. However, the size of eye pupil always varies with different illumination, resulting in the iris texture deformation. Thus, how to precisely predict the deformation degree of the iris is an important issue. A fast algorithm simply using the law of cosine is proposed to make Yuan and Shi’s non-linear normalization model used in iris recognition suitable for real-time personal authentication applications.

Paper Nr: 191
Title:

MIXTURES OF GAUSSIAN DISTRIBUTIONS UNDER LINEAR DIMENSIONALITY REDUCTION

Authors:

Ahmed Otoom, Oscar Perez Concha and Massimo Piccardi

Abstract: High dimensional spaces pose a serious challenge to the learning process. It is a combination of limited number of samples and high dimensions that positions many problems under the “curse of dimensionality”, which restricts severely the practical application of density estimation. Many techniques have been proposed in the past to discover embedded, locally-linear manifolds of lower dimensionality, including the mixture of Principal Component Analyzers, the mixture of Probabilistic Principal Component Analyzers and the mixture of Factor Analyzers. In this paper, we present a mixture model for reducing dimensionality based on a linear transformation which is not restricted to be orthogonal. Two methods are proposed for the learning of all the transformations and mixture parameters: the first method is based on an iterative maximum-likelihood approach and the second is based on random transformations and fixed (non iterative) probability functions. For experimental validation, we have used the proposed model for maximum-likelihood classification of five “hard” data sets including data sets from the UCI repository and the authors’ own. Moreover, we compared the classification performance of the proposed method with that of other popular classifiers including the mixture of Probabilistic Principal Component Analyzers and the Gaussian mixture model. In all cases but one, the accuracy achieved by the proposed method proved the highest, with increases with respect to the runner-up ranging from 0.2% to 5.2%.

Paper Nr: 193
Title:

AN EVALUATION OF LOCAL IMAGE FEATURES FOR OBJECT CLASS RECOGNITION

Authors:

Saiful Islam and Andrzej Sluzek

Abstract: The use of local image features (LIF) for object class recognition is becoming increasingly popular. To better understand the suitability and power of existing LIFs for object class recognition, a simple but useful method is proposed in evaluation of such features. We have compared the performance of eight frequently used LIFs by the proposed method on two popular databases. We have used F-measure criterion for this evaluation. It is found that the individual performance of SURF and SIFT features are better than that of the global features on ETH-80* database with considerably lower number of training objects. However, it may not be good enough for more challenging object class recognition problem (e.g. Caltech-101+). The evaluation of LIFs suggests the requirement for further investigation of more complementary LIFs.

Paper Nr: 196
Title:

TOWARDS DETECTING PEOPLE CARRYING OBJECTS - A Periodicity Dependency Pattern Approach

Authors:

Tobias Senst, Rubén Heras Evangelio, Volker Eiselein, Michael Pätzold and Thomas Sikora

Abstract: Detecting people carrying objects is a commonly formulated problem which results can be used as a first step in order to monitor interactions between people and objects in computer vision applications. In this paper we propose a novel method for this task. By using gray-value information instead of the contours obtained by a segmentation process we build up a system that is robust against segmentation errors. Experimental results show the validity of the method.

Paper Nr: 212
Title:

A GENERIC CONCEPT FOR OBJECT-BASED IMAGE ANALYSIS

Authors:

André Homeyer, Michael Schwier and Horst K. Hahn

Abstract: Object-based image analysis enables the recognition of complex image structures that are intractable to conventional pixel-based methods. To date, there is no generally accepted approach for the object-based processing of images, thus making it difficult to transfer developments. In this paper, we propose a generic concept for object-based image analysis that is broadly applicable and founded on established methodologies, such as the attributed relational graph, the relational data model and statistical classifiers. We also describe a reference implementation of the concept as part of the MeVisLab image processing platform.

Paper Nr: 236
Title:

HIERARCHICAL OBJECT CLASSIFICATION USING IMAGENET DOMAIN ONTOLOGIES

Authors:

Haider Ali

Abstract: We present a binary tree based object classification method in this paper. The binary tree builds a group of classes using ImageNet domain ontologies. A binary decision function is introduced in the root node of the decision tree using the positive samples of the first group for training. The decision function continues dividing the groups in sub-sequent groups when approaching the leaf nodes and provides positive and negative samples for multi-class problems. We have tested our method on the PASCAL Visual Object Classes Challenge 2006 (VOC2006) dataset and have achieved comparable accuracy for group classification. The results show that the proposed method is a powerful class binarization technique for hierarchical objects group classification.

Paper Nr: 239
Title:

SUPPRESSION OF UNCERTAINTIES AT EMOTIONAL TRANSITIONS - Facial Mimics Recognition in Video with 3-D Model

Authors:

Gerald Krell, Robert Niese, Ayoub Al-Hamadi and Bernd Michaelis

Abstract: Facial expression is of increasing importance for man-machine communication. It is expected that future human computer interaction systems even include emotions of the user. In this work we present an associative approach based on a multi-channel deconvolution for processing of face expression data derived from video sequences supported by a 3-D facial model generated with stereo support. Photogrammetric techniques are applied to determine real world geometric measures and to create a feature vector. Standard classification is used to discriminate between a limited number of mimics, but often fails at transitions from one detected emotion state to another. The proposed associative approach reduces ambiguities at the transitions between different classified emotions. This way, typical patterns of facial expression change is considered.

Paper Nr: 253
Title:

DETECTION OF EXIT NUMBER FOR THE BLIND AT THE SUBWAY STATION

Authors:

Ho-Sub Yoon, Jae Yeon Lee and Eun-Mi Ji

Abstract: This paper presents an approach for detecting the exit number to enhance the safety and mobility of blind people while walking around subway station. It is extremely important for a blind person to know whether a frontal area is a correct exit number or not. In a crossing at each exit roads, the usual black exit number is painted with blue circle contour that have white background in Taejon subway station. An image-based technique has been developed to detect the isolated number pattern at the crossing roads. The presences of exit numbers are inferred by careful analysis of numeral width, height, rate, number of numerals, as well as bandwidth trend. If we have several candidates of numerals, we adapt to the OCR function. Experimental evaluation of the proposed approach was conducted using several real images with and without exit roads. It was found that the proposed technique performed with good accuracy.

Area 4 - Motion, Tracking and Stereo Vision

Full Papers
Paper Nr: 24
Title:

IMPROVED MULTISTAGE LEARNING FOR MULTIBODY MOTION SEGMENTATION

Authors:

Yasuyuki Sugaya and Kenichi Kanatani

Abstract: We present an improved version of the MSL method of Sugaya and Kanatani for multibody motion segmentation. We replace their initial segmentation based on heuristic clustering by an analytical computation based on GPCA, fitting two mbox 2-D affine spaces in mbox 3-D by the Taubin method. This initial segmentation alone can segment most of the motions in natural scenes fairly correctly, and the result is successively optimized by the EM algorithm in mbox 3-D , mbox 5-D , and mbox 7-D . Using simulated and real videos, we demonstrate that our method outperforms the previous MSL and other existing methods. We also illustrate its mechanism by our visualization technique.

Paper Nr: 79
Title:

DYNAMIC GLOBAL OPTIMIZATION FRAMEWORK FOR REAL-TIME TRACKING

Authors:

João F. Henriques, Rui Caseiro and Jorge Batista

Abstract: Tracking is a crucial task in the context of visual surveillance. There are roughly three classes of trackers: the classical greedy algorithms (based on sequential modeling of targets, such as particle filters), Multiple Hypothesis Tracking (MHT) and its variants, and global optimizers (based on optimal matching algorithms from linear programming). We point out the shortcomings of all approaches, and set out to solve the only gaping deficiency of global optimization trackers, which is their inability to work with streamed video, in continual operation. We present an extension to the new Dynamic Hungarian Algorithm that achieves this effect, and show tracking results in such different conditions as the tracking of humans and vehicles, in different scenes, using the same set of parameters for our tracker.

Paper Nr: 86
Title:

A VISION-BASED HYBRID SYSTEM FOR REAL-TIME ACCURATE LOCALIZATION IN AN INDOOR ENVIRONMENT

Authors:

Vincent Gay-Bellile, Mohamed Tamaazousti, Romain Dupont and Sylvie Naudet Collette

Abstract: This paper presents an indoor vision-based system using a single camera for human localization. Without a priori knowledge of the operating environment, a map has to be built on-line to estimate the relative positions of the camera. When a model is a priori known, only the camera poses are computed. It results in distinctive algorithms which have both assets and drawbacks. Localization in an unknown environment is much more flexible but subject to drift while localization in a known environment is almost drift-less but suffer from recognition failures. We propose a new approach to localize a camera in an indoor environment. It combines both techniques described above benefiting from the knowledge of Georeferencing information to reduce the drift (comparatively to localization in unknown environment) while avoiding the user to be lost during long time intervals. Experimental results show the efficiency of our method.

Paper Nr: 155
Title:

VIEW-BASED APPEARANCE MODEL ONLINE LEARNING FOR 3D DEFORMABLE FACE TRACKING

Authors:

Stéphanie Lefèvre and Jean-Marc Odobez

Abstract: In this paper we address the issue of joint estimation of head pose and facial actions. We propose a method that can robustly track both subtle and extreme movements by combining two types of features: structural features observed at characteristic points of the face, and intensity features sampled from the facial texture. To handle the processing of extreme poses, we propose two innovations. The first one is to extend the deformable 3D face model Candide so that we can collect appearance information from the head sides as well as from the face. The second and main one is to exploit a set of view-based templates learned online to model the head appearance. This allows us to handle the appearance variation problem, inherent to intensity features and accentuated by the coarse geometry of our 3D head model. Experiments on the Boston University Face Tracking dataset show that the method can track common head movements with an accuracy of 3.2º, outperforming some state-of-the-art methods. More importantly, the ability of the system to robustly track natural/faked facial actions and challenging head movements is demonstrated on several long video sequences.

Paper Nr: 160
Title:

ROBUST KEY FRAME EXTRACTION FOR 3D RECONSTRUCTION FROM VIDEO STREAMS

Authors:

Mirza Tahir Ahmed, Matthew N. Dailey, Jose Luis Landabaso and Nicolas Herrero

Abstract: Automatic reconstruction of 3D models from video sequences requires selection of appropriate video frames for performing the reconstruction. We introduce a complete method for key frame selection that automatically avoids degeneracies and is robust to inaccurate correspondences caused by motion blur. Our method combines selection criteria based on the number of frame-to-frame point correspondences, Torr’s geometrical robust information criterion (GRIC) scores for the frame-to-frame homography and fundamental matrix, and the point-to-epipolar line cost for the frame-to-frame point correspondence set. In a series of experiments with real and synthetic data sets, we show that our method achieves robust 3D reconstruction in the presence of noise and degenerate motion.

Paper Nr: 165
Title:

A THREE-LEVEL ARCHITECTURE FOR MODEL–FREE DETECTION AND TRACKING OF INDEPENDENTLY MOVING OBJECTS

Authors:

Nicolas Pugeault, Karl Pauwels, Mark M. Van Hulle, Florian Pilz and Norbert Krüger

Abstract: We present a three–level architecture for detection and tracking of independently moving objects (IMOs) in sequences recorded from a moving vehicle. At the first stage, image pixels with an optical flow that is not entirely induced by the car’s motion are detected by combining dense optical flow, egomotion extracted from this optical flow, and dense stereo. These pixels are segmented and an attention mechanism is used to process them at finer resolution at the second level making use of sparse 2D and 3D edge descriptors. Based on the rich and precise information on the second level, the full rigid motion for the environment and for each IMO is computed. This motion information is then used for tracking, filtering and the building of a 3D model of the street structure as well as the IMO. This multi-level architecture allows us to combine the strength of both dense and sparse processing methods in terms of precision and computational complexity, and to dedicate more processing capacity to the important parts of the scene (the IMOs).

Paper Nr: 187
Title:

MULTI-CAMERA TOPOLOGY RECOVERY USING LINES

Authors:

Sang Ly, Cédric Demonceaux and Pascal Vasseur

Abstract: We present a topology estimation approach for a system of single view point (SVP) cameras using lines. Images captured by SVP cameras such as perspective, central catadioptric or fisheye cameras are mapped to spherical images using the unified projection model. We recover the topology of a multiple central camera setup by rotation and translation decoupling. The camera rotations are first recovered from vanishing points of parallel lines. Next, the translations are estimated from known rotations and line projections in spherical images. The proposed algorithm has been validated on simulated data and real images from perspective and fisheye cameras. This vision-based approach can be used to initialize an extrinsic calibration of a hybrid camera network.

Paper Nr: 206
Title:

4D MAP MRI IMAGE RECONSTRUCTION

Authors:

Jacob Hinkle, Ganesh Adluru, Eugene Kholmovski, Edward DiBella and Sarang Joshi

Abstract: Conventional MRI reconstruction techniques are susceptible to artifacts when imaging moving organs. In this paper, a reconstruction algorithm is developed that accommodates respiratory motion instead of using only navigator-gated data. The maximum a posteriori (MAP) algorithm uses the raw k-space time-stamped data and the 1D diaphragm navigator signal to reconstruct the images and estimate deformations in anatomy simultaneously. The algorithm eliminates blurring due to binning the data and increases signal-to-noise ratio (SNR) by using all of the collected data. The algorithm is tested in a simulated torso phantom and is shown to increase image quality by dramatically reducing motion artifacts.

Short Papers
Paper Nr: 6
Title:

IMPROVED KERNEL BASED TRACKING FOR FAST MOVING OBJECT

Authors:

Dang Xiaoyan, Yao Anbang, Wang Wei, Zhang Ya, Wang Zhuo and Wang Zhihua

Abstract: A novel approach of discriminative object representation and multiple-kernel tracking is proposed. We first employ a discriminative object representation, which introduces the foreground and background modelling ingredient to select the most discriminative features from a set of candidates via classification procedure. In the context of using kernel based tracking algorithm, a multiple-kernel strategy is employed to handle the difficulties resulted from fast motion through refining the ill-initialization position according to pre-refinement method. Extensive experiments demonstrate that the proposed tracker works better than Camshift and traditional kernel tracker.

Paper Nr: 27
Title:

LANDMARK CONSTELLATION MATCHING FOR PLANETARY LANDER ABSOLUTE LOCALIZATION

Authors:

Bach Van Pham, Simon Lacroix, Michel Devy, Marc Drieux and Thomas Voirin

Abstract: Precise landing position is required for future planetary exploration missions in order to avoid obstacles on the surface or to get close to scientifically interesting areas. Nevertheless, the current Entry, Descent and Landing (EDL) technologies are still far from this capability, as the landing point is predicted with a dispersion of several kilometres. Therefore, research has been conducted to solve this absolute localization problem (also called ``pinpoint landing''), which allows the spacecraft to localize itself within a known reference -- namely orbital imagery. We propose an approach (nicknamed ``Landstel'') which relies on Landmark Constellation matching that gives an alternative to the current solutions and also avoids the drawbacks of existing algorithms. The fusion of the inertial sensor relative motion estimation and the Landstel global position estimation yields a better global position estimation and a higher system's robustness. Position estimation results obtained both with standalone Landstel and with the fusion of INS-Landstel via a simulator are shown and analysed.

Paper Nr: 28
Title:

AUTOMATIC CONSTRUCTION OF HIERARCHICAL HIDDEN MARKOV MODEL STRUCTURE FOR DISCOVERING SEMANTIC PATTERNS IN MOTION DATA

Authors:

O. Samko, A. D. Marshall and P. L. Rosin

Abstract: The objective of this paper is to automatically build a Hierarchical Hidden Markov Model (HHMM) (Fine et al., 1998) structure to detect semantic patterns from data with an unknown structure by exploring the natural hierarchical decomposition embedded in the data. The problem is important for effective motion data representation and analysis in a variety of applications: film and game making, military, entertainment, sport and medicine. We propose to represent the patterns of the data as an HHMM built utilising a two-stage learning algorithm. The novelty of our method is that it is the first fully automated approach to build an HHMM structure for motion data. Experimental results on different motion features (3D and angular pose coordinates, silhouettes extracted from the video sequence) demonstrate the approach is effective at automatically constructing efficient HHMM with a structure which naturally represents the underlying motion that allows for accurate modelling of the data for applications such as tracking and motion resynthesis.

Paper Nr: 50
Title:

TRACKING OF FACIAL FEATURE POINTS BY COMBINING SINGULAR TRACKING RESULTS WITH A 3D ACTIVE SHAPE MODEL

Authors:

Moritz Kaiser, Dejan Arsić, Shamik Sural and Gerhard Rigoll

Abstract: Accurate 3D tracking of facial feature points from one monocular video sequence is appealing for many applications in human-machine interaction. In this work facial feature points are tracked with a Kanade-Lucas-Tomasi (KLT) feature tracker and the tracking results are linked with a 3D Active Shape Model (ASM). Thus, the efficient Gauss-Newton method is not solving for the shift of each facial feature point separately but for the 3D position, rotation and the 3D ASM parameters which are the same for all feature points. Thereby, not only the facial feature points are tracked more robustly but also the 3D position and the 3D ASM parameters can be extracted. The Jacobian matrix for the Gauss-Newton optimization is split via chain rule and the computations per frame are further reduced. The algorithm is evaluated on the basis of three handlabeled video sequences and it outperforms the KLT feature tracker. The results are also comparable to two other tracking algorithms presented recently, whereas the method proposed in this work is computationally less intensive.

Paper Nr: 55
Title:

SPEEDED UP IMAGE MATCHING USING SPLIT AND EXTENDED SIFT FEATURES

Authors:

Faraj Alhwarin, Danijela Ristić –Durrant and Axel Gräser

Abstract: Matching feature points between images is one of the most fundamental issues in computer vision tasks. As the number of feature points increases, the feature matching rapidly becomes a bottleneck. In this paper, a novel method is presented to accelerate features matching by two modifications of the popular SIFT algorithm. The first modification is based on splitting the SIFT features into two types, Maxima- and Minima-SIFT features, and making comparisons only between the features of the same type, which reduces the matching time to 50% with respect to the original SIFT. In the second modification, the SIFT feature is extended by a new attribute which is an angle between two independent orientations. Based on this angle, SIFT features are divided into subsets and only the features with the difference of their angles less than a pre-set threshold value are compared. The performance of the proposed methods was tested on two groups of images, real-world stereo images and standard dataset images. The presented experimental results show that the feature matching step can be accelerated 18 times with respect to exhaustive search without losing a noticeable portion of correct matches.

Paper Nr: 59
Title:

MULTI-OBJECT TRACKING BASED ON SOFT ASSIGNMENT OF DETECTION RESPONSES

Authors:

Sami Huttunen and Janne Heikkilä

Abstract: We introduce a new detection-based method that is able to track multiple objects from a single camera. The method is built upon an approach that combines Kalman filtering and the Expectation Maximization (EM) algorithm. The benefit of this approach is that soft assignment of the detections to corresponding objects can be performed automatically using their a posteriori probabilities. This is a general approach for detectionbased multi-object tracking, and there are various ways to detect the objects. In this paper, we demonstrate the applicability of the approach for tracking multiple pedestrians and faces using a basic cascade detector.

Paper Nr: 61
Title:

TIME-OF-FLIGHT BASED SCENE RECONSTRUCTION WITH A MESH PROCESSING TOOL FOR MODEL BASED CAMERA TRACKING

Authors:

Svenja Kahn, Harald Wuest and Dieter W. Fellner

Abstract: The most challenging algorithmical task for markerless Augmented Reality applications is the robust estimation of the camera pose. With a given 3D model of a scene the camera pose can be estimated via model-based camera tracking without the need to manipulate the scene with fiducial markers. Up to now, the bottleneck of model-based camera tracking is the availability of such a 3D model. Recently time-of-flight cameras were developed which acquire depth images in real time. With a sensor fusion approach combining the color data of a 2D color camera and the 3D measurements of a time-of-flight camera we acquire a textured 3D model of a scene. We propose a semi-manual reconstruction step in which the alignment of several submeshes with a mesh processing tool is supervised by the user to ensure a correct alignment. The evaluation of our approach shows its applicability for reconstructing a 3D model which is suitable for model-based camera tracking even for objects which are difficult to measure reliably with a time-of-flight camera due to their demanding surface characteristics.

Paper Nr: 67
Title:

DISPARITY MAPS FOR FREE PATH DETECTION

Authors:

Nuria Ortigosa, Samuel Morillas, Guillermo Peris-Fajarnés and Larisa Dunai

Abstract: In this paper we introduce amethod to detect free paths in real-time using disparitymaps froma pair of rectified stereo images. Disparity maps are obtained by processing the disparities between left and right rectified images from a stereo-vision system. The proposed algorithm is based on the fact that disparity values decrease linearly from the bottom of the image to the top. By applying least-squares fitting over groups of image columns to a linear model, free paths are detected. Only those pixels that fulfil the matching requirements are identified as free path. Results from outdoor scenarios are also presented.

Paper Nr: 90
Title:

RELIABLE LOCALIZATION AND MAP BUILDING BASED ON VISUAL ODOMETRY AND EGO MOTION MODEL IN DYNAMIC ENVIRONMENT

Authors:

Pangyu Jeong and Sergiu Nedevschi

Abstract: This paper presents a robust method for localization and map building in dynamic environment. The proposed localization and map building provide general approaches to use them both in indoor and outdoor environments. The proposed localization is based on the relative global position starting from initial departure position. In order to provide reliable positioning information, Visual Odometry (VO) is used instead of ego robot’s encoder. Unlike general VO based localization, the proposed VO does not use iterative refinement in order to select inliers. The suggested VO uses ego motion model based on the motion control. The rotation and translation values of tracked features are guided by the estimated rotation and translation values obtained by motion control. Namely the estimated motion provides upper and lower limits of motion variation of VO. This estimated boundary of motion variation helps to reject outliers among tracked features. The rejected outliers represent tracked features of fast/slow moving objects against ego robot movement. The map is built along with ego robot path. In order to get rich 3D points in each frame accumulated dense map based temporal filter method is adapted.

Paper Nr: 91
Title:

APPARENT MOTION ESTIMATION USING PLANAR CONTOURS AND FOURIER DESCRIPTORS

Authors:

Fatma Chaker and Faouzi Ghorbel

Abstract: In the present paper, we present a Fourier-based method for global apparent motion estimation. We apply this method for the estimation of the 2D affine transform linking two planar and closed curves. The originality of the method relies on the estimation of the parameters not in the original space but in the transformed space: Fourier space. This technique does not require explicit point to point correspondences; in fact such point correspondences are a by-product of the proposed algorithm. Experimental results and applications validate the use of our technique.

Paper Nr: 107
Title:

COMPUTATIONAL MODEL OF DEPTH PERCEPTION BASED ON FIXATIONAL EYE MOVEMENTS

Authors:

Norio Tagawa and Todorka Alexandrova

Abstract: Small vibration of eye ball, which occurs when we fix our gaze on object, is called ``fixational eye movement.'' It has been reported that this function works also as a clue to monocular depth perception. Moreover, researches for a depth recovery method using camera motions based on an analogy of fixational eye movement are in progress. We suppose that depth perception with fixational eye movement is firstly carried out, and subsequently such depth information is supplementary used for binocular stereopsis. Especially in this study, using camera motions corresponding to the smallest type of fixational eye movement called ``tremor,'' we construct depth perception algorithm which models camera motion as a irregular perturbation, and confirm its effectiveness.

Paper Nr: 125
Title:

COMPLEMENTARITY OF FEATURE POINT DETECTORS

Authors:

Guillaume Gales, Alain Crouzil and Sylvie Chambon

Abstract: The goal of this paper is to provide a study on complementarity of feature point detectors. Many studies have been proposed on these detectors but none deals with complementarity in details. We introduce an evaluation of eleven well-known detectors based on new criteria used to characterize complementarity. The complementarity is computed with spatial distribution and contribution measures as well as repeatability and distribution gains of the association of two detectors.

Paper Nr: 133
Title:

ON THE POTENTIAL OF ACTIVITY RELATED RECOGNITION

Authors:

A. Drosou, K. Moustakas, D. Ioannidis and D. Tzovaras

Abstract: This paper proposes an innovative activity related authentication method for ambient intelligence environments, based on Hidden Markov Models (HMM). The biometric signature of the user is extracted, throughout the performance of a couple of common, every-day office activities. Specifically, the behavioral response of the user, stimuli related to an office scenario, such as the case of a phone conversation and the interaction with a keyboard panel is examined. The motion based, activity related, biometric features that correspond to the dynamic interaction with objects that exist in the surrounding environment are extracted in the enrollment phase and are used to train an HMM. The authentication potential of the proposed biometric features has been seen to be very high in the performed experiments. Moreover, the combination of the results of these two activities further increases the authentication rate. Extensive experiments carried out on the proprietary ACTIBIO-database verify this potential of activity related authentication within the proposed scheme.

Paper Nr: 135
Title:

DETECTION OF ROAD CRACKS WITH MULTIPLE IMAGES

Authors:

Sylvie Chambon

Abstract: Extracting the defects of the road pavement in images is difficult and, most of the time, one image is used alone. The difficulties of this task are: illumination changes, objects on the road, artefacts due to the dynamic acquisition. In this work, we try to solve some of these problems by using acquisitions from different points of view. In consequence, we present a new methodology based on these steps : the detection of defects in each image, the matching of the images and the merging of the different extractions. We show the increase in performances and more particularly how the false detections are reduced.

Paper Nr: 140
Title:

TOWARDS REAL-TIME NEURONAL DISPARITY MAP ESTIMATION

Authors:

Nadia Baha and Slimane Larabi

Abstract: We propose in this paper a new approach for fast disparity map estimation from pair of stereo images. The disparity map computing is divided into two main steps. The first one deals with computing the initial disparity map using a neuronal DSI (Disparity Space Image) method. Whereas, the second one is a simple and fast method to refine the initial disparity map. New strategies and improvements are introduced so an accurate and fast result can be acquired. In order to reduce the computing time, we implemented some steps of the proposed algorithm on FPGA. Experimental results on real data sets were conducted for evaluating the solutions proposed and comparative evaluation of our method with two others methods is presented.

Paper Nr: 157
Title:

STRUCTURE FROM MOTION OF LONG VIDEO SEQUENCES

Authors:

Siyuan Fang and Neill Campbell

Abstract: In this paper we introduce an approach for ``Structure from Motion'' from long video sequence. Our approach starts from an initialization of first several frames and adopts an incremental strategy to allow more frames to be added into the SFM system. The main contribution lies in that we introduce an update propagation to modify the entire SFM system to accommodate changes brought by the local bundle adjustment applied to newly added frames. With this step, our approach gains a significant accuracy improvement at a cost of relatively small extra computation overhead.

Paper Nr: 207
Title:

A MULTI-VIEW STEREO SYSTEM FOR ARTICULATED MOTION ANALYSIS

Authors:

Francesco Setti, Mariolino De Cecco and Alessio Del Bue

Abstract: In this paper we present a system for the motion segmentation of a human arm and the determination of its internal joint characteristics (position and degrees of freedom). In particular, we are interested in the segmentation of a set of 3D points lying over a pair of non-rigid bodies (arm and forearm) connected through a rotational joint (elbow). The complexity of the problem resides in the non-rigidity of the motion given by the human articulations and the soft tissues of the body (e.g. skin and muscles). In this work we address the aspects of 3D reconstruction by multi-stereo vision, frame-by-frame matching of the feature points, motion segmentation and the joint characteristics determination.

Paper Nr: 221
Title:

MULTIPLE-CUE FACE TRACKING USING PARTICLE FILTER EMBEDDED IN INCREMENTAL DISCRIMINANT MODELS

Authors:

Zi-Yang Liu, Ju-Chin Chen and Jenn-jier James Lien

Abstract: This paper presents a multi-feature integrated algorithm incorporating a particle filter and the incremental linear discriminant models for face tracking purposes. To solve the drift problem, the discriminant models are constructed for colour and orientation feature to separate the face from the background clutter. The colour and orientation features are described in the form of part-wisely concatenating histograms such that the global information and local geometry can be preserved. Additionally, the proposed adaptive confidence value for each feature is fused with the corresponding likelihood probability in a particle filter. To render the face tracking system more robust toward variations in the facial appearance and background scene, the LDA model for each feature is updated on a frame-by-frame basis by using the discriminant feature vectors selected in accordance with a co-training approach. The experimental results show that the proposed system deals successfully with face appearance variations (including out-of-plane rotations), partial occlusions, varying illumination conditions, multiple scales and viewpoints, and cluttered background scenes.

Paper Nr: 226
Title:

REAL-TIME CAMERA POSE ESTIMATION USING CORRESPONDENCES WITH HIGH OUTLIER RATIOS - Solving the Perspective n-Point Problem using Prior Probability

Authors:

Tobias Nöll, Alain Pagani and Didier Stricker

Abstract: We present PPnP, an algorithm capable of estimating a robust camera pose in real-time, even if being provided with large sets of correspondences containing high ratios of outliers. For these situations, standard pose estimation algorithms using RANSAC are often unable to provide a solution or at least not in the required time frame. PPnP is provided with a probability distribution function which describes all valid possible camera pose estimates. By checking the correspondences for being compatible with the prior probability, it can be decided effectively at a very early stage, which correspondences can be treated as outliers. This allows a considerably more effective selection of hypothetical inliers than in RANSAC. Although PPnP is based on a technique called BlindPnP which is not intended for real-time computing, a number of changes in PPnP allows to estimate a camera pose with the same high quality as BlindPnP while being considerably faster.

Paper Nr: 228
Title:

ROBUST DETECTION AND IDENTIFICATION OF PARTIALLY OCCLUDED CIRCULAR MARKERS

Authors:

Johannes Koehler, Alain Pagani and Didier Stricker

Abstract: In this paper we present a pipeline for the robust detection of partially occluded circular markers. Compared to square markers, occluded circular tags can be tracked in a more robust way, since the camera pose is in this case computed from the whole contour instead of only the four corners. We introduce a new ellipse detection technique based on a constrained RANSAC algorithm and pre-ellipse fit outlier removal to detect tag candidates with damaged borders. Digital codes are used to identify the actual markers afterwards, since correlation based marker identification approaches are not capable of handling occlusion. The key to error detection and correction is a suitable Reed Solomon code together with a proper code layout on the marker. We show that markers covered up to 30% can be detected, our tracker moreover has a very low risk of false positive marker detection.

Posters
Paper Nr: 15
Title:

HANDLING REPEATED SOLUTIONS TO THE PERSPECTIVE THREE-POINT POSE PROBLEM

Authors:

Michael Q. Rieck

Abstract: In the Perspective 3-Point Pose Problem (P3P), when the three reference points are equidistant from each other, this distance may be assumed to be one unit in length. A repeated solution to the problem then occurs when and only when 1+R1R2 +R2R3 +R3R1-R21-R22-R23 = 0, where R1;R2 and R3 are the squared distances from the camera’s focal point to the reference points. When the setup only approximately satisfies this equation, two nearly equal solutions can introduce substantial calculation errors. To better handle this circumstance, it may be preferable to behave as though the above equation holds precisely, and then invert a certain two-dimensional transformation to obtain the repeated solution. The inversion involves only a few basic arithmetic operations and square roots. This approach is more efficient, and more reliable, than the standard quartic equation approach to solving P3P, at least in this special case.

Paper Nr: 35
Title:

ITERATIVE DENSE CORRESPONDENCE CORRECTION THROUGH BUNDLE ADJUSTMENT FEEDBACK-BASED ERROR DETECTION

Authors:

Mauricio Hess-Flores, Mark A. Duchaineau, Michael J. Goldman and Kenneth I. Joy

Abstract: A novel method to detect and correct inaccuracies in a set of unconstrained dense correspondences between two images is presented. Starting with a robust, general-purpose dense correspondence algorithm, an initial pose estimate and dense 3D scene reconstruction are obtained and bundle-adjusted. Reprojection errors are then computed for each correspondence pair, which is used as a metric to distinguish high and low-error correspondences. An affine neighborhood-based coarse-to-fine iterative search algorithm is then applied only on the high-error correspondences to correct their positions. Such an error detection and correction mechanism is novel for unconstrained dense correspondences, for example not obtained through epipolar geometry-based guided matching. Results indicate that correspondences in regions with issues such as occlusions, repetitive patterns and moving objects can be identified and corrected, such that a more accurate set of dense correspondences results from the feedback-based process, as proven by more accurate pose and structure estimates.

Paper Nr: 36
Title:

REAL-TIME HAND LOCATING BY MONOCULAR VISION

Authors:

Li Ding, Jiaxin Wang, Christophe Chaillou and Chunhong Pan

Abstract: The research on real-time hand locating by monocular vision has a considerable challenge that to track hands correctly under occlusion situation. This paper proposes a robust hand locating method which generates a possibility support map by integrating information from color model, position model and motion model. For better accuracy, hands are modeled as ellipses. The PSM depends on both previous model information and the relationship between models. Hand pattern search is then processed on the generated map by two steps which firstly locates the center position of hand and secondly determines the size and orientation. Our experimental results show that the proposed method is efficient under situation that one hand is occluded by the other one. Our current prototype system processes image at 10~14 frames per second.

Paper Nr: 38
Title:

REAL-TIME MOVING OBJECT DETECTION IN VIDEO SEQUENCES USING SPATIO-TEMPORAL ADAPTIVE GAUSSIAN MIXTURE MODELS

Authors:

Katharina Quast, Matthias Obermann and André Kaup

Abstract: In this paper we present a background subtraction method for moving object detection based on Gaussian mixture models which performs in real-time. Our method improves the traditional Gaussian mixture model (GMM) technique in several ways. It takes into account spatial and temporal dependencies, as well as a limitation of the standard deviation leading to a faster update of the model and a smoother object mask. A shadow detection method which is able to remove the umbra as well as the penumbra in one single processing step is further used to get a mask that fits the object outline even better. Using the computational power of parallel computing we further speed up the object detection process.

Paper Nr: 73
Title:

BACKGROUND MODELING WITH MOTION CRITERION AND MULTI-MODAL SUPPORT

Authors:

Juan Rosell-Ortega, Gabriela Andreu-García, Fernando López-García and Vicente Atienza-Vanacloig

Abstract: In this paper we introduce an algorithm aimed to create a background model with multimodal support, which associates a confidence value to the obtained model. Our algorithm creates the model based on a criterion of motion, pixel behavior and pixel similarity with the scenes background. This method uses only three frames to create a first model without restrictions on the frame content. The model is adapted over time to reflect new situations and illumination changes in the scene. One approach to detect corrupt model is also mentioned. The goal of confidence value is to quantify the quality of the model after a number of frames have been used to build it. Quantitative experimental results are obtained with a well-known benchmark and compared to a classical background modelling algorithm, showing the benefits of our approach.

Paper Nr: 74
Title:

CATADIOPTRIC MULTIVIEW POSE ESTIMATION FOR ROBOTIC PICK AND PLACE

Authors:

Markus Heber, Matthias Rüther and Horst Bischof

Abstract: Robotic handling of objects requires exact knowledge of the object pose. In this work, we propose a novel vision system, allowing robust and accurate pose estimation of objects, which are grasped and held in unknown pose by an industrial manipulator. For superior robustness, we solely rely on object contour as a visual cue. We address the apparent problems of object symmetry and ambiguous perspective by acquiring multiple views of the object cheaply and accurately, through a mirror system. Self-calibration of the mirror setup allows us to model the mirror geometry and perform metric multiview contour matching with a known 3D model.

Paper Nr: 77
Title:

PERFORMANCE EVALUATION OF POINT MATCHING METHODS IN VIDEO SEQUENCES WITH ABRUPT MOTIONS

Authors:

Wael Elloumi, Sylvie Treuillet, Remy Leconge and Aïcha Fonte

Abstract: In this paper, we compare the performance of matching algorithms in terms of efficiency, robustness, and computation time. Our evaluation uses as criterion, for efficiency and robustness, number of inliers and is carried out for different video sequences with abrupt motions (translation, rotation, combined). We compare SIFT, SURF, cross-correlation with Harris detector, and cross-correlation with SURF detector. Our experiments show that abrupt movements perturb a lot the matching process. They show also that SURF is the most disturbed, by such motions, and which even fails in cases that present a large rotation unlike the rest of descriptors as SIFT and cross-correlation.

Paper Nr: 81
Title:

APPLICATION OF A HIGH-SPEED STEREOVISION SENSOR TO 3D SHAPE ACQUISITION OF FLYING BATS

Authors:

Yijun Xiao and Robert B. Fisher

Abstract: 3D shape acquisition of fast-moving objects is an emerging area with many potential applications. This paper presents a novel application of 3D acquisition for studying the dynamic external morphology of live bats in flight. The 3D acquisition technique is based on binocular stereovision. Two high-speed (500 fps) calibrated machine vision cameras are employed to capture intensity images from the bats simultaneously, and 3D shape information of the bats is derived from the stereo video recording. Since the high-speed stereovision system and the bat dynamic morphology study application are both novel, it was unknown to what extent the system could perform 3D acquisition of the bats’ shapes. We carried out experiments to evaluate the performance of the system using artificial objects in various controlled conditions, and the knowledge gained helped us deploy the system in the on-site data acquisition. Our analysis of the real data demonstrates the feasibility of gathering 3D dynamic measurements on bats’ bodies from a few selected feature points and the possibility of recovering dense 3D shapes of bat heads from the stereo video data acquired. Issues are revealed in the 3D shape recovery, most notably related to motion blur and occlusion.

Paper Nr: 85
Title:

ON-LINE PLANAR AREA SEGMENTATION FROM SEQUENCE OF MONOCULAR MONOCHROME IMAGES FOR VISUAL NAVIGATION OF AUTONOMOUS ROBOT

Authors:

Naoya Ohnishi, Yoshihiko Mochizuki, Atsushi Imiya and Tomoya Sakai

Abstract: We introduce an on-line segmentation of a planar area from a sequence of images for visual navigation of a robot. We assume that the robot moves autonomously in a man-made environment without any stored map in the memory or any markers in the environment. Since the robot moves in a man-made environment, we can assume that the robot workspace is a collection of spatial plane segments. The robot is needed to separate a ground plane from an image and/or images captured by imaging system mounted on the robot. The ground plane defines a collision-free space for navigation. We develop a strategy for computing the navigation direction using a hierarchical expression of plane segments in the workspace. The robot is required to extract a spatial hierarchy of plane segments from images. We propose an algorithm for plane segmentation using an optical flow field captured by an uncalibrated moving camera.

Paper Nr: 92
Title:

GPU OPTIMIZER: A 3D RECONSTRUCTION ON THE GPU USING MONTE CARLO SIMULATIONS - How to Get Real Time without Sacrificing Precision

Authors:

Jairo R. Sánchez, Hugo Álvarez and Diego Borro

Abstract: The reconstruction of a 3D map is the key point of any SLAM algorithm. Traditionally these maps are built using non-linear minimization techniques, which need a lot of computational resources. In this paper we present a highly paralellizable stochastic approach that fits very well on the graphics hardware. It can achieve the same precision as non-linear optimization methods without loosing the real time performance. Results are compared against the well known Levenberg-Marquardt algorithm using real video sequences.

Paper Nr: 104
Title:

SMART VISION SENSOR FOR VELOCITY ESTIMATION USING A MULTI-RESOLUTION ARCHITECTURE

Authors:

Mickael Quelin, Abdesselam Bouzerdoum and Son Lam Phung

Abstract: This paper presents a velocity estimator based on a digital version of the so called Elementary Motion Detector (EMD). Inspired by insect vision, this model benefits from a low complexity motion detection algorithm and is able to estimate velocities in four directions. It can handle noisy images with a pre-filtering step which highlights the important features to be detected. Using a specific velocity tuned detector called Elementary Velocity Detector (EVD) applied to different resolutions of the same input, it gains time efficiency by estimating different speeds in parallel. The responses of the different EVDs are then combined together at the input resolution size.

Paper Nr: 113
Title:

MR COMPATIBLE OPTICAL MOTION TRACKING - Building an Optical Tracking System for Head Motion Compensation in MRI

Authors:

Martin Hoßbach

Abstract: Magnetic Resonance Imaging (MRI), in spite of its potential in medical diagnosis, has one major drawback: Image acquisition is a slow process, requiring the patient to not move for several minutes. This renders MRI useless in a number of cases. In the case of MR Imaging of the Head, optical motion tracking can be used for motion compensation, thereby greatly improving image quality. In this paper, an MR-compatible approach of tracking the patient’s head is presented which does not require his or her cooperation, based on stereo-optical marker tracking. It is adapted to work in the MRI scanner, does not influence the MR image acquisition and is easily integrated into clinical routine.

Paper Nr: 129
Title:

LANE DETECTION BASED ON GUIDED RANSAC

Authors:

Yi Hu, You-Sun Kim, Kuang-Wook Lee and Sung-Jea Ko

Abstract: In this paper, a robust and real-time lane detection method is proposed. The method consists of two steps, the lane-marking detection and lane model fitting. After detecting the lane marking by the Intensity bump algorithm, we apply the post filters by constraining the parallelism of lane boundary. Then, a novel model fitting algorithm called Guided RANSAC is presented. The Guided RANSAC searches lanes from initial lane segments and the extrapolation of lane segments is used as the guiding information to elongate lane segments recursively. With the proposed method, the accuracy of the model fitting is greatly increased while the computational cost is reduced. Both theoretical and experimental analysis results are given to show the efficiency.

Paper Nr: 144
Title:

RESOLVING DATA-ASSOCIATION UNCERTAINTY - In Mutli-object Tracking through Qualitative Modules

Authors:

Saira Saleem Pathan, Ayoub Al-Hamadi, Gerald Krell and Bernd Michaelis

Abstract: In real-time tracking, a crucial challenge is to efficiently build association among the objects. However, real-time interferences~(e.g. occlusion) manifest errors in data association. In this paper, the uncertainties in data association are handled when discrete information is incomplete during occlusion through qualitative reasoning modules. The formulation of the qualitative modules are based on exploiting human-tracking abilities (i.e. common sense) which are integrated with data association technique. Each detected object is described as a node in space with a unique identity and status tag whereas association weights are computed using CWHI and Bhattacharyya coefficient. These weights are input to qualitative modules which interpret the appropriate status of the objects satisfying the fundamental constraints of object's continuity during tracking. The results are linked with Kalman Filter to estimate the trajectories of objects. The proposed approach has shown promising results illustrating its contribution when tested on a set of videos representing various challenges.

Paper Nr: 158
Title:

INCREMENTAL LEARNING AND VALIDATION OF SEQUENTIAL PREDICTORS IN VIDEO BROWSING APPLICATION

Authors:

David Hurych and Tomáš Svoboda

Abstract: Loss-of-track detection (tracking validation) and automatic tracker adaptation to new object appearances are attractive topics in computer vision. We apply very efficient learnable sequential predictors in order to address both issues. Validation is done by clustering of the sequential predictor responses. No aditional object model for validation is needed. The paper also proposes an incremental learning procedure that accommodates changing object appearance, which mainly improves the recall of the tracker/detector. Exemplars for the incremental learning are collected automatically, no user interaction is required. The aditional training examples are selected automatically using the tracker stability computed for each potential aditional training example. Coupled with a sparsely applied SIFT or SURF based detector the method is employed for object localization in videos. Our Matlab implementation scans videosequences up to eight times faster than the actual frame rate. A standard-length movie can be thus searched through in terms of minutes.

Paper Nr: 186
Title:

A FRAMEWORK TO IMPROVE MATCHING RESULTS OF WIDELY SEPARATED VIEWS

Authors:

Cosmin Ancuti, Codruta Orniana Ancuti and Philippe Bekaert

Abstract: Matching images is a crucial step in many computer vision applications. In this paper we present an alternative strategy built on the SIFT operator to solve the problem of wide-baseline matching. We first show how to add the color information to the SIFT descriptors of extracted keypoints. Practically, the SIFT descriptor vector is blended with the main parameters (contrast, correlation and energy) of the color co-occurrence histogram computed in the same image patch. Afterward, in order to better improve the matching results of images taken under large variations of the camera viewpoint angle, the valid matches obtained by the previous strategy are employed to estimate the geometry between patches of corresponding keypoints. This overcomes the lack of affine invariance of the existing operators (including SIFT), allowing to use a more appropriate region shape where descriptors will be calculated for better preciseness. In our experiments the proposed method shows a substantial improvement of the matching results compared with the results obtained by the original local operator.

Paper Nr: 220
Title:

INCREMENTAL DETECTION AND TRACKING OF MOVING OBJECTS BY OPTICAL FLOW AND A CONTRARIO METHOD

Authors:

Dora Luz Almanza-Ojeda, Michel Devy and Ariane Herbulot

Abstract: This paper concerns moving objects detection and tracking based on the a contrario theory and on a Kalman filtering process. Only visual information is acquired from a B&Wcamera embedded on a mobile robot. KLT and a contrario theory are used to initially detect and cluster moving points. Then, each detected group of moving points is tracked as a moving object using Kalman Filter. The process detection-clustering-tracking is executed in an iterative way to deal with some challenges for real robot navigation. Furthermore, the area in which a moving obstacle is detected, is enlarged in the time until its real limits: clusters are fused with already detected objects considering similarities about their respective velocities and positions. Experimental results on real dynamic images acquired from a camera mounted on a moving robot, are presented and discussed.

Paper Nr: 227
Title:

3D RECONSTRUCTION USING PHOTO CONSISTENCY FROM UNCALIBRATED MULTIPLE VIEWS

Authors:

Heewon Lee and Alper Yilmaz

Abstract: This paper presents a new 3D object shape reconstruction approach, which exploits the homography transform and photo consistency between multiple images. The proposed method eliminates the requirement of dense feature correspondences, camera calibration, and pose estimation. Using planar homography, we generate a set of planes slicing the object to a set of parallel cross-sections in the 3D object space. For each object slice, we check photo consistency based on color observation. This approach in return provides us with the capability for expressing convex and concave parts of the object. We show that the application of our approach to a standard multiple view dataset achieves comparably better performance than competing silhouette based method.

Paper Nr: 257
Title:

AUTOMATIC HEADLAMP SWITCHING SYSTEM USING ACCELEROMETERS

Authors:

Kai-Chi Chan and Yiu-Sang Moon

Abstract: This paper presents a two-sensor method to enhance the nighttime driving safety. It consists of two accelerometers and an array of auxiliary swiveling headlamps. An alpha beta filter is proposed to stabilize the readings of the accelerometers. With the kinematics of a turning car, the car’s turning path is predicted based on the steering angle measured by the accelerometers so that the relevant auxiliary swiveling headlamps will be switched on accordingly. In this paper, we will study the performance of the alpha beta filter. Test results demonstrate that our angular measurement method is an efficient way for proper road illumination along curved paths.

Paper Nr: 258
Title:

3D POSE ESTIMATION FROM SILHOUETTES IN CYCLIC ACTIVITIES ENCODED BY A DENSE GAUSSIANS MIXTURE MODEL

Authors:

S. Amin Dadgar, Jean-Christophe Nebel and Dimitrios Makris

Abstract: This paper presents a system for 3D Pose estimation of cyclic activities (e.g. walking, jogging). Principal Component Analysis is used to compress the high dimensional space of poses. Human activities are encoded by Hidden Markov Models, overlaid on Gaussian Mixture Models. A generative approach based on the Annealed Particle Filter is used to estimate poses from silhouettes derived by a monocular camera. Experimental results indicate the value of the proposed Dense Gaussian Mixture Model when initialised by a gait cycle.