VISAPP 2009 Abstracts


Area 1 - Image Formation and Processing

Full Papers
Paper Nr: 24
Title:

GENERATING QUALITY TETRAHEDRAL MESHES FROM BINARY VOLUMES

Authors:

Mads Fogtmann Hansen, Jakob Andreas Bærentzen and Rasmus Larsen

Abstract: This paper presents two new quality measures for tetrahedra which are smooth and well-suited for gradient based optimization. Bothmeasures are formulated as a distance fromthe regular tetrahedron and utilize the fact that the covariance of the vertices of a regular tetrahedron is isotropic. We use these measures to generate high quality meshes from signed distance maps. This paper also describes an approach for computing (smooth) signed distance maps from binary volumes as volumetric data in many cases originate from segmentation of objects from imaging techniques such as CT, MRI, etc. The mesh generation is split into two stages; a candidate mesh generation stage and a compression stage, where the surface of the candidate mesh is moved to the zero iso-surface of the signed distance maps, while one of the quality measures ensures that the quality remains high. We apply the mesh generation algorithm on four examples (torus, Stanford dragon, brain mask, and pig back) and report the dihedral angle, aspect ratio and radius-edge ratio. Even though, the algorithm incorporates none of the mentioned quality measures in the compression stage it receives a good score for all these measures. The minimum dihedral angle is in none of the examples smaller than 15º.

Paper Nr: 41
Title:

ESTIMATION OF INERTIAL SENSOR TO CAMERA ROTATION FROM SINGLE AXIS MOTION

Authors:

Lorenzo Sorgi

Abstract: The aim of the present work is to define a calibration framework to estimate the relative orientation between a camera and an inertial orientation sensor AHRS (Attitude Heading Reference System). Many applications in computer vision and inmixed reality frequently work in cooperation with such class of inertial sensors, in order to increase the accuracy and the reliability of their results. In this context the heterogeneous measurements must be represented in a unique common reference frame (rf.) in order to carry out a joint processing. The basic framework is given by the estimation of the vertical direction, defined by a 3D vector expressed in the camera rf. as well as in the AHRS rf. In this paper a new approach has been adopted to retrieve such direction by using different geometrical entities which may be inferred from the analysis of single axis motion projective geometry. Their performances have been evaluated on simulated data as well as on real data.

Paper Nr: 55
Title:

RELATIVE DISTANCE METHOD FOR LOSSLESS IMAGE COMPRESSION ON PARALLEL ARCHITECTURES - A New Approach for Lossless Image Compression on GPU

Authors:

Luca Bianchi, Riccardo Gatti, Luca Lombardi and Luigi Cinque

Abstract: Computer graphics and digital imaging have finally reached the goal of photorealism. This comes however with a huge cost in terms of memory and CPU needs. In this paper we present a lossless method for image compression using relative distances between pixel values belonging to separate and independent blocks. In our approach we try to reach a good balance between execution time and image compression rate. In a second step, by considering the parallel characteristics of this algorithm (and nonetheless the trend of multi-core processor), a parallel version of this algorithm was implemented using Nvidia CUDA architecture.

Paper Nr: 78
Title:

COMBINING TEXTURE SYNTHESIS AND DIFFUSION FOR IMAGE INPAINTING

Authors:

Aurélie Bugeau and Marcelo Bertalmío

Abstract: Image inpainting or image completion consists in filling in the missing data of an image in a visually plausible way. Many works on this subject have been proposed these recent years. They can mainly be decomposed into two groups: geometric methods and texture synthesis methods. Texture synthesis methods work best with images containing only textures while geometric approaches are limited to smooth images containing strong edges. In this paper, we first present an extended state of the art. Then a new algorithm dedicated to both types of images is introduced. The basic idea is to decompose the original image into a structure and a texture image. Each of them is then filled in with some extensions of one of the best methods from the literature. A comparison with some existing methods on different natural images shows the strength of the proposed approach.

Paper Nr: 93
Title:

A MULTISCALE OPERATOR FOR DOCUMENT IMAGE BINARIZATION

Authors:

Neucimar Jerônimo Leite and Leyza Baldo Dorini

Abstract: Basically, document image binarization consists on the segmentation of scanned gray level images into text and background, and is a basic preprocessing stage in many image analysis systems. It is essential to threshold the document image reliably in order to extract useful information and make further processing such as character recognition and feature extraction. The main difficulties arise when dealing with poor quality document images, containing nonuniform illumination, shadows and smudge, for example. This paper presents an efficient morphological-based document image binarization technique that is able to cope with these problems. We evaluate the proposed approach for different classes of images, such as historical and machine-printed documents, obtaining promising results.

Paper Nr: 142
Title:

PHOTO REPAIR AND 3D STRUCTURE FROM FLATBED SCANNERS

Authors:

Ruggero Pintus, Thomas Malzbender, Oliver Wang, Ruth Bergman, Hila Nachlieli and Gitit Ruckenstein

Abstract: We introduce a technique that allows 3D information to be captured from a conventional flatbed scanner. The technique requires no hardware modification and allows untrained users to easily capture 3D datasets. Once captured, these datasets can be used for interactive relighting and enhancement of surface detail on physical objects. We have also found that the method can be used to scan and repair damaged photographs. Since the only 3D structure on these photographs will typically be surface tears and creases, our method provides an accurate procedure for automatically detecting these flaws without any user intervention. Once detected, automatic techniques, such as infilling and texture synthesis, can be leveraged to seamlessly repair such damaged areas. We first present a method that is able to repair damaged photographs with minimal user interaction and then show how we can achieve similar results using a fully automatic process.

Paper Nr: 263
Title:

MOIRÉ PATTERNS FROM A CCD CAMERA - Are They Annoying Artifacts or Can They be Useful?

Authors:

Tong Tu and Wooi-Boon Goh

Abstract: When repetitive high frequency patterns appear in the view of a charge-coupled device (CCD) camera, annoying low frequency Moiré patterns are often observed. This paper demonstrates that such Moiré pattern can useful in measuring surface deformation and displacement. What is required, in our case, is that the surface in question is textured with appropriately aligned black and white line gratings and this surface is imaged using a grey scaled CCD camera. The characteristics of the observed Moiré patterns are described along with a spatial domain model-fitting algorithm that is able to extract a dense camera-to-surface displacement measures. The experimental results discuss the reconstruction of planar incline and curved surfaces using only a coarse 33 lines per inch line grating patterns printed from a 600 dpi printer.

Short Papers
Paper Nr: 40
Title:

GRAYTONE IMAGE METAMORPHOSIS USING 3D INTERPOLATION FUNCTION

Authors:

Marcin Iwanowski

Abstract: Image metamorphosis process produces deformation sequence which transforms one input image into another one. The method described in the paper applies morphological approach to achieve this goal. It is based on morphological interpolation which makes use of the interpolation functions produced from geodesic distance functions. The described method allows applying this approach to graytone images via its 3D umbra. It produces 3D interpolation function. Its thresholding at given level followed by inverse umbra transform allows obtaining frame of the interpolated sequence.

Paper Nr: 44
Title:

INDOOR PTZ CAMERA CALIBRATION WITH CONCURRENT PT AXES

Authors:

Jordi Sanchez-Riera, Jordi Salvador and Josep R. Casas

Abstract: The introduction of active (pan-tilt-zoom or PTZ) cameras in Smart Rooms in addition to fixed static cameras allows to improve resolution in volumetric reconstruction, adding the capability to track smaller objects with higher precision in actual 3D world coordinates. To accomplish this goal, precise camera calibration data should be available for any pan, tilt, and zoom settings of each PTZ camera. The PTZ calibration method proposed in this paper introduces a novel solution to the problem of computing extrinsic and intrinsic parameters for active cameras. We first determine the rotation center of the camera expressed under an arbitrary world coordinate origin. Then, we obtain an equation relating any rotation of the camera with the movement of the principal point to define extrinsic parameters for any value of pan and tilt. Once this position is determined, we compute how intrinsic parameters change as a function of zoom. We validate our method by evaluating the re-projection error and its stability for points inside and outside the calibration set.

Paper Nr: 56
Title:

PARALLEL LOSSY COMPRESSION FOR HD IMAGES - A New Fast Image Magnification Algorithm for Lossy HD Video Decompression Over Commodity GPU

Authors:

Luca Bianchi, Riccardo Gatti, Luca Lombardi and Luigi Cinque

Abstract: Today High Definition (HD) for video contents is one of the biggest challenges in computer vision. The 1080i standard defines the minimum image resolution required to be classified as HD mode. At the same time bandwidth constraints and latency don’t allow the transmission of uncompressed, high resolution images. Often lossy compression algorithms are involved in the process of providing HD video streams, because of their high compression rate capabilities. The main issue concerned to these methods, while processing frames, is that high frequencies components in the image are neither conserved nor reconstructed. Our approach uses a simple downsampling algorithm for compression, but a new, very accurate method for decompression which is capable of high frequencies restoration. Our solution Is also highly parallelizable and can be efficiently implemented on a commodity parallel computing architecture, such as GPU, obtaining extremely fast performances.

Paper Nr: 57
Title:

A NOVEL APPROACH FOR NOISE REDUCTION IN THE GABOR TIME-FREQUENCY DOMAIN

Authors:

Behnaz Pourebrahimi and Jan C. A. van der Lubbe

Abstract: In this paper, a noise reduction technique is introduced based on the Gabor time-frequency transform. In the proposed approach, noise is removed using low pass filters locally in the transform domain. Finding the cut-off frequency for the low pass filters in such a way that image does not loose its features, is an important issue. The optimal cut-off frequency of the low pass filters are computed in an iterative method for each sub-block of the image. The followed approach, besides showing a good performance in removing noise, it also performs well in preserving image features.

Paper Nr: 99
Title:

HSV-DOMAIN ENHANCEMENT OF HIGH-CONTRAST IMAGES - Power Laws and Unsharp Masking for Bounded and Circular Signals

Authors:

Alfredo Restrepo Palacios, Stefano Marsi and Giovanni Ramponi

Abstract: We present techniques for the amplification of small contrast of bounded signals; one is based on gamma correction and another is of an unsharp-masking type; the one of the unsharp-masking type is suitably modified for its application on circular signals as well. We enhance the saturation and luminance components of high dynamic range images on the basis of a segmentation of the image into light and dark regions.

Paper Nr: 113
Title:

A SVD BASED IMAGE COMPLEXITY MEASURE

Authors:

David Gustavsson, Kim Steenstrup Pedersen and Mads Nielsen

Abstract: Images are composed of geometric structures and texture, and different image processing tools - such as denoising, segmentation and registration - are suitable for different types of image contents. Characterization of the image content in terms of geometric structure and texture is an important problem that one is often faced with. We propose a patch based complexity measure, based on how well the patch can be approximated using singular value decomposition. As such the image complexity is determined by the complexity of the patches. The concept is demonstrated on sequences from the newly collected DIKU Multi-Scale image database.

Paper Nr: 118
Title:

IMAGE RETARGETING USING STABLE PATHS

Authors:

Hélder P. Oliveira and Jaime S. Cardoso

Abstract: Media content adaptation is the action of transforming media files to adapt to device capabilities, usually related to mobile devices that require special handling because of their limited computational power, small screen size and constrained keyboard functionality. Image retargeting is one of such adaptations, transforming an image into another with different size. Tools allowing the author to imagery once and automatically retarget that imagery for a variety of different display devices are therefore of great interest. The performance of these algorithms is directly related with the preservation of the most important regions and features of the image. In this work, we introduce an algorithm for automatically retargeting images. We explore and extend a recently proposed algorithm on the literature. The central contribution is the introduction of the stable paths for image resizing, improving both the computational performance and the overall quality of the resulting image. The experimental results confirm the potential of the proposed algorithm.

Paper Nr: 134
Title:

SHOPPING BY EXAMPLE - A New Shopping Paradigm in Next Generation Retail Stores

Authors:

Ashish Khare, Hiranmay Ghosh and Jaideep Shankar Jagannathan

Abstract: In this paper, we present a new example based approach to search for a particular product based on its visual properties. A user can take a photo of a product package with a cell-phone or webcam and submit it to an online shopping portal for finding the product details. We search a product image database for the distinctive visual features on the query image to locate the desired product. We use PCA-SIFT feature for robust retrieval, to account for possible imperfections in the query image due to uncontrolled user environment. We use Oracle Java R-Tree to index image features to realize a scalable system. We establish robustness and scalability of our approach by conducting several experiments on fairly large prototype implementations.

Paper Nr: 148
Title:

HIERARCHICAL ONLINE IMAGE REPRESENTATION BASED ON 3D CAMERA GEOMETRY

Authors:

Sang Min Yoon and Holger Graf

Abstract: Within this paper, we present a hierarchical online image representation method with 3D camera position to efficiently summarize and classify the images on the web. The framework of our proposed hierarchical online image representation methodology is composed of multiple layers: at the lowest layer in the hierarchical structure, relationship between multiple images is represented by their recovered 3D camera parameters by automatic feature detection and matching. At the upper layers, images are classified using constrained agglomerative hierarchical image clustering techniques, in which the feature space established at the lowest layer consists of the camera’s 3D position. Constrained agglomerative hierarchical online image clustering method is efficient to balance the hierarchical layers whether images in the cluster are many or not. Our proposed hierarchical online image representation method can be used to classify online images within large image repositories by their camera view position and orientation. It provides a convenient way to image browsing, navigating and categorizing of the online images that have various view points, illumination, and partial occlusion.

Paper Nr: 185
Title:

INSCRIBED CONVEX SETS AND DISTANCE MAPS - Application to Shape Classification and Spatially Adaptive Image Filtering

Authors:

Frédérique Robert-Inacio

Abstract: This paper presents two original applications related to discrete distance maps. Based on the relation linking inscribed convex sets and discrete distance maps, the first application is a spatially adaptive filtering method which is set up for both grey-level and color images. This spatially adaptive filter is really efficient in performances and computation time. Furthermore a new mean of computation for the Asplund distance as well as a method for determining the similarity degree between shapes are also presented. The similarity parameter enables a quantitative shape classification with respect to a set of reference shapes.

Paper Nr: 192
Title:

LINEAR IMAGE REPRESENTATION UNDER CLOSE LIGHTING FOR SHAPE RECONSTRUCTION

Authors:

Yoshiyasu Fujita, Fumihiko Sakaue and Jun Sato

Abstract: In this paper, we propose a method for representing intensity images of objects illuminated by near point light sources. Our image representation model is a linear model, and thus, the 3D shape of objects can be recovered linearly from intensity images taken from near point light sources. Since our method does not require the integration of surface normals to recover 3D shapes, the 3D shapes can be recovered, even if they are not smooth unlike the standard shape from shading methods. The experimental results support the efficinecy of the proposed method.

Paper Nr: 201
Title:

DEWARPING AND DESKEWING OF A DOCUMENT USING AFFINE TRANSFORMATION

Authors:

Honey Kansal, Sudip Sanyal and Deepali Gupta

Abstract: An approach based on affine transformations is applied to solve the problem of dewarping of scanned text images. The technique is script independent and does not make any assumptions about the nature of the text image or the nature of warping. The attendant problems of deskewing and deshadowing are also dealt with using a vertical projection technique and filtering technique respectively. Experiments were performed on scanned text images with varying font sizes, shapes and from various scripts with varying degrees of warp, skew and shadow. The proposed method was found to give good results on all the text images, thus demonstrating the effect of the approach.

Paper Nr: 273
Title:

CONSIDERING THE WAVELET TYPE AND CONTENTS ON THE COMPRESSION-DECOMPRESSION ASSOCIATED WITH IMPROVEMENT OF BLURRED IMAGES

Authors:

Aura Conci, Marcello Santos Fonseca, Carlos S. Kubrusly and Thomas Walter Raubert

Abstract: Uncompressed multimedia data such as high resolution images, audio and video require a considerable storage capacity and transmission bandwidth on telecommunications systems. Despite of the development of the storage technology and the high performance of digital communication systems, the demand for huge files is higher than the available capacity. Moreover, the growth of image data in database applications needs more efficient ways to encode images. So image compression is more important than ever. One of the most used techniques is compression by wavelet, specified in the JPEG 2000 standard and recommended also for medical image DICOM database. This work seeks to investigate the wavelet image compression-denoising technique related to the wavelet family bases used (Haar, Daubechies, Biorthogonal, Coiflets and Symlets), database content and noise level. The target of the work is to define which combination present the best and the worst compression quality, through quality evaluation by quantitative functions: Root Mean Square Error (RMSE), Sign Noise Ratio (SNR) and Peak Sign Noise Ratio (PSNR).

Posters
Paper Nr: 21
Title:

A NOISE REMOVAL MODEL WITH ANISOTROPIC DIFFUSION BASED ON VISUAL GRADIENT

Authors:

Li Shi-Fei, Wang Ping and Shen Zhen-Kang

Abstract: In recent years considerable amount of researchers have been devoted to anisotropic diffusion method and achieved a series of important development. However, human visual system which perceived and interpreted images has been paid little attention to in all these models. In this paper, we define a visual gradient, which is looked as a generalization of the image gradient. After that we substitute the visual gradient for the image gradient in the anisotropic diffusion model to keep to some extent consistent with human visual system for the first time. Finally numerical results show the proposed method’s performance.

Paper Nr: 76
Title:

A YARP-BASED ARCHITECTURAL FRAMEWORK FOR ROBOTIC VISION APPLICATIONS

Authors:

Stefán Freyr Stefánsson, Björn Þór Jónsson and Kristinn R. Thórisson

Abstract: The complexity of advanced robot vision systems calls for an architectural framework with great flexibility with regards to sensory, hardware, processing, and communications requirements. We are currently developing a system that uses time-of-flight and a regular video stream for mobile robot vision applications. We present an architectural framework based on YARP, and evaluate its efficiency. Overall, we have found YARP to be easy to use, and our experiments show that the overhead is a reasonable tradeoff for the convenience.

Paper Nr: 87
Title:

COLOR-PRESERVING DEFOG METHOD FOR FOGGY OR HAZY SCENES

Authors:

Dongbin Xu, Chuangbai Xiao and Jing Yu

Abstract: Bad weather, such as fog and haze, can significantly degrade the imaging quality, which becomes a major problem for many applications of computer vision. In this paper, we propose a novel color-preserving defog method based on the Retinex theory, using a single image as an input without user interactions. In the proposed method, we apply the Retinex theory to fog/haze removal form foggy/hazy images, and conceive a new strategy of fog/haze estimation. Experiment results demonstrate that the proposed method can not only remove fog or haze present in foggy or hazy images, but also restore real color of clear-day counterparts, without color distortion. Besides, the proposed method has very fast implementation.

Paper Nr: 114
Title:

ASSIGNING AUTOMATIC REGULARIZATION PARAMETERS IN IMAGE RESTORATION

Authors:

Ignazio Gallo and Elisabetta Binaghi

Abstract: This work aims to define and experimentally evaluate an adaptive strategy based on neural learning to select an appropriate regularization parameter within a regularized restoration process. The appropriate setting of the regularization parameter within the restoration process is a difficult task attempting to achieve an optimal balance between removing edge ringing effects and suppressing additive noise. In this context,in an attempt to overcome the limitations of trial and error and curve fitting procedures we propose the construction of the regularization parameter function through a training concept using a Multilayer Perceptron neural network. The proposed solution is conceived independent from a specific restoration algorithm and can be included within a general local restoration procedure. The proposed algorithm was experimentally evaluated and compared using test images with different levels of degradation. Results obtained proven the generalization capability of the method that can be applied successfully on heterogeneous images never seen during training.

Paper Nr: 139
Title:

ANALYTICAL APPROXIMATIONS FOR NONLINEAR DIFFUSION TIME IN MULTISCALE EDGE ENHANCEMENT

Authors:

C. Platero, J. Sanguino, M. C. Tobar, J. M. Poncela and G. Asensio

Abstract: The image simplification, noise elimination and edge enhancement steps are all fundamental to segmentation tasks. These processing techniques usually require the tuning of their control parameters; a procedure known to be incompatible with automatic segmentation. The aim of this paper is to adopt a procedure, based on nonlinear diffusion, that is capable of auto tuning by means of analytical expressions that relate diffusion times to the gradient module. The numerical method and experimental results are shown in 1D, 2D and 3D.

Paper Nr: 154
Title:

IMAGE RECTIFICATION - Evaluation of Various Projections for Omnidirectional Vision Sensors using the Pixel Density

Authors:

Christian Scharfenberger, Georg Faerber and Florian Boehm

Abstract: Omnidirectional vision sensors provide a large field of view for numerous technical applications. But the original images of these sensors are distorted, not simply interpretable and not easy to apply for normal image processing routines. So image transformation of original into panoramic images is necessary using various projections like cylindrical, spherical and conical projection, but which projection is best for a specific application? In this paper, we present a novel method to evaluate different projections regarding their applicability in a specific application using a novel variable, the pixel density. The pixel density allows to determine the resolution of a panoramic image depending on the chosen projection. To achieve the pixel density, first the camera model is determined based on the gathered calibration data. Secondly, a projection matrix is calculated to map each pixel of the original image into the chosen projection area for image transformation. The pixel density is calculated based on this projection matrix in a final step. Theory is verified and discussed in experiments with simulated and real image data. We also demonstrate that the common cylindrical projection is not always the best projection to rectify images from omnidirectional vision sensors.

Paper Nr: 157
Title:

EFFICIENT PLANAR CAMERA CALIBRATION VIA AUTOMATIC IMAGE SELECTION

Authors:

Brendan P. Byrne, John Mallon and Paul F. Whelan

Abstract: This paper details a novel approach to automatically selecting images which improve camera calibration results. An algorithm is presented which identifies calibration images that inherently improve camera parameter estimates based on their geometric configuration or image network geometry. Analysing images in a more intuitive geometric framework allows image networks to be formed based on the relationship between their world to image homographies. Geometrically, it is equivalent to enforcing maximum independence between calibration images, this ensures accuracy and stability when solving the planar calibration equations. A webcam application using the proposed strategy is presented. This demonstrates that careful consideration of image network geometry, which has largely been neglected within the community, can yield more accurate parameter estimates with less images.

Paper Nr: 206
Title:

EIGENVECTOR ANALYSIS FOR OPTIMAL FILTERING UNDER DIFFERENT LIGHT SOURCES

Authors:

Juha Lehtonen, Jussi Parkkinen, Timo Jaaskelainen and Alexei Kamshilin

Abstract: Eigenvectors from Standard Object Colour Spectra (SOCS) set were used with several other spectra sets to find the optimal sampling intervals for optimal number of eigenvectors. The sampling intervals were calculated for each eigenvector separately. The analysis was applied not only for different sets of reflectance spectra, but also for spectra sets under different real light sources and standard illuminations. It is shown that 20 nm sampling interval for eigenvectors from SOCS set can be used for reflectance data and data under such light sources which spectrum is smooth. However, data under peaky real fluorescent light sources and standard F-illuminant require accurate 5 nm or even narrower sampling interval for the first few eigenvectors, but can be wider with some of the others. These eigenvectors from SOCS set are shown to be applicable for the other data sets. The results give guidelines for the required accuracy of eigenvectors under different light sources that can be considered e.g. in eigenvector-based filter design.

Paper Nr: 247
Title:

RAPID VISION APPLICATION DEVELOPMENT USING HIVE - A Modular and Scaleable Approach to Vision System Engineering

Authors:

Gregor Miller, Amir Afrah and Sidney Fels

Abstract: In this paper we demonstrate the use of Hive as a novel basis for creating multi-sensor vision systems. Hive is a framework in which reusable modules called drones are defined and connected together to create larger systems. Drones are simple to implement, perform a specific task and using the powerful interface of Hive can be combined to create sophisticated vision pipelines. We present a set of drones defined within Hive and a suite of applications built using these drones which utilize the input from multiple cameras and a variety of sensors. Results demonstrate the flexibility of approaches possible with Hive as well as the real-time performance of the Hive applications.

Paper Nr: 276
Title:

MULTI-LAYERED CONTENTS GENERATION FROM REAL WORLD SCENE BY THREE-DIMENSIONAL MEASUREMENT

Authors:

M. K. Kim, Y. Nakajima, T. Takeshita, S. Onogi, M. Mitsuishi and Yoichiro Matsumoto

Abstract: In this paper, we propose a method to create automatically multi-layered contents from real world scene based on Depth from Focus and Spatio-Temporal Image Analysis. Since the contents are generated by layer representation directly from real world, the change of point of view is able to freely and it reduces the labor and cost of creating three-dimensional (3-D) contents using Computer Graphics. To extraction layer in the real images, Depth from Focus is used in case of stationary objects and Spatio-Temporal Image Analysis is used in case of moving objects. We selected above two methods, because of stability of system. Depth from Focus method doesn’t need to search correspondence point and Spatio-Temporal Image Analysis has also simple computing algorithm relatively. We performed an experiment to extract layer contents from stationary and moving object automatically and the feasibility of the method was confirmed.

Area 2 - Image Analysis

Full Papers
Paper Nr: 20
Title:

TRANSFORM CODING OF RGB-HISTOGRAMS

Authors:

Reiner Lenz and Pedro Latorre Carmona

Abstract: In this paper we introduce the representation theory of the symmetric group~$ SPG$ as a tool to investigate the structure of the space of $RGB$-histograms. We show that the theory reveals that typical histogram spaces are highly structured and that these structures originate partly in group theoretically defined symmetries. The algorithms exploit this structure and constructs a PCA like decomposition without the need to construct correlation or covariance matrices and their eigenvectors. We implemented these algorithms and investigate their properties with the help of two real-world databases (one from an image provider and one from a image search engine company) containing over one million images.

Paper Nr: 51
Title:

FROM INDIVIDUAL INTENSITY VOXEL DATA TO INTER-INDIVIDUAL PROBABILISTIC ATLASES OF BIOLOGICAL OBJECTS BY AN INTERLEAVED REGISTRATION-SEGMENTATION APPROACH

Authors:

Felix Bollenbeck, Diana Weier, Wolfram Schoor and Udo Seiffert

Abstract: In this paper we describe an automated processing of plant serial section data for high-resolution 3-D models of internal structures. The processing pipeline includes standardization and registration of large image stacks as well as multiple tissue recognition by a joint registration-segmentation approach. By integrating segmented data from multiple individuals in a common reference, a statistical three-dimensional description is used to represent the inherent biodiversity amongst specimen. Inter-individual 3-D models are a novelty in the context of plant microscopy, and along with meaningful visualisation they deliver new insights into growth and development as well as provide a framework for the integration of functional data.

Paper Nr: 61
Title:

ACTIVE APPEARANCE MODEL FITTING UNDER OCCLUSION USING FAST-ROBUST PCA

Authors:

Markus Storer, Peter M. Roth, Martin Urschler, Horst Bischof and Josef A. Birchbauer

Abstract: The Active Appearance Model (AAM) is a widely used method for model based vision showing excellent results. But one major drawback is that the method is not robust against occlusions. Thus, if parts of the image are occluded the method converges to local minima and the obtained results are unreliable. To overcome this problem we propose a robust AAM fitting strategy. The main idea is to apply a robust PCA model to reconstruct the missing feature information and to use the thus obtained image as input for the standard AAM fitting process. Since existing methods for robust PCA reconstruction are computationally too expensive for real-time processing we developed a more efficient method: fast robust PCA (FR-PCA). In fact, by using our FR-PCA the computational effort is drastically reduced. Moreover, more accurate reconstructions are obtained. In the experiments, we evaluated both, the fast robust PCA model on the publicly available ALOI database and the whole robust AAM fitting chain on facial images. The results clearly show the benefits of our approach in terms of accuracy and speed when processing disturbed data (i.e., images containing occlusions).

Paper Nr: 71
Title:

A NOVEL APPROACH TO ORTHOGONAL DISTANCE LEAST SQUARES FITTING OF GENERAL CONICS

Authors:

Sudanthi Wijewickrema, Charles Esson and Andrew Papliński

Abstract: Fitting of conics to a set of points is a well researched area and is used in many fields of science and engineering. Least squares methods are one of the most popular techniques available for conic fitting and among these, orthogonal distance fitting has been acknowledged as the ’best’ least squares method. Although the accuracy of orthogonal distance fitting is unarguably superior, the problem so far has been in finding the orthogonal distance between a point and a general conic. This has lead to the development of conic specific algorithms which take the characteristics of the type of conic as additional constraints, or in the case of a general conic, the use of an unstable closed form solution or a non-linear iterative procedure. Using conic specific constraints produce inaccurate fits if the data does not correspond to the type of conic being fitted and in iterative solutions too, the accuracy is compromised. The method discussed in this paper aims at overcoming all these problems, in introducing a direct calculation of the orthogonal distance, thereby eliminating the need for conic specific information and iterative solutions. We use the orthogonal distances in a fitting algorithm that identifies which type of conic best fits the data. We then show that this algorithm requires less accurate initializations, uses simpler calculations and produces more accurate results.

Paper Nr: 88
Title:

WELDING INSPECTION USING NOVEL SPECULARITY FEATURES AND A ONE-CLASS SVM

Authors:

Fabian Timm, Sascha Klement, Erhardt Barth and Thomas Martinetz

Abstract: We present a framework for automatic inspection of welding seams based on specular reflections. Therefore, we introduce a novel feature set -- called specularity features (SPECs) -- describing statistical properties of specular reflections. For classification we use a one-class support-vector approach. The SPECs significantly outperform statistical geometric features and raw pixel intensities, since they capture more complex characteristics and depencies of shape and geometry.We obtain an error rate of 9%, which corresponds to the level of human performance.

Paper Nr: 103
Title:

GEOMETRY CLOSURE FOR HEMODYNAMICS SIMULATIONS

Authors:

J. Bruijns and R. Hermans

Abstract: Physicians may treat an aneurysm by injecting coils through a catheter into the aneurysm, or by anchoring a stent as a flow diverter. Since such an intervention is risky, a patient is only treated when the probability of aneurysm rupture is relatively high. Hemodynamic properties of aneurysmal blood flow, extracted by computational fluid dynamics calculations, are hypothesized to be relevant for predicting this rupture. Since hemodynamics simulations require a closed vessel section with defined inflow and outflow points, and since the user can easily overlook small side branches, we have developed an algorithm for fully-automatic geometry closure of an open vessel section. Since X-ray based flow returns an indication for the needed length to have a developed flow inside the geometry, we have also developed an algorithm to create a geometry closure around an aneurysm based on a length criterion. After both geometry closure algorithms were tested elaborately, practicability of the hemodynamics workstation is currently being tested.

Paper Nr: 104
Title:

A TOP DOWN CONSTRUCTION SCHEME FOR IRREGULAR PYRAMIDS

Authors:

Romain Goffe, Luc Brun and Guillaume Damiand

Abstract: Hierarchical data structures such as irregular pyramids are used by many applications related to image processing and segmentation. The construction scheme of such pyramids is bottom-up. Such a scheme forbids the definition of a level according to more global information defined at upper levels in the hierarchy. Moreover, the base of the pyramid has to encode any single pixel of the initial image in order to allow the definition of regions of any shape at higher levels. This last constraint raises major issues of memory usage and processing costs when irregular pyramids are applied to large images. The objective of this paper is to define a top-down construction scheme for irregular pyramids. Each level of such a pyramid is encoded by a combinatorial map associated to an explicit encoding of the geometry and the inclusion relationships of the corresponding partition. The resulting structure is a stack of finer and finer partitions obtained by successive splitting operations and is called a top-down pyramid.

Paper Nr: 108
Title:

ROBUSTNESS OF DIFFERENT FEATURES FOR ONE-CLASS CLASSIFICATION AND ANOMALY DETECTION IN WIRE ROPES

Authors:

Esther-Sabrina Platzer, Joachim Denzler, Herbert Süße, Josef Nägele and Karl-Heinz Wehking

Abstract: Automatic visual inspection of wire ropes is an important but challenging task. Anomalies in wire ropes usually are unobtrusive and their detection is a difficult job. Certainly, a reliable anomaly detection is essential to assure the safety of the ropes. A one-class classification approach for the automatic detection of anomalies in wire ropes is presented. Different well-established features from the field of textural defect detection are compared to context-sensitive features extracted by linear prediction. They are used to learn a Gaussian mixture model which represents the faultless rope structure. Outliers are regarded as anomaly. To evaluate the robustness of the method, a training set containing intentionally added, defective samples is used. The generalization ability of the learned model, which is important for practical life, is exploited by testing the model on different data sets from identically constructed ropes. All experiments were performed on real-life rope data. The results prove a high generalization ability, as well as a good robustness to outliers in the training set. The presented approach can exclude up to 90 percent of the rope as faultless without missing one single defect.

Paper Nr: 167
Title:

CURL-GRADIENT IMAGE WARPING - Introducing Deformation Potentials for Medical Image Registration using Helmholtz Decomposition

Authors:

Michael Sass Hansen, Rasmus Larsen and Niels Vorgaard Christensen

Abstract: Image registration is becoming an increasingly important tool in medical image analysis, and the need to understand deformations within and between subjects often requires analysis of obtained deformation fields. The current paper presents a novel representation of the deformation field based on the Helmholtz decomposition of vector fields. The two decomposed potential fields form a curl free field and a divergence free field. The representation has already proven its worth in fluid modelling and electrostatics, and we show it also has desirable features in image registration and morphometry in particular. The potentials are shown to a offer decoupling of the two potential fields in both elastic and fluid image registration. For morphometry applications, we show that when decomposing the deformation field in symmetric and antisymmetric parts, the vector potential alone describes the vorticity, and the scalar gradient potential gives a first-order approximation to the determinant of the Jacobian. We provide some insight into the behavior of curl and divergence representation of the warp field by constructed examples and by a demonstration on real medical image data. Our theoretical findings are readily observable in our empirical experiment, which further illustrates the benefit of the parametrization.

Paper Nr: 173
Title:

INTERACTIVE IMAGE SEGMENTATION WITH INTEGRATED USE OF THE MARKERS AND THE HIERARCHICAL WATERSHED APPROACHES

Authors:

Bruno Klava and Nina Sumiko Tomita Hirata

Abstract: The watershed transform is a well-known approach for image segmentation. Watershed from markers and hierarchical watershed are derived from the watershed transform and are suitable for interactive image segmentation: in the former, the user can edit markers and control the segmentation result; in the latter, the user can select an image partition from a nested set of partitions. We investigate and propose ways to transition from one approach to other. Such transitions can be used to integrate both approaches in such a way that allow us to make full use of the strengths of both. We present examples that illustrate the use of the proposed transitions in conjunction with several interaction possibilities from both approaches.

Paper Nr: 194
Title:

ON ANALYZING SYMMETRY OF OBJECTS USING ELASTIC DEFORMATIONS

Authors:

Chafik Samir, Anuj Srivastava, Mohamed Daoudi and Sebastian Kurtek

Abstract: We introduce a framework for analyzing symmetry of 2D and 3D objects using elastic deformations of their boundaries. The basic idea is to define spaces of elastic shapes and to compute shortest (geodesic) paths between the objects and their reflections using a Riemannian structure. Elastic matching, based on optimal (nonlinear) re-parameterizations of curves, provides a better registration of points across shapes, as compared to the previously-used linear registrations. A crucial step of orientation alignment, akin to finding planes of symmetry, is performed as a search for shortest geodesic paths. This framework is fully automatic and provides: a measure of asymmetry, the nearest symmetric shape, the optimal deformation to make an object symmetric, and the plane of symmetry for a given object.

Paper Nr: 217
Title:

EXTRACTING PRINTED DESIGNS AND WOVEN PATTERNS FROM TEXTILE IMAGES

Authors:

Jia Wei, Stephen J. McKenna and Annette A. Ward

Abstract: The extraction of printed designs and woven patterns from textiles is formulated as a pixel labelling problem. Algorithms based on Markov random field (MRF) optimisation and reestimation are described and evaluated on images from an historical fabric archive. A method for quantitative evaluation is presented and used to compare the performance of MRF models optimised using $\alpha-$expansion and iterated conditional modes, both with and without parameter reestimation. Results are promising for potential application to content-based indexing and browsing.

Paper Nr: 224
Title:

A COMPLETE SYSTEM FOR DETECTION AND RECOGNITION OF TEXT IN GRAPHICAL DOCUMENTS USING BACKGROUND INFORMATION

Authors:

Partha Pratim Roy, Josep Lladós and Umapada Pal

Abstract: Automatic Text/symbols retrieval in graphical documents (map, engineering drawing) involves many challenges because they are not usually parallel to each other. They are multi-oriented and curve in nature to annotate the graphical curve lines and hence follow a curvi-linear way too. Sometimes, text and symbols frequently touch/overlap with graphical components (river, street, border line) which enhances the problem. For OCR of such documents we need to extract individual text lines and their corresponding words/characters. In this paper, we propose a methodology to extract individual text lines and an approach for recognition of the extracted text characters from such complex graphical documents. The methodology is based on the foreground and background information of the text components. To take care of background information, water reservoir concept and convex hull have been used. For recognition of multi-font, multi-scale and multi-oriented characters, Support Vector Machine (SVM) based classifier is applied. Circular ring and convex hull have been used along with angular information of the contour pixels of the characters to make the feature rotation and scale invariant.

Paper Nr: 248
Title:

TEXTURED IMAGE SEGMENTATION BASED ON LOCAL SPECTRAL HISTOGRAM AND ACTIVE CONTOUR

Authors:

Xianghua Xie

Abstract: In this paper, we propose a novel level set based active contour model to segment textured images. The proposed methods is based on the assumption that local histograms of filtering responses between foreground and background regions are statistically separable. In order to be able to handle texture non-uniformities, which often occur in real world images, we use rotation invariant filtering features and local spectral histograms as image feature to drive the snake segmentation. Automatic histogram bin size selection is carried out so that its underlying distribution can be best represented. Experimental results on both synthetic and real data show promising results and significant improvements compared to direct modeling of filtering responses.

Short Papers
Paper Nr: 15
Title:

SKY DETECTION IN CSC-SEGMENTED COLOR IMAGES

Authors:

Frank Schmitt and Lutz Priese

Abstract: We present a novel algorithm for detection of sky areas in outdoor color images. In contrast to sky detectors in literature that detect only blue, cloudless sky we intend to detect all sorts of sky, i.e. blue, clouded and partially clouded sky. Our approach is based on the analysis of color, position, and shape properties of color homogeneous spatially connected regions detected by the CSC. An evaluation on a set of images acquired under different weather conditions proves the quality of the proposed system.

Paper Nr: 34
Title:

COLOR FEATURES FOR VISION-BASED TRAFFIC SIGN CANDIDATE DETECTION

Authors:

Steffen Görmer, Anton Kummert and Stefan Müller-Schneiders

Abstract: A common approach for traffic sign detection and recognition algorithms is to use shape based and in addition color features. Especially to distinguish between speed-limit and end-of-speed-limit-signs the usage of color information can be helpful as the outer border of speed-signs is in a forceful red. In this paper the focus is faced on color features of speed-limit and no-overtaking signs. The apparent color in the captured image is varying very much due to illumination conditions, sign surface condition and viewing angle. Therefore the color distribution in the HSV color space of a sufficient amount of signs at different illumination conditions and aging has been collected, examined, and a matching mathematical model is developed to describe the subregion in the according color space. Once the color region of traffic signs is known, two kinds of traffic sign segmentation algorithms are developed and evaluated with the explicit focus only on color features to preselect subregions in the image where (red bordered) traffic signs are likely to be.

Paper Nr: 36
Title:

IMAGE UNDERSTANDING USING SELF-SIMILAR SIFT FEATURES

Authors:

Nils Hering, Frank Schmitt and Lutz Priese

Abstract: In this paper we present a new method to group self-similar SIFT features in images. The aim is to automatically build groups of all SIFT features with the same semantics in an image. To achieve this a new distance between SIFT feature vectors taking into account their orientation and scale is introduced. The methods are presented in the context of recognition of buildings. A first evaluation shows promising results.

Paper Nr: 45
Title:

STRUCTURE, SCALE-SPACE AND DECAY OF OTSU’S THRESHOLD IN IMAGES FOR FOREGROUND/BACKGROUND DISCRIMINATION

Authors:

Rahul Walia and Ray Jarvis

Abstract: A method for gauging the appropriate scale for foreground-background discrimination in Scale-Space theory is presented. Otsu’s Threshold (OT) is a statistical parameter generated from the first two moments of a histogram of a signal / image. In the current work a set of OT is derived from histograms of derivatives of image having Scale-Space representation. This set of OT, when plotted against corresponding scale, generates a Threshold Graph (TG). The TG undergoes an exponential decay, in the absence of foreground and exhibits inflection(s) in the presence of foreground. It is demonstrated, using synthetic and natural images, that the maxima of inflection indicate the scale and threshold (OT) appropriate to interface edges. The edges identified by thresholding at scale and threshold given by inflection of OT correspond to foreground-background interface edges. The histogram inherently imbeds the TG with underlying image signal parameters like background intensity range, pattern frequency, foreground-background intensity gradient, foreground size etc, making the method adaptable and deployable for unsupervised machine vision applications. Commutative, separable and symmetric properties of the Scale-Space representation of an image and its derivatives are preserved and computationally efficient implementations are available.

Paper Nr: 47
Title:

WEAKENED WATERSHED ASSEMBLY FOR REMOTE SENSING IMAGE SEGMENTATION AND CHANGE DETECTION

Authors:

Olivier Debeir, Hussein Atoui, Christophe Simler, Nadine Warzée and Eléonore Wolff

Abstract: Marked watershed transform can be seen as a classification in which connected pixels are grouped into components included into the marks catchment basins.The weakened classifier assembly paradigm has shown its ability to give better results than its best member, while generalization and robustness to the noise present in the dataset is increased. We promote in this paper the use of the weakened watershed assembly for remote sensed image segmentation followed by a consensus (vote) of the segmentation results. This approach allows to, but is not restricted to, introduce previously existing borders (e.g. for the map update) in order to constraint the segmentation. We show how the method parameters influence the resulting segmentation and what are the choices the practitioner can make with respect to his problem. A validation of the obtained segmentation is done by comparing with a manual segmentation of the image.

Paper Nr: 59
Title:

A SOLID TEXTURE DATABASE FOR SEGMENTATION AND CLASSIFICATION EXPERIMENTS

Authors:

Ludovic Paulhac, Pascal Makris and Jean-Yves Ramel

Abstract: This paper describes the methods of construction and the main characteristics of a solid texture database freely available for texture classification experiment. Here the purpose is to propose a solid texture database with many classes of different solid textures to allow an evaluation of properties and performance of analysis methods. Each images is described by a xml file made according to a DTD which is available in our web site. Using this formalism, it is even possible for a researcher to propose his own images or creation methods to complete this solid texture database. At last we discuss about different ways to exploit the database by reviewing some evaluation methods used to evaluate performance of classification and segmentation algorithms.

Paper Nr: 81
Title:

USE OF ADAPTIVE BOOSTING IN FEATURE SELECTION FOR VEHICLE MAKE & MODEL RECOGNITION

Authors:

I. Zafar, B. S. Acar and E. A. Edirisinghe

Abstract: Vehicle Make and Model Recognition (Vehicle MMR) systems that are capable of improving the trustworthiness of automatic number plate recognitions systems have received attention of the research community in the recent past. Out of a number of algorithms that have been proposed in literature the use of Scale Invariant Feature Transforms (SIFT) in particular have been able to demonstrate the ability to perform vehicle MMR, invariant to scale, rotation, translation, which forms typical challenges of the application domain. In this paper we propose a novel approach to SIFT based vehicle MMR in which SIFT features are initially investigated for their relevance in representing the uniqueness of the make and model of a given vehicle class based on Adaptive Boosting. We provide experimental results to show that the proposed selection of SIFT features significantly reduces the computational cost associated with classification at negligible loss of the system accuracy. We further prove that the use of more appropriate feature matching algorithms enable significant gains in the accuracy of the algorithm. Experimental results prove that a 91% accuracy rate has been achieved on a publically available database of car frontal views.

Paper Nr: 97
Title:

BINARIZATION OF PHASE CONTRAST VOLUME IMAGES OF FIBROUS MATERIALS - A Case Study

Authors:

Filip Malmberg, Catherine Östlund and Gunilla Borgefors

Abstract: In this paper, we present a method for segmenting phase contrast volume images of fibrous materials into fibre and background. The method is based on graph cut segmentation, and is tested on high resolution X-ray microtomography volume images of wood fibres in paper an composites. The new method produces better results than a standard method based on edge-preserving smoothing and hysteresis thresholding. The most important improvement is that the proposed method handles thick and collapsed fibres more accurately than previous methods.

Paper Nr: 126
Title:

LIVER SEGMENTATION USING LEVEL SETS AND GENETIC ALGORITHMS

Authors:

Dário A. B. Oliveira, Raul Q. Feitosa and Mauro M. Correia

Abstract: This paper presents a method based on level sets to segment the liver using Computer Tomography (CT) images. Initially, the liver boundary is manually set in one slice as an initial solution, and then the method automatically segments the liver in all other slices, sequentially. In each step of iteration it fits a Gaussian curve to the liver histogram to model the speed image in which the level sets propagates. The parameters of our method were estimated using Genetic Algorithms (GA) and a database of reference segmentations. The method was tested using 20 different exams and five different measures of performance, and the results obtained confirm the potential of the method. The cases in which the method presented a poor performance are also discussed in order to instigate further research.

Paper Nr: 130
Title:

TEMPORAL VIDEO COMPRESSION USING MODE FACTOR AND POLYNOMIAL FITTING ON WAVELET COEFFICIENTS

Authors:

T. Nithyaletchumy Devi, W. K. Lim, W. N. Tan, Y. F. Tan, H. T. Teng and Y. F. Chang

Abstract: The core idea of this study is to build an algorithm that functions to compress video sequences. The mode value at every pixel along the temporal direction is calculated. If the frequency of the mode value satisfies a predetermined frequency, then the intensity values for entire entries at that particular pixel position will be changed to the mode value. The wavelet techniques will be applied to the pixels that do not satisfy the predetermined frequency and followed by a polynomial fitting method. For the purpose of compression, only the polynomial coefficients for pixels that do not satisfy the predetermined frequency, the mode values for pixels that satisfy the predetermined frequency and the corresponding pixel positions will be stored. To decompress, wavelet coefficients are estimated by the respective polynomials. The intensity values at the intended pixel position are obtained by inverse wavelet transform for pixels that do not satisfy the predetermined frequency. On the other hand, the stored mode values will be used to represent the intensity values throughout the time interval. This method portrays a prospect to achieve an acceptable decompressed video quality and compression ratio.

Paper Nr: 144
Title:

VARIATIONAL REGION GROWING

Authors:

Rose Jean-Loic, Revol-Muller Chantal, Odet Christophe and Christian Reichert

Abstract: Region growing is one of the most popular image segmentation methods. The concept of region growing is easily understandable but sometimes criticized for its lack of theorical background. In order to overcome this weakness, we propose to describe region growing in a new framework which is the variational approach. A variational approach is commonly used in image segmentation methods such as active contours or level sets, but is quite original in the context of region growing. We call this method Variational Region Growing. First, we define a region-based criterion. A discrete derivation is applied to this criterion in order to get an evolution rule for the evolving region. The aim of this equation is to guide the evolving region towards a minimum of the criterion. Then, we formalize the iterative process of region growing in the proposed framework. Furthermore, we highlight the relevance of VRG for integrating shape prior. We apply VRG to synthetic and 3D-biomedical images. Results illustrate the improvements of VRG compared to classical methods.

Paper Nr: 149
Title:

ASSESSING THE VARIABILITY OF INTERNAL BRAIN STRUCTURES USING PCA ON SAMPLED SURFACE POINTS

Authors:

Darwin Martínez, Isabelle Bloch and Tiberio Hernández

Abstract: In this paper we propose to analyze the variability of brain structures using principal component analysis (PCA). We rely on a data base of registered and segmented 3D MRI images of normal subjects. We propose to use as input of PCA sampled points on the surface of the considered objects, selected using uniformity criteria or based on mean and Gaussian curvatures. Results are shown on the lateral ventricles. The main variation tendencies are observed in the orthogonal eigenvector space. Dimensionality reduction can be achieved and the variability of each landmark point is accurately described using the first three components.

Paper Nr: 151
Title:

AN ANALYSIS OF SAMPLING FOR FILTER-BASED FEATURE EXTRACTION AND ADABOOST LEARNING

Authors:

Anselm Haselhoff and Anton Kummert

Abstract: In this work a sampling scheme for filter-based feature extraction in the field of appearance-based object detection is analyzed. Optimized sampling radically reduces the number of features during the AdaBoost training process and better classification performance is achieved. The signal energy is used to determine an appropriate sampling resolution which then is used to determine the positions at which the features are calculated. The advantage is that these positions are distributed according to the signal properties of the training images. The approach is verified using an AdaBoost algorithm with Haar-like features for vehicle detection. Tests of classifiers, trained with different resolutions and a sampling scheme, are performed and the results are presented.

Paper Nr: 197
Title:

EAR SEGMETATION USING TOPOGRAPHIC LABELS

Authors:

Milad Lankarany and Alireza Ahmadyfard

Abstract: Ear segmentation is considered as the first step of all ear biometrics systems while the objective in separating the ear from its surrounding backgrounds is to improve the capability of automatic systems used for ear recognition. To meet this objective in the context of ear biometrics a new automatic algorithm based on topographic labels is presented here. The proposed algorithm contains four stages. First we extract topographic labels from the ear image. Then using the map of regions for three topographic labels namely, ridge, convex hill and convex saddle hill we build a composed set of labels. The thresholding on this labelled image provides a connected component with the maximum number of pixels which represents the outer boundary of the ear. As well as addressing faster implementation and brightness insensitivity, the technique is also validated by performing completely successful ear segmentation tested on “USTB” database which contains 308 profile view images of the ear and its surrounding backgrounds.

Paper Nr: 210
Title:

LOCMAX SIFT - Non-Statistical Dimension Reduction on Invariant Descriptors

Authors:

Dávid Losteiner, László Havasi and Tamás Szirányi

Abstract: The descriptors used for image indexing - e.g. Scale Invariant Feature Transform (SIFT) - are generally parameterized in very high dimensional spaces which guarantee the invariance on different light conditions, orientation and scale. The number of dimensions limit the performance of search techniques in terms of computational speed. That is why dimension reduction of descriptors is playing an important role in real life applications. In the paper we present a modified version of the most popular algorithm, SIFT. The motivation was to speed up searching on large feature databases in video surveillance systems. Our method is based on the standard SIFT algorithm using a structural property: the local maxima of these high dimensional descriptors. The weighted local positions will be aligned with a dynamic programming algorithm (DTW) and its error is calculated as a new kind of measure between descriptors. In our approach we do not use a training set, pre-computed statistics or any parameters when finding the matches, which is very important for an online video indexing application.

Paper Nr: 212
Title:

FAST MULTI-CLASS IMAGE ANNOTATION WITH RANDOM SUBWINDOWS AND MULTIPLE OUTPUT RANDOMIZED TREES

Authors:

Marie Dumont, Raphaël Marée, Louis Wehenkel and Pierre Geurts

Abstract: This paper addresses image annotation, i.e. labelling pixels of an image with a class among a finite set of predefined classes. We propose a new method which extracts a sample of subwindows from a set of annotated images in order to train a subwindow annotation model by using the extremely randomized trees ensemble method appropriately extended to handle high-dimensional output spaces. The annotation of a pixel of an unseen image is done by aggregating the annotations of its subwindows containing this pixel. The proposed method is compared to a more basic approach predicting the class of a pixel from a single window centered on that pixel and to other state-of-the-art image annotation methods. In terms of accuracy, the proposed method significantly outperforms the basic method and shows good performances with respect to the state-of-the-art, while being more generic, conceptually simpler, and of higher computational efficiency than these latter.

Paper Nr: 225
Title:

COREST: A MEASURE OF COLOR AND SPACE STABILITY TO DETECT SALIENT REGIONS ACCORDING TO HUMAN CRITERIA

Authors:

Agnés Borràs and Josep Lladós

Abstract: In this paper we present a novel method to obtain regions of interest in color images. The strategy consists in the evaluation of the stability of a region according to its properties of color and spatial arrangement. We propose a fusion of the classical color image segmentation with the space scale analysis. An image can be decomposed in a set of regions that describe the whole image content. Using a set of manual labelled images we have evaluated the properties of the detector according to the human perception. The proposed region detector has a potential application in the field of the content based image retrieval by sketch.

Paper Nr: 238
Title:

EVALUATION AND IMPROVEMENTS OF THE LEVEL SET METHOD FOR RM IMAGES SEGMENTATION

Authors:

Donatello Conte, Pasquale Foggia, Francesco Tufano and Mario Vento

Abstract: We present a novel algorithm for the segmentation of bony tissues in MR images. Our approach is based on the level set algorithm. We introduce some pre-processing phases that improve image quality and segmentation performance. The technique requires no training and operates semi-automatically, requiring only the entry of a single seed point within the tissue to be segmented. The proposed approach is more robust than the other approaches present in the literature, with respect to the position of the initial seed point. The quantitative analysis of the results on a significant number of images demonstrate the effectiveness of our approach.

Paper Nr: 250
Title:

AUTOMATIC KEY-FRAME EXTRACTION FROM BROADCAST SOCCER VIDEOS

Authors:

Nielsen C. Simões, Neucimar J. Leite and Beatriz Marcotegui

Abstract: This paper presents a new approach for broadcast soccer video navigation and summarization based on specific representative images of the video. It also takes into account some soccer video features to better describe these videos. This work considers a special color reduction based on an HSV subquantization and a shot classification approach for soccer videos by exploring the dominant color related to the playground area.

Paper Nr: 251
Title:

APPLICATION OF SCALE ANALYSIS ON LEVEL SETS FOR COOPERATIVE IMAGE SEGMENTATION

Authors:

M. Y. Benzian and N. Benamrane

Abstract: Image Segmentation has been used by many approaches and techniques in artificial vision but none of them has been proved to be applied completely successfully for any image or object type. We propose in this paper a segmentation approach based on level sets which incorporate low scale cooperative analysis of both image and curve. The image at a low resolution level provides information on coarse variation of grey level intensity. For the same perspective, the curve at a low resolution scale provides a coarser curvature value. The purpose of image scale cooperative approach is to avoid stopping the curve evolution at local minima of images. This method is tested on a sample of a 2D abdomen image, and can be applied on other image types. The results obtained are satisfying and show good precision of the method.

Paper Nr: 257
Title:

DETECTING RECTANGULAR OBJECTS IN URBAN IMAGERY - A Re-Segmentation Approach

Authors:

Thales Sehn Korting, Luciano Vieira Dutra and Leila Maria Garcia Fonseca

Abstract: Image segmentation is a broad area, which covers strategies for splitting one input image into its components. This paper aims to present a re-segmentation approach applied to urban imagery, where the interest elements (houses roofs) are considered to have a rectangular shape. Our technique finds and generates rectangular objects, leaving the remaining objects as background. With an over-segmented image we connect adjacent objects in a graph structure, known as Region Adjacency Graph - RAG. We then go into the graph, searching for best cuts that may result in segments more rectangular, in a relaxation-like approach. Graph search considers information about object class, through a pre-classification stage using Self-Organizing Maps algorithm. Results show that the method was able to find rectangular elements, according user-defined parameters, such as maximum levels of graph searching and minimum degree of rectangularity for interest objects.

Paper Nr: 262
Title:

SHAPE COMPARISON BASED ON SKELETON ISOMORPHISM

Authors:

L. Domakhina and A. Okhlopkov

Abstract: A new approach to shape comparison problem is presented in this work. The approach is based on skeleton isomorphism. We propose a shape metrics construction instrument which is based on finding close shapes having isomorphic continuous skeletons. We propose several metrics based on this instrument that can be used for shape comparison. The main advantage over existing approaches is mathematically correctly defined shape metrics via Hausdorff distance. The efficiency of the proposed approach is confirmed on the shapes recognition problem.

Paper Nr: 264
Title:

SCORING OF BREAST TISSUE MICROARRAY SPOTS THROUGH ORDINAL REGRESSION

Authors:

Telmo Amaral, Stephen McKenna, Katherine Robertson and Alastair Thompson

Abstract: Breast tissue microarrays (TMAs) facilitate the study of very large numbers of breast tumours in a single histological section, but their scoring by pathologists is time consuming, typically highly quantised, and not without error. This paper compares the results of different classification and ordinal regression algorithms trained to predict the scores of immunostained breast TMA spots, based on spot features obtained in previous work by the authors. Despite certain theoretical advantages, Gaussian process ordinal regression failed to achieve any clear performance gain over classification using a multi-layer perceptron. The use of the entropy of the posterior probability distribution over class labels for avoiding uncertain decisions is demonstrated.

Paper Nr: 267
Title:

BREADER: A MODULAR FRAMEWORK FOR VISION RECOGNITION OF MATHEMATICAL-LOGICAL STRUCTURES

Authors:

Celia Salmim Rafael and Jorge Simao

Abstract: We describe a system that uses image processing and computer vision techniques to discover and recognize mathematical, logical, geometric, and other structures and symbols from bit-map images. The system uses a modular architecture to allow easy incorporation of new kinds of object recognizers. The systems uses a ``blackboard'' data-structure to retain the list of objects that have been recognized. Particular object recognizers check this list to discover new objects. Initially, objects are simple pixel clusters resulting from image-processing and segmentation operations. First-level object recognizers include symbol/character recognizers and basic geometric elements. Higher-level object recognizers collect lower-level objects and build more complex objects. This includes mathematical-logical expressions, and complex geometric elements such as polylines, graphs, and others. The recognized objects and structures can be exported to a variety of vector graphic languages and type-setting systems, such as SVG and LaTeX.

Posters
Paper Nr: 17
Title:

ACCELERATION OF THE EXPECTATION-MAXIMIZATION ALGORITHM FOR A TWOFOLD GAUSSIAN MIXTURE MODEL BY USING THE HISTOGRAM OF THE OBSERVATIONS INSTEAD OF THE OBSERVATIONS - Evaluation of its Accuracy by Generated Histograms

Authors:

J. Bruijns

Abstract: Volume representations of blood vessels acquired by 3D rotational angiography are very suitable for diagnosing a stenosis or an aneurysm. For optimal treatment, physicians need to know the shape of the diseased vessel parts. Binary segmentation by thresholding is the first step in our shape extraction procedure. Assuming a twofold Gaussian mixture model, the model parameters (and thus the threshold for binary segmentation) can be extracted from the observations (i.e. the gray values) by the Expectation-Maximization (EM) algorithm. Since the EM algorithm requires a number of iterations through the observations, and because of the large number of observations, the EM algorithm is very time-consuming. Therefore, we developed a method to apply the EM algorithm to the histogram of the observations, requiring a single pass through the observations and a number of iterations through the much smaller histogram. This variant gives almost the same results as the original EM algorithm, at least for our clinical volumes. We have used this variant for an evaluation of the accuracy of the EM algorithm: the maximum relative error in the mixing coefficients was less than 7%, the maximum relative error in the parameters of the two Gaussian components was less than 2.5%.

Paper Nr: 37
Title:

SEGMENTATION OF MULTISPECTRAL IMAGES USING MATHEMATICAL MORPHOLOGY AND AUTOMATIC CLASSIFICATION - Application to Microscopic Medical Images

Authors:

Sarah Ghandour, Eric Gonneau and Guy Flouzat

Abstract: In this paper, a new color segmentation scheme of microscopic color images is proposed. The approach combines a region growing method and a clustering method. Each channel plane of the color images is represented by a set of regions using a watershed algorithm. Those regions are represented and modeled by a Region Adjacency Graph (RAG). A novel method is introduced to simplify the RAG by merging candidate regions until the violation of a stopping aggregation criterion determined using a statistical method which combines the generalized likelihood ratio (GLR) and the Bayesian information criterion (BIC). From the resulting segmented and simplified images, the RGB image is computed. Structural features as cells area, shape indicator and cells color are extracted using the simplified graph and then stored in a database in order to elaborate meaningful queries. A regularization step based on the use of an automatic classification will take place. Results show that our method that does not involve any a priori knowledge is suitable for several types of cytology images.

Paper Nr: 52
Title:

MULTIMODAL REGISTRATION OF NMR-VOLUMES AND HISTOLOGICAL CROSS-SECTIONS OF BARLEY GRAINS ON THE CELL BROADBAND ENGINE PROCESSOR

Authors:

Rainer Pielot, Udo Seiffert, Bertram Manz, Diana Weier, Frank Volke, Falk Schreiber and Winfriede Weschke

Abstract: Representation of developmental gradients in biological structures requires visualization of storage compounds, metabolites or mRNA hybridization patterns in a 3D morphological framework. NMR imaging can generate such a 3D framework by non-invasive scanning of living structures. Histology provides the distribution of developmental markers as 2D cross-sections. Multimodal alignment tries to put such different image modalities into correspondence. Here we compare different methods for rigid registration of 3D NMR datasets and 2D cross-sections of developing barley grains. As metrics for similarity measurements mutual information, cross correlation and overlap index are used. In addition, different filters are applied to the images before the alignment. The algorithms are parallelized, partially vectorized and implemented on the Cell Broadband Engine processor in a Playstation® 3. Evaluation is done by a comparison of the results to a manually defined gold standard of a NMR dataset and a corresponding 2D cross-section of the same grain. The results show, that best alignment is achieved by application of mutual information on sobel-filtered images and, compared to the implementation on a standard single-core CPU, the computation is accelerated by a factor up to 1.95.

Paper Nr: 98
Title:

LESION BOUNDARY SEGMENTATION USING LEVEL SET METHODS

Authors:

Elizabeth M. Massey, James A. Lowell, Andrew Hunter and David Steel

Abstract: This paper addresses the issue of accurate lesion segmentation in retinal imagery, using level set methods and a novel stopping mechanism - an elementary features scheme. Specifically, the curve propagation is guided by a gradient map built using a combination of histogram equalization and robust statistics. The stopping mechanism uses elementary features gathered as the curve deforms over time, and then using a lesionness measure, defined herein, ’looks back in time’ to find the point at which the curve best fits the real object. We implement the level set using a fast upwind scheme and compare the proposed method against five other segmentation algorithms performed on 50 randomly selected images of exudates with a database of clinician marked-up boundaries as ground truth.

Paper Nr: 110
Title:

A NOVEL FEATURE EXTRACTION AND SELECTION METHOD FOR STEEL SHEET DEFECTS CLASSIFICATION

Authors:

Navid Rabbani, Mohammad Alamdari, Mohammad Rohollah Yazdani and Farhad Imanpour

Abstract: This paper presents a novel approach for detection and classification of steel sheet defects. A Defects database with enough samples and good imaging conditions introduced. A set of new features proposed to extract the appropriate textural characteristics from defects images. This is followed by the selection of important features using SFFS algorithm. Modifications to SFFS feature selection method were presented to achieve the real-time needs of research. The proposed scheme decrease computational complexity in cost of little decreasing of classification accuracy.

Paper Nr: 160
Title:

3D PHASE CORRELATION USING NON-UNIFORM CYLINDRICAL SAMPLING

Authors:

Jakub Bican and Jan Flusser

Abstract: A Phase Correlation Method (PCM) is a well known and effective strategy for 2D image registration. Earlier we presented a derived method called Cylindrical Phase Correlation Method (CPCM) which belongs among many improvements and applications of PCM published by other authors. CPCM utilizes the effective and robust approach of PCM for a 3D image rigid registration task in an iterative optimization procedure. In this paper, the improvement to the rotation estimation step based on the non-uniform sampling in the cylindrical coordinate system is described in detail. Experimental results are provided both for the original and improved version of the rotation estimation algorithm as well as the results of the final method and its comparison to reference methods.

Paper Nr: 175
Title:

PATTERN ANALYSIS FOR COMPUTER-AIDED DRIVING

Authors:

Frédérique Robert-Inacio, Damien Outré, Mohamadou-Falilou Diop and Franck Bertrand

Abstract: This study is based on the elaboration of a software for computer-aided driving. A video is acquired through the windscreen while driving, showing the scene observed by the driver. The purpose is to extract characteristic elements on each image of the video sequence in order to interpret them and help the driver to make a decision. In this way, the road width is estimated. As well, road signs are extracted from the video and the information they contain is interpreted. The presented works are based on a preliminary study giving a draft software and experimental results are shown on several examples.

Paper Nr: 193
Title:

A FEATURE-BASED DENSE LOCAL REGISTRATION OF PAIRS OF RETINAL IMAGES

Authors:

M. Fernandes, Y. Gavet and J. C. Pinoli

Abstract: A method for spatial registering pairs of digital images of the retina is presented, using intrinsic feature points (landmarks) and dense local transformation. First, landmarks, i.e. blood vessel bifurcations, are extracted from both retinal images using filtering followed by thinning and branch point analysis. Correspondances are found by topological and structural comparisons between both retinal networks. From this set of matching points, a displacement field is computed and, finally, one of the two images is transformed. Due to complex retinal registration problem, the presented transformation is dense, local and adaptive. Expermimental results established the effectiveness and the interest of the dense registration method.

Paper Nr: 199
Title:

NOISE REMOVAL IN CRACK DETECTION ALGORITHM ON ASPHALT SURFACE IMAGES

Authors:

Siwaporn Sorncharean and Suebskul Phiphobmongkol

Abstract: This paper presents an image processing technique for noise removal in the intermediate stage of crack detection algorithm. Unlike noise in other domains, noise in this kind of image is unique in terms of size and dispersal. This technique is based on Newton’s theory of universal gravitation. The technique highlights noise within an image by giving low values to noise objects while giving high values to cracks, thus, making it simple to indicate an object as a noise or a crack. This method gave good results in removing noise from crack segmentation algorithm.

Paper Nr: 202
Title:

ROBUST AUTOMATIC SEGMENTATION OF ANCIENT COINS

Authors:

Sebastian Zambanini and Martin Kampel

Abstract: Nowadays, ancient coins are becoming subject to a very large illicit trade. Thus, the interest in reliable automatic coin recognition systems within cultural heritage and law enforcement institutions rises rapidly. Central component in the permanent identification and traceability of coins is the underlying image recognition technology. Prior to any analysis a coin image has to be segmented into two areas: the area depicting the coin and the area belonging to the background. In this paper, we focus on the segmentation task as a preprocessing step for any automated coin recognition system. The objective is a robust segmentation procedure for a large variety of coin image styles. We present a simple and fast method for coin segmentation, based on local entropy and gray value range. Results of the developed algorithm are shown for an image database of ancient coins and demonstrate the benefits of our approach.

Paper Nr: 227
Title:

A NEW NON-REDUNDANT SCALE INVARIANT INTEREST POINT DETECTOR

Authors:

Luis Ferraz and Xavier Binefa

Abstract: In this paper we present a novel scale invariant interest point detector of blobs which incorporates the idea of blob movement along the scales. This trajectory of the blobs through the scale space is shown to be valuable information in order to estimate the most stable locations and scales of the interest points. Our detector evaluates interest points in terms of their self trajectory along the scales and its evolution avoiding redundant detections. Moreover, in this paper we present a differential geometry view to understand how interest points can be detected. We propose analyze the gaussian curvature to classify image regions as elliptical (blobs) or hyperbolic (corners or saddles). Our interest point detector has been compared with Harris-Laplace and Hessian-Laplace detectors on infrared (IR) images, outperforming their results in terms of the number and precision of interest points detected.

Paper Nr: 239
Title:

CELL TRACKING AND DATA ANALYSIS OF IN VITRO TUMOUR CELLS FROM TIME-LAPSE IMAGE SEQUENCES

Authors:

Kuan Yan, Fons Verbeek, Sylvia Le Dévédec and Bob van de Water

Abstract: In this paper, we address the problem of the analysis of cellular phenotype from time-lapse image sequences using object tracking algorithms and feature extraction and classification. We discusses the application of an object tracking algorithm for in the analysis of high content cell-migration time-lapse image sequence of extremely motile cells; these cells are captured at low time-resolution.. The small size of the objects and significant deformation of the object during the process renders the tracking as a non-trivial problem. To that end, the ‘KDE Mean Shift’, a real-time tracking solution, is adapted for our research. We illustrate that in a simulation experiment with artificial objects, with our algorithm an accuracy of over 90% can be established. Based on the tracking result, we propose several morphology and motility based measurements for the analysis of cell behaviour. Our analysis requires only initial manual interference; the majority of the processing is automated.

Area 3 - Image Understanding

Full Papers
Paper Nr: 19
Title:

BAYESIAN SCENE SEGMENTATION INCORPORATING MOTION CONSTRAINTS AND CATEGORY-SPECIFIC INFORMATION

Authors:

Alexander Bachmann and Irina Lulcheva

Abstract: In this paper we address the problem of detecting objects form a moving camera by jointly considering lowlevel image features and high-level object information. The proposed method partitions an image sequence into independently moving regions with similar 3-dimensional (3D) motion and distance to the observer. In the recognition stage category-specific information is integrated into the partitioning process. An object category is represented by a set of descriptors expressing the local appearance of salient object parts. To account for the geometric relationships among object parts a structural prior over part configurations is designed. This prior structure expresses the spatial dependencies of object parts observed in a training data set. To achieve global consistency in the recognition process, information about the scene is extracted from the entire image based on a set of global image features. These features are used to predict the scene context of the image from which characteristic spatial distributions and properties of an object category are derived. The scene context helps to resolve local ambiguities and achieves locally and globally consistent image segmentation. Our expectations on spatial continuity of objects are expressed in a Markov Random Field (MRF) model. Segmentation results are presented based on real image sequences.

Paper Nr: 43
Title:

EFFICIENT CLASSIFICATION FOR LARGE-SCALE PROBLEMS BY MULTIPLE LDA SUBSPACES

Authors:

Martina Uray, Peter M. Roth and Horst Bischof

Abstract: In this paper we consider the limitations of Linear Discriminative Analysis (LDA) when applying it for largescale problems. Since LDA was originally developed for two-class problems the obtained transformation is sub-optimal if multiple classes are considered. In fact, the separability between the classes is reduced, which decreases the classification power. To overcome this problem several approaches including weighting strategies and mixture models were proposed. But these approaches are complex and computational expensive. Moreover, they were only tested for a small number of classes. In contrast, our approach allows to handle a huge number of classes showing excellent classification performance at low computational costs. The main idea is to split the original data into multiple sub-sets and to compute a single LDA space for each sub-set. Thus, the separability in the obtained subspaces is increased and the overall classification power is improved. Moreover, since smaller matrices have to be handled the computational complexity is reduced for both, training and classification. These benefits are demonstrated on different publicly available datasets. In particular, we consider the task of object recognition, where we can handle up to 1000 classes.

Paper Nr: 50
Title:

A VLSI-ORIENTED AND POWER-EFFICIENT APPROACH FOR DYNAMIC TEXTURE RECOGNITION APPLIED TO SMOKE DETECTION

Authors:

Jorge Fernandez-Berni, Ricardo Carmona-Galán and Luis Carranza-González

Abstract: The recognition of dynamic textures is fundamental in processing image sequences as they are very common in natural scenes. The computation of the optic flow is the most popular method to detect, segment and analyse dynamic textures. For weak dynamic textures, this method is specially adequate. However, for strong dynamic textures, it implies heavy computational load and therefore an important energy consumption. In this paper, we propose a novel approach intented to be implemented by very low-power integrated vision devices. It is based on a simple and flexible computation at the focal plane implemented by power-efficient hardware. The first stages of the processing are dedicated to remove redundant spatial information in order to obtain a simplified representation of the original scene. This simplified representation can be used by subsequent digital processing stages to finally decide about the presence and evolution of a certain dynamic texture in the scene. As an application of the proposed approach, we present the preliminary results of smoke detection for the development of a forest fire detection system based on a wireless vision sensor network.

Paper Nr: 62
Title:

SEMI-SUPERVISED DISTANCE METRIC LEARNING FOR VISUAL OBJECT CLASSIFICATION

Authors:

Hakan Cevikalp and Roberto Paredes

Abstract: This paper describes a semi-supervised distance metric learning algorithm which uses pairwise equivalence (similarity and dissimilarity) constraints to discover the desired groups within high-dimensional data. As opposed to the traditional full rank distance metric learning algorithms, the proposed method can learn nonsquare projection matrices that yield low rank distance metrics. This brings additional benefits such as visualization of data samples and reducing the storage cost, and it is more robust to overfitting since the number of estimated parameters is greatly reduced. Our method works in both the input and kernel induced-feature space, and the distance metric is found by a gradient descent procedure that involves an eigen-decomposition in each step. Experimental results on high-dimensional visual object classification problems show that the computed distance metric improves the performance of the subsequent clustering algorithm.

Paper Nr: 90
Title:

MULTI-CLASS FROM BINARY - Divide to conquer

Authors:

Anderson Rocha and Siome Goldenstein

Abstract: Several researchers have proposed effective approaches for binary classification in the last years. We can easily extend some of those techniques to multi-class. Notwithstanding, some other powerful classifiers (e.g., SVMs) are hard to extend to multi-class. In such cases, the usual approach is to reduce the multi-class problem complexity into simpler binary classification problems (divide-and-conquer). In this paper, we address the multi-class problem by introducing the concept of affine relations among binary classifiers (dichotomies), and present a principled way to find groups of high correlated base learners. Finally, we devise a strategy to reduce the number of required dichotomies in the overall multi-class process.

Paper Nr: 129
Title:

FAST APPROXIMATE NEAREST NEIGHBORS WITH AUTOMATIC ALGORITHM CONFIGURATION

Authors:

Marius Muja and David G. Lowe

Abstract: For many computer vision problems, the most time consuming component consists of nearest neighbor matching in high-dimensional spaces. There are no known exact algorithms for solving these high-dimensional problems that are faster than linear search. Approximate algorithms are known to provide large speedups with only minor loss in accuracy, but many such algorithms have been published with only minimal guidance on selecting an algorithm and its parameters for any given problem. In this paper, we describe a system that answers the question, “What is the fastest approximate nearest-neighbor algorithm for my data?” Our system will take any given dataset and desired degree of precision and use these to automatically determine the best algorithm and parameter values. We also describe a new algorithm that applies priority search on hierarchical k-means trees, which we have found to provide the best known performance on many datasets. After testing a range of alternatives, we have found that multiple randomized k-d trees provide the best performance for other datasets. We are releasing public domain code that implements these approaches. This library provides about one order of magnitude improvement in query time over the best previously available software and provides fully automated parameter selection.

Paper Nr: 150
Title:

A KERNEL MAXIMUM UNCERTAINTY DISCRIMINANT ANALYSIS AND ITS APPLICATION TO FACE RECOGNITION

Authors:

Carlos Eduardo Thomaz and Gilson Antonio Giraldi

Abstract: In this paper, we extend the Maximum uncertainty Linear Discriminant Analysis (MLDA), proposed recently for limited sample size problems, to its kernel version. The new Kernel Maximum uncertainty Discriminant Analysis (KMDA) is a two-stage method composed of Kernel Principal Component Analysis (KPCA) followed by the standard MLDA. In order to evaluate its effectiveness, experiments on face recognition using the well-known ORL and FERET face databases were carried out and compared with other existing kernel discriminant methods, such as Generalized Discriminant Analysis (GDA) and Regularized Kernel Discriminant Analysis (RKDA). The classification results indicate that KMDA performs as well as GDA and RKDA, with the advantage of being a straightforward stabilization approach for the within-class scatter matrix that uses higher-order features for further classification improvements.

Paper Nr: 162
Title:

CLASSIFYING AND COMPARING REGULAR TEXTURES FOR RETRIEVAL USING TEXEL GEOMETRY

Authors:

Junwei Han and Stephen J. McKenna

Abstract: Regular textures can be modelled as consisting of periodic patterns where a fundamental unit, or texel, occurs repeatedly. This paper explores the use of a representation of texel geometry for classification and comparison of regular texture images. Texels are automatically extracted from images and the distribution of texel shape and orientation is modelled. The application of this model to image retrieval and browsing is discussed using examples from a database of art and textile images.

Paper Nr: 166
Title:

SIGNIFICANCE OF THE WEIBULL DISTRIBUTION AND ITS SUB-MODELS IN NATURAL IMAGE STATISTICS

Authors:

Victoria Yanulevskaya and Jan-Mark Geusebroek

Abstract: The contrast statistics of natural images can be adequately characterized by a two-parameter Weibull distribution. Here we show how distinct regimes of this Weibull distribution lead to various classes of visual content. These regimes can be determined using model selection techniques from information theory. We experimentally explore the occurrence of the content classes, as related to the global statistics, local statistics, and to human attended regions. As such, we explicitly link local image statistics and visual content.

Paper Nr: 265
Title:

TEXT SEGMENTATION FROM WEB IMAGES USING TWO-LEVEL VARIANCE MAPS

Authors:

Insook Jung and Il-Seok Oh

Abstract: Variance map can be used to detect and distinguish texts from the background in images. However previous variance maps work as one level and they revealed a limitation in dealing with diverse size, slant, orientation, translation and color of texts. In particular, they have difficulties in locating texts of large size or texts with severe color gradation due to specific value in mask sizes. We present a method of robustly segmenting text regions in complex web color images using two-level variance maps. The two-level variance maps works hierarchically. The first level finds the approximate locations of text regions using global horizontal and vertical color variances with the specific mask sizes. Then the second level segments each text region using intensity variation with a local new mask size, in which a local new mask size is determined adaptively. By the second process, backgrounds tend to disappear in each region and segmentation can be accurate. Highly promising experimental results have been obtained using the our method in 400 web images.

Short Papers
Paper Nr: 39
Title:

ADVANCED PLAYER ACTIVITY RECOGNITION BY INTEGRATING BODY POSTURE AND MOTION INFORMATION

Authors:

Marco Leo, Tiziana D’Orazio, Paolo Spagnolo and Pier Luigi Mazzeo

Abstract: Human action recognition is an important research area in the field of computer vision having a great number of real-world applications. This paper presents a multi-view action recognition framework able to extract human silhouette clues from different synchronized static cameras and then to validate them introducing advanced reasonings about scene dynamics. Two different algorithmic procedures have been introduced: the first one performs, in each acquired image, the neural recognition of the human body configuration by using a novel mathematic tool named Contourlet transform. The second procedure performs, instead, 3D ball and player motion analysis. The outcomes of both procedures are then properly merged to accomplish the final player activity recognition task. Experimental results were carried out on several image sequences acquired during some matches of the Italian Serie A soccer championship.

Paper Nr: 60
Title:

FOCUS OF ATTENTION AND REGION SEGREGATION BY LOW-LEVEL GEOMETRY

Authors:

J. A. Martins, J. A. Rodrigues and J. M. H. du Buf

Abstract: Research has shown that regions with conspicuous colours are very effective in attracting attention, and that regions with different textures also play an important role. We present a biologically plausible model to obtain a saliency map for Focus-of-Attention (FoA), based on colour and texture boundaries. By applying grouping cells which are devoted to low-level geometry, boundary information can be completed such that segregated regions are obtained. Furthermore, we show that low-level geometry, in addition to rendering filled regions, provides important local cues like corners, bars and blobs for region categorisation. The integration of FoA, region segregation and categorisation is important for developing fast gist vision, i.e., which types of objects are about where in a scene.

Paper Nr: 68
Title:

CHARACTER RECOGNITION IN NATURAL IMAGES

Authors:

Teófilo E. de Campos, Bodla Rakesh Babu and Manik Varma

Abstract: This paper tackles the problem of recognizing characters in images of natural scenes. In particular, we focus on recognizing characters in situations that would traditionally not be handled well by OCR techniques. We present an annotated database of images containing English and Kannada characters. The database comprises of images of street scenes taken in Bangalore, India using a standard camera. The problem is addressed in an object cateogorization framework based on a bag-of-visual-words representation. We assess the performance of various features based on nearest neighbour and SVMclassification. It is demonstrated that the performance of the proposed method, using as few as 15 training images, can be far superior to that of commercial OCR systems. Furthermore, the method can benefit from synthetically generated training data obviating the need for expensive data collection and annotation.

Paper Nr: 72
Title:

CLASSIFIERS SENSITIVITY FOR BOUNDARY CASE TESTING SET IN THE FACE RECOGNITION ALGORITHM BASED ON THE ACTIVE SHAPE MODEL

Authors:

Andrzej Florek and Maciej Król

Abstract: In this paper, experimental results from the face contour classification tests are shown. The presented approach is dedicated to a face recognition algorithm based on the Active Shape Model method. The results were obtained from experiments carried out on the set of 3300 images taken from 100 persons. Automatically fitted contours (as 194 ordered face contour points vector, where the contour consisted of eight components) were classified by Nearest Neighbourhood Classifier and Support Vector Machines classifier, after feature space decomposition, carried out by the Linear Discriminant Analysis method. Feature subspace size reduction and classification sensitivity analysis for boundary case testing set are presented.

Paper Nr: 77
Title:

USING PHYLLOTAXIS FOR DATE PALM TREE 3D RECONSTRUCTION FROM A SINGLE IMAGE

Authors:

Ran Dror and Ilan Shimshoni

Abstract: Phyllotaxis is the study of the morphological order of plants. Remarkably, in spite of the overwhelming diversity of plant morphology, there are common patterns that link a wide variety of species. The date palm, having a phyllotactic order, possesses a simple, repetitive model. Only a small number of parameters are needed to represent the phyllotactic order of the date palm. This a priori knowledge we have on the date palm can help in the 3D reconstruction of the tree and can even make it possible to reconstruct a 3D model from only one image. The proposed algorithm receives as input a single image of the date palm. Upon image acquisition, the algorithm proceeds to search for, and locate, the trunk followed by a few prominent leaves. From the location of the prominent leaves the algorithm proceeds to calculate tree model parameters, which can then be used to search for additional, neighboring, leaves. Complete 3D reconstruction is achieved by utilizing the calculated tree model parameters and by the known location of the leaves on the 2D image.

Paper Nr: 84
Title:

COMBINING SHAPE DESCRIPTORS AND COMPONENT-TREE FOR RECOGNITION OF ANCIENT GRAPHICAL DROP CAPS

Authors:

Benoît Naegel and Laurent Wendling

Abstract: The component-tree structure allows to analyse the connected components of the threshold sets of an image by means of various criteria. In this paper we propose to extend the component-tree structure by associating robust shape-descriptors to its nodes. This allows an efficient shape based classification of the image connected components. Based on this strategy, an original and generic methodology for object recognition is presented. This methodology has been applied to segment and recognize ancient graphical drop caps.

Paper Nr: 85
Title:

PRE-FIGHT DETECTION - Classification of Fighting Situations using Hierarchical AdaBoost

Authors:

Scott J. Blunsden and Robert B. Fisher

Abstract: This paper investigates the detection and classification of fighting and pre and post fighting events when viewed from a video camera. Specifically we investigate normal, pre, post and actual fighting sequences and classify them. A hierarchical AdaBoost classifier is described and results using this approach are presented. We show it is possible to classify pre-fighting situations using such an approach and demonstrate how it can be used in the general case of continuous sequences.

Paper Nr: 112
Title:

SOFT CATEGORIZATION AND ANNOTATION OF IMAGES WITH RADIAL BASIS FUNCTION NETWORKS

Authors:

Moreno Carullo, Elisabetta Binaghi and Ignazio Gallo

Abstract: This work focuses on fast approaches for image retrieval and classification by employing simple features to build image signatures. For this purpose a neural model for soft classification and automatic image annotation is proposed. The salient aspects of this solution are: a) the employment of a Radial Basis Function Network built on top of an image retrieval distance metric b) a soft learning strategy for annotation handling. Experiments have been conducted on a subset of the Corel image dataset for evaluation and comparative analysis.

Paper Nr: 127
Title:

PROJECTION PEAK ANALYSIS FOR RAPID EYE LOCALIZATION

Authors:

Jingwen Dai, Dan Liu and Jianbo Su

Abstract: This paper presents a new method of projection peak analysis for rapid eye localization. First, the eye region is segmented from the face image by setting appropriate candidate window. Then, a threshold is obtained by histogram analysis of the eye region image to binarize and segment the eyes out of the eye region. Thus, a series of projection peak will be derived from vertical and horizontal gray projection curves on the binary image, which is used to confirm the positions of the eyes. The proposed eye-localization method does not need any a priori knowledge and training process. Experiments on three face databases show that this method is effective, accurate and rapid in eye localization, which is fit for real-time face recognition system.

Paper Nr: 140
Title:

GENERIC 3D OBJECT RECOGNITION FROM TIME-OF-FLIGHT IMAGES USING BOOSTED COMBINED SHAPE FEATURES

Authors:

Doaa Hegazy and Joachim Denzler

Abstract: Very few research is done to deal with the problem of generic object recognition from range images. With the upcoming technique of Time-of-Flight cameras (TOF), for example the PMD-cameras, range images can be acquired in real-time and thus recorded range data can be used for generic object recognition. This paper presents a model for generic recognition of 3D objects from TOF images. The main challenge is the low resolution in space and the noise level of the data which makes careful feature selection and robust classifier necessary. Our approach describes the objects as a set of local shape specific features. These features are computed from interest regions detected and extracted using a suitable interest point detector. Learning is performed in a weakly supervised manner using RealAdaBoost algorithm. The main idea of our approach has previously been applied to 2D images, and, up to our knowledge, has never been applied to range images for the task of generic object recognition. As a second contribution, a new 3D object category database is introduced which provides 2D intensity as well as 3D range data about its members. Experimental evaluation of the performance of the proposed recognition model is carried out using the new database and promising results are obtained.

Paper Nr: 164
Title:

HUMAN GAIT RECOGNITION USING DIFFERENCE BETWEEN FRAMES

Authors:

Mazaher Karami and Alireza Ahmadyfard

Abstract: In this paper, we address the problem of human identification using gait. Considering the recent work of Lee et al. (Lee et al., 2007) proposed for gait recognition. First we will introduce the algorithm proposed by Lee et al.. This method has two main steps: (1) extract key frames to define the gait cycle pattern, and (2) compute Shape Variation-based frieze patterns. These patterns are then used to classify and perform the gait identification. We modify the utilized features in this approach. We try to omit redundant features based on the effect of each feature on recognition rate and in next step, we improve performance of this approach by making some changes in way of feature extraction. Finally, we use the statistical characteristics of employed features instead of direct applying of remaining features. We test the proposed method on CASIA database. The experimental results are used to compare the proposed method with Lee et al. method.

Paper Nr: 165
Title:

LOCAL HISTOGRAM BASED DESCRIPTORS FOR RECOGNITION

Authors:

Oskar Linde and Lars Bretzner

Abstract: This paper proposes a set of new image descriptors based on local histograms of basic operators. These descriptors are intended to serve in a first-level stage of an hierarcical representation of image structures. For reasons of efficiency and scalability, we argue that descriptors suitable for this purpose should be able to capture and separate invariant and variant properties. Unsupervised clustering of the image descriptors from training data gives a visual vocabulary, which allow for compact representations. We demonstrate the representational power of the proposed descriptors and vocabularies on image categorization tasks using well-known datasets. We use image representations via statistics in form of global histograms of the underlying visual words, and compare our results to earlier reported work.

Paper Nr: 168
Title:

VISUAL FACIAL AGEING USING PLS - Visual Ageing of Human Faces in Three Dimensions using Morphable Models and Projection to Latent Structures

Authors:

David W. Hunter and B. P. Tiddeman

Abstract: We present an approach to synthesising the effects of ageing on human face images using three-dimensional modelling. We extract a set of three-dimensional face models from a set of two-dimensional face images by fitting a Morphable Model. We propose a method to age these face models using Partial Least Squares to extract from the data-set those factors most related to ageing. These ageing related factors are used to train an individually weighted linear model. We show that this is an effective means of producing an aged face image and compare this method to two other linear ageing methods for ageing face models. This is demonstrated both quantitatively and with perceptual evaluation using human raters.

Paper Nr: 198
Title:

READING STREET SIGNS USING A GENERIC STRUCTURED OBJECT DETECTION AND SIGNATURE RECOGNITION APPROACH

Authors:

Sobhan Naderi Parizi, Alireza Tavakoli Targhi, Omid Aghazadeh and Jan-Olof Eklundh

Abstract: In the paper we address the applied problem of detecting and recognizing street name plates in urban images by a generic approach to structural object detection and recognition. A structured object is detected using a boosting approach and false positives are filtered using a specific method called the texture transform. In a second step the subregion containing the key information, here the text, is segmented out. Text is in this case characterized as texture and a texton based technique is applied. Finally the texts are recognized by using Dynamic Time Warping on signatures created from the identified regions. The recognition method is general and only requires text in some form, e.g. a list of printed words, but no image models of the plates for learning. Therefore, it can be shown to scale to rather large data sets. Moreover, due to its generality it applies to other cases, such as logo and sign recognition. On the other hand the critical part of the method lies in the detection step. Here it relied on knowledge about the appearance of street signs. However, the boosting approach also applies to other cases as long as the target region is structured in some way. The particular scenario considered deals with urban navigation and map indexing by mobile users, e.g. when the images are acquired by a mobile phone.

Paper Nr: 220
Title:

EVALUATION OF A ROAD SIGN PRE-DETECTION SYSTEM BY IMAGE ANALYSIS

Authors:

Philippe Foucher, Pierre Charbonnier and Houssem Kebbous

Abstract: In this paper, we introduce a pre-detection algorithm dedicated to French danger-warning and prohibitory road signs. The proposed method combines color, shape, location and symmetry features to select among large image databases, a small subset of pictures that probably contain road signs. We report the results of a systematic experimental assessment that we performed on five image databases, comprised of more than 26,000 images, covering 176 km and containing 371 traffic signs, among which a non-negligible amount (about 5% in average) is damaged. The experiments show that about 10% images of the sequences are selected and more than 87% traffic signs are detected. The missed objects always correspond to dirty, worn-out or badly oriented signs that would be difficult to detect even for a human operator.

Paper Nr: 272
Title:

3D OBJECT RECONSTRUCTION FROM SWISSRANGER SENSOR DATA USING A SPRING-MASS MODEL

Authors:

Guillem Alenya, Sergi Foix, Carme Torras and Babette Dellen

Abstract: We register close-range depth images of objects using a Swissranger sensor and apply a spring-mass model for 3D object reconstruction. The Swissranger sensor delivers depth images in real time which have, compared with other types of sensors, such as laser scanners, a lower resolution and are afflicted with larger uncertainties. To reduce noise and remove outliers in the data, we treat the point cloud as a system of interacting masses connected via elastic forces. We investigate two models, one with and one without a surface-topology preserving interaction strength. The algorithm is applied to synthetic and real Swissranger sensor data, demonstrating the feasibility of the approach. This method represents a preliminary step before fitting higher-level surface descriptors to the data, which will be required to define object-action complexes (OACS) for robot applications.

Posters
Paper Nr: 48
Title:

FEATURE EXTRACTION FOR LOCALIZED CBIR - What You Click is What you Get

Authors:

Steven Verstockt, Peter Lambert and Rik Van De Walle

Abstract: This paper addresses the problem of localized content based image retrieval. Contrary to classic CBIR systems which rely upon a global view of the image, localized CBIR only focuses on the portion of the image where the user is interested in, i.e. the relevant content. Using the proposed algorithm, it is possible to recognize an object by clicking on it. The algorithm starts with an automatic gamma correction and bilateral filtering. These pre-processing steps simplify the image segmentation. The segmentation itself uses dynamic region growing, starting from the click position. Contrary to the majority of segmentation techniques, region growing only focuses on that part of the image that contains the object. The remainder of the image is not investigated. This simplifies the recognition process, speeds up the segmentation, and increases the quality of the outcome. Following the region growing, the algorithm starts the recognition process, i.e., feature extraction and matching. Based on our requirements and the reported robustness in many state-of-the-art papers, the Scale Invariant Feature Transform (SIFT) approach is used. Extensive experimentation of our algorithm on three different datasets achieved a retrieval efficiency of approximately 80%.

Paper Nr: 49
Title:

WEIGHT ESTIMATION AND CLASSIFICATION OF MILLED RICE USING SUPPORT VECTOR MACHINES

Authors:

Oliver C. Agustin and Byung-Joo Oh

Abstract: This paper presents a method for weight estimation and classification of milled rice kernels using support vector machines. Shape descriptors are used as input features for determining the grade factors based on physical shapes such as headrice, broken kernel, and brewer. Colour histogram is extracted from milled rice image to obtain 24 colour features in RGB and Cielab colour spaces. We built a support vector regression (SVR) model for estimating rice kernel weight and support vector classifier (SVC) for rice defectives. Results showed that in real data, the performance of SVR is better than linear regression (LR) with a mean square error (MSE), mean absolute error (MAE) and correlation coefficient of 78.35x10-3, 0.206 and 0.9943, respectively. In determining grade factors based on colour appearance (rice defectives), SVC outperforms the generalized regression neural network (GRNN) with an accuracy of 98.86%.

Paper Nr: 53
Title:

MAKING STRUCTURAL PATTERN RECOGNITION TRACTABLE BY LOCAL INHIBITION

Authors:

E. Michaelsen, L. Doktorski and M. Arens

Abstract: Declarative knowledge and control decisions on the sequence of interpretation acts are separated in a structural pattern recognition system. The control can be optimized leaving the knowledge fixed. A simple production system is used as declarative example knowledge. It is tailored to recognize and locate rectangles in images – where object primitives are several thousand very short contour segments. Different control strategies can be realized: (i) a simple quality driven bottom-up control; (ii) an heuristic strategy punishing object instances which have been partner in an already performed reduction and (iii) a new psychologically inspired strategy that combines local inhibition with less local excitation. These strategies are compared quantitatively on synthetic data and qualitatively on a real aerial image.

Paper Nr: 58
Title:

HUMAN VISION SIMULATION IN THE BUILT ENVIRONMENT

Authors:

Qunli Chen, Chengyu SUN and Bauke de VRIES

Abstract: This paper first presents a brief review on visual perception in the built environment and the Standard Feature Model of visual cortex (SFM); following experiments are presented for architectural cue recognition (door, wall and doorway) using SFM feature-based model. Based on the findings of these experiments, we conclude that the visual differences between architectural cues are too subtle to realistically simulate human vision for the SFM.

Paper Nr: 69
Title:

OBJECT RECOGNITION USING MULTIPLE THRESHOLDING AND LOCAL BINARY SHAPE FEATURES

Authors:

Sameer Singh and Tom Warsop

Abstract: Traditionally, image thresholding is applied to segmentation - allowing foreground objects to be segemented. However, selection of thresholds in such schemes can prove difficult. We propose a solution by applying multiple thresholds. The task of object recognition then becomes that of matching binary objects, for which we present a new method based on local shape features. We embed our recognition method in a system which reduces the computational increase caused by using multiple thresholding. Experimental results show our method and system work well despite only using a single example of each object class for matching.

Paper Nr: 121
Title:

WINDOW DETECTION FROM TERRESTRIAL LASER SCANNER DATA - A Statistical Approach

Authors:

Haider Ali, Robert Sablatnig and Gerhard Paar

Abstract: This paper proposes a window detection system using applied statistics and image based methods from Terrestrial Laser Scanners which can be used for direct application in a deformation measurement system. It exploits the laser distance information either directly in the laser scanner spherical coordinate space images, or on segmented planar facade patches, both with the assumption that the laser beam penetrates windows. The applied statistical method uses basic local features on local distance variations and decides on an adaptive threshold on the basis of the 1-Sigma percentile upper limit with P90 90% and P10 10% produced sample quartiles of the data for the laser spherical coordinate system image and Q3 -Sigma for the ortho images of segmented 3D facade planes as a location in the order statistics. For window detection the image is binarized and morphological closing is performed using the derived adaptive threshold. Thereafter we do the contour analysis and obtain the bounding rectangles positions that directly form the window segments in the image. We compare the window detection results on the laser spherical coordinate system image with those on ortho images of segmented 3D facades. The system provides a windows detection rate of more than 85% with a processing time of less than a minute in a typical 360 degree laser scan image.

Paper Nr: 159
Title:

SMILE DETECTION USING LOCAL BINARY PATTERNS AND SUPPORT VECTOR MACHINES

Authors:

D. Freire, M. Castrillón and O. Déniz

Abstract: Facial expression recognition has been the subject of much research in the last years within the Computer Vision community. The detection of smiles, however, has received less attention. Its distinctive configuration may pose less problem than other, at times subtle, expressions. On the other hand, smiles can still be very useful as a measure of happiness, enjoyment or even approval. Geometrical or local-based detection approaches like the use of lip edges may not be robust enough and thus researchers have focused on applying machine learning to appearance-based descriptors. This work makes an extensive experimental study of smile detection testing the Local Binary Patterns (LBP) as main descriptors of the image, along with the powerful Support Vector Machines classifier. The results show that error rates can be acceptable, although there is still room for improvement.

Paper Nr: 170
Title:

ESTIMATION OF ASYMMETRY IN 3D FACE MODELS

Authors:

Natalya Dyshkant and Leonid Mestetskiy

Abstract: This paper proposes a new estimation of facial asymmetry in 3D face models of humans and an algorithm to compute it. We consider models derived by 3D scanning method. Each model is given as a cloud of points in 3D space and can be considered as a discrete single-valued function of two variables. We present an approach for constructing a disparity measure between original face model and its reflected model. Main stages of proposed algorithm are construction Delaunay triangulations of two models and general Delaunay triangulation, function interpolation on basis of triangulations localization in each other and comparison of functions on separate triangles of general triangulation. Further using elementary manipulations of reflected model algorithm searches such position that two models constitute a maximum matching so that the corresponding disparity measure will be minimal. We carry out computing experiments on database consisting of about 200 face models. These experiments have indicated that the proposed estimation is stable for different models of one and the same person.

Paper Nr: 181
Title:

THE EMERGENT STRUCTURE OF THE DROSOPHILA WING - A Dynamic Model Generator

Authors:

Alberto Silletti, Angelo Cenedese and Alessandro Abate

Abstract: Drosophila melanogaster is a model organism in genetics thanks to the compactness of its genome and its relative simplicity. Recently, certain developmental patterns in Drosophila have been studied by mathematical models, with the aim of gaining deeper and quantitative insight into the morphogenesis of this insect. There is a need for accurate dynamical of the epithelial cell structure and organization within the fly wing, to further the understanding of a phenomenon known as planar cell polarity. The present study tackles the problem of retrieving such a salient structure using classical tools of dynamical system theory embedded with network and graph concepts. On the one hand the goal is to provide a visual detection and representation of the cell packaging that is accurate and fine. Particular care is also put in obtaining a model of this structure, whose main features are the compactness and simplicity.

Paper Nr: 237
Title:

3D SHAPE RECONSTRUCTION ENDOSCOPE USING SHAPE FROM FOCUS

Authors:

T. Takeshita, Y. Nakajima, M. K. Kim, S. Onogi, M. Mitsuishi and Y. Matsumoto

Abstract: In this paper, we propose three-dimensional (3-D) shape reconstruction endoscope using shape from focus/defocus (SFF/SFD). 3-D shape measurement that uses the endoscope image sequence can measure both the shape and the texture at the same time. It has some advantages such as the analysis of lesion location that integrates the analysis of shape and texture. And the shape and the texture from the endoscope can be recorded quantitatively. To obtain 3-D information, shape measurement methods using stereo cameras is often used. But in case of narrow space, 3-D reconstruction using focus information such as SFF/SFD is more appropriate in terms of apparatus size. Therefore, we apply SFF method to endoscope for shape reconstruction, and conducted two basic experiments to confirm the possibility of the system using general camera as a first step. First, to estimate the accuracy of shape measurement of the system, shape measurement of the objects that the shape is already-known was conducted. And the error of the system was calculated about 1 to 5 mm. Next, to confirm the possibility to measure biological inner wall, the measurement of inner wall of the pig stomach was conducted, and the shape was reconstructed.

Paper Nr: 249
Title:

APPEARANCE-BASED AND ACTIVE 3D OBJECT RECOGNITION USING VISION

Authors:

F. Trujillo-Romero and M. Devy

Abstract: This paper concerns 3D object recognition from vision. In our robotics context,an object must be recognized and localized in order to be grasped by a mobile robot equipped with a manipulator arm: several cameras are mounted on this robot, on a static mast or on the wrist of the arm. The use of such a robot for object recognition, makes possible active strategies for object recognition. This system must be able to place the sensor in different positions around the object in order to learn discriminant features on every object to be recognized in a first step, and then to recognize these objects before a grasping task. Our method exploits the Mutual Information to actively acquire visual data until the recognition, like it was proposed in works presented in (Denzler and Brown, 2000) and (Denzler et al., 2001): color histogram, shape context, shape signature, Harris or Sift points descriptors are learnt from different viewpoint around every object in order to make the system more robust and efficient.

Area 4 - Motion, Tracking and Stereo Vision

Full Papers
Paper Nr: 23
Title:

TOWARD LARGE SCALE MODEL CONSTRUCTION FOR VISION-BASED GLOBAL LOCALISATION

Authors:

P. Lothe, S. Bourgeois, F. Dekeyser, E. Royer and M. Dhome

Abstract: Monocular SLAM reconstruction algorithm advancements enable their integration in various applications: trajectometry, 3D model reconstruction, etc. However proposed methods still have drift limitations when applied to large-scale sequences. In this paper, we propose a post-processing algorithm which exploits a CAD model to correct SLAM reconstructions. The presented method is based on a specific deformable transformations model and then on an adapted non-rigid ICP between the reconstructed 3D point cloud and the known CAD model. Experimental results on both synthetic and real sequences point out that the 3D scene geometry regains its consistency and that the camera trajectory is improved: mean distance between the reconstructed cameras and the ground truth is less than 1 meter on several hundreds of meters.

Paper Nr: 42
Title:

GPU-BASED REAL-TIME DISCRETE EUCLIDEAN DISTANCE TRANSFORMS WITH PRECISE ERROR BOUNDS

Authors:

Jens Schneider, Martin Kraus and Rüdiger Westermann

Abstract: We present a discrete distance transform in style of the vector propagation algorithm by Danielsson. Like other vector propagation algorithms, the proposed method is close to exact, i.e., the error can be strictly bounded from above and is significantly smaller than one pixel. Our contribution is that the algorithm runs entirely on consumer class graphics hardware, thereby achieving a throughput of up to 96 Mpixels/s. This allows the proposed method to be used in a wide range of applications that rely both on high speed and high quality.

Paper Nr: 64
Title:

SPATIAL RECONSTRUCTION OF LOCALLY SYMMETRIC OBJECTS BASED ON STEREO MATE IMAGES

Authors:

Leonid Mestetskiy and Archil Tsiskaridze

Abstract: Restoration of spatial objects characteristics with locally symmetric elements is proposed in this paper. An approach based on the model of a spatial flexible object defined as a family of spheres with the centres on a graph with a tree-like structure is proposed. A method of real time identification of such objects using the stereo mate images of their silhouettes is introduced. Image processing comprises construction of continuous skeletons of silhouettes. Application to real time gesture recognition is considered.

Paper Nr: 92
Title:

REPAIRING PEOPLE TRAJECTORIES BASED ON POINT CLUSTERING

Authors:

Chau Duc Phu, François Brémond, Etienne Corvée and Monique Thonnat

Abstract: This paper presents a method for improving any object tracking algorithm based on machine learning. During the training phase, important trajectory features are extracted which are then used to calculate a confidence value of trajectory. The positions at which objects are usually lost and found are clustered in order to construct the set of ‘lost zones’ and ‘found zones’ in the scene. Using these zones, we construct a triplet set of zones i.e. 3 zones: In/Out zone (zone where an object can enter or exit the scene), ‘lost zone’ and ‘found zone’. Thanks to these triplets, during the testing phase, we can repair the erroneous trajectories according to which triplet they are most likely to belong to. The advantage of our approach over the existing state of the art approaches is that (i) this method does not depend on a predefined contextual scene, (ii) we exploit the semantic of the scene and (iii) we have proposed a method to filter out noisy trajectories based on their confidence value.

Paper Nr: 94
Title:

REAL-TIME MULTI-OBJECT TRACKING WITH FEW PARTICLES - A Parallel Extension of MCMC Algorithm

Authors:

François Bardet, Thierry Chateau and Datta Ramadasan

Abstract: This paper addresses real-time automatic tracking and labeling of a variable number of generic objects, using one or more static cameras. The multi-object configuration is tracked through a Markov Chain Monte-Carlo Particle Filter (MCMC PF) method. As this method sequentially processes particles, it cannot be speeded up by parallel computing allowed by multi-core processing units. As a main contribution, we propose in this paper an extended MCMC PF algorithm, benefiting from parallel computing, and we show that this strategy improves tracking operation. This paper also addresses object tracking involving occlusions, deep scale and appearance changes: we propose a global observation function allowing to fairly track far objects as well as close objects. Experiment results are shown and discussed on pedestrian and on vehicle tracking sequences.

Paper Nr: 136
Title:

RAO-BLACKWELLIZED RESAMPLING PARTICLE FILTER FOR REAL-TIME PLAYER TRACKING IN SPORTS

Authors:

Nicolai v. Hoyningen-Huene and Michael Beetz

Abstract: Tracking multiple targets with similiar appearance is a common task in computer vision applications, especially in sports games. We propose a Rao-Blackwellized Resampling Particle Filter (RBRPF) as an implementable real-time continuation of a state-of-the-art multi-target tracking method. Target configurations are tracked by sampling associations and solving single-target tracking problems by Kalman filters. As an advantage of the new method the independence assumption between data associations is relaxed to increase the robustness in the sports domain. Smart resampling and memoization is introduced to equip the tracking method with real-time capabilities in the first place. The probabilistic framework allows for consideration of appearance models and the fusion of different sensors. We demonstrate its applicability to real world applications by tracking soccer players captured by multiple cameras through occlusions in real-time.

Paper Nr: 152
Title:

IMAGE SIGNAL PROCESSING FOR VISUAL DOOR PASSING WITH AN OMNIDIRECTIONAL CAMERA

Authors:

Luis Felipe Posada, Thomas Nierobisch, Frank Hoffmann and Torsten Bertram

Abstract: This paper proposes a novel framework for vision based door traversal that contributes to the ultimate goal of purely vision based mobile robot navigation. The door detection, door tracking and door traversal is accomplished by processing omnidirectional images. In door detection candidate line segments detected in the image are grouped and matched with prototypical door patterns. In door localisation and tracking a Kalman filter aggregates the visual information with the robots odometry. Door traversal is accomplished by a 2D visual servoing approach. The feasibility and robustness of the scheme are confirmed and validated in several robotic experiments in an office environment.

Paper Nr: 172
Title:

EXACT ALGEBRAIC METHOD OF OPTICAL FLOW DETECTION VIA MODULATED INTEGRAL IMAGING - Theoretical Formulation and Real-time Implementation using Correlation Image Sensor

Authors:

Shigeru Ando, Toru Kurihara and Dabi Wei

Abstract: A novel mathematical method and a sensing system that detects velocity vector distribution on an optical image with a pixel-wise spatial resolution and a frame-wise temporal resolution is proposed. It is provided by the complex sinusoidally-modulated imaging using the three-phase correlation image sensor (3PCIS) and the exact algebraic inversion method based on the optical flow identity (OFI) satisfied by an intensity image and a complex-sinusoidally modulated image captured by the 3PCIS. Since the OFI is free from time derivatives, any limitations on the object velocity and inaccuracies due to approximated time derivatives is thoroughly avoided. An experimental system was constructed with a 320×256 pixel 3PCIS device and a standard PC for inversion operations and display. Several experimental results are shown including the dense motion capture of face and gesture and the particle image velocimetry of water vortices.

Paper Nr: 180
Title:

LABELING HUMAN MOTION SEQUENCES USING GRAPHICAL MODELS

Authors:

José I. Gómez, Manuel J. Marín-Jiménez and Nicolas Pérez de la Blanca

Abstract: Graphical models have proved to be very efficient models for labeling image data. In particular, they have been used to label data samples from human body images. In this paper, the use of graphical models is studied for human-body landmark localization. Here a new algorithm based on the Branch&Bound methodology, improving the state of the art, is presented. The initialization stage is defined as a local optimum labeling of the sample data. An iterative improvement is given on the labeling space in order to reach new graphs with a lower cost than the current best one. Two branch prune strategies are suggested under a B&B approach in order to speed up the search: a) the use of heuristics; and b) the use of a node dominance criterion. Experimental results on human motion databases show that our proposed algorithm behaves better than the classical Dynamic Programming based approach.

Paper Nr: 184
Title:

ROAD INTERPRETATION FOR DRIVER ASSISTANCE BASED ON AN EARLY COGNITIVE VISION SYSTEM

Authors:

Emre Başeski, Lars Baunegaard With Jensen, Nicolas Pugeault, Florian Pilz, Karl Pauwels, Marc M. Van Hulle, Florentin Wörgötter and Norbert Krüger

Abstract: In this work, we address the problem of road interpretation for driver assistance based on an early cognitive vision system. The structure of a road and the relevant traffic are interpreted in terms of ego-motion estimation of the car, independently moving objects on the road, lane markers and large scale maps of the road. We make use of temporal and spatial disambiguation mechanisms to increase the reliability of visually extracted 2D and 3D information. This information is then used to interpret the layout of the road by using lane markers that are detected via Bayesian reasoning. We also estimate the ego-motion of the car which is used to create large scale maps of the road and also to detect independently moving objects. Sample results for the presented algorithms are shown on a stereo image sequence, that has been collected from a structured road.

Paper Nr: 189
Title:

AN UNSUPERVISED LEARNING BASED APPROACH FOR UNEXPECTED EVENT DETECTION

Authors:

Bertrand Luvison, Thierry Chateau, Patrick Sayd, Quoc-Cuong Pham and Jean-Thierry Lapresté

Abstract: This paper presents a generic unsupervised learning based solution to unexpected event detection from a static uncalibrated camera. The system can be represented into a probabilistic framework in which the detection is achieved by a likelihood based decision. We propose an original method to approximate the likelihood function using a sparse vector machine based model. This model is then used to detect efficiently unexpected events online. Moreover, features used are based on optical flow orientation within image blocks. The resulting application is able to learn automatically expected optical flow orientations from training video sequences and to detect unexpected orientations (corresponding to unexpected event) in a near real-time frame rate. Experiments show that the algorithm can be used in various applications like crowd or traffic event detection.

Paper Nr: 213
Title:

A TUNING STRATEGY FOR FACE RECOGNITION IN ROBOTIC APPLICATION

Authors:

Thierry Germa, Romain Rioux, Michel Devy and Frédéric Lerasle

Abstract: This paper deals with video-based face recognition and tracking from a camera mounted on a mobile robot companion. All persons must be logically identified before being authorized to interact with the robot while continuous tracking is compulsory in order to estimate the position of this person. A first contribution relates to experiments of still-image-based face recognition methods in order to check which image projection and classifier associations lead to the highest performance of the face database acquired from our robot. Our approach, based on Principal Component Analysis (PCA) and Support Vector Machines (SVM) improved by genetic algorithm optimization of the free-parameters, is found to outperform conventional appearance-based holistic classifiers (eigenface and Fisherface) which are used as benchmarks. The integration of face recognition, dedicated to the previously identified person, as intermittent features in the particle filtering framework is well-suited to this context as it facilitates the fusion of different measurement sources by positioning the particles according to face classification probabilities in the importance function. Evaluations on key-sequences acquired by the mobile robot in crowded and continuously changing indoor environments demonstrate the tracker robustness against such natural settings. The paper closes with a discussion of possible extensions.

Paper Nr: 231
Title:

A SINGLE PAN AND TILT CAMERA ARCHITECTURE FOR INDOOR POSITIONING AND TRACKING

Authors:

T. Gaspar and P. Oliveira

Abstract: A new architecture for indoor positioning and tracking is proposed, based on a single low cost pan and tilt camera, where three main modules can be identified: one related to the interface with the camera, supported on parameter estimation techniques; other, responsible for isolating and identifying the target, based on advanced image processing techniques, and a third, that resorting to nonlinear dynamic system suboptimal state estimation techniques, performs the tracking of the target and estimates its position, and linear and angular velocities. To assess the performance of the proposed methods and this new architecture, a software package was developed. An accuracy of 20 cm was obtained in a series of indoor experimental tests, for a range of operation of up to ten meter, under realistic real time conditions.

Paper Nr: 240
Title:

ARTICULATED HUMAN MOTION TRACKING WITH HPSO

Authors:

Vijay John, Spela Ivekovic and Emanuele Trucco

Abstract: In this paper, we address full-body articulated human motion tracking from multi-view video sequences acquired in a studio environment. The tracking is formulated as a multi-dimensional nonlinear optimisation and solved using particle swarm optimisation (PSO), a swarm-intelligence algorithm which has gained popularity in recent years due to its ability to solve difficult nonlinear optimisation problems. Our tracking approach is designed to address the limits of particle filtering approaches: it initialises automatically, removes the need for a sequence-specific motion model and recovers from temporary tracking divergence through the use of a powerful hierarchical search algorithm (HPSO). We quantitatively compare the performance of HPSO with that of the particle filter (PF) and annealed particle filter (APF). Our test results, obtained using the framework proposed by (Balan et al., 2005) to compare articulated body tracking algorithms, show that HPSO’s pose estimation accuracy and consistency is better than PF and compares favourably with the APF, outperforming it in sequences with sudden and fast motion.

Paper Nr: 242
Title:

RECOVERY OF THE RESPONSE CURVE OF A DIGITAL IMAGING PROCESS BY DATA-CENTRIC REGULARIZATION

Authors:

Johannes Herwig and Josef Pauli

Abstract: A method is presented that fuses multiple differently exposed images of the same static real-world scene into a single high dynamic range radiance map. Firstly, the response function of the imaging device is recovered, that maps irradiating light at the imaging sensor to gray values, and is usually not linear for 8-bit images. This nonlinearity affects image processing algorithms that do assume a linear model of light. With the response function known this compression can be reversed. For reliable recovery the whole set of images is segmented in a single step, and regions of roughly constant radiance in the scene are labeled. Under- and overexposed parts in one image are segmented without loss of detail throughout the scene. From these segments and a parametrization of digital film the slope of the response curve is estimated, whereby various noise sources of an imaging sensor have been modeled. From its slope the response function is recovered and images are fused. The dynamic range of outdoor environments cannot be captured by a single image. Valuable information gets lost because of under- or overexposure. A radiance map overcomes this problem and makes object recognition or visual self-localisation of robots easier.

Paper Nr: 256
Title:

COMPLEXITY REDUCTION OF REAL-TIME DEPTH SCANNING ON GRAPHICS HARDWARE

Authors:

Sammy Rogmans, Maarten Dumont, Tom Cuypers, Gauthier Lafruit and Philippe Bekaert

Abstract: This paper presents an intelligent control loop add-on to reduce the total amount of hardware operations – and therefore the resulting execution speed – of a real-time depth scanning algorithm. The analysis module of the control loop predicts redundant brute-force operations, and dynamically adjusts the input parameters of the algorithm, to avoid scanning in a space that lacks the presence of objects. Therefore, this approach reduces the algorithmic complexity in proportion with the amount of void within the scanned volume, while remaining fully compliant with stream-centric paradigms such as CUDA and Brook+.

Paper Nr: 278
Title:

STEREO VISION USING HETEROGENEOUS SENSORS FOR COMPLEX SCENE MONITORING

Authors:

Sanjeev Kumar and Claudio Piciarelli

Abstract: The intelligent monitoring of complex scenes usually requires the adoption of different sensors depending on the type of application (i.e. radar, sonar, chemical, etc.). From the past few years, monitoring is mainly represented by visual-surveillance. In this field, the research has proposed great innovation improving the surveillance from the standard CCTV to modern systems now able to infer behaviors in limited contexts. Though, when environments allow the creation of complex scenes (i.e. crowds, clutter, etc.) robust solutions are still far to be available. In particular, one of the major problems is represented by the occlusions that often limit the performance of the algorithms. As matter of fact, the majority of the proposed visual surveillance solutions processes the data flow generated by a single camera. These methods fail to correctly localize an occluded object in the real environment. Stereo vision can be introduced to solve such a limit but the number of needed sensors would double. Thus, to obtain the benefits of the stereo vision discharging some of its drawbacks, a novel framework in stereo vision is proposed by adopting the sensors available in common visual-surveillance networks. In particular, we will focus on the analysis of a stereo vision system which is build from a pairs of heterogeneous sensors, i.e., static and PTZ cameras with a task to locate objects accurately.

Paper Nr: 279
Title:

EXPLOITING HUMAN BIPEDAL MOTION CONSTRAINTS FOR 3D POSE RECOVERY FROM A SINGLE UNCALIBRATED CAMERA

Authors:

Paul Kuo, Thibault Ammar, Michal Lewandowski, Dimitrios Makris and Jean-Christophe Nebel

Abstract: A new method is proposed for recovering 3D human poses in video sequences taken from a single uncalibrated camera. This is achieved by exploiting two important constraints observed from human bipedal motion: coplanarity of body key points during the mid-stance position and the presence of a foot on the ground – i.e. static foot - during most activities. Assuming 2D joint locations have been extracted from a video sequence, the algorithm is able to perform camera auto-calibration on specific frames when the human body adopts particular postures. Then, a simplified pin-hole camera model is used to perform 3D pose reconstruction on the calibrated frames. Finally, the static foot constraint which is found in most human bipedal motions is applied to infer body postures for non-calibrated frames. We compared our method with (1) “orthographic reconstruction” method and (2) reconstruction using manually calibrated data. The results validate the assumptions made for the simplified pin-hole camera model and reconstruction results reveal a significant improvement over the orthographic reconstruction method.

Short Papers
Paper Nr: 9
Title:

PROJECTOR CALIBRATION USING A MARKERLESS PLANE

Authors:

Jamil Draréni, Sébastien Roy and Peter Sturm

Abstract: In this paper we address the problem of geometric video projector calibration using a markerless planar surface (wall) and a partially calibrated camera. Instead of using control points to infer the camera-wall orientation, we find such relation by efficiently sampling the hemisphere of possible orientations. This process is so fast that even the focal of the camera can be estimated during the sampling process. Hence, physical grids and full knowledge of camera parameters are no longer necessary to calibrate a video projector.

Paper Nr: 16
Title:

VIEW-INDEPENDENT VIDEO SYNCHRONIZATION FROM TEMPORAL SELF-SIMILARITIES

Authors:

Emilie Dexter, Patrick Pérez, Ivan Laptev and Imran N. Junejo

Abstract: This paper deals with the temporal synchronization of videos representing the same dynamic event from different viewpoints. We propose a novel approach to automatically synchronize such videos based on temporal self-similarities of sequences. We explore video descriptors which capture the structure of video similarity over time and remain stable under viewpoint changes. We achieve temporal synchronization of videos by aligning such descriptors by Dynamic Time Warping. Our approach is simple and does not require point correspondences between views while being able to handle strong view changes. The method is validated on two public datasets with controlled view settings as well as on other videos with challenging motions and large view variations.

Paper Nr: 28
Title:

ROBUST OBJECT TRACKING BY SIMULTANEOUS GENERATION OF AN OBJECT MODEL

Authors:

Maria Sagrebin, Daniel Caparròs Lorca, Daniel Stroh and Josef Pauli

Abstract: Although robust object tracking has a wide variety of applications ranging from video surveillance to recognition from motion, it is not completely solved. Difficulties in tracking objects arise due to abrupt object motion, changing appearance of the object or partial and full object occlusions. To resolve these problems, assumptions are usually made concerning the motion or appearance of an object. However in most applications no models of object motion or appearance are previously available. This paper presents an approach which improves the performance of a tracking algorithm due to simultaneous online model generation of a tracked object. The achieved results testify the stability and the robustness of this approach.

Paper Nr: 29
Title:

A REAL-TIME TRACKING SYSTEM FOR TAILGATING BEHAVIOR DETECTION

Authors:

Yingxiang Zhang, Qiang Chen and Yuncai Liu

Abstract: It is a challenging problem to detect human and recognize their behaviors in video sequence due to the variations of background and the uncertainty of pose, appearance and motion. In this paper, we propose a systematic method to detect the behavior of tailgating. Firstly, in order to make the tracking process robust in complex situation, we propose an improved Gaussian Mixture Model (IGMM) for background and combine the Deterministic Nonmodel-Based approach with Gaussian Mixture Shadow Model (GMSM) to remove shadows. Secondly, we have developed an algorithm of object tracking by establishing tracking strategy and computing the similarity of color histograms. Having known door position in the scene, we specify tailgating behavior definition to detect tailgater. Experiments show that our system is robust in complex environment, cost-effective in computation and practical in real-time application.

Paper Nr: 35
Title:

COMPARISON OF RECONSTRUCTION AND TEXTURING OF 3D URBAN TERRAIN BY L1 SPLINES, CONVENTIONAL SPLINES AND ALPHA SHAPES

Authors:

Dimitri Bulatov and John E. Lavery

Abstract: We compare computational results for three procedures for reconstruction and texturing of 3D urban terrain. One pro¬¬cedure is based on recently developed “L1 splines”, another on conventional splines and a third on “α-shapes”. Computational results generated from optical images of a model house and of the Gottesaue Palace in Karlsruhe, Germany are presented. These comparisons indicate that the L1-spline-based procedure pro¬duces textured reconstructions that are superior to those produced by the conventional-spline-based pro¬ce¬dure and the α-shapes-based procedure.

Paper Nr: 66
Title:

FUSION OF MOTION SEGMENTATION WITH ONLINE ADAPTIVE NEURAL CLASSIFIER FOR ROBUST TRACKING

Authors:

Sławomir Bąk, Sundaram Suresh, François Brémond and Monique Thonnat

Abstract: This paper presents a method to fuse the information from motion segmentation with online adaptive neural classifier for robust object tracking. The motion segmentation with object classification identify new objects present in the video sequence. This information is used to initialize the online adaptive neural classifier which is learned to differentiate the object from its local background. The neural classifier can adapt to illumination variations and changes in appearance. Initialized objects are tracked in following frames using the fusion of their neural classifiers with the feedback from the motion segmentation. Fusion is used to avoid drifting problems due to similar appearance in the local background region. We demonstrate the approach in several experiments using benchmark video sequences with different level of complexity.

Paper Nr: 74
Title:

REAL-TIME DENSE DISPARITY ESTIMATION USING CUDA’S API

Authors:

Mourad Boufarguine, Malek Baklouti, Vincent Guitteny and Serge Couvet

Abstract: In this paper, we present a real-time dense disparity map estimation based on beliefs propagation inference algorithm. While being real-time, our implementation generates high quality disparity maps. Despite the high complexity of the calculations beliefs propagation involves, our implementation on graphics processor using CUDA API makes more than 100 times speedup compared to CPU implementation. We tested our experimental results in the Middlebury benchmark and obtained good results among the real-time algorithms. We use several programming techniques to reduce the number of iterations to convergence and memory usage in order to maintain real-time performance.

Paper Nr: 75
Title:

ONE-SHOT 3D SURFACE RECONSTRUCTION FROM INSTANTANEOUS FREQUENCIES - Solutions to Ambiguity Problems

Authors:

F. van der Heijden, L. J. Spreeuwers and A. C. Nijmeijer

Abstract: Phase-measuring profilometry is a well known technique for 3D surface reconstruction based on a sinusoidal pattern that is projected on a scene. If the surface is partly occluded by, for instance, other objects, then the depth shows abrupt transitions at the edges of these occlusions. This causes ambiguities in the phase and, consequently, also in the reconstruction. This paper introduces a reconstruction method that is based on the instantaneous frequency instead of phase. Using these instantaneous frequencies we present a method to recover from ambiguities caused by occlusion. The recovery works under the condition that some surface patches can be found that are planar. This ability is demonstrated in a simple example.

Paper Nr: 101
Title:

ATTENTION MODELS FOR VERGENCE MOVEMENTS BASED ON THE JAMF FRAMEWORK AND THE POPEYE ROBOT

Authors:

Niklas Wilming, Felix Wolfsteller, Peter König, Rui Caseiro, João Xavier and Helder Araújo

Abstract: In this work we describe a novel setup for implementation and development of stereo vision attention models in a realistic embodied setting. We introduce a stereo vision robot head, called POPEYE, that provides degrees of freedom comparable to a human head. We describe the geometry of the robot as well as the characteristics that make it a good candidate for studying models of visual attention. Attentional robot control is implemented with JAMF, a graphical modeling framework which allows to easily implement current state-of-the-art saliency models. We give a brief overview over JAMF and show implementations of four exemplary attention models that can control the robot head.

Paper Nr: 105
Title:

FEATURE-BASED ANNEALING PARTICLE FILTER FOR ROBUST BODY POSE ESTIMATION

Authors:

Adolfo López and Josep R. Casas

Abstract: This paper presents a new annealing method for particle filtering in the context of body pose estimation. The feature-based annealing is inferred from the weighting functions obtained with common image features used for the likelihood approximation. We introduce a complementary weighting function based on the foreground extraction and we balance the different measures through the annealing layers in order to improve the posterior estimate. This technique is applied to estimate the upper body pose of a subject in a realistic multi-view environment. Comparative results between the proposed method and the common annealing strategy are presented to assess the robustness of the algorithm.

Paper Nr: 106
Title:

A VIRTUAL REALITY SIMULATOR FOR ACTIVE STEREO VISION SYSTEMS

Authors:

Manuela Chessa, Fabio Solari and Silvio P. Sabatini

Abstract: The virtual reality is a powerful tool to simulate the behavior of the physical systems. The visual system of a robot and its interplay with the 3D environment can be modeled and simulated through the geometrical relationships between the virtual stereo cameras and the virtual 3D world. The novelty of our approach is related to the use of the virtual reality as a tool to simulate the behavior of active vision systems. In the standard way, the virtual reality is used for the perceptual rendering of the visual information exploitable by a human user. In the proposed approach, a virtual world is rendered to simulate the actual projections on the cameras of a robotic system, thus the mechanisms of the active vision are quantitatively validated by using the available ground truth data.

Paper Nr: 116
Title:

GENERIC MOTION BASED OBJECT SEGMENTATION FOR ASSISTED NAVIGATION

Authors:

Sion Hannuna, Xianghua Xie, Majid Mirmehdi and Neill Campbell

Abstract: We propose a robust approach to annotating independently moving objects captured by head mounted stereo cameras that are worn by an ambulatory (and visually impaired) user. Initially, sparse optical flow is extracted from a single image stream, in tandem with dense depth maps. Then, using the assumption that apparent movement generated by camera egomotion is dominant, flow corresponding to independently moving objects (IMOs) is robustly segmented using MLESAC. Next, the mode depth of the feature points defining this flow (the foreground) are obtained by aligning them with the depth maps. Finally, a bounding box is scaled proportionally to this mode depth and robustly fit to the foreground points such that the number of inliers is maximised.

Paper Nr: 119
Title:

EXPERIMENTAL COMPARISON OF WIDE BASELINE CORRESPONDENCE ALGORITHMS FOR MULTI CAMERA CALIBRATION

Authors:

Ferid Bajramovic, Michael Koch and Joachim Denzler

Abstract: The quality of point correspondences is crucial for the successful application of multi camera self-calibration procedures. There are several interest point detectors, local descriptors and matching algorithms, which can be combined almost arbitrarily. In this paper, we compare the point correspondences produced by several such combinations. In contrast to previous comparisons, we evaluate the correspondences based on the accuracy of relative pose estimation and multi camera calibration.

Paper Nr: 123
Title:

VOXEL OCCUPANCY WITH VIEWING LINE INCONSISTENCY ANALYSIS AND SPATIAL REGULARIZATION

Authors:

Marcel Alcoverro and Montse Pardàs

Abstract: In this paper we review the main techniques for volume reconstruction from a set of views using Shape from Silhouette techniques and we propose a new method that adapts the inconsistencies analysis shown in (Landabaso et al., 2008) to the graph cuts framework (Snow et al., 2000) which allows the introduction of spatial regularization. For this aim we use a new viewing line based inconsistency analysis within a probabilistic framework. Our method adds robustness to errors by projecting back to the views the volume occupancy obtained from 2D foreground detections intersection, and analysing this projection. The final voxel occupancy of the scene is set following a maximum a posteriori (MAP) estimate. We have evaluated a sample of techniques and the new method proposed to have an objective measure of the robustness to errors in real environments.

Paper Nr: 141
Title:

TRACKING MULTIPLE TARGETS BASED ON STEREO VISION

Authors:

Ali Ganoun, Thomas Veit and Didier Aubert

Abstract: This paper deals with the problem of tracking multiple objects in outdoor scenarios for the prospective of intelligent vehicles. The input of the proposed algorithm is the result of a stereovision obstacle detection algorithm. The aim is to establish the correspondence between the detected objects in consecutive frames and to reconstruct the trajectory of each individual object. To this purpose, an object model based on its scene position and its intensity caracteristic is defined. A track management strategy including track initiation, track termination and track continuation is also proposed. This strategy enables to deal with issues such as object appearance, dispapearance, occlusion and detection failure. An adaptive model update technique is applied in order to take into account appearance variations of the tracked object along time. Experiments were carried out in the context of pedestrian detection. Results on urban scenarios illustrate the performance of the proposed method.

Paper Nr: 143
Title:

A NOVEL APPROACH TO ACHIEVE ROBUSTNESS AGAINST MARKER OCCLUSION

Authors:

Hugo Álvarez and Diego Borro

Abstract: This paper introduces a novel estimation technique to compute camera translation and rotation (only in the axis that is perpendicular to the image plane) when a marker is partially occluded. The approach has two main advantages: 1) only one marker is necessary; and 2) it has a low computational cost. As a result of the second feature, this proposal is ideal for mobile devices. Our method is implemented in ARToolkitPlus library, but it could be implemented in another marker-tracking library with square markers. A little extra image processing is needed, taking advantage of temporal coherence. Results show that user feels enough realistic sensation to apply this technique in some applications.

Paper Nr: 161
Title:

A NOVEL SEGMENTATION METHOD FOR CROWDED SCENES

Authors:

Domenico Bloisi, Luca Iocchi, Dorothy N. Monekosso and Paolo Remagnino

Abstract: Video surveillance is one of the most studied application in Computer Vision. We propose a novel method to identify and track people in a complex environment with stereo cameras. It uses two stereo cameras to deal with occlusions, two different background models that handle shadows and illumination changes and a new segmentation algorithm that is effective in crowded environments. The algorithm is able to work in real time and results demonstrating the effectiveness of the approach are shown.

Paper Nr: 174
Title:

TOWARDS REAL-TIME AND ACCURATE VOXEL COLORING FRAMEWORK

Authors:

Oussama Moslah, Arnaud Debeugny, Vincent Guitteny, Serge Couvet and Sylvie Philipp-Foliguet

Abstract: This paper presents algorithms and techniques towards a real-time and accurate Voxel Coloring framework. We combine Visual Hull, Voxel Coloring and Marching Cubes techniques to derive an accurate 3D model from a set of calibrated photographs. First, we adapted the Visual Hull algorithm for the computation of the bounding box from image silhouettes. Then, we improved the accuracy of the Voxel Coloring algorithm using both colorimetric and geometric citerions. The calculation time is reduced using an Octree data structure. Then, the Marching Cubes is used to obtain a polygonal mesh from the voxel reconstruction. Finally, we propose a practical way to speed up the whole process using graphics hardware capababilities.

Paper Nr: 183
Title:

A HIERARCHICAL 3D CIRCLE DETECTION ALGORITHM APPLIED IN A GRASPING SCENARIO

Authors:

Emre Baseski, Dirk Kraft and Norbert Kruger

Abstract: In this work, we address the problem of 3D circle detection in a hierarchical representation which contains 2D and 3D information in the form of multi-modal primitives and their perceptual organizations in terms of contours. Semantic reasoning on higher levels leads to hypotheses that then become verified on lower levels by feedback mechanisms. The effects of uncertainties in visually extracted 3D information can be minimized by detecting a shape in 2D and calculating its dimensions and location in 3D. Therefore, we use the fact that the perspective projection of a circle on the image plane is an ellipse and we create 3D circle hypotheses from 2D ellipses and the planes that they lie on. Afterwards, these hypotheses are verified in 2D, where the orientation and location information is more reliable than in 3D. For evaluation purposes, the algorithm is applied in a robotics application for grasping cylindrical objects.

Paper Nr: 191
Title:

IMPLICIT TRACKING OF MULTIPLE OBJECTS BASED ON BAYESIAN REGION LABEL ASSIGNMENT

Authors:

Masaya Ikeda, Kan Okubo and Norio Tagawa

Abstract: For tracking objects, the various template matching methods are usually used. However, those cannot completely cope with apparent changes of a target object in images. On the other hand, to discriminate multiple objects in still images, the label assignment based on the MAP estimation using object's features is convenient. In this study, we propose a method which enables to track multiple objects stably without explicit tracking by extending the above MAP assignment in the temporal direction. We propose two techniques; information of target position and its size detected in the previous frame is propagated to the current frame as a prior probability of the target region, and distribution properties of target’s feature values in a feature space are adaptively updated based on detection results at each frame. Since the proposed method is based on a label assignment and then, it is not an explicit tracking based on target appearance in images, the method is robust especially for occlusion.

Paper Nr: 204
Title:

REGISTRATION OF DSM AND RANGE IMAGES FOR3-D POSE ESTIMATION OF AN UNMANNED GROUND VEHICLE

Authors:

Sung-In Choi, Soon-Yong Park, Jaekyoung Moon, Jun Kim and Yong-Woon Park

Abstract: In this paper, we propose a new approach which registers a range image which is acquired from a 3-D range sensor to a DSM to estimate the 3-D pose of an unmanned ground vehicle. Generally, 3-D registration is divided into two parts that called as coarse and refinement steps. Above all, a proper feature matching technique is demanded between the DSM and the range image for the coarse registration to register precisely and speedy. We generated signatures using shape parameterization about the DSM and the range images and got a 3-D rigid transformation by matching them to minimize registration error.

Paper Nr: 214
Title:

USING THE DISCRETE HADAMARD TRANSFORM TO DETECT MOVING OBJECTS IN SURVEILLANCE VIDEO

Authors:

Chanyul Kim and Noel E.O'Connor

Abstract: In this paper we present an approach to object detection in surveillance video based on detecting moving edges using the Hadamard transform. The proposed method is characterized by robustness to illumination changes and ghosting effects and provides high speed detection, making it particularly suitable for surveillance applications. In addition to presenting an approach to moving edge detection using the Hadamard transform, we introduce two measures to track edge history, Pixel Bit Mask Difference (PBMD) and History Update Value (HUV) that help reduce the false detections commonly experienced by approaches based on moving edges. Experimental results show that the proposed algorithm overcomes the traditional drawbacks of frame differencing and outperforms existing edge-based approaches in terms of both detection results and computational complexity.

Paper Nr: 218
Title:

CONTINUOUS EDGE GRADIENT-BASED TEMPLATE MATCHING FOR ARTICULATED OBJECTS

Authors:

Daniel Mohr and Gabriel Zachmann

Abstract: In this paper, we propose a novel edge gradient based template matching method for object detection. In contrast to other methods, ours does not perform any binarization or discretization during the online matching. This is facilitated by a new continuous edge gradient similarity measure. Its main components are a novel edge gradient operator, which is applied to query and template images, and the formulation as a convolution, which can be computed very efficiently in Fourier space. We compared our method to a state-of-the-art chamfer based matching method. The results demonstrate that our method is much more robust against weak edge response and yields much better confidence maps with fewer maxima that are also more significant. In addition, our method lends itself well to efficient implementation on GPUs: at a query image resolution of 320×256 and a template resolution of 80×80 we can generate about 330 confidence maps per second.

Paper Nr: 219
Title:

SIMULATING DYNAMICAL SYSTEMS FOR EARLY VISION

Authors:

Babette Dellen and Florentin Wörgötter

Abstract: We propose a novel algorithm for stereo matching using a dynamical systems approach. The stereo correspondence problem is first formulated as an energy minimization problem. From the energy function, we derive a system of differential equations describing the corresponding dynamical system of interacting elements, which we solve using numerical integration. Optimization is introduced by means of a damping term and a noise term, an idea similar to simulated annealing. The algorithm is tested on the Middlebury stereo benchmark.

Paper Nr: 226
Title:

MULTIPLE CUE DATA FUSION USING MARKOV RANDOM FIELDS FOR MOTION DETECTION

Authors:

Marc Vivet, Brais Martínez and Xavier Binefa

Abstract: We propose a new method for Motion Detection using stationary camera, where the information of different motion detectors which are not robust but light in terms of computation time (what we will call weak motion detector (WMD)) are merged with spatio-temporal Markov Random Field to improve the results. We put the strength, instead of on the weak motion detectors, on the fusion of their information. The main contribution is to show how the MRF can be modeled for obtaining a robust result. Experimental results show the improvement and good performance of the proposed method.

Paper Nr: 228
Title:

SCALE ROBUST ADAPTIVE FEATURE DENSITY APPROXIMATION FOR VISUAL OBJECT REPRESENTATION AND TRACKING

Authors:

C. Liu, N. H. C. Yung and R. G. Fang

Abstract: Feature density approximation (FDA) based visual object appearance representation is emerging as an effective method for object tracking, but its challenges come from object’s complex motion (e.g. scaling, rotation) and the consequent object’s appearance variation. The traditional adaptive FDA methods extract features in fixed scales ignoring the object’s scale variation, and update FDA by sequential Maximum Likelihood estimation, which lacks robustness for sparse data. In this paper, to solve the above challenges, a robust multi-scale adaptive FDA object representation method is proposed for tracking, and its robust FDA updating method is provided. This FDA achieve robustness by extracting features in the selected scale and estimating feature density using a new likelihood function defined both by feature set and the feature’s effectiveness probability. In FDA updating, robustness is achieved updating FDA in a Bayesian way by MAP-EM algorithm using density prior knowledge extracted from historical density. Object complex motion (e.g. scaling and rotation) is solved by correlating object appearance with its spatial alignment. Experimental results show that this method is efficient for complex motion, and robust in adapting the object appearance variation caused by changing scale, illumination, pose and viewing angel.

Paper Nr: 230
Title:

LASER RANGE DATA REGISTRATION USING SPIN IMAGES

Authors:

Xavier Mateo Prous and Xavier Binefa Valls

Abstract: Registration of laser range data becoming from different scanner positions is still a current topic in literature. In this paper we introduce the possibility of solving it by using spin images, which create a 2D image for every 3D coordinate vertex in the scans. Matching between spin images allows the estimation of an initial rigid transformation between the scans, which later can be refined with ICP process in order to achieve a more accurate registration.

Paper Nr: 232
Title:

A NEW APPLICATION FOR 3D-SNAKES - Modelling Electrical Discharges

Authors:

Gilmario Barbosa dos Santos, Sidney Pinto da Cunha and Clesio Luiz Tozzi

Abstract: A new approach for modelling electrical discharges is proposed. To this purpose, an active contour named 3Dsnake is used that is geometrically represented by a B-spline which evolves in 3D space constrained by internal and external energies. More specifically, this external energy come from a pair of images. This new model is much less dependent on determination of homologous points than the approaches found in the literature for recovering 3D geometry of electrical discharges. In addition, the proposal discussed here is capable of tracking the evolution os the electrical discharge taking into account the time dependence between consecutive pairs of frames in two videos.

Paper Nr: 233
Title:

MULTI-FEATURE STEREO VISION SYSTEM FOR ROAD TRAFFIC ANALYSIS

Authors:

Quentin Houben, Juan Carlos Tocino Diaz, Nadine Warzée, Olivier Debeir and Jacek Czyz

Abstract: This paper presents a method for counting and classifying vehicles on motorway. The system is based on a multi-camera system fixed over the road. Different features (maximum phase congruency and edges) are detected on the two images and matched together with local matching algorithm. The resulting 3D points cloud is processed by maximum spanning tree clustering algorithm to group the points into vehicle objects. Bounding boxes are defined for each detected object, giving an approximation of the vehicles 3D sizes. A complementary 2D quadrilateral detector has been developed to enhance the probability of matching features on vehicle exhibiting little texture such as long vehicles. The algorithm presented here was validated manually and gives 90% of good detection accuracy.

Paper Nr: 241
Title:

ROBUST OCCLUSION HANDLING WITH MULTIPLE CAMERAS USING A HOMOGRAPHY CONSTRAINT

Authors:

Anastasios L. Kesidis and Dimitrios I. Kosmopoulos

Abstract: The problem of human detection in crowded scenes where people may occlude each other has been tackled recently using the planar homography constraint in a multiple view framework. The foreground objects detected in each view are projected on a common plane in an accumulated fashion and then the maxima of this accumulation are matched to the moving targets. However the superposition of foreground objects projections on a common plane may create artifacts which can seriously disorientate a human detector by creating false positives. In this work we present a method which eliminates those artifacts by using only geometrical information thus contributing to robust human detection for multiple views. The presented experimental results validate the proposed approach.

Paper Nr: 245
Title:

ColEnViSon: Color Enhanced Visual Sonifier - A Polyphonic Audio Texture and Salient Scene Analysis

Authors:

Codruta Orniana Ancuti, Cosmin Ancuti and Philippe Bekaert

Abstract: In this work we introduce a color based image-audio system that enhances the perception of the visually impaired users. Traditional sound-vision substitution systems mainly translate gray scale images into corresponding audio frequencies. However, these algorithms deprive the user from the color information, an critical factor in object recognition and also for attracting visual attention. We propose an algorithm that translates the scene into sound based on some classical computer vision algorithms. The most salient visual regions are extracted by a hybrid approach that blends the computed salient map with the segmented image. The selected image region is simplified based on a reference color map dictionary. The centroid of the color space are translated into audio by different musical instruments. We chose to encode the audio file by polyphonic music composition reasoning that humans are capable to distinguish more than one instrument in the same time but also to reduce the playing duration. Testing the prototype demonstrate that non-proficient blindfold participants can easily interpret sequence of colored patterns and also to distinguish by example the quantity of a specific color contained by a given image.

Paper Nr: 254
Title:

SELF-CALIBRATION CONSTRAINTS ON EUCLIDEAN BUNDLE ADJUSTMENT PARAMETERIZATION - Application to the 2 Views Case

Authors:

Guillaume Gelabert, Michel Devy and Frédéric Lerasle

Abstract: During the two last decades, many contributions have been proposed on 3D reconstruction from image sequences. Nevertheless few practical applications exist, especially using vision. We are concerned by the analysis of image sequences acquired during crash tests. In such tests, it is required to extract 3D measurements about motions of objects, generally identified by specific markings. With numerical cameras, it is quite simple to acquire video sequences, but it is very difficult to obtain from operators in charge of these acquisitions, the camera parameters and their relative positions when using a multicamera system. In this paper, we are interested on the simplest situation: two cameras observing the motion of an object of interest: the challenge consists in reconstructing the 3D model of this object, estimating in the same time, the intrinsic and extrinsic parameters of these cameras. So this paper copes with 3D Euclidean reconstruction with uncalibrated cameras: we recall some theoretical results in order to evaluate what are the possible estimations when using only two images acquired by two distinct perspective cameras. Typically it will be the two first images of our sequences. It is presented several contributions of the state of the art on these topics, and then results obtained from synthetic data, so that we could state on advantages and drawbacks of several parameter estimation strategies, based on the Sparse Bundle Adjustment and on the Levenberg-Marquardt optimization function.

Paper Nr: 277
Title:

INTEGRATION OF INTENSITY EDGE INFORMATION INTO THE REACTION-DIFFUSION STEREO ALGORITHM

Authors:

Atsushi Nomura, Makoto Ichikawa, Koichi Okada and Hidetoshi Miike

Abstract: The present paper proposes a visual integration algorithm that integrates intensity edge information into a stereo algorithm. The stereo algorithm assumes two constraints of continuity and uniqueness on disparity distribution. Since depth discontinuity around object boundaries does not satisfy the continuity constraint, it causes numerous errors in stereo disparity detection. In order to reduce the errors due to the depth discontinuity, we propose a new algorithm that integrates intensity edge information into the stereo algorithm. The stereo algorithm utilizes reaction-diffusion equations, in which diffusion coefficients control the continuity constraint. Thus, we introduce anisotropic diffusion fields into the reaction-diffusion equations; that is, we modulate the diffusion coefficients according to results of edge detection applied to image intensity distribution. We demonstrate how the proposed algorithm works around areas having depth discontinuity and confirm quantitative performance of the algorithm in comparison to other stereo algorithms.

Posters
Paper Nr: 33
Title:

A FAST AND ROBUST HAND-DRIVEN 3D MOUSE

Authors:

Andrea Bottino and Matteo De Simone

Abstract: The development of new interaction paradigms requires a natural interaction. This means that people should be able to interact with technology with the same models used to interact with everyday real life, that is through gestures, expressions, voice. Following this idea, in this paper we propose a non intrusive vision based tracking system able to capture hand motion and simple hand gestures. The proposed device allows to use the hand as a “natural” 3D mouse, where the forefinger tip or the palm centre are used to identify a 3D marker and the hand gesture can be used to simulate the mouse buttons. The approach is based on a monoscopic tracking algorithm which is computationally fast and robust against noise and cluttered backgrounds. Two image streams are processed in parallel exploiting multi-core architectures, and their results are combined to obtain a constrained stereoscopic problem. The system has been implemented and thoroughly tested in an experimental environment where the 3D hand mouse has been used to interact with objects in a virtual reality application. We also provide results about the performances of the tracker, which demonstrate precision and robustness of the proposed system.

Paper Nr: 38
Title:

ANGLES ESTIMATION OF ROTATING CAMERA

Authors:

Samira Ait Kaci Azzou, Slimane Larabi and Chabane Djeraba

Abstract: We address the problem of camera motion from points and line correspondences across multiple views. We investigate firstly the mathematical mathematical formula between slopes of lines in the different images acquired after rotation motion of camera. Assuming that lines in successive images are tracked, this relation is used for estimating rotation angles of the camera. Experiments are conducted over real images and the obtained results are presented and discussed.

Paper Nr: 120
Title:

MOTION-BASED FEATURE CLUSTERING FOR ARTICULATED BODY TRACKING

Authors:

Hildegard Kuehne and Annika Woerner

Abstract: The recovery of three dimensional structures from moving elements is one of the main abilities of the human perception system. It is mainly based on particularities of how we interpret moving features, especially on the enforcement of geometrical grouping and definition of relation between features. In this paper we evaluate how the human abilities of motion based feature clustering can be transferred to an algorithmic approach to determine the structure of a rigid or articulated body in an image sequence. It shows how to group sparse 3D motion features to structural clusters, describing the rigid elements of articulated body structures. The location and motion properties of sparse feature point clouds have been analyzed and it is shown that moving features can be clustered by their local and temporal properties without any additional image information. The assembly of these structural groups could allow the detection of a human body in an image as well as its pose estimation. So, such a clustering can establish a basis for a markerless reconstruction of articulated body structures as well as for human motion recognition by moving features.

Paper Nr: 128
Title:

OBJECT DETECTION AND TRACKING USING KALMAN FILTER AND FAST MEAN SHIFT ALGORITHM

Authors:

Ali Ahmed

Abstract: Object detection in videos involves verifying the presence of an object in image sequences and possibly locating it precisely for recognition. Object tracking is to monitor an object's spatial and temporal changes during a video sequence, including its presence, position, size, shape, etc. These two processes are closely related because tracking usually starts with detecting objects, while detecting an object repeatedly in subsequent image sequence is often necessary to help and verify tracking. In this paper, a novel approach is being presented for detecting and tracking object. It includes combination of Kalman filter and fast mean shift algorithm. Kalman prediction is measurement follower. It may be misled by wrong measurement. In order to cater it, fast mean shift algorithm is used. It is used to locate densities extrema, which gives clue that whether Kalman prediction is right or it is misled by wrong measurement. In case of wrong prediction, it is corrected with the help of densities extrema in the scene. The proposed approach has the robust ability to track the moving object in the consecutive frames under some kinds of difficulties such as rapid appearance changes caused by image noise, illumination changes, and cluttered background.

Paper Nr: 153
Title:

TRACKING DEFORMABLE OBJECTS AND DEALING WITH SAME CLASS OBJECT OCCLUSION

Authors:

Rene Alquezar, Nicolas Amezquita and Francesc Serratosa

Abstract: This paper presents an extension of a previously reported method for object tracking in video sequences to handle the problems of object crossing and occlusion by other objects in the same class that the one followed. The proposed solution is embedded in a system that integrates recognition and tracking in a probabilistic framework. In a recent work, a method to approach the object occlusion problem was proposed that failed when the object crossed or was occluded by another object of the same class. Here we present an attempt to overcome this limitation and show some promising results. The method is based on the assumption that when two objects cross each other there is not a brusque change of the trajectories. Our system uses object recognition results provided by a neural net that are computed from colour features of image regions for each frame. The location of tracked objects is represented through probability images that are updated dynamically using both recognition and tracking results. From these probabilities and a prediction of the motion of the object in the image, a binary decision is made for each pixel and object.

Paper Nr: 158
Title:

SOLVING ILL-POSED PROBLEMS USING DATA ASSIMILATION - Application to Optical Flow Estimation

Authors:

Dominique Béréziat and Isabelle Herlin

Abstract: Data Assimilation is a mathematical framework used in environmental sciences to improve forecasts performed by meteorological, oceanographic or air quality simulation models. Data Assimilation techniques require the resolution of a system with three components: one describing the temporal evolution of a state vector, one coupling the observations and the state vector, and one defining the initial condition. In this article, we use this framework to study a class of ill-posed Image Processing problems, usually solved by spatial and temporal regularization techniques. A generic approach is defined to convert an ill-posed Image Processing problem in terms of a Data Assimilation system. This method is illustrated on the determination of optical flow from a sequence of images. The resulting software has two advantages: a quality criterion on input data is used for weighting their contribution in the computation of the solution and a dynamic model is proposed to ensure a significant temporal regularity on the solution.

Paper Nr: 169
Title:

A NEW LIKELIHOOD FUNCTION FOR STEREO MATCHING - How to Achieve Invariance to Unknown Texture, Gains and Offsets?

Authors:

Ferdinand van der Heijden, Luuk J. Spreeuwers and Sanja Damjanovic

Abstract: We introduce a new likelihood function for window-based stereo matching. This likelihood can cope with unknown textures, uncertain gain factors, uncertain offsets, and correlated noise. The method can be fine-tuned to the uncertainty ranges of the gains and offsets, rather than a full, blunt normalization as in NCC (normalized cross correlation). The likelihood is based on a sound probabilistic model. As such it can be directly used within a probabilistic framework. We demonstrate this by embedding the likelihood in a HMM (hidden Markov model) formulation of the 3D reconstruction problem, and applying this to a test scene. We compare the reconstruction results with the results when the similarity measure is the NCC, and we show that our likelihood fits better within the probabilistic frame for stereo matching than NCC.

Paper Nr: 190
Title:

REAL TIME FOREGROUND OBJECT DETECTION USING PTZ CAMERA

Authors:

Lionel Robinault, Serge Miguet and Stéphane Bres

Abstract: An important research is done to exploit the characteristics of PTZ cameras. These cameras allow motorized cover a wide field of view. A classic application of these cameras is to image mosaicing. But they can also be used to track moving objects. In this paper, we present an original approach for performing the registration, adapted to the case of central projection and a background subtraction algorithms for these cameras. The background image is iteratively updated and only on the part "seen" by the camera. We have experimented different segmentation algorithms using our background modeling technique and this approach makes it possible object tracking in real time for PTZ cameras.

Paper Nr: 215
Title:

COMBINED 3D AND MULTISPECTRAL FRESCO DOCUMENTATION OF THE VILLA OPLONTIS, POMPEI - High-Resolution and High-Performance Digitization of Cultural Heritage

Authors:

Bernd Breuckmann, Hubert Mara and Zsófia Végvári

Abstract: Motivated by cultural heritage, industry, medicine we are developing 3D-scanners and post-processing systems for rapid and precise documentation of surfaces with curvature. By constantly increasing resolution and accuracy of our system we can enable the documentation of small deviations of even flat surfaces – like frescos. This enables documentation of important features for restoration like small fractures or topology of paintstrokes for scientific research. The 3D-documentation can be done in-situ, radiation-free and contact-free using a structured (coded) light-source and a digital camera. Using light for documentation of colourful painted surface lead to the integration of colour-filtering techniques to ”see thru” the first layer(s) of paint. This approach, typically known from photography, is used to reveal under- drawings of paintings. While photographs suffer from lens distortion lacking a precise scale, we can provide the height of paint-layers in µm in a properly calibrated scale. This method has already been successful tested on synthetic data and medieval paintings and statues, which cover not all painting techniques known to art historians. Therefore we conducted experiments in Pompei to determine the capabilities of our system for fresco paintings. Results shown in this report cover traditional close-range 3D-acquisition for larger fields of view (m2) and multi-spectral 3D-acquisition for paint layers having a field of view of ˜ 600cm2. Regarding performance – having a tremendous amount of frescos – we could show that 3D-acquisition can be done in ˜ 15 minutes per m2. Multi-spectral 3D-acquisition can be applied in a similar fast manner by using expert-knowledge to narrow down the areas of interest.

Paper Nr: 235
Title:

DIGITAL IMAGE STABILIZATION IN A VIDEO-STREAM - Stabilization of (Undesirable) Image Movements in a Video-Stream

Authors:

Martin Drahansky and Filip Orsag

Abstract: This paper deals with an image stabilization for video based tracking systems. At the beginning an introduc-tion to the image stabilization is stated. Short description of known algorithms for image stabilization follows including our solution based on these methods with some optimizations. At the end, we represent a suitable hardware platform, which was developed and constructed by us and uses DSP, FPGA and SDRAM. The connection of our software and our hardware is new and very promising.

Paper Nr: 236
Title:

A CAMERA AUTO-CALIBRATION ALGORITHM FOR REALTIME ROAD TRAFFIC ANALYSIS

Authors:

Juan Carlos Tocino Diaz, Quentin Houben, Jacek Czyz, Olivier Debeir and Nadine Warzée

Abstract: This paper presents a new mono-camera system for traffic surveillance. It uses an original algorithm to obtain automatically a calibration pattern from road lane markings. Movement detection is done with a ∑ - ∆ background estimation which is a non linear method of background substraction based on comparison and elementary increment/decrement. Foreground and calibration data obtained allow to determine vehicles speed in an efficient manner. Finally, a new method to estimate the height of vehicles is presented.

Paper Nr: 252
Title:

MULTIPLE VEHICLE TRACKING USING GABOR FILTER BANK PREDICTOR

Authors:

James Graham, Mehmet Celenk, John Willis, Tom Conley and Haluk Eren

Abstract: This paper presents a time-varying Gabor filter bank predictor for use with vehicle tracking via surveillance video. A frame-based 2D Gabor-filter bank is selected as a primary detector for any changes in a given video frame sequence. Detected changes are localized in each frame by fitting a bounding box on the silhouette of the vehicle in the region of interest (ROI). Arbitrary motion of each vehicle is fed to a non-linear directional predictor in the time axis for estimating the location of the tracked vehicle in the next frame of the video sequence. Real-time traffic-video experimentation dictates that the cone Gabor filter structure is able to tune itself into a selected target and trace it accordingly. This property is highly desirable in the fast and accurate moving vehicle or target tracking purposes in range and intensity driven sensing.

Paper Nr: 253
Title:

ON-LINE FACE TRACKING UNDER LARGE LIGHTING CONDITION VARIATIONS USING INCREMENTAL LEARNING

Authors:

Lyes Hamoudi, Khaled Boukharouba, Jacques Boonaert and Stéphane Lecoeuche

Abstract: To be efficient outdoors, automated video surveillance systems should recognize and monitor humans activities under various amounts of light. In this paper, we present a human face tracking system that is based on the classification of the skin pixels using colour and texture properties. The originality of this work concerns the use of a specific dynamical classifier. An incremental svm algorithm equipped with dynamic learning and unlearning rules, is designed to track the variation of the skin-pixels distribution. This adaptive skin classification system is able to detect and track a face in large lighting condition variations.