VISAPP 2006 Abstracts
Conference
Area 1 - Image Formation and Processing
Area 2 - Image Analysis
Area 3 - Image Understanding
Area 4 - Motion, Tracking and Stereo Vision




Area 1 - Image Formation and Processing

Title:
AN EFFICIENT CATADIOPTRIC SENSOR CALIBRATION BASED ON A LOW-COST TEST-PATTERN
Author(s):
Nicolas Ragot, Jean-Yves Ertaud, Xavier Savatier and Belahcène Mazari
Abstract:
This article presents an innovative calibration method for a panoramic vision sensor which is dedicated to the three-dimensional reconstruction of an environment with no prior knowledge. We begin this paper by a detailed presentation of the architecture of the sensor. We mention the general features about central catadioptric sensors and we clarify the fixed viewpoint constraint. Next, a large description of the previous panoramic calibration techniques is given. We mention the different postulates which lead us to envisage the method of calibration presented in this paper. A description of the low-cost calibration test pattern is given. The algorithmic approach developed is detailed. We present the results obtained. Finally, the last part is devoted to the result reviewing.

Title:
JOINT PRIOR MODELS OF MUMFORD-SHAH REGULARIZATION FOR BLUR IDENTIFICATION AND SEGMENTATION IN VIDEO SEQUENCES
Author(s):
Hongwei Zheng and Olaf Hellwich
Abstract:
We study a regularized Mumford-Shah functional in the context of joint prior models for blur identification and segmentation. For the ill-posed regularization problem, it is hard to find a good initial value for ensuring the soundness of the convergent value. A newly introduced prior solution space of point spread functions in a double regularized Bayesian estimation can satisfy such demands. The Mumford-Shah functional is formulated using $\Gamma$-convergence approximation and is minimized by projecting iterations onto an alternating minimization within Neumann conditions. The pre-estimated priors support the Mumford-Shah functional to decrease of the complexity of computation and improve the restoration results simultaneously. Moreover, segmentation of blurred objects is more difficult. A graph-theoretic approach is used to group edges which driven from the Mumford-Shah functional. Blurred objects with lower gradients and objects with stronger gradients are grouped separately. Numerical experiments show that the proposed algorithm is robust and efficiency in that it can handle images that are formed in different environments with different types and amounts of blur and noise.

Title:
RENDERING (COMPLEX) ALGEBRAIC SURFACES
Author(s):
J. F. Sanjuan-Estrada, L. G. Casado and I. García
Abstract:
The traditional ray-tracing technique based on a ray-surface intersection is reduced to a developable surface-surface intersection problem. At the core of every ray-tracing program is the fundamental question of detecting the intersecting point(s) of a ray and a surface. Usually, this applications involve computation and manipulation of non-linear algebraic primitivies, where these primitivies are represented using real numbers and polynomial equations. But the fast algorithms used for real polynomial surfaces are not useful to render complex polynomials. In this paper, we propose to extend the traditional ray-tracing technique to detect the intersecting points of a ray and complex polynomials. Each polynomial equation with some complex coefficients, that is coefficients with real and imaginary numbers, are called complex polynomials. We use a root finder algorithm based in interval arithmetic which computes verified enclosures of the roots of a complex polynomial by enclosing the zeros in narrow bounds. We also propose a new procedure to render real or complex polynomial in real and complex space. If we want rendering a surface in complex space, the algorithm must detect all real and complex roots. The color of pixel will be calculated with those root with an arguments inside on complex space chosen and a minimum modulus of complex roots.

Title:
CONSIDERATIONS ON THE FFT VARIANTS FOR AN EFFICIENT STREAM IMPLEMENTATION ON GPU
Author(s):
José G. Marichal-Hernández, Fernando Rosa and José M. Rodríguez-Ramos
Abstract:
In this article, the different variants of the fast Fourier transform algorithm are revisited and analysed in terms of the cost of implementing them on graphics processing units. We describe the key factors in the selection of an efficient algorithm that takes advantage of this hardware and, with the stream model language BrookGPU, we implement efficient versions of unidimensional and bidimensional FFT. These implementations allow the computation of unidimensional transform sequences of 262k complex numbers under 13 ms and bidimensional transforms on sequences of size 1024x1024 under 59 ms on a G70 GPU, that is almost 3.4 times faster than FFTW on a high-end CPU.

Title:
CALCULATION OF OPTIMAL TRAJECTORY IN 3-D STRUCTURED ENVIRONMENT BY USING GEODESY AND MATHEMATICAL MORPHOLOGY
Author(s):
Santiago T. Puente, Fernando Torres, Francisco Ortiz and Pablo Gil
Abstract:
A new method for obtaining the optimal path to disassembly an object in a 3-D structure is presented in this paper. To obtain the optimal path, we use an extension of the mathematical morphology and the geodesic distance to 3-D sets. The disassembly algorithm is based on the search for a path of minimum cost by using the wave-front of the geodesic distance. Cost is considered to be the number of changes in trajectory required to be able to remove the object. The new method will be applied to disassembly objects in several 3-D environments. The results of the disassembly of an object in a concrete 3-D set will be shown.

Title:
A SIMPLE THREE–PARAMETER SURFACE FITTING SCHEME FOR IMAGE COMPRESSION
Author(s):
Salah Ameer and Otman Basir
Abstract:
This paper describes a simple scheme to compress images through surface fitting. The scheme can achieve better than 60:1 compression ratio with acceptable image quality degradation. The results are superior to those of JPEG at comparable ratios. Another advantage is that no multiplications or divisions are required, making the implementation suitable for online or progressive compression. Blocking effects were reduced (up to 0.5dB of PSNR improvement) through simple line fitting on block boundaries.

Title:
A NOVEL COPYRIGHT PROTECTION FOR DIGITAL IMAGES USING EXTRA SCRAMBLED INFORMATION
Author(s):
Jin-Wook Shin, Jucheng Yang, Dong-Sun Park and Sook Yoon
Abstract:
Both watermarking and fingerprinting techniques can be used for protecting digital contents with different properties. A watermarking system may degrade the fidelity of the digital contents by embedding watermark messages, while a fingerprinting system may have high computational complexity to generate unique features for digital contents. In this paper, we propose a novel copyright protection technique that combines positive features of both techniques. The proposed technique can distribute digital images without embedding messages related with them, and save extra scrambled information on simple fingerprints stored in a certified database. Experimental results show that the proposed method outperforms an existing method for various signal processing attacks. The proposed technique is also flexible and fast so that it can be used for real-time applications.

Title:
CFA DEMOSAICKING BY ADAPTIVE ORDER OF APPROXIMATION
Author(s):
J.S. Jimmy Li and Sharmil Randhawa
Abstract:
Colour filter array (CFA) demosaicking refers to determining the missing colour values at each pixel when a single-sensor digital camera is used for colour image capture. It has recently been shown that missing colour values can be interpolated or extrapolated using Taylor series. The accuracy of approximation depends on the number of high order derivative terms included in the Taylor series. For a smooth region of an image, the higher the order, the higher the accuracy in the approximation of the missing colour values. However, the estimation of high order derivative terms requires pixel values from a wider area of neighbourhood. When an image contains features closely spaced together, extrapolation using pixels from a smaller region of neighbourhood is preferred and a low order of approximation should be applied. In order to achieve more accurate results, we propose an algorithm using an adaptive order of approximation depending on the colour smoothness of the image. It has been shown that our algorithm outperforms other techniques for various images, and in particular for images with the above mentioned characteristics.

Title:
QUANTITATIVE COMPARISON OF TOLERANCE-BASED FEATURE TRANSFORMS
Author(s):
Dennie Reniers and Alexandru Telea
Abstract:
Tolerance-based feature transforms (TFTs) assign to each pixel in an image not only the nearest feature pixels on the boundary (origins), but all origins from the minimum distance up to a user-defined tolerance. In this paper, we compare four simple-to-implement methods for computing TFTs for binary images. Of these, two are novel methods and two extend existing distance transform algorithms. We quantitatively and qualitatively compare all algorithms on speed and accuracy of both distance and origin results. Our analysis is aimed at helping practitioners in the field to choose the right method for given accuracy and performance constraints.

Title:
VISIBILITY BASED DETECTION AND REMOVAL OF SEMI-TRANSPARENT BLOTCHES ON ARCHIVED DOCUMENTS
Author(s):
Vittoria Bruni, Andrew Crawford, Domenico Vitulano and Filippo Stanco
Abstract:
This paper focuses on a novel model for digital suppression of semi-transparent blotches, caused by the contact between water and paper on antique documents. The proposed model is based on laws regulating the human visual system and provides a fast and automatic algorithm both in detection and restoration. Experimental results show the great potentialities of the proposed model in solving also critical situations.

Title:
EXEMPLAR-BASED INPAINTING WITH ROTATION INVARIANT PATCH MATCHING
Author(s):
Jiri Boldys and Bernard Besserer
Abstract:
In this paper, we propose a novel approach to patch matching in exemplar-based inpainting. Our field of concern is movie restoration, particularly scratch concealment. Here we want to focus on a single frame (still image) inpainting. Exemplar-based approach uses patches from the known areas and copies their content to the damaged area. In case of irregular texture, there might be no patches available, so that the result would be visually acceptable. One way to increase the number of available patches is to rotate them. In most of the exemplar-based approaches, a target patch is not complete and a source patch has to be rotated and compared at every single angle. We overcome this inefficiency using a clue image, which comes from previous processing stages. We use moments of patches from this clue image, normalized to rotation, to reject apparently dissimilar patches, and to calculate the approximate angle of rotation, which has to be performed only once. In this paper, we provide justification for this simplification. We have no ambitions to provide a complete inpainting algorithm here.

Title:
ROBUST CALIBRATION OF A RECONFIGURABLE CAMERA ARRAY FOR MACHINE VISION INSPECTION (RAMVI): Using Rule-Based Colour Recognition
Author(s):
Patrick Spicer, Kristin Bohl, Gil Abramovich and Jacob Barhak
Abstract:
This paper describes a Reconfigurable Array for Machine Vision Inspection (RAMVI) that is able to produce spatially-accurate images combining information obtained from several cameras. Automatic camera calibration is essential for minimizing the changeover time required to reconfigure the array. This paper describes an automatic calibration method that uses a colour coded calibration grid (CCG) to determine the field of view of each camera relative to the other cameras. Since colour is integral to the calibration process, robust colour recognition is essential, particularly since several cameras are involved. Hence, a rule-based colour recognition methodology is described. Results are presented demonstrating the effectiveness of this approach under varying lighting conditions.

Title:
REAL-TIME FPGA-BASED IMAGE RECTIFICATION SYSTEM
Author(s):
Cristian Vancea, Sergiu Nedevschi, Mihai Negru and Stefan Mathe
Abstract:
Image rectification is the process of transforming stereo-images as if they were captured using a canonical stereo-system. Computationally intensive tasks, like dense stereo matching, are greatly simplified if performed on rectified images. We developed an efficient pipeline hardware machine which performs real-time image rectification. The design was implemented using VHDL, thus allowing portability on many hardware platforms. The architecture was highly optimized, both in terms of time and resources needed. To increase its flexibility, the design was described based on generics, which allow reconfiguring different characteristics and behaviour, such as: image size, number of precision bits, memory cache complexity. We also analyze the performance of the implemented solution on a VirtexE600 FPGA device.

Title:
AN UNIFIED THEORY FOR STEERABLE AND QUADRATURE FILTERS
Author(s):
Kai Krajsek and Rudolf Mester
Abstract:
In this paper, a complete theory of steerable filters is presented which shows that quadrature filters are only a special case of steerable filters. Although there has been a large number of approaches dealing with the theory of steerable filters, none of these gives a complete theory with respect to the transformation groups which deform the filter kernel. Michaelis and Sommer and Hel-Or and Teo were the first ones who gave a theoretical justification for steerability based on Lie group theory. But the approach of Michaelis and Sommer considers only Abelian Lie groups. Although the approach of Hel-Or and Teo considers all Lie groups, their method for generating the basis functions may fail as shown in this paper. We extend these steerable approaches to arbitrary Lie groups, like the important case of the rotation group $SO(3)$ in three dimensions. Quadrature filters serve for computing the local energy and local phase of a signal. Whereas for the one dimensional case quadrature filters are theoretically well founded, this is not the case for higher dimensional signal spaces. The monogenic signal based on the Riesz transformation has been shown to be a rotational invariant generalization of the analytic signal. A further generalization of the monogenic signal, the 2D rotational invariant quadrature filter, has been shown to capture richer structures in images as the monogenic signal. We present a generalization of the rotational invariant quadrature filter based on our steerable theory. Our approach includes the important case of 3D rotational invariant quadrature filters but it is not limited to any signal dimension and includes all transformation groups that own an unitary group representation.

Title:
RESTORATION OF DEGRADED MOVING IMAGE FOR PREDICTING A MOVING OBJECT
Author(s):
Kei Akiyama, Zhi-wei Luo, Masaki Onishi and Shigeyuki Hosoe
Abstract:
Iterative optimal calculation methods have been proposed for restoration of degraded static image based on wavelet multiresolution decomposition. However, it is quite difficult to apply these methods to process moving images due to the high computation cost. In this paper, we propose an effective restoration method for degraded moving image by modeling the motion of a moving object and predicting the future object position. We verified our method by computer simulations and experiments to show that our method can reduce the computation time.

Title:
SYNTHESIZING FACE IMAGES BY IRIS REPLACEMENT - Strabismus Simulation
Author(s):
Xiaoyi Jiang, Swenja Rothaus, Kai Rothaus and Daniel Mojon
Abstract:
In this paper we consider a class of face image processing operations, in which we change the position of the iris. In particular, we present a novel technique for synthesizing strabismic face images from a normal frontal face image. This image synthesis is needed for conducting studies in psychosocial and vocational implications of strabismus and strabismus surgery and we are not aware of any previous work for this purpose. The experimental results demonstrate the potential of our approach. The algorithm presented in this paper provides the basis for two related tasks of correction of strabismic face images and gaze direction.

Title:
ACHIEVING HIGH-RESOLUTION VIDEO USING SCALABLE CAPTURE, PROCESSING, AND DISPLAY
Author(s):
Donald Tanguay, H. Harlyn Baker and Dan Gelb
Abstract:
New video applications are becoming possible with the advent of several enabling technologies: multicamera capture, increased PC bus bandwidth, multicore processors, and advanced graphics cards. We present a commercially-available multicamera system and a software architecture that, coupled with industry trends, create a situation in which video capture, processing, and display are all increasingly scalable in the number of video streams. Leveraging this end-to-end scalability, we introduce a novel method of generating high-resolution, panoramic video. While traditional point-based mosaicking requires significant image overlap, we gain significant advantage by calibrating using shared observations of lines to constrain the placement of images. Two non-overlapping cameras do not share any scene points; however, seeing different parts of the same line does constrain their spatial alignment. Using lines allows us to reduce overlap in the source images, thereby maximizing final mosaic resolution. We show results of synthesizing a 6 megapixel video camera from 18 smaller cameras, all on a single PC and at 30 Hz.

Title:
A NOVEL APPROACH TO PLANAR CAMERA CALIBRATION
Author(s):
Ashutosh Morde, Mourad Bouzit and Lawrence Rabiner
Abstract:
Camera calibration is an important step in 3D reconstruction of scenes. Many natural and man made objects are circular and form good candidates as calibration objects. We present a linear calibration algorithm to estimate the intrinsic camera parameters using at least three images of concentric circles of unknown radii. Novel methods to determine the projected center of concentric circles of unknown radii using the projective invariant, cross ratio, and calculating the vanishing line of the circle are proposed. The circular calibration pattern can be easily and accurately created. The calibration algorithm does not require any measurements of the scene or the homography between the images. Once the camera is fully calibrated the focal length of zooming cameras can be estimated from a single image. The algorithm was tested with real and synthetic images with different noise levels.

Title:
A STATISTICAL BASED APPROACH FOR REMOVING HEAVY TAIL NOISE FROM IMAGES
Author(s):
Mohammed El Hassouni and Hocine Cherifi
Abstract:
In this paper, we propose to use a class of filters based on fractional lower order statistics (FLOS) for still image restoration in the presence of $\alpha$-stable noise. For this purpose, we present a family of 2-D finite-impulse response (FIR) adaptive filters optimized by the least mean $l_p$-norm (LMP) algorithm. Experiments performed on natural images prove that the proposed algorithms provide superior performance in impulsive noise environments compared to LMS and Weighted Myriad filters.

Title:
SCAN-LINE QUALITY INSPECTION OF STRIP MATERIALS USING 1-D RADIAL BASIS FUNCTION NETWORK
Author(s):
Afşar Saranlı
Abstract:
There exist a variety of manufacturing quality inspection tasks where the inspection of a continuous strip of material using a scan-line camera is involved. Here the image is very short in one dimension but unlimited in the other dimension. In this study, a method of image event detection for this class of applications based on adaptive radial-basis function networks is presented. The architecture of the system and the adaptation methodology is presented in detail together with a detailed discussion on parameter selection. Promising detection results are illustrated for an application to grinded glass edge inspection problem.

Title:
ADAPTIVE STACK FILTERS IN SPECKLED IMAGERY
Author(s):
María E. Buemi, Marta E. Mejail, Julio C. Jacobo and María J. Gambini
Abstract:
Stack filters are a special case of non-linear filters. They have a good performance for filtering images with different types of noise while preserving edges and details. A stack filter decomposes an input image into several binary images according to a set of thresholds. Each binary image is filtered by using a boolean function. Adaptive stack filters are optimized filters that compute a boolean function by using a corrupted image and ideal image without noise. In this work the behaviour of an adaptive stack filter is evaluated for the classification of synthetic apreture radar (SAR) images, which are affected by speckle noise. With this aim it is carried out a Monte Carlo experiment in which simulated images are generated and then filtered with a stack filter trained with one of them. The results of their maximum likelihood classification are evaluated and then are compared with the results of classifying the images without previous filtering.

Title:
A NEW TECHNIQUE FOR COLOR IMAGE QUANTIZATION
Author(s):
Wafae Sabbar and Abdelkrim Bekkhoucha
Abstract:
In this paper, we introduce a new technique of color image quantization. It is carried out in two processing. In the first, we decrease the number of color using a multi-thresholding, by intervals, of the tree marginal histograms of the image. In the second processing, the colors determined in the first processing are reduced by colors fusion based on the mean square error minimization. The algorithm is simple to implement and produces a high quality result.

Title:
THE USE OF DYNAMICS IN GRAYLEVEL QUANTIZATION BY MORPHOLOGICAL HISTOGRAM PROCESSING
Author(s):
Franklin Flores, Leonardo Facci and Roberto Lotufo
Abstract:
In a previous paper, it was proposed a method applied to image simplification in terms of graylevel and flat zone reduction, by histogram classification via morphological processing. It this method, it is possible to reduce the number of graylevels of an image to n graylevels by selecting n regional maxima in the processed histogram and discarding the remaining ones, in other to classify the histogram via application of watershed operator. In the previous paper, it was proposed the choice of the n highest regional maxima. By far, it is not the best criterion to choose the regional maxima and other criteria had been were tested in order to obtain a better histogram classification. In this paper we propose the selection of the regional maxima via application of dynamics, a measurement of contrast usually applied to find markers to morphological segmentation.

Area 2 - Image Analysis

Title:
A NEUROBIOLOGICALLY INSPIRED VOWEL RECOGNIZER USING HOUGH-TRANSFORM - A novel approach to auditory image processing
Author(s):
Tamás Harczos, Frank Klefenz and András Kátai
Abstract:
Many pattern recognition problems can be solved by mapping the input data into an n-dimensional feature space in which a vector indicates a set of attributes. One powerful pattern recognition method is the Hough-transform, which is usually applied to detect specific curves or shapes in digital pictures. In this paper the Hough-transform is applied to the time series data of neurotransmitter vesicle releases of an auditory model. Practical vowel recognition of different speakers with the help of this transform is investigated and the findings are discussed.

Title:
ELLIPSE DETECTION IN DIGITAL IMAGE DATA USING GEOMETRIC FEATURES
Author(s):
Lars Libuda, Ingo Grothues and Karl-Friedrich Kraiss
Abstract:
Ellipse detection is an important task in vision based systems because many real world objects can be described by this primitive. This paper presents a fast data driven four stage filtering process which uses geometric features in each stage to synthesize ellipses from binary image data with the help of lines, arcs, and extended arcs. It can cope with partial occluded and overlapping ellipses, works fast and accurate and keeps memory consumption to a minimum.

Title:
NEW WAVELETS BASED FEATURES FOR NATURAL SURFACE INDEXING
Author(s):
Hugo Alexandre and João Caldas Pinto
Abstract:
Natural Surfaces Indexing based on their visual appearance is an important industrial issue for example in inspection and automatic goods retrieval problems. However, due to the presence of randomly distributed high number of different colors and its subjective evaluation by human experts, the problem remains practically unsolved. In this paper they were introduced some new features derived from a wavelet decomposition of the original image represented in different color spaces. They were used different wavelet families and resolution levels. It will be shown that promising results on marble surfaces indexing can be obtained with a suitable combination of those parameters and using our proposed new features for indexing with very simple Euclidian distances.

Title:
EFLAM: A MODEL TO LEVEL-LINE JUNCTION EXTRACTION
Author(s):
Nikom Suvonvorn and Bertrand Zavidovique
Abstract:
This paper describes an efficient approach for the detection of level-line junctions in images. Potential junctions are exhibited independent from noise by their consistent local level-variation. Then, level-lines are tracked through junctions in descending the level-line flow. Flow junctions are extracted as image primitives to support matching in many applications. The primitive is robust against contrast changes and noise. It is easily made rotation invariant. As far as the image content allows, the spread of junctions can be controlled for even spatial distribution. We show some results and compare with the Harris detector.

Title:
NONPARAMETRIC STATISTICAL LEVEL SET SNAKE BASED ON THE MINIMIZATION OF THE STOCHASTIC COMPLEXITY
Author(s):
Pascal Martin, Philippe Réfrégier, Frederic Galland and Frédéric Guérault
Abstract:
In this paper, we focus on the segmentation of objects not necessarily simply connected using level set snakes and we present a nonparametric statistical approach based on the minimization of the stochastic complexity (Minimum Description Length principle). This approach allows one to get a criterion to optimize with no free parameter to be tuned by the user. We thus propose to estimate the probability law of the gray levels of the object and the background of the image with a step function whose order is automatically determinated. We show that coupling the probability law estimation and the segmentation steps leads to good results on various types of images. We illustrate the robustness of the proposed nonparametric statistical snake on different examples and we show on synthetic images that the segmentation results are equivalent to those obtained with a parametric statistical technique, although the technique is non parametric and without ad hoc parameter in the optimized criterion.

Title:
IMPROVED SEGMENTATION OF MR BRAIN IMAGES INCLUDING BIAS FIELD CORRECTION BASED ON 3D-CSC
Author(s):
Haojun Wang, Patrick Sturm, Frank Schmitt and Lutz Priese
Abstract:
The 3D Cell Structure Code (3D-CSC) is a fast region growing technique. However, directly adapted for segmentation of magnetic resonance (MR) brain images it has some limitations due to the variability of brain anatomical structure and the degradation of MR images by intensity inhomogeneities and noise. In this paper an improved approach is proposed. It starts with a preprocessing step which contains a 3D Kuwahara filter to reduce noise and a bias correction method to compensate intensity inhomogeneities. Next the 3D- CSC is applied, where a required similarity threshold is chosen automatically. In order to recognize gray and white matter, a histogram-based classification is applied. Morphological operations are used to break small bridges connecting gray value similar non-brain tissues with the gray matter. 8 real and 10 simulated T1-weighted MR images were evaluated to validate the performance of our method.

Title:
IMPROVED RECONSTRUCTION OF IMAGES DISTORTED BY WATER WAVES
Author(s):
Arturo Donate and Eraldo Ribeiro
Abstract:
This paper describes a new method for removing geometric distortion in images of submerged objects observed from outside shallow water. We focus on the problem of analyzing video sequences when the water surface is disturbed by waves. The water waves will affect the appearance of the individual video frames such that no single frame is completely free of geometric distortion. This suggests that, in principle, it is possible to perform a selection of a set of low distortion sub-regions from each video frame and combine them to form a single undistorted image of the observed object. The novel contribution in this paper is to use a multistage clustering algorithm combined with frequency domain measurements that allow us to select the best set of undistorted sub-regions of each frame in the video sequence. We evaluate the new algorithm on video sequences created both in our laboratory, as well as in natural environments. Results show that our algorithm is effective in removing distortion caused by water motion.

Title:
ANALYSIS OF AN EXTENDED PMART FOR CT IMAGE RECONSTRUCTION AS A NONLINEAR DYNAMICAL SYSTEM
Author(s):
Tetsuya Yoshinaga
Abstract:
Among iterative image reconstruction algorithms for computed tomography (CT), it is known that the power multiplicative algebraic reconstruction technique (PMART) has a good property for convergence speed and maximization of entropy. In this paper, we investigate an extended PMART, which is a dynamical class for accelerating the convergence. The convergence process of the state in the neighborhood of the true reconstructed image can be reduced to the property of a fixed point observed in the dynamical system. For investigating convergence speed, we present a computational method of obtaining parameter sets in which a given real or absolute value of the characteristic multiplier is equal. The advantage of the extended PMART is verified by comparing with the standard multiplicative algebraic reconstruction technique (MART) using numerical experiments.

Title:
A SPACE- AND TIME-EFFICIENT MOSAIC-BASED ICONIC MEMORY FOR INTERACTIVE SYSTEMS
Author(s):
Birgit Möller and Stefan Posch
Abstract:
One basic capability of interactive and mobile systems to cope with unknown situations and environments is active, sequence-based visual scene analysis. Image sequences provide static as well as dynamic and also 2D as well as 3D information about a certain scene. However, at the same time they require efficient mechanisms to handle their large data volumes. In this paper we introduce a new concept of a visual scene memory for interactive mobile systems that supports these systems with a space- and time-efficient data structure for representing iconic information. The memory is based on mosaic images and allows to efficiently store and process sequences of stationary rotating and zooming cameras. Its main key features are polytopial reference coordinate frames and an online data processing strategy. The polytopes provide euclidean coordinates and thus allow the application of standard image analysis algorithms directly to the data yielding easy access and analysis, while online data processing preserves system interactivity. Additionally, mechanisms are included to properly handle multi-resolution data and to deal with dynamic scenes. The concept has been implemented in terms of an integrated system that can easily be included as an additional module in the architecture of interactive and mobile systems. As one prototypical example for possible fields of application the integration of the memory into the architecture of an interactive multi-modal robot is discussed emphasizing the practical relevancy of the new concept.

Title:
A NOVEL ASYMMETRIC VARIANCE-BASED HYPOTHESIS TEST FOR A DIFFICULT SURVEILLANCE PROBLEM
Author(s):
Dalton Rosario
Abstract:
Local anomaly detectors have become quite popular for applications requiring hyperspectral (HS) target detection in natural clutter background assisted by an image analyst. Their popularity may have been attributed to the simplicity of the algorithms designed to function as such. A disadvantage of using such detectors, however, is that they often produce an intolerable high number of detections per scene, which—according to image analysts—becomes a nuisance rather than an aiding tool. We present an effective local anomaly detector for HS data. The new detector exploits a notion of indirect comparison between two sets of samples and is free from distribution assumptions. The notion led us to derive a compact solution for a variance test, in which, under the null hypothesis, the detector’s performance converges to a known distribution. Let X and Y denote two random samples, and let Z = X U Y, where U denotes the union. X can be indirectly compared to Y by comparing, instead, Z to Y. Implementation of this simple idea has shown the desirable outcome of preserving what is often characterized by image analysts as meaningful detections, and significantly reducing the number of meaningless detections. Experimental results using both simulated multivariate data and real HS data are presented to illustrate the effectiveness of this detector over five known alternative techniques.

Title:
AUTOMATIC BRAIN MR IMAGE SEGMENTATION BY RELATIVE THRESHOLDING AND MORPHOLOGICAL IMAGE ANALYSIS
Author(s):
Kai Li, Allen D. Malony and Don M. Tucker
Abstract:
We present an automatic method for segmentation of white matter, gray matter and cerebrospinal fluid in T1-weighted brain MR images. Instead of modeling images with a form of statistical distribution on the image intensities, whose solutions are often trapped into local optima, we model images in terms of spatial relationships between voxels considering structural, geometrical and radiological prior knowledge expressed in first-order logic. Brain tissue segmentation is first performed with relative thresholding, a new segmentation mechanism which compares two voxel intensities against a relative threshold. Relative thresholding makes intensity inhomogeneity transparent, avoids using any form of regularization, and enables global searching for optimal solutions as usually performed in traditional thresholding. Results from relative thresholding are improved by a series of morphological operations. The most important of these is what we call skeleton-based opening designed to robustly remove unwanted structures from binary objects.

Title:
A DETECTION METHOD OF INTERSECTIONS FOR DETERMINING OVERLAPPING USING ACTIVE VISION
Author(s):
Pablo Gil, Fernando Torres and Oscar Reinoso
Abstract:
Sometimes, the presence of objects difficult the observation of other neighboring objects. This is because part of the surface of an object occludes partially the surface of another, increasing the complexitiy in the recognition process. Therefore, the information which is acquired from scene to describe the objects is often incomplete and depends a great deal on the view point of the observation. Thus, when any real scene is observed, the regions and the boundaries which delimit and dissociate objects from others are not perceived easily. In this paper, a method to discern objects from others, delimiting where the surface of each object begins and finishes is presented. Really, here, we look for detecting the overlapping and occlusion zones of two or more objects which interact among each other in a same scene. This is very useful, on the one hand, to distinguish some objects from others when the features like texture colour and geometric form are not sufficient to separate them with a segmentation process. On the other hand, it is also important to identify occluded zones without a previous knowledge of the type of objects which are wished to recognize. The proposed approach is based on the detection of occluded zones by means of structured light patterns projected on the object surfaces in a scene. These light patterns determine certain discontinuities of the beam projections when they hit against the surfaces becoming deformed themselves. So that, such discontinuities are taken like zones of boundary of occlusion candidate regions.

Title:
DISTANCE HISTOGRAM TO CENTROID AS A UNIQUE FEATURE TO RECOGNIZE OBJECTS
Author(s):
Pilar Arques, Rafael Molina, Mar Pujol and Ramon Rizo
Abstract:
The shape of objects plays an essential role among the different aspects of visual information. A 2D silhouette often conveys enough information to allow the correct recognition of the original 3D object. Distance Histogram to Centroid will be used as the unique feature to totally describe an object and to distinguish it from all the other objects in the scene. The proposed system has been proved to be robust to discriminate between classes in a given set of objects The main advantages are the elimination of the feature selection process and avoiding the problem of dimensionality.

Title:
STATIC FOREGROUND ANALYSIS TO DETECT ABANDONED OR REMOVED OBJECTS
Author(s):
Andrea Caroppo, Tommaso Martiriggiano, Marco Leo, Paolo Spagnolo and Tiziana D'Orazio
Abstract:
In this paper, a new method to robustly and efficiently analyse video sequences to both extract foreground objects and to classify the static foreground regions as abandoned or removed objects (ghosts) is presented. As a first step, the moving regions in the scene are detected by subtracting to the current frame a referring model continuously adapted. Then, a shadow removing algorithm is used to find out the real shape of the detected objects and an homographic transformations is used to localize them in the scene avoiding perspective distortions. Finally, moving objects are classified as abandoned or removed by analysing the boundaries of static foreground regions. The method was successfully tested on real image sequences and it run about 7 fps at size 480x640 on a 2,33 GB Pentium IV machine.

Title:
EXCLUDING THE REMAINING RIDGES OF FINGERPRINT IMAGE
Author(s):
En Zhu, Jianping Yin, Chunfeng Hu, Guomin Zhang and Jianming Zhang
Abstract:
Fingerprint segmentation is usually to identify non-ridge regions and unrecoverable low quality ridge regions and exclude them as background so as to reduce the time of image processing and avoid detecting false features. In ridge regions, including high quality and low quality, there are often some remaining ridges which are the afterimage of the previously scanned finger and are expected to be excluded from the foreground. However, existing segmentation methods do not take the case into consideration, and often, the remaining ridge regions are falsely taken as foreground. This paper proposes two steps for fingerprint segmentation aiming to excluding the remaining ridge region from the foreground. The non-ridge regions and unrecoverable low quality ridge regions are removed as background in the first step, and then the foreground produced by the first step is further analyzed so as to remove the remaining ridge region. The proposed method turns out effective in avoiding detecting false ridges and in improving minutiae detection.

Title:
STATISTICAL TECHNIQUES FOR EDGE DETECTION IN HISTOLOGICAL IMAGES
Author(s):
David Svoboda, Ian Williams, Nicholas Bowring and Elizabeth Guest
Abstract:
A review of the statistical techniques available for performing edge detection on histological images is presented. The tests under review include the Student T Test, the Fisher test, the Chi Square test, the Kolmogorov Smirnov test, and the Mann Whitney U test. All utilize a novel two sample edge detector to compare the statistical properties of two image regions surrounding a central pixel. The performance of the statistical tests is compared using histological biomedical images on which traditional gradient based techniques fail, therefore giving an overall review of the methods, and results. Comparisons are also made to the more traditional Canny and Sobel, edge detection filters. The results show that in the presence of noise and clutter in histological images both parametric and non-parametric statistical tests compare well robustly extracting edge information on a series images.

Title:
NONLINEAR PRIMARY CORTICAL IMAGE REPRESENTATION FOR JPEG 2000 - Applying natural image statistics and visual perception to image compression
Author(s):
Roberto Valerio and Rafael Navarro
Abstract:
In this paper, we present a nonlinear image representation scheme based on a statistically-derived divisive normalization model of the information processing in the visual cortex. The input image is first decomposed into a set of subbands at multiple scales and orientations using the Daubechies (9, 7) floating point filter bank. This is followed by a nonlinear “divisive normalization” stage, in which each linear coefficient is squared and then divided by a value computed from a small set of neighboring coefficients in space, orientation and scale. This neighborhood is chosen to allow this nonlinear operation to be efficiently inverted. The parameters of the normalization operation are optimized in order to maximize the statistical independence of the normalized responses for natural images. Divisive normalization not only can be used to describe the nonlinear response properties of neurons in visual cortex, but also yields image descriptors more independent and relevant from a perceptual point of view. The resulting multiscale nonlinear image representation permits an efficient coding of natural images and can be easily implemented in a lossy JPEG 2000 codec. In fact, the nonlinear image representation implements in an automatic way a more general version of the point-wise extended masking approach proposed as an extension for visual optimisation in JPEG 2000 Part 2. Compression results show that the nonlinear image representation yields a better rate-distortion performance than the wavelet transform alone.

Title:
CONSTRAINED GENERALISED PRINCIPAL COMPONENT ANALYSIS
Author(s):
Wojciech Chojnacki, Anton van den Hengel and Michael J. Brooks
Abstract:
Generalised Principal Component Analysis (GPCA) is a recently devised technique for fitting a multi-component, piecewise-linear structure to data, which has found strong utility in computer vision. Unlike other methods which intertwine the processes of estimating structure components and segmenting data points into clusters associated with putative components, GPCA estimates a multi-component structure with no recourse to data clustering. The standard GPCA algorithm searches for an estimate by minimising an appropriate misfit function. The underlying constraints on the model parameters are ignored. Here we promote a variant of GPCA that incorporates the parameter constraints and exploits constrained rather than unconstrained minimisation of the error function. The output of any GPCA algorithm hardly ever perfectly satisfies the parameter constraints. The new version of GPCA greatly facilitates the final correction of the algorithm output to satisfy perfectly the constraints, making this step less prone to error in the presence of noise. The method is applied to the example problem of fitting a pair of lines to noisy image points, but has potential for use in more general multi-component structure fitting in computer vision.

Title:
MODEL-BASED CAVITY SHAPE ESTIMATION IN A GAS-LIQUID SYSTEM WITH NONUNIFORM IMAGE SAMPLING
Author(s):
Magnus Evestedt and Alexander Medvedev
Abstract:
A water model is studied to simulate physical phenomena in the Lintz-Donawitz steel converter. The depression in the liquid, due to the impinging gas jet, is measured by means of a video camera. Image processing tools are used to extract the edge of the surface indentation. The measured edge, sampled in a special way, is used together with a nonlinear mathematical model to obtain a description of the cavity profile. The parameters of the mathematical model are optimized to match the registered cavity edge in the image at a set of sampled points. Three ways of choosing sampling points for the optimization are proposed and compared on simulated as well as experimental data. An approach involving an observer decreases the computation time with an acceptable loss of accuracy of the estimates.

Title:
PERCEPTUAL ORGANIZATION OF DIRECTIONAL PRIMITIVES USING A PSEUDOCOLOR FUZZY HOUGH TRANSFORM FOR ARC DETECTION
Author(s):
Marta Penas, Manuel G. Penedo, Noelia Barreira and María José Carreira
Abstract:
This paper describes a computational framework for extracting the low-level directional primitives present in an image and organizing them into circular arcs. The system is divided into three stages: extraction of the directional features through an efficient implementation of the Gabor wavelet decomposition, reduction of the high dimensional Gabor results by means of growing cell structures and detection of the circular arcs by means of a pseudo-color Fuzzy Hough Transform.

Title:
INTERPOLATION SNAKES FOR BORDER DETECTION IN ULTRASOUND IMAGES
Author(s):
Silviu Minut and George Stockman
Abstract:
Ultrasound images present major challanges to just about any segmentation algorithm, including active contour techniques, due to increased specularity, non-uniform edges along the boundaries of interest, incomplete and misleading visual support. Active contours that depend on a vector of parameters (\eg B-splines), have been proposed in the literature, and have the advantage over traditional snakes and level-set snakes, that smoothness is built-in, which is a {\it sine qua non} requirement in border detection in medical images. We propose in this paper the use of {\it interpolation splines} as active contours for border detection in ultrasound images, which we term {\it interpolation snakes}. We argue that interpolation snakes are better suited for ultrasound than other snakes, because of the fact that the control points (parameters which control the shape of the snake) are {\it on} the curve. This allows for an initial arclength parameterization of the snake. In conjunction with interpolation snakes we define a new energy (measure of fit) which incorporates a term supposed to maintain arclength parameterization of the snake throughout the minimization process. A shape prior can also be introduced naturally, as a distribution on the control points.

Title:
LOCAL ENERGY MINIMISATIONS - An Optimisation for the Topological Active Volumes Model
Author(s):
N. Barreira, M. G. Penedo and M. Penas
Abstract:
The Topological Active Volumes (TAV) model \cite{barreira05} is a general active model focused on 3D segmentation tasks. It can also be used for the surface reconstruction and the topological analysis of the inner side of the detected objects. As any other deformable model, it defines a mesh and several energy functions. The minimisation of the energy functions moves the mesh towards the objects in the scene. The breaking of connections causes topological changes directed to the achievement of specific adjustments. This way, as well as improving the adjustment, the model is able to find several objects in the image and delimit holes in the structures detected. The TAV model achieves accurate results but the computational cost of the segmentation procedure is high. To reduce it, this paper proposes an optimisation of the model. It consists in performing local energy minimisations after the connection breaking process. This way, the execution times are reduced and the accuracy of the results is increased.

Title:
A NEW MULTISCALE, CURVATURE-BASED SHAPE REPRESENTATION TECHNIQUE FOR CONTENT-BASED IMAGE RETRIEVAL
Author(s):
JanKees van der Poel, Leonardo Batista and Carlos Almeida
Abstract:
This work presents a new multiscale, curvature-based shape representation technique for planar curves. One limitation of the well-known Curvature Scale Space (CSS) method is that it uses only curvature zero-crossings to characterize shapes and thus there is no CSS descriptor for convex shapes. The proposed method, on the other hand, uses bidimentional->unidimentional->bidimentional transformations together with resampling techniques to retain the full curvature information for shape characterization. It also employs the correlation coefficient as a measure of similarity. In the evaluation tests, the proposed method achieved a high correct classification rate (CCR), even when the shapes were severely corrupted by noise. Results clearly showed that the proposed method is more robust to noise than CSS.

Title:
LOCAL KERNEL COLOR HISTOGRAMS FOR BACKGROUND SUBTRACTION
Author(s):
Philippe Noriega, Benedicte Bascle and Olivier Bernier
Abstract:
n addition to being invariant to image rotation and translation, histograms have the advantage of being easy to compute. These advantages make histograms very popular in computer vision. However, without data quantization to reduce size, histograms are generally not suitable for realtime applications. Moreover, they are sensitive to quantization errors and lack any spatial information. This paper presents a way to keep the advantages of histograms avoiding their inherent drawbacks using local kernel histograms. This approach is tested for background subtraction using indoor and outdoor sequences.

Title:
TEXT LOCALIZATION IN COLOR DOCUMENTS
Author(s):
Nikos Papamarkos, Nikos Nikolaou, Euthimios Badekas and Charalambos Strouthopoulos
Abstract:
A new method for text localization in cover color pages and general color document images is presented. The colors of the document image are reduced to a small number using a color reduction technique based on a Kohonen Self Organized Map (KSOM) neural network. Each color defines a color plane in which the connected components (CCs) are extracted. In each color plane a CC filtering procedure is applied which is followed by a local grouping procedure. At the end of this stage, groups of CCs are constructed which are next refined by obtaining the Direction Of Connection (DOC) property for each CC. Using the DOC property, the groups of CCs are classified as text or non text regions. Finally, text regions identified in the different color planes are superimposed and the final text localization of the entire document is achieved. The proposed technique was extensively tested with a large number of color documents.

Title:
A NEURAL NETWORK APPROACH TO BAYESIAN BACKGROUND MODELING FOR VIDEO OBJECT SEGMENTATION
Author(s):
Dubravko Culibrk, Oge Marques, Daniel Socek, Hari Kalva and Borko Furht
Abstract:
Object segmentation from a video stream is an essential task in video processing and forms the foundation of scene understanding, object-based video encoding (e.g. MPEG4), and various surveillance and 2D-to-pseudo-3D conversion applications. The task is difficult and exacerbated by the advances in video capture and storage. Increased resolution of the sequences requires development of new, more efficient algorithms for object detection and segmentation. The paper presents a novel neural network based approach to background modelling for motion based object segmentation in video sequences. The proposed approach is designed to enable efficient, highly-parallelized hardware implementation. Such a system would be able to achieve real time segmentation of high-resolution sequences.

Title:
COLOR SEGMENTATION OF COMPLEX DOCUMENT IMAGES
Author(s):
Nikos Papamarkos and Nikos Nikolaou
Abstract:
In this paper we present a new method for color segmentation of complex document images which can be used as a preprocessing step of a text information extraction application. From the edge map of an image, we choose a representative set of samples of the input color image and built a 3D histogram of the RGB color space. These samples are used to locate a relatively large number of proper points in the 3D color space and use them in order to initially reduce the colors. From this step, an oversegmented image is produced which usually has no more than 100 colors. To extract the final result, a mean shift procedure starts from the calculated points and locates the final color clusters of the RGB color distribution. Also, to overcome noise problems, a new edge preserving smoothing filter is used to enhance the quality of the image. Experimental results showed the method’s capability of producing correctly segmented complex color documents while removing background noise or low contrast objects which is very desirable in text information extraction applications. Additionally, our method has the ability to cluster randomly shaped distributions.

Title:
NONPLANARITY AND EFFICIENT MULTIPLE FEATURE EXTRACTION
Author(s):
Ernst D. Dickmanns and Hans-Joachim Wuensche
Abstract:
A stripe-based image evaluation scheme has been developed allowing efficient detection of the following classes of features: 1. ‘Nonplanarity’ feature for separating image regions treatable by planar shading models from the rest containing textured regions and corners; 2. edges and 3. smoothly shaded regions between edges, and 4. corners for stable 2-D feature tracking. All these features are detected by evaluating receptive fields (masks) with four mask elements shifted through stripes, both in row and column direction. Efficiency stems from re-use of intermediate results in mask elements in neighboring stripes and from coordinated use of these results in different feature extractors. Application to road scenes is discussed.

Title:
A COMPARISON OF WAVELET-BASED AND RIDGELET-BASED TEXTURE CLASSIFICATION OF TISSUES IN COMPUTED TOMOGRAPHY
Author(s):
Lindsay Semler and Lucia Dettori
Abstract:
The research presented in this article is aimed at developing an automated imaging system for classification of tissues in medical images obtained from CT scans. The article focuses on using multi-resolution texture analysis. The approach consists of two steps: automatic extraction of the most discriminative texture features of regions of interest and creation of a classifier that automatically identifies the various tissues. Four forms of multi-resolution analysis were carried on including, the Haar wavelet, Daubechies wavelet, Coiflet wavelet, and the ridgelet. The classification step is implemented through a decision tree classifier based on the cross-validation Classification and Regression Tree approach. Preliminary results indicate that the Haar wavelet outperforms Daubechies and Coiflet. Further investigation shows the ridgelet-based texture features have greater discriminating power than all other multi-resolution feature vectors

Title:
AUTOMATIC EXTRACTION OF CLOSED CONTOURS IN THE PORTUGUESE CADASTRAL MAPS
Author(s):
Tiago Candeias, Filipe Tomaz and Hamid Shahbazkia
Abstract:
The automatic extraction of closed contours is the most important and difficult problem in the automatic recognition of the Portuguese cadastral maps. Many difficulties such as gaps on contour, elements connected on contour, crossing of lines and the association of each entity to its contour have to be solved. In literature there are very few studies about the recognition of cadastral maps and the maps already studied are different than ours. Therefore our research mainly focused on appropriate computer vision algorithms that yield acceptable results. In this paper we present a sequence of algorithms to solve various problems in the contour extraction. The algorithms are completely different and each one tries to solve one specific problem of the analysis. The methods used were the Block-Fill algorithm, the Lohmann's algorithm, the Seed-Segment algorithm and the Rosin-West's vectorization algorithm. The architecture of our system is presented and the results are shown at the end of the paper.

Title:
COMPUTER VISION BASED SORTING OF ATLANTIC SALMON (SALMO SALAR) ACCORDING TO SIZE AND SHAPE
Author(s):
Ekrem Misimi, John R. Mathiassen, Ulf Erikson and Amund Skavhaug
Abstract:
Intensive use of manual labour is necessary in the majority of operations in today’s fish processing plants, incurring high labour costs, and human mistakes in processing, evaluation and assessment. Automatization of processing line operations is therefore a necessity for faster, low-cost processing. In this paper, we present a computer vision system for sorting Atlantic salmon according to size and shape. Sorting is done into two grading classes of salmon: “Production Grade” and “Superior/Ordinary Grade”. Images of salmon were segmented into binary images, and then feature extraction was performed on the geometrical parameters to ensure separability between the two grading classes. The classification algorithm was a threshold type classifier. We show that our computer vision system can be used to evaluate and sort salmon by shape and deformities in a fast and non-destructive manner. Today, the low-cost of implementing advanced computer vision solutions makes this a real possibility for replacing manual labour in fish processing plants.

Title:
LARGE SCALE IMAGE-BASED ADULT-CONTENT FILTERING
Author(s):
Henry A. Rowley, Yushi Jing and Shumeet Baluja
Abstract:
As more people start using the Internet and more content is placed on-line, the chances that individuals will encounter inappropriate or unwanted adult-oriented content increases. This paper presents a practical and scalable method to efficiently detect many adult-content images, specifically pornographic images. We currently use this system in a search engine that covers a large fraction of the images on the WWW. For each image, face detection is applied and a number of summary features are computed; the results are then fed to a support vector machine for classification. The results show that a significant fraction of adult-content images can be detected.

Title:
FACIAL IMAGE FEATURE EXTRACTION USING SUPPORT VECTOR MACHINES
Author(s):
Hamid Abrishami Moghaddam and Mehdi Ghayoumi
Abstract:
In this paper, we present an approach that unifies sub-space feature extraction and support vector classification for face recognition. Linear discriminant, independent component and principal component analyses are used for dimensionality reduction prior to introducing feature vectors to a support vector machine. The performance of the developed methods in reducing classification error and providing better generalization for high dimensional face recognition application is demonstrated.

Title:
SPEEDING UP SNAKES
Author(s):
Enrico Kienel, Marek Vanco and Guido Brunnett
Abstract:
In this paper we summarize new and existing approaches for the semiautomatic image segmentation based on active contour models. We developed a user interface in order to replace the manual segmentation of images of the medical research of the Center of Anatomy at the Georg August University of Göttingen. Due to the huge images (sometimes bigger than 100 megapixels) the research deals with, an efficient implementation is essential. We use a multiresolution model to achieve a fast convergence in coarse scales. The subdivision of an active contour into multiple segments and their treatment as open snakes allows us to exclude those parts of the contour from the calculation, which have already aligned with the desired curve. In addition, the band structure of the iteration matrices can be used to set up a O(n) linear algorithm for the computation of one single deformation step. Finally, we gained an acceleration of the initial computation of the Edge Map and the Gradient Vector Flow by the use of contemporary CPU architectures. Furthermore, the storage of huge images next to additional data structures, such as the gradient vector flow, requires lots of memory. We show a possibility to save memory by a lossy scaling of the traditional potential image forces.

Title:
ICR DETECTION IN FILLED FORM & FORM REMOVAL
Author(s):
Abhishek Agarwal, Pramod Kumar and Sorabh Kumar
Abstract:
This paper presents methods to enhance accuracy rates of ICR detection in structured form processing. Forms are printed at different vendors using variety of printers and at different settings. Every printer has its own scaling algorithm, so the final printed forms though visibly similar to naked eyes, contains considerable shift, expansion or shrinkage. This poses problems when data zones are close together as the template reference points refer to the neighbouring identical zones, impeding data extraction accuracy. Moreover, these transformational defects result in inaccurate form removal leaving behind line residues and noise that further deteriorates the extraction accuracy. Our proposed algorithm works on filled forms thereby eliminating the problem of difference between template and actual form. Template data can also be provided as an input to our algorithm to increase speed and accuracy. The algorithm has been tested on variety of forms and the results have been very promising.

Title:
ROBUST CLASSIFICATION BASED ON PRIOR OF LOCAL DIFFERENCE PROBABILITY FOR THE UNMANNED GROUND VEHICLES
Author(s):
Pangyu Jeong and Sergiu Nedevschi
Abstract:
The aim of this paper is to propose a new classification method based on the noise tolerant LDP (Local Difference Probability) prior-based discriminator for the unmanned ground vehicles. This proposed classification has three characteristics, namely, probability features space instead of Gray intensity features space, Bimodal Gaussian discriminator (noise tolerant discriminator), and single class cluster center based classification (only road class). Based on these components, the classification ability and classification time-cost are better than in generic classification method; K-Mean, Fuzzy K-Mean, Contiguity K-Mean, K-Mean applied on the texture features obtained from GMRF and from Gabor filter bank. The core of the proposed classification is a discriminator (prior density), and it is obtained from the mean of the distances of Local Difference Probabilities (LDPs) in the randomly selected road area. The road area is randomly selected in front of ego vehicle, and the initial class cluster center is employed inside the sampled road area. The road features are classified from around single cluster center to the entire image space.

Title:
REGISTRATION OF 3D - PATTERNS AND SHAPES WITH CHARACTERISTIC POINTS
Author(s):
Darko Dimitrov, Christian Knauer, Klaus Kriegel
Abstract:
We study approximation algorithms for a matching problem that is motivated by medical applications. Given a small set of points $P \subset {\mathbb R}^{3}$ and a surface $S$, the optimal matching of $P$ with $S$ is represented by a rigid transformation which maps $P$ as `close as possible'to $S$. Previous solutions either require polynomial runtime of high degree or they make use of heuristic techniques which could be trapped in some local minimum. We propose a modification of the problem setting by introducing small subsets of so called characteristic points $P_{c} \subseteq P$ and $S_{c} \subseteq S$, and assuming that points from $P_{c}$ must be matched with points from $S_{c}$. We focus our attention on the first nontrivial case that occurs if $|P_{c}| = 2$, and show that this restriction results in new fast and reliable algorithms for the matching problem. Experimental results are provided for surfaces reconstructed from real and synthetic data.

Title:
DETECTION OF ISOLATED NEMATODES IN CLUTTER ENVIRONMENTS USING SHAPE FEATURE HISTOGRAMS
Author(s):
Daniel Ochoa, Sidharta Gautama and Boris Vintimilla
Abstract:
We present an approach for the detection of isolated Caenohabditis Elegans nematodes recognition in clutter environments. The method is based on shape feature histograms which describe the distribution of features of second-order derivative responses of linear image structures. The shape features are able to distinguish isolated from overlapping nematodes and clutter, thereby improving the automated image analysis of nematode populations where accurate assessment of shape is needed. An evaluation is performed on a database of manually segmented images. Shape continuity features prove to have the highest discriminative power. This is consistent with the morphological structure of this kind of organism. Our experiment suggest that similar techniques can be used for identification of other linear shaped biological objects.

Title:
SEGMENTATION ALGORITHMS FOR EXTRACTION OF IDENTIFIER CODES IN CONTAINERS
Author(s):
Juan A. Rosell Ortega, Alberto J. Pérez Jiménez and Gabriela Andreu García
Abstract:
In this paper we present a study of four segmentation algorithms with the aim of extracting characters from containers. We compare their performance using images acquired under real conditions and using results of human operators as a model to check their capabilities. We modified the algorithms to adapt them to our needs. Our aim is obtaining a segmentation of the image which contains all, or as much as possible, characters of the container's code; no matter how many other non relevant objects may appear; as irrelevant objects may be filtered out by applying other techniques afterwards. This work is part of a higher order project whose aim is the automation of the entrance gate of a port.

Title:
A MULTIRESOLUTION FEATURE BASED METHOD FOR AUTOMATIC REGISTRATION OF SATELLITE IMAGERY BASED ON DIGITAL MAPS
Author(s):
Farhad Samadzadegan, Sara Saeedi and Mohammad Hosseini
Abstract:
The registration of satellite images based on object information such as digital maps is one of the main key tasks in most of remote sensing applications. Due to the tremendous complications and complexities associated with the natural scenes appearing in satellite imageries and different structures of image and vector map, fully automatic registration process have faced serious obstacles and thus, only in a relatively simple imaging environment a reliable result is normally expected. In the proposed procedure of this paper, Genetic algorithms (GAs) are used to detect and match the corresponding key features in the satellite image and object data based on a multi-resolution representation of information and math models. The present approach is designed to be completely independent from the sensor type and any a prior information on the exterior orientation. A first successful application of proposed approach is demonstrated for automatic registration of IKONOS imagery and GIS map.

Title:
IMAGE “GROUP-REGISTRATION” BASED ON REPRESENTATION THEORY
Author(s):
Lamia Ben Youssef and Faouzi Ghorbel
Abstract:
The general principle of a matching algorithm is to optimize a criterion that furnishes a measure of the similarity between two images for a given space of geometrical transformations. In this work, we propose a methodology/framework based on a similarity measure -- the generalized correlation -- built in a systematic way from the links between a features space and a group of transforms modeled by an action group. % Using results from representation theory, we can extend the correlation transform to any homogeneous space with a transitively acting group. When the generalized Fourier transform exists, the group correlation can be expressed in a spectral space and it becomes possible to implement fast algorithms for its computation. Two important examples in image processing are then detailed: the similarity group (rotation and scaling) on gray-level shapes from 2D images and the 3D rigid motion group (rotation and translation) followed by a plan projection.

Title:
ROBUST VIDEO MOSAICING FOR BENTHIC HABITAT MAPPING
Author(s):
Hiêp Luong, Wilfried Philips and Anneleen Foubert
Abstract:
Nowadays remotely operated vehicles (ROV) have become a popular tool among biologists and geologists to examine and map the seafloor. For analytical purposes, mosaics have to be created from a large amount of recorded video sequences. Existing mosaicing techniques fail in case of non-uniform illuminated environments, due to the presence of a spotlight mounted on the ROV. Also traditional image blending techniques suffer from ghosting artifacts in the presence of moving objects. We propose a general observation model and a robust mosaicing algorithm which tackles these major problems. Results show an improvement in visual quality: noise and ghosting artifacts are removed.

Title:
HIERARCHICAL ESTIMATION OF IMAGE FEATURES WITH COMPENSATION OF MODEL APPROXIMATION ERRORS
Author(s):
Stefano Casadei
Abstract:
The efficient and accurate estimation of complex image features, such as corners and junctions, requires the combination of a hierarchical approach with model-based techniques. Towards this goal, we propose a formalism to decompose a complex feature into simpler approximating features and an algorithm to fuse estimates of the local features into an estimate of the complex feature. The algorithm contains a training stage to calculate and store in a memory the discrepancies in the estimated feature parameter that arise when the complex feature model is approximated by simpler ones. The algorithm is shown to give the correct result for the case of noise-free feature instances. One stage of an edge detector based on this methodology is described and some results are presented.

Title:
AN IMAGE REGISTRATION TECHNIQUE FOR DETECTION OF CRACK AND RUST GROWTH
Author(s):
Norihiko Itoh
Abstract:
To detect change of crack and detailed rust using images, it is necessary to know a camera angle between a past image and a present image. This paper proposes a technique of rectifying discrepancy of a camera angle using two images. The proposed image registration technique can detect the camera angle parameter with high precision using frequency spectrum of whole images. The technique is also applicable to images, which photographed concrete surface of a wall with few features in the image. The validity of the technique was checked by experiments.

Title:
GRAIN SIZE MEASUREMENT IN IMAGES OF SANDS
Author(s):
Fátima Cristina Lira and Pedro Pina
Abstract:
Different sand deposits exhibits different size distributions and measuring the size of its grains permits to obtain important information about these deposits and consequently the establishment of correlations between them. This paper presents a new method for the characterization of grain sand size based on image analysis. Size distributions are obtained with successive morphological openings parameterized by structuring elements of increasing size. The results obtained from image analysis and sieving are compared transforming the area measured in the images to weight, assuming some simplifications. Although some bias is introduced in relation to sieving, the global sediments characteristics are kept allowing to conclude that image analysis is an alternative technique for measuring the size of sand grains.

Title:
ENHANCING IMPACT CRATER CONTOURS TO INCREASE RECOGNITION RATES
Author(s):
Lourenço P.C. Bandeira, José Saraiva and Pedro Pina
Abstract:
This paper introduces an enhancement to the edge detection procedures that are part of a general methodology which aims at increasing the robustness of the automatic recognition of impact craters on planetary surfaces. It is demonstrated that the proposed improvement is a major contribution to increase the recognition rates and to simultaneously diminish the rates of false positives. Its performance is evaluated through a comparison with other classic edge detectors, which are applied to a set of images of the surface of Mars acquired by the MOC instrument aboard Mars Global Surveyor, a probe currently orbiting the planet.

Title:
SEGMENTATION AND MODELLING OF FULL HUMAN BODY SHAPE FROM 3D SCAN DATA: A SURVEY
Author(s):
Naoufel Werghi
Abstract:
The recent advances in full human body imaging technology illustrated by the 3D human body scanner (HBS), a device delivering full human body shape data, opened up large perspectives for the deployment of this technology in various fields (e.g. clothing industry, anthropology, entertainment). Yet this advance brought challenges on how to process and interpret the data delivered by the HBS in order to bridge the gap between this technology and potential applications. This paper surveys the literature on methods, for human body scan data segmentation and modelling, that attempted to overcome these challenges. It also discusses and evaluated the different approaches with respect to several requirements.

Title:
NEIGHBORHOOD HYPERGRAPH PARTITIONING FOR IMAGE SEGMENTATION
Author(s):
Soufiane Rital, Hocine Cherifi and Serge Miguet
Abstract:
The aim of this paper is to introduce a multilevel neighborhood hypergraph partitioning for image segmentation. Our proposed approach uses the image neighborhood hypergraph model introduced in our last works and the algorithm of multilevel hypergraph partitioning introduced by George Karypis. To evaluate the algorithm performance, experiments were carried out on a group of gray scale images. The results show that the proposed segmentation approach find the region properly from images as compared to image segmentation algorithm using normalized cut criteria.

Title:
A SIMPLE SCHEME FOR CONTOUR DETECTION
Author(s):
Gopal Datt Joshi and Jayanthi Sivaswamy
Abstract:
We present a simple and general purpose scheme for the detection of all salient object contours and region boundaries in real images. The scheme is inspired by the mechanism of centre-surround interaction that is exhibited by 80% of neurons in the primary visual cortex of primates. It is based on the observation that the local context of a contour significantly affects the global saliency of the contour. The proposed scheme consists of two steps: first find the edge response at all points in an image and in the second step modulate the response at a point by the response in its surround. In this paper, we present the results of a low cost implementation of this scheme using a Sobel gradient operator and a mask operation for the surround influence. The successful results of testing this scheme on wide ranges of images show that the proposed scheme can be used as a general preprocessing step for high level tasks such shape based recognition and image retrieval.

Title:
PROBABILITY ANALYSIS IN ART CONSERVATION
Author(s):
Vassiliki Kokla, Alexandra Psarrou and Vassilis Konstantinou
Abstract:
Semi-transparent pigments are very difficult to discriminate caused by the influence of support on which are found therefore its examination often becomes using the destructive techniques of analysis. In the case of old manuscript inks, which are semi-transparent pigments, is frequently impossible to apply the destructive techniques for their analysis because of the historical and cultural value of manuscripts. However the need of the ink analysis is important because it gives information on the authenticity and the dating of manuscripts. Probability analysis offers a best opportunity for developing effective solutions on the non-destructive characterization of manuscript inks. In this paper we present a novel method for the ink recognition problems that is based on the optical ink information employed on the representation of inks through a mixture of Gaussian functions so as the ink classification using the Bayes' decision rule can be feasible.

Title:
COMPARING YEAST CELLS SEGMENTATION THROUGH HIERARCHICAL TREES
Author(s):
Marco Antonio Garcia de Carvalho and Tiago Willian Pinto
Abstract:
Image filtering and segmentation consists of separating an image into regions according to some criteria and to the application finality. Recent publications in the image processing domain make use of a segmentation strategy called multiscale or hierarchical segmentation. The multiscale segmentation provides a family of partitions of an image, presenting it at several levels of resolution. This work studies a multiscale image representation called Tree of the Critical Lakes (TCL), that provides an set of nested partitions of an image. The Tree of the Critical Lakes is defined from the Watershed Transform, the traditional tool of Mathematical Morphology in image segmentation operations. Moreover, we implement a comparison between TCL and another way of image representation, called Component Tree (CT). The CT consists of a set of cross-sections images and its connected components, linked thanks to the inclusion relation. We show experiments of image segmentation, based on TCL´s and CT´s, for a group of yeast cells images.

Title:
LEARNING NONLINEAR MANIFOLDS OF DYNAMIC TEXTURES
Author(s):
Ishan Awasthi and Ahmed Elgammal
Abstract:
Dynamic textures are sequences of images of moving scenes that show stationarity properties in time. Eg: waves, flame, fountain, etc. Recent attempts at generating, potentially, infinitely long sequences model the dynamic texture as a Linear Dynamic System. This assumes a linear correlation in the input sequence. Most real world sequences however, exhibit nonlinear correlation between frames. In this paper, we propose a technique of generating dynamic textures using a low dimension model that preserves the non-linear correlation. We use nonlinear dimensionality reduction to create an embedding of the input sequence. Using this embedding, a nonlinear mapping is learnt from the embedded space into the image input space. Any input is represented by a linear combination of nonlinear bases functions centered along the manifold in the embedded space. A spline is used to move along the input manifold in this embedded space as a similar manifold is created for the output. The nonlinear mapping learnt on the input is used to map this new manifold into a sequence in the image space. Output sequences, thus created, contain images never present in the original sequence and are very realistic.

Title:
MINIMAL DISTORTION MAPPINGS OF SURFACES FOR MEDICAL IMAGES
Author(s):
Eli Appleboim, Emil Saucan and Yehoshua Y. Zeevi
Abstract:
In this paper we present a simple method for minimal distortion development of triangulated surfaces for mapping and imaging. The method is based on classical results of F. Gehring and Y. V\"{a}isal\"{a} regarding the existence of quasi-comformal and quasi-isometric mappings between Riemannian manifolds. Both random and curvature based variations of the algorithm are presented. In addition the algorithm enables the user to compute and control the maximal distortion. Moreover, the algorithm makes no use to derivatives, hence it is suitable for analysis of noisy data. The algorithm is tested both on synthetic images and on data obtained from real CT images of the human colon.

Area 3 - Image Understanding

Title:
SURFACE REGISTRATION USING LOCAL SURFACE EXTENDED POLAR MAP
Author(s):
Elsayed Hemayed
Abstract:
In this paper, we are presenting a new surface signature-based representation that is orientation-independent and can be used to match and align surfaces under rigid transformation. The proposed scheme represents the surface patches in terms of their signatures. The surface signatures are formed as extended polar maps using the neighbours of each surface patch. Correlation of the maps is used to establish point correspondences between two views; from these correspondences a rigid transformation that aligns the views is calculated. The effectiveness of the proposed scheme is demonstrated through several registration experiments.

Title:
DISTANCE MAPS: A ROBUST ILLUMINATION PREPROCESSING FOR ACTIVE APPEARANCE MODELS
Author(s):
Sylvain Le Gallou, Gaspard Breton, Christophe Garcia and Renaud Séguier
Abstract:
Methods of deformable appearance models are useful for realistically modelling shapes and textures of visual objects for reconstruction. A first application can be the fine analysis of face gestures and expressions from videos, as deformable appearance models make it possible to automatically and robustly locate several points of interest in face images. That opens development prospects of technologies in many applications like video coding of faces for videophony, animation of synthetic faces, word visual recognition, expressions and emotions analysis, tracking and recognition of faces. However, these methods are not very robust to variations in the illumination conditions, which are expectable in non constrained conditions. This article describes a robust preprocessing method designed to enhance the performances of deformable models methods in the case of lighting variations. The proposed preprocessing is applied to the Active Appearance Models (AAM). More precisely, the contribution consists in replacing texture images (pixels) by distance maps as input of the deformable appearance models methods. The distance maps are images containing information about the distance between edges in the original object images, which enhance the robustness of the AAMs models against lighting variations.

Title:
SIMPLIFIED REPRESENTATION OF LARGE RANGE DATASET
Author(s):
Hongchuan Yu and Mohammed Bennamoun
Abstract:
In this paper, we consider two approaches of simplifying medium- and large-sized range datasets to a compact data point set, based on the Radial Basis Functions (RBF) approximation. The first algorithm uses a Pseudo-Inverse Approach for the case of given basis functions, and the second one uses an SVD-Based Approach for the case of unknown basis functions. The novelty of this paper consists in a novel partition-based SVD algorithm for a symmetric square matrix, which can effectively reduce the dimension of a matrix in a given partition case. Furthermore, this algorithm is combined with a standard clustering algorithm to form our SVD-Based Approach, which can then seek an appropriate partition automatically for dataset simplification. Experimental results indicate that the presented Pseudo-Inverse Approach requires a uniform sampled control point set, and can obtain an optimal least square solution in the given control point set case. While in the unknown control point case, the presented SVD-Based Approach can seek an appropriate control point set automatically, and the resulting surface preserves more of the essential details and is prone to less distortions.

Title:
CORTICAL OBJECT SEGREGATION AND CATEGORIZATION BY MULTI-SCALE LINE AND EDGE CODING
Author(s):
João Rodrigues and J. M. Hans du Buf
Abstract:
In this paper we present an improved scheme for line and edge detection in cortical area V1, based on responses of simple and complex cells, truly multi-scale with no free parameters. We illustrate the multi-scale representation for visual reconstruction, and show how object segregation can be achieved with coarse-to-fine-scale groupings. A two-level object categorization scenario is tested in which pre-categorization is based on coarse scales only, and final categorization on coarse plus fine scales. Processing schemes are discussed in the framework of a complete cortical architecture.

Title:
POSE ESTIMATION USING STRUCTURED LIGHT AND HARMONIC SHAPE CONTEXTS
Author(s):
Thomas B. Moeslund and Jakob Kirkegaard
Abstract:
In this work we address the general bin-picking problem where a CAD model of the object to be picked is available beforehand. Structured light, in the form of Time Multiplexed Binary Stripes, is used together with a calibrated camera to obtain 3D data of the objects in the bin. The 3D data is then segmented into points of interest and for each a regional feature vector is extracted. The features are the Harmonic Shape Contexts, which are rotational invariants and can model any free-form object. These features are matched against similar features found in the CAD model allowing for a pose estimation of the objects in the bin. Tests show the method to be capable of pose estimating partial-occluded objects, however, the method is also found to be sensitive to the resolution in the structured light system and to noise in the data.

Title:
AN AUTOMATIC APPROACH FOR PARAMETER SELECTION IN SELF-ADAPTIVE TRACKING
Author(s):
Daniela Hall, Rémi Emonet and James L. Crowley
Abstract:
In this article we propose an automatic approach for parameter selection of a tracking system. We show that such a self-adaptive tracking system achieves better tracking performance than a system with manually tuned parameters. Our approach requires little supervision by a user which makes this approach ideally suited for commercial applications. The self-adaptive component makes the system less sensitive to changing environmental conditions. Components for tracking, auto-critical evaluation and automatic parameter regulation serve to detect performance drops that trigger the parameter regulation process. The self-adaptive components require a quality measure based on a statistical scene reference model. We propose an automatic approach for the generation of such a reference model and compare several learning approaches. The experiments show that the auto-regulation of parameters significantly enhances the performance of the tracking system.

Title:
A FAST ALGORITHM FOR ND POLYHEDRAL SCENE PERCEPTION FROM A SINGLE 2D LINE DRAWING
Author(s):
Hongbo Li and Lei Huang
Abstract:
In this paper, we study the problem of reconstructing the polyhedral structures of a general $n$D polyhedral scene from its single 2D line drawing. With the idea of local construction and propagation, we propose a number of powerful techniques for general face identification. Our reconstruction algorithm, called ``$n$DView", is tested by all the 3D examples we found in the literature, plus a number of 4D and 5D examples we devised. Our algorithm does not prerequire the dimension $n$ of the object nor the dimension $m$ of its surrounding space be given, and allows the object to be a non-manifold in which neighboring faces can be coplanar. Another striking feature is its efficiency: our algorithm can handle 3D solids of over 10,000 faces, with a speed 100 times as fast as the fastest existing algorithms on 2D polyhedral manifold reconstruction.

Title:
EBGM VS SUBSPACE PROJECTION FOR FACE RECOGNITION
Author(s):
Andreas Stergiou, Aristodemos Pnevmatikakis and Lazaros Polymenakos
Abstract:
Autonomic human-machine interfaces need to determine the user of the machine in a non-obtrusive way. The identification of the user can be done in many ways, using RF ID tags, the audio stream or the video stream to name a few. In this paper we focus on the identification of faces from the video stream. In particular, we compare two different approaches, linear subspace projection from the appearance-based methods and Elastic Bunch Graph Matching from the feature-based. Since the intended application is restricted to indoor multi-camera setups with collaborative users, the deployment scenarios of the face recognizer are easily identified. The comparison of the methods is done using a common test-bed for both methods. The test-bed is exhaustive for the deployment scenarios we need to consider, leading to the identification of deployment scenarios for which each method is preferable.

Title:
FINGERCODE FOR FINGERPRINT RECOGNITION IN WAVELET TRANSFORM DOMAIN
Author(s):
JuCheng Yang, JinWook Shin, BungJun Min, Bin Yu and DongSun Park
Abstract:
FingerCode has been shown to be an effective representation to capture both the local and global information in a fingerprint by the reference point in a fingerprint. Wavelet transform is a power tool for fingerprint enhancement and features extraction. In this paper, a novel method for fingerprint recognition using FingerCode in wavelet transform domain is proposed. Also, a new reference point detection method in the wavelet sub-images is proposed. By adopting this method, many conventional preprocessing such as smoothing, binarization, thinning and restoration are not necessary. Meanwhile, time consuming is reduced, too. Experiment shows our proposed method is more accurate and reliable than a traditional FingerCode method.

Title:
APPEARANCE BASED PAINTINGS RECOGNITION FOR A MOBILE MUSEUM GUIDE
Author(s):
Claudio Andreatta and Fabrizio Leonardi
Abstract:
This paper presents a prototype of a visual recognition system for a handheld interactive museum guide. Contextualized information about museum drawings may be obtained by the user, without any knowledge about how the system works by simply pointing a palmtop camera towards the painting and taking a shot. The system was tested and performance was found to be satisfactory in challenging environment conditions.

Title:
CONTENT-BASED TEXTURE IMAGE RETRIEVAL USING THE LEMPEL-ZIV-WELCH ALGORITHM
Author(s):
Leonardo Vidal Batista, Moab Mariz Meira and Nicomedes L. Cavalcanti Júnior
Abstract:
This paper presents a method for content-based texture image retrieval using the Lempel-Ziv-Welch (LZW) compression algorithm. Each texture image in the database is processed by a global histogram equalization filter, and then an LZW dictionary is constructed for the filtered texture and stored in the database. The LZW dictionaries thus constructed comprise a statistical model to the texture. In the query stage, each texture sample to be searched is processed by the histogram equalization filter and successively encoded by the LZW algorithm in static mode, using the stored dictionaries. The system retrieves a ranked list of images, sorted according to the coding rate achieved with each stored dictionary. Empirical results with textures from the Brodatz album show that the method achieves retrieval accuracy close to 100%.

Title:
ROBUST HUMAN SKIN DETECTION IN COMPLEX ENVIRONMENTS
Author(s):
Ehsan Fazl Ersi and John Zelek
Abstract:
Skin detection has application in people retrieval, face detection/tracking, hand detection/tracking and more recently on face recognition. However, most of the currently available methods are not robust enough for dealing with some real-world conditions, such as illumination variation and background noises. This paper describes a novel technique for skin detection that is capable of achieving high performance in complex environments with real-world conditions. Three main contributions of our work are: (i) processing each pixel in different brightness levels for handling the problem of illumination variation, (ii) proposing a fast and simple method for incorporating the neighborhood information in processing each pixel, and (iii) presenting a comparative study on thresholding the skin likelihood map, and employing a local entropy technique for binarizing our skin likelihood map. Experiments on a set of real-world images and the comparison with some state-of-the-art methods validate the robustness of our method.

Title:
USING DEFICITS OF CONVEXITY TO RECOGNIZE HAND GESTURES FROM SILHOUETTES
Author(s):
Ed Lawson and Zoran Duric
Abstract:
We describe a method of recognizing hand gestures from hand silhouettes. Given the silhouette of a hand, we compute its convex hull and extract the deficits of convexity corresponding to the differences between the hull and the silhouette. The deficits of convexity are normalized by rotating them around the edges shared with the hull. To learn a gesture, the deficits from a number of examples are extracted and normalized. The deficits are grouped by similarity which is measured by the relative overlap using k-means clustering. Each cluster is assigned a symbol and represented by a template. Gestures are represented by string of symbols corresponding to the nearest neighbors of the deficits. Distinct sequences of symbols corresponding to a given gesture are stored in a dictionary. Given an unknown gesture, its deficits of convexity are extracted and assigned the corresponding sequence of symbols. This sequence is compared with the dictionary of known gestures and assigned to the class to which the best matching string belongs. We used our method to design a gesture interface to control a web browser. We tested our method on five different subjects and achieved a recognition rate of 92% - 99%.

Title:
AN AUDIO-VISUAL SPEECH RECOGNITION SYSTEM FOR TESTING NEW AUDIO-VISUAL DATABASES
Author(s):
Tsang-Long Pao and Wen-Yuan Liao
Abstract:
For past several decades, multimedia signal processing has been an increasing topic of attractive research for overcoming certain problems of audio-only recognition. In recent years, there have been many automatic speech-reading systems proposed, that combine audio and visual speech features. For all such systems, the objective of these audio-visual speech recognizers is to improve recognition accuracy, particularly in the difficult condition. In addition, the audio-visual database is also discussed in this paper. In this paper, we will focus on our new audio-visual database and the visual feature extraction for the audio-visual recognition. We create here a new audio-visual database. Contrary to other existing corpora, our database was recorded in two languages: English and Mandarin. The audio-visual recognition consists of two main steps: feature extraction and recognition. In the proposed approach, we extract the visual motion feature of the lip using the front end processing. In the post-processing, the Hidden Markov model (HMM) is used for the audio-visual speech recognition. We will describe the new audio-visual database and use this database in our proposed system, with some preliminary experiments.

Title:
COMPARING FACES: A COMPUTATIONAL AND PERCEPTUAL STUDY
Author(s):
L. Brodo, M. Bicego, G. Brelstaff, A. Lagorio, M. Tistarelli and E. Grosso
Abstract:
The problem of extracting distinctive parts from a face is addressed. Rather than examining a priori specified features such as nose, eyes, mouth or others, the aim here is to extract from a face the most distinguishing or dissimilar parts with respect to another given face, i.e. finding differences between faces. A computational approach, based on log polar patch sampling and evaluation, has been compared with results obtained from a newly designed perceptual test involving 45 people. The results of the comparison confirm the potential of the proposed computational method.

Title:
COGNITIVE VISION AND PECEPTUAL GROUPING BY PRODUCTION SYSTEMS WITH BLACKBOARD CONTROL - An example for high-resolution SAR-image
Author(s):
Eckart Michaelsen, Wolfgang Middelmann and Uwe Sörgel
Abstract:
The laws of gestalt-perception play an important role in human vision. Psychological studies identified similarity, good continuation, proximity and symmetry as important inter-object relations that distinguish perceptive gestalts from arbitrary sets of clutter objects. Particularly, symmetry and continuation possess a high potential in detection, identification, and reconstruction of man-made objects. This contribution focuses on coding this principle in a full automatic production system. Such systems capture declarative knowledge. The procedural details are defined as control strategy for an interpreter. Often an exact solution is not feasible while approximately correct interpretations of the data with the production system are sufficient. Given input data and a given production system the control acts accumulative instead of reducing. The approach is assessment driven features any-time capability and fits well into the recently discussed paradigms of cognitive vision. An example from the automatic extraction of groupings and symmetry in man-made structure from high resolution SAR-image data is given.

Title:
SPATIAL STATISTICS OF TEXTONS
Author(s):
Gary Dahme, Eraldo Ribeiro and Mark Bush
Abstract:
Current texture classification methods based on learned textons rely on similarity measurements of frequency histograms of texton maps. A problem with this representation is the loss of spatial information among neighboring textons. In this paper we propose the use of spatial statistics on the texton maps that differ only in spatial arrangements of textons as a means to improve classification. We achieve this by directly calculating spatial statistics on the texton maps using co-occurrence measurements. We demonstrate our method on both Brodatz and natural textures from a tropical pollen database. Our results show that the inclusion of spatial statistics on the texton maps help improve the classification of certain types of textures that cannot be correctly classified using the texton histogram-based methods.

Title:
EVALUATING THE POTENTIAL OF CLUSTERING TECHNIQUES FOR 3D OBJECT EXTRACTION FROM LIDAR DATA
Author(s):
Farhad Samadzadegan, Mehdi Maboodi, Sara Saeedi and Ahmad Javaheri
Abstract:
During the last decade airborne laser scanning (LIDAR) has become a mature technology which is now widely accepted for 3D data collection. Nevertheless, these systems have the disadvantage of not representing the desirable bare terrain, but the visible surface including vegetation and buildings. To generate high quality bare terrain using LIDAR data, the most important and difficult step is filtering, where non-terrain 3D objects such as buildings and trees are eliminated while keeping terrain points for quality digital terrain modelling. The main goal of this paper is to investigate and compare the potential of procedures for clustering of LIDAR data for 3D object extraction. The study aims at a comparison of K-Means clustering, SOM and Fuzzy C-Means clustering applied on range laser images. For evaluating the potential of each technique, the confusion matrix concept is employed and the accuracy evaluation is done qualitatively and quantitatively.

Title:
REPRESENTING DIRECTIONS FOR HOUGH TRANSFORMS
Author(s):
Fabian Wenzel and Rolf-Rainer Grigat
Abstract:
Many algorithms in computer vision operate with minimal parametrizations of directions, i.e. with representations of 3D-points by ignoring their distance to the origin. Even though minimal parametrizations may contain singularities, they can enhance convergence in optimization algorithms and are required e.g. for accumulator spaces in Hough transforms. There are numerous possibilities for parameterizing directions. However, many do not account for numerical stability when dealing with noisy data. This paper gives an overview of different parametrizations and shows their sensitivity with respect to noise. In addition to standard approaches in the field of computer vision, representations originating from the field of cartography are introduced. Experiments demonstrate their superior performance in computer vision applications.

Title:
MULTIDIRECTIONAL FACE TRACKING WITH 3D FACE MODEL AND LEARNING HALF-FACE TEMPLATE
Author(s):
Jun’ya Matsuyama and Kuniaki Uehara
Abstract:
In this paper, we present an algorithm to detect and track both frontal and side faces in video clips. By means of both learning Haar-Like features of human faces and boosting the learning accuracy with InfoBoost algorithm, our algorithm can detect frontal faces in video clips. We map these Haar-Like features to a 3D model to create the classifier that can detect both frontal and side faces. Since it is costly to detect and track faces using the 3D model, we project Haar-Like features from the 3D model to a 2D space in order to generate various face orientations. By using them, we can detect even side faces in real time without learning frontal faces and side faces separately.

Title:
PHOTOGENIC FACIAL EXPRESSION DISCRIMINATION
Author(s):
Luana Bezerra Batista, Herman Martins Gomes and João Marques de Carvalho
Abstract:
Facial Expression Recognition Systems (FERS) are usually applied to human-machine interfaces, enabling the utilization of services that require a good identification of the emotional state of the user. This paper presents a new view of the facial expression recognition problem, by addressing the question of whether or not is possible to classify previously labeled photogenic and non-photogenic face images, based on their appearance. A Multi-Layer Perceptron (MLP) is trained with PCA representations of the face images to learn the relationships between facial expressions and the concept of a good photography of the face of a person. In the experiments, the generalization performances using MLP and Support Vector Machines (SVM) were analyzed. The results have shown that PCA representations combined with MLP represent a promising approach to the problem.

Title:
FACIAL EXPRESSION RECOGNITION BASED ON FACIAL MUSCLES BEHAVIOR ESTIMATION
Author(s):
Saki Morita and Kuniaki Uehara
Abstract:
Recent development in multimedia urges the need for an engineering study of the human face in communication media and man-machine interface. In this paper, we introduce a method not only for recognizing facial expression and human emotion, but for extracting rules from them as well. Facial data can be obtained by considering the relative position of each feature point in time series. Our approach estimates the behavior of muscles of facial expression from these data, and evaluates it to recognize facial expressions. In the recognition process, essential parameters that cause visible change of the face are extracted by estimating the force vectors of points on the face. The force vectors are calculated from displacements of points on the face by using FEM (Finite Element Method). To compare the multi-streams of force vectors of each facial expression effectively,A new similarity metric AMSS (Angular Metrics for Shape Similarity) is proposed. Finally, experiments of recognition of facial expressions shows that usable results are achieved even with few testees in our approach and variable rule corresponding AUs can be detected.

Title:
ON COLOUR SPACES AND ON COLOUR PERCEPTION - Independence between uniques and chromatic circularity
Author(s):
Alfredo Restrepo Palacios
Abstract:
The colour space one uses has a bearing on the type of colour image processing tasks one does. As we approach the stage of colour processing in image processing, new colour spaces may be needed. New colour spaces that model properties of our perception of colour. We propose two nonlinear tridimensional transformations of the variables of the RGB colour space. In the proposed spaces, which are based on the RGB space, “pure red” and “pure green” do not imply the presence of yellow. In one of the spaces, as the wavelength variable sweeps the visible spectrum, a circle is obtained, making explicit a circularity of chromaticity for spectral colours. Since there is evidence of S input into the parvo system, we use a dimension called violet minus green.

Title:
DYNAMIC FACIAL EXPRESSION UNDERSTANDING BASED ON TEMPORAL MODELLING OF TRANSFERABLE BELIEF MODEL
Author(s):
Zakia Hammal
Abstract:
In this paper we present a novel approach for dynamic facial expressions classification. This work is in the continuity of our previous work on static facial expression classification based on the Transferable Belief Model. The system is able to recognize \textit{pure} as well as \textit{mixture} of facial expressions (\textit{Joy, Surprise, Disgust and Neutral}) and to deal with all facial feature configurations which does not correspond to any of the cited expression (\textit{Unknown} expressions). Here we present a major improvement of this former work consisting in the introduction of the temporal evolution of the facial feature behavior during a facial expression sequence. The temporal information is introduced first to improve the robustness of the frame-by-frame classification by the correction of errors due to the automatic segmentation process. Secondly, a facial expression is the result of a dynamic and progressive combination of facial features behavior which is not always synchronous. Then a frame-by-frame classification is not sufficient. Here the introduction of the temporal information inside the TBM fusion framework allows to tackle this problem. The recognition is accomplished by combining all facial feature behaviors between the beginning and the end of an expression sequence independently to their chronological order then the final decision is taken on the whole sequence. Consequently the recognition becomes more robust and accurate. Experimental results on Our database demonstrate the improvement on the frame-by-frame facial expressions classification and the ability to recognize entire facial expression sequences. Finally the system is able to automatically display rich and detailed informations on the facial feature behaviors during an expression sequence.

Title:
3D REGISTRATION AND MODELLING FOR FACE RECOGNITION
Author(s):
Li Bai and Yi Song
Abstract:
This paper presents a new approach to automatic model-based face recognition from three-dimensional (3D) unstructured point clouds. By applying a non-iterative registration technique, we transform each point cloud to a canonical position. Unlike the iterative ICP algorithm, our non-iterative registration process is scale invariant. An efficient B-spline surface-fitting technique is developed to represent 3D faces in a way that allows comparison. A novel knot vector standardisation algorithm developed allows one-to-one mapping from the object space to a parameter space. Consequently, correspondence between objects is established based on shape descriptors, which can be incorporated into recognition algorithms. We demonstrate the use of these descriptors in the implementation of a distance metric based face recognition system.

Title:
SCENE CATEGORIZATION USING LOW-LEVEL VISUAL FEATURES
Author(s):
Ioannis Pratikakis, Basilios Gatos and Stelios C.A. Thomopoulos
Abstract:
In this paper, we have built two binary classifiers for indoor/outdoor and city/landscape categories, respectively. The proposed classifiers consist of robust visual feature extraction that feeds a support vector classification. In the case of indoor/outdoor classification, we combine color and texture information using the first three moments of RGB color space components and the low order statistics of the energy wavelet coefficients from a two-level wavelet pyramid. In the case of city/landscape classification, we combine the first three moments of L*a*b color space components and structural information (line segment orientation). Experimental results show that a high classification accuracy is achieved.

Title:
FACIAL PARTS RECOGNITION USING LIFTING WAVELET FILTERS LEARNED BY KURTOSIS-MINIMIZATION
Author(s):
Koichi Niijima
Abstract:
We propose a method for recognizing facial parts using the lifting wavelet filters learned by kurtosis-minimization. This method is based on the following three features of kurtosis: If a random variable has a gaussian distribution, its kurtosis is zero. If the kurtosis is positive, the respective distribution is supergaussian. The value of kurtosis is bounded below. It is known that the histogram of wavelet coefficients for a natural image behaves like a supergaussian distribution. Exploiting these properties, free parameters included in the lifting wavelet filter are learned so that the kurtosis of lifting wavelet coefficients for the target facial part is minimized. Since this minimization problem is an ill-posed problem, it is solved by employing the regularization method. Facial parts recognition is accomplished by extracting a facial part similar to the target facial part. In simulation, a lifting wavelet filter is learned using the narrow eyes of a female, and the learned lifting filter is applied to facial images of 10 females and 10 males, whose expressions are neutral, smile, anger, and scream, to recognize eye part.

Title:
HEAD ORIENTATION AND GAZE DETECTION FROM A SINGLE IMAGE
Author(s):
Jeremy Yirmeyahu Kaminski, Adi Shavit, Dotan Knaan and Mina Teicher
Abstract:
Head orientation is an important part of many advanced human-machine interaction systems. We present a single image based head pose computation algorithm. It is deduced from anthropometric data. This approach allows us to use a single camera and requires no cooperation from the user. Using a single image avoids the complexities associated with of a multi-camera system. Evaluation tests show that our approach is accurate, fast and can be used in a variety of contexts. Application to gaze detection, with a working system, is also demonstrated.

Title:
HAND POSTURE DATASET CREATION FOR GESTURE RECOGNITION
Author(s):
Luis Anton-Canalis and Elena Sanchez-Nielsen
Abstract:
This paper introduces a fast and feasible method for the collection of hand gesture samples. Currently, there are not solid reference databases and standards for the evaluation and comparison of developed algorithms in hand posture recognition, and more generally in gesture recognition. These are two important issues that should be solved in order to improve research results. Unlike previous hand image datasets, which creation usually involves many different people, sceneries and light conditions, we propose a simplified method that requires just a single person' hand being recorded in a controlled light environment. Our method allows the generation of thousands of heterogeneous samples within hours, thus saving time and people's efforts. The resulting dataset has been tested with a cascade classifier, although it may be used by most pattern recognition systems, and compared with a classical dataset obtaining similar results.

Title:
OCCLUSION INVARIANT FACE RECOGNITION USING TWO-DIMENSIONAL PCA
Author(s):
Tae Young Kim, Kyoung Mu Lee and Sang Uk Lee
Abstract:
Subspace analysis such as Principal Component Analysis(PCA) and Linear Discriminant Analysis(LDA) are widely used feature extraction methods for face recognition. However, most of them employ holistic basis so that local parts can not be efficiently represented in the subspace. Therefore, they cannot cope with occlusion problem. In this paper, we propose a new method using two-dimensional principal component analysis (2D PCA) for occlusion invariant face recognition. In contrast to PCA, 2D PCA is performed by projecting 2D image directly onto the 2D PCA subspace, and each row of feature matrix represents the distribution of corresponding row of the image. Therefore by classifying each row of the feature matrix independently, we can easily identify the locally occluded parts in the face image. The proposed occlusion invariant face recognition system consists of two steps: occlusion detection and partial matching. To detect occluded regions, we apply a new combined k-NN and 1-NN classifier to each row or block of the feature matrix of the test face. For partial matching, similarity between feature matrices is evaluated after removing the rows identified as the occluded parts. The experimental results on AR face database demonstrate that the proposed algorithm outperforms other existing approaches.

Area 4 - Motion, Tracking and Stereo Vision

Title:
FACE TRACKING ALGORITHM ROBUST TO POSE, ILLUMINATION AND FACE EXPRESSION CHANGES: A 3D PARAMETRIC MODEL APPROACH
Author(s):
Marco Anisetti, Valerio Bellandi, Luigi Arnone and Fabrizio Beverina
Abstract:
This paper presents a method for tracking a face on a video sequence by recovering the full-motion and the expression deformation of the head using 3D expressive face model. Taking advantage from a 3D triangle based face model we are able to deal with any kind of illumination changes and face expression movements. In this parametric model any changes can be defined as a linear combination of a set of weighted basis that could be easily included in minimization algorithm using a classical Newton Optimization approach. The 3D model of the face is created using some characteristic face points given on the first frame. Using a gradient descent approach the algorithm is able to extract, simultaneously the parameters related to the face expression, 3D posture and the virtual illumination conditions. The algorithm has been tested on Kanade-Cohn database (Kanade et al., 2000) for expression estimation and its precision has been compared with a standard multicamera system for the 3D tracking (ELITE2002 System). Regarding illumination tests, we use both synthetic movie created using standard 3D-mesh animation tools and real experimental videos created in very extreme illumination condition. The results in all the cases are promising even with great head movements and changes in expression and illumination conditions. The proposed approach has a twofold application as a part of a facial expression analysis system and preprocessing for identification system (expression, pose and illumination normalization).

Title:
DETECTION THRESHOLDING USING MUTUAL INFORMATION
Author(s):
Ciarán Ó Conaire, Noel O'Connor, Eddie Cooke and Alan Smeaton
Abstract:
In this paper, we introduce a novel non-parametric thresholding method that we term 'Mutual-Information Thresholding'. In our approach, we choose the two detection thresholds for two input signals such that the mutual information between the thresholded signals is maximised. Two efficient algorithms implementing our idea are presented: one using dynamic programming to fully explore the quantised search space and the other method using the Simplex algorithm to perform gradient ascent to significantly speed up the search, under the assumption of surface convexity. We demonstrate the effectiveness of our approach in foreground detection (using multi-modal data) and as a component in a person detection system.

Title:
A BACKGROUND MODELLING ALGORITHM BASED ON ENERGY EVALUATION
Author(s):
Paolo Spagnolo, Tiziana D’Orazio, Marco Leo, Nicola Mosca and Massimiliano Nitti
Abstract:
Detecting moving objects is very important in many application contexts such as people detection, visual surveillance, automatic generation of video effects, and so on. The first and fundamental step of all motion detection algorithms is the background modeling. The goal of the methodology here proposed is to create a background model substantially independent from each hypothesis about the training phase, as the presence of moving persons, moving background objects, and changing (sudden or gradual) light conditions. We propose an unsupervised approach that combines the results of temporal analysis of pixel intensity with a sliding window procedure to preserve the model from the presence of foreground moving objects during the building phase. Moreover, a multilayered approach has been implemented to handle small movements in background objects. The algorithm has been tested in many different contexts, in both indoor and outdoor environments. Finally, it has been tested even on the CAVIAR 2005 dataset.

Title:
DEVELOPMENT OF A COMPUTER PLATFORMFOR OBJECT 3D RECONSTRUCTIONUSING COMPUTER VISION TECHNIQUES
Author(s):
Teresa Azevedo, João Manuel R. S. Tavares and Mário A. Vaz
Abstract:
In this paper we pretend to describe a Computer Platform development, whose goal is to recover the threedimensional (3D) structure of a scene or the shape of an object, using Structure From Motion (SFM) techniques. SFM is an Active Computer Vision technique, which needs no contact or energy projection. The main goal of this project is to recover the 3D shape of an object or scene using the camera(s)’s or object’s movement, without imposing any kind of restrictions to it. Starting with an uncalibrated sequence of images, the referred movement is extracted, as well as the camera(s) calibration, and finally, the 3D geometry of the object or scene is inferred. Shortly, in the first section of this paper are the goals definition; in the second, the computer platform is presented, as well as some experimental results; in the third and last section, the conclusions relative to the study and work done are drawn and, finally, some perspectives of future work are given.

Title:
RECONSTRUCTION OF ELLIPSOIDS ON ROLLERS FROM STEREO IMAGES USING OCCLUDING CONTOURS
Author(s):
Sudanthi Wijewickrema, Andrew Paplinski and Charles Esson
Abstract:
We describe the reconstruction of quadric surfaces with special attention on ellipsoids, using two different views from calibrated cameras, given that they rest on known objects in space. The technique proposed focuses basically on speed and efficiency and is suitable to be used in resource constrained environments in real time. We model the quadric in dual space and introduce a method of including application specific information in the reconstruction. We also discuss a novel and fast way of adjusting the occluding contours to fit the epipolar tangency constraints before the reconstruction. We further apply this to a real-life application where ellipsoidal fruit are modelled in 3d. Then, we analyze the error of fit for the reconstructed quadrics. Although this paper focuses on ellipsoids, it can be easily extended to incorporate the modelling of other non-degenerate quadrics using two occluding contours in dual space.

Title:
COMPUTER VISION BASED INTERFACES FOR INTERACTIVE SIMULATIONS
Author(s):
Ben Ward and Anthony Dick
Abstract:
3D environments are commonplace in applications for simulation, gaming and design. However, interaction with these environments has traditionally been limited by the use of 2D interface devices. This paper explores the use of computer vision to capture the 3D motion of a handheld object by tracking known features. Captured motion is translated into control of an object onscreen, allowing 3D interaction with a rendered environment. Objects are tracked in real-time in video from a single webcam. The technique is demonstrated using two real-time interactive applications.

Title:
IMAGE MATCHING BY RANSAC USING MULTIPLE NON-UNIFORM DISTRIBUTIONS COMPUTED FROM IMAGES
Author(s):
Yasushi Kanazawa and Yoshihiro Ito
Abstract:
We propose an accurate method for establishing point correspondences between two images taken by an uncalibrated stereo. We explores the case of a scene with multiple planes and we detect the homographies of the planes by using a RANSAC-like algorithm. For random sampling in RANSAC, we define three nonuniform sampling weights that are computed from feature points in the images. By introducing these weights, our method can detect more accurate matches than the usual methods. Furthermore, our method can establish the correspondence stably irrespective of the scene is faraway or not. We demonstrate effectiveness of our method by real image examples.

Title:
INVESTIGATING THE POTENTIAL COMBINATION OF GPS AND SCALE INVARIANT VISUAL LANDMARKS FOR ROBUST OUTDOOR CROSS-COUNTRY NAVIGATION
Author(s):
Hans J. Andersen, T. L. Dideriksen, C. Madsen and M. B. Holte
Abstract:
Safe, robust operation of an autonomous vehicle in cross-country environments relies on sensing of the surroundings. Thanks to the reduced cost of vision hardware, and increasing computational power, computer vision has become an attractive alternative for this task. This paper concentrates on the use of stereo vision for navigation in cross-country environments. For visual navigation the Scale Invariant Feature Transform, SIFT, is used to locate interest points that are matched between successive stereo image pairs. In this way the ego-motion of a autonomous platform may be estimated by least squares estimation of the interest points in current and previous frame. The paper investigate the situation where GPS become unreliable due to occlusion from for example trees. In this case, however, SIFT based navigation has the advantage that it is possible to locate sufficient interest points close to the robot platform for robust estimation of its ego-motion. In contrast GPS may provide very stable navigation in an open cross-country environment where the interest points from the visual based navigation are sparse and located far from the robot and hence gives a very uncertain position estimate. As a result the paper demonstrates that a combination of the two methods is a way forward for development of robust navigation of robots in a cross country environment.

Title:
NON-INTRUSIVE TRACKING OF MULTIPLE USERS IN A SPATIALLY IMMERSIVE DISPLAY
Author(s):
Jiyoung Park, Seon-Min Rhee and Myoung-Hee Kim
Abstract:
We present a novel vision-based system for tracking multiple users in a spatially immersive display. Without requiring them to wear any markers or other devices, we can detect and track the heads of several participants. In a projection-based display environment, the lighting conditions make it difficult to extract silhouettes or shape features from acquired images. Using a separate IR lighting and stereo camera system solves the problem, and makes background subtraction simple and fast. We start by finding general location of the users’ heads in each image, from the silhouettes and projection histogram of the foreground regions. These points are used to create search areas, one in each image of a stereo pair. By cross-correlation between the search areas, corresponding points in each image are identified, and these are used to determine an accurate 3D location on the head. Finally, the search areas in consecutive frames are correlated to maintain the identification of the users over time. Experimental results demonstrate the viability of the proposed system.

Title:
MOTION SEGMENTATION THROUGH FACTORIZATION - Application to Night Driving Assistance
Author(s):
Carme Julià, Joan Serrat, Antonio López, Felipe Lumbreras, Dani Ponsa and Thorsten Graf
Abstract:
Intelligent vehicles are those equipped with sensors and information control systems that can assist human driving. In this context, we address the problem of detecting vehicles at night. The aim is to distinguish vehicles from lamp posts and traffic sign reflections by grouping the blob trajectories according to their apparent motion. We have adapted two factorization techniques, originally designed to estimate the scene structure from motion: the Costeira--Kanade and the Han--Kanade, named after their authors. Results on both vehicle existence in the field of view and motion segmentation are reported.

Title:
3D TRACKING USING 2D-3D LINE SEGMENT CORRESPONDENCE AND 2D POINT MOTION
Author(s):
Woobum Kang and Shigeru Eiho
Abstract:
In this paper, we propose a 3D tracking method which integrates 2D feature tracking. Our tracker searches the 2D-3D correspondences used to estimate the camera pose on the next frame from detected straight edges and projected 3D-CAD model on the current frame, and tracks the corresponding edges on the consecutive frames. By tracking those edges, our tracker can keep correct correspondences even when the large camera motion occurs. Furthermore, when the estimated pose seems incorrect, our tracker brings back to the correspondences of previous frame and continues tracking of corresponding edges. Then, our tracker estimates the pose on the next frame from those correspondences and can recover to the correct pose. Our tracker also detects and tracks corners on the image as 2D feature points, and estimates the camera pose from 2D-3D line segment correspondences and the motions of feature points on the consecutive frames. As the result, our tracker can suppress the influence of incorrect 2D-3D correspondences and can estimate the pose even when the number of detected correspondences is not enough. We also propose an approach which estimates both the camera pose and the correspondences. With this approach, our tracker can estimate the pose and the correspondence on the initial frame of the tracking automatically. From the experimental results, we confirmed our tracker can work in real-time with enough accuracy for various applications even with less accurate CAD model.

Title:
SURVEILLANCE OF OUTDOOR MOVING TARGETS - Matching Targets using Five Features
Author(s):
Nalin Pradeep S. and Mayur D. Jain
Abstract:
The proposed video surveillance method comprises segmentation of moving targets and tracking the detected objects through five features of the target object. We introduce motion object segmentation based on mean and variance background learning model, and subtraction using both color and edge information. The cognitive fusion of color and edge information helps identifying foreground object. The combination of the five features spatial positions, LBW, Compactness, Orientation and color histogram through particle filter approach tracks the segmented objects. These five features help in matching the target tracks during occlusions, merging of targets, stop and go motion in vary challenging environmental (rainy and snowy) conditions shown in the results. Our proposed method provides solution to common problems related to matching of target tracks. We provide encouraging experimental results calculated on synthetic and real world sequences to demonstrate the algorithm performance.

Title:
IMPROVING APPEARANCE-BASED 3D FACE TRACKING USING SPARSE STEREO DATA
Author(s):
Fadi Dornaika and Angel D. Sappa
Abstract:
Recently, researchers proposed deterministic and statistical appearance-based 3D head tracking methods which can successfully tackle the image variability and drift problems. However, appearance-based methods dedicated to 3D head tracking may suffer from inaccuracies since these methods are not very sensitive to out-of-plane motion variations. On the other hand, the use of dense 3D facial data provided by a stereo rig or a range sensor can provide very accurate 3D head motions/poses. However, this paradigm requires either an accurate facial feature extraction or a computationally expensive registration technique (e.g., the Iterative Closest Point algorithm). In this paper, we improve our appearance-based 3D face tracker by combining an adaptive appearance model with a robust 3D-to-3D registration technique that uses sparse stereo data. The resulting 3D face tracker combines the advantages of both appearance-based trackers and 3D data-based trackers while keeping the CPU time very close to that required by real-time trackers. We provide experiments and performance evaluation which show the feasibility and usefulness of the proposed approach.

Title:
GROWING AGGREGATION ALGORITHM FOR DENSE TWO-FRAME STEREO CORRESPONDENCE
Author(s):
Elisabetta Binaghi, Ignazio Gallo , Chiara Fornasier and Mario Raspanti
Abstract:
This work aims at defining a new method for matching correspondences in stereoscopic image analysis. The salient aspects of the method are -an explicit representation of occlusions driving the overall matching process and the use of neural adaptive technique in disparity computation. In particular, based on the taxonomy proposed by Scharstein and Szelinsky, the dense stereo matching process has been divided into three tasks: matching cost computation, aggregation of local evidence and computation of disparity values. Within the second phase a new strategy has been introduced in an attempt to improve reliability in computing disparity. An experiment was conducted to evaluate the solutions proposed The experiment is based on an analysis of test images including data with a ground truth disparity map.

Title:
REAL-TIME TRACKING FOR VIRTUAL ENVIRONMENTS USING SCAAT KALMAN FILTERING AND UNSYNCHRONISED CAMERAS
Author(s):
Niels Tjørnly Rasmussen, Moritz Störring, Thomas B. Moeslund and Erik Granum
Abstract:
This paper presents a real-time outside-in camera-based tracking system for wireless 3D pose tracking of a user's head and hand in a virtual environment. The system uses four unsynchronised cameras as sensors and passive retroreflective markers arranged in rigid bodies as targets. In order to achieve high update rates and to cope with the unsynchronised data a single-constraint-at-a-time (SCAAT) Extended Kalman Filtering approach is used that recursively integrates measurements as soon as they are available one-at-a-time. Tests show that this approach is more robust to occlusions and provides less noisy pose estimates with a higher update rate than a conventional stereo triangulation approach.

Title:
PEOPLE COUNTING SYSTEM
Author(s):
Raul Feitosa and Priscila Dias
Abstract:
Demand for security and surveillance systems is getting bigger day after day. This work proposes a method that counts people and detects suspicious attitudes via video sequences of areas with moderate people access. A typical application is the security of warehouses during the night, on weekends or at any time when people access is allowed but no load movement is admissible. Specifically it focuses on detecting when a person passing by the environment carries any object belonging to the background away or leaves any object in the background, while only people movement is allowed in the area. In addition, it estimates the number of people on scene. The method consists of performing four main tasks on video sequences: a) background and foreground separation, b) background estimative dynamic update, c) people location and counting, and d) suspicious attitudes detection. The proposed background and foreground separation and background estimative update algorithms deal with illumination fluctuation and shade effects. People location and counting explores colour information and motion coherence. A prototype implementing the proposed method was built for evaluation purpose. Experiments on simulated and real video sequences are reported showing the effectiveness of the proposed approach.

Title:
HUMAN BODY TRACKING FOR PHYSIOTHERAPY VIRTUAL TRAINING
Author(s):
Sara Shafaei and Mohammad Rahmati
Abstract:
In this paper, we introduced a system in which it can be used for patients who are prescribed to undergo a physiotherapy treatment. In this personal virtual training system we employ several markers, attached to the various points of the human body. The system provides a physiotherapy session to the user, once the session is repeated by the user, the video image sequence captured by the system is analyzed and results are displayed to the user for further instructions.Our design consists of 3 general stages: detection, tracking, and verification stages. In the detection stage, our aim is to process the first frame of the image sequence for detecting the locations of the markers. In order to reduce the computational complexity of the first stage, the detection was performed in the lower scale of a Gaussian pyramid space representation. The second stage of our system performs tracking of detected markers of the first stage. A prediction algorithm is applied in this stage in order to limit the search along the predicted directions during the search for the markers in subsequent frames. For verification stage, the trajectory of the markers will be compared with the information in the model. Trajectory matching is performed by computing the difference between their smoothed zero-crossing potentials of the captured trajectory and the model.

Title:
HUMAN BODY TRACKING BASED ON PROBABILITY EVOLUTIONARY ALGORITHM
Author(s):
Shuhan Shenc and Weirong Chen
Abstract:
A novel evolutionary algorithm called Probability Evolutionary Algorithm (PEA), and a method based on PEA for visual tracking of human body are presented. PEA is inspired by the Quantum computation and the Quantum-inspired Evolutionary Algorithm, and it has a good balance between exploration and exploitation with very fast computation speed. The individual in PEA is encoded by the probabilistic compound bit, defined as the smallest unit of information, for the probabilistic representation. The observation step is used in PEA to obtain the observed states of the individual, and the update operator is used to evolve the individual. In the PEA based human tracking framework, tracking is considered to be a function optimization problem, so the aim is to optimize the matching function between the model and the image observation. Then PEA is used to optimize the matching function. Experiments on synthetic and real image sequences of human motion demonstrate the effectiveness, significance and computation efficiency of the proposed human tracking method.

Title:
MULTILIGHTTRACKER: VISION BASED MULTI OBJECT TRACKING ON SEMI-TRANSPARENT SURFACES
Author(s):
Jesper Nielsen and Kaj Grønbæk
Abstract:
This paper describes MultiLightTracker - a simple and robust system for simultaneous tracking of multiple objects on 2D semi-transparent surfaces. We describe how the system facilitates object tracking on a semi-transparent surface which can be back projected, allowing direct single- or multi-user interaction with the projected content. The system is vision based and runs in both 4:3 and 16:9 picture formats. MultiLightTracker currently tracks four different objects simultaneously in real time (~100ms) but the aim is to extend this amount, although the performance also depends on the number of tracked objects. In controlled environments such as meeting rooms, MultiLightTracker is sufficiently robust for everyday collaborative use. Thus MultiLightTracker is superior to existing multi-object tracking surfaces with regards to its easy availability, simplicity and comparable low cost.

Title:
3D RECONSTRUCTION METHODS, A SURVEY
Author(s):
Julius Butime, Dr. Iñigo Gutierrez, Luis Galo Corzo and Carlos Flores Espronceda
Abstract:
3 D reconstruction technologies have evolved over the years. In this paper we try to highlight the evolution over the years of the scanning technologies. The idea of a survey came up with our decision to look at 3D reconstruction methods over the years. Little has been written about the methods as a whole, yet many developments have taken place in this area over the years. This survey will prove useful for those intending to embark on research in 3D reconstruction technologies. The survey takes a look at the major reconstruction methods, which are; laser triangulation, stereoscopy, conoscopic holography and Interferometry. A review of the major producers of scanning technology for 3D reconstruction is also carried out.

Title:
REAL-TIME LIPTRACKING FOR SYNTHETIC FACE ANIMATION WITH FEEDBACK LOOP
Author(s):
Franck Luthon and Brice Beaumesnil
Abstract:
This article deals with facial segmentation and liptracking with feedback control for real-time animation of a synthetic 3D face model. Straightforward approaches consist in two successive steps: video analysis then synthesis. Our approach departs from the previous ones in that we build a global analysis/synthesis processing loop, where the image analysis needs the 3D synthesis and conversely. A first facial segmentation is computed according to which the 3D face model is positionned. Then the feedback loop implemented from the 3D animated model back to the input pixel segmentation algorithm, helps to correct some (few) bad segmentation points, detected by measuring the distance between lip contour points and corresponding 3D face model points. When the measured distance is too big, we re-enter into the initial segmentation process and zoom-in inside a few regions of interest where the algorithm is run again, with a new set of tuning parameters better suited to the neighborhood context. In that way, the face segmentation is refined in order to extract more precise parameters.This approach is inspired from control systems theory with feedback loops. The contribution of the paper is to use simple image processing techniques, but to improve segmentation through the feedback loop. Results show that real-time and robust performances are achievable under real-world conditions, which are two key issues for face and lip tracking applications.

Title:
HUMAN POSTURE TRACKING AND CLASSIFICATION THROUGH STEREO VISION
Author(s):
Stefano Pellegrini and Luca Iocchi
Abstract:
The ability of detecting human postures is very relevant for applications related to the analysis of human behaviours. Techniques for posture detection and classification can be thus very important in several fields, like ambient intelligence, surveillance, elderly care, etc. This problem has been studied in recent years in the Computer Vision community, but proposed solutions still suffer from some limitations that are due to the difficulty of dealing with complex scenes (e.g., occlusions, different view points, etc.). In this paper we present a system for posture tracking and classification that uses a stereo vision sensor, which provides both for a robust way to segment and track people in the scene and 3D information about tracked people. The presented method uses a 3D model of human body, performs model matching through a variant of the ICP algorithm and then uses a Hidden Markov Model to model posture transitions. Experimental results show the effectiveness of the system in determining human postures in presence of partial occlusions and from different view points.

Title:
STEREO VISION-BASED DETECTION OF MOVING OBJECTS UNDER STRONG CAMERA MOTION
Author(s):
Hernán Badino, Uwe Franke, Clemens Rabe and Stefan Gehrig
Abstract:
The visual perception of independent 3D motion from a moving observer is one of the most challenging tasks in computer vision. This paper presents a powerful fusion of depth and motion information for image sequences. For a large number of points, 3D position and 3D motion is simultaneously estimated by means of Kalman Filters. The necessary ego-motion is computed based on the points that are identified as static points. The result is a real-time system that is able to detect independently moving objects even if the own motion is far from planar. The input provided by this system is suited to be used by high-level perception systems in order to carry out cognitive processes such as autonomous navigation or collision avoidance.

Title:
SUBPIXEL VISUAL TRACKING BASED ON ADAPTIVE STRATEGIES
Author(s):
Héctor Barrón, Janeth Cruz and Leopoldo Altamirano
Abstract:
Several applications with visual tracking need a better accuracy to perform a more reliable analysis of the objects in scene. However, environments with different atmospheric factors are presented. Object dynamic can affect tracking throughout time. In this work, a tracking method with subpixel measurements was developed. So, quality of the state estimate of the object was enhanced. The proposed scheme is robust in scenes with occlusions and changes in apparence of the target. The target model is adapted to size changes of the object, avoiding aperture problem and integration with false information. The dynamic state of the object is estimated and it is possible to estimate the object aspect along time, too. Each pixel is modeled by a random variable because the set of pixels represents the non-observable surface of target and each pixel can be affected by noise. This assumption allows the design of a gradual scheme for model updating. Subpixel precision in tracking is based on an iterative method that uses the similitude surface between the target model and the current image of the object.

Title:
REAL-TIME LABEL INSERTION IN LIVE VIDEO THROUGH ONLINE TRIFOCAL TENSOR ESTIMATION
Author(s):
Robert Laganière and Johan Gottin
Abstract:
We present an augmented reality application that can supplement a live video sequence with virtual labels associated with the scene content captured by an agile video camera moving inside an explored environment. The method proposed is composed of two main phases. First, a matching phase where reference images are successively compared with the captured images. And, second, a tracking phase that aims at maintaining the correspondence between a successfully matched reference image and each frame of a captured sequence. Labels insertion is based on projective transfer using the trifocal tensor, this one being estimated and continuously updated as the camera is moved inside the scene.

Title:
STRATEGIES FOR FAST TRUE MOTION BLOCK MATCHING
Author(s):
Hendrik van der Heijden, Fabian Wenzel and Rolf-Rainer Grigat
Abstract:
Block matching is a widely used method for fast motion estimation. Although using a very simple motion model, which does not fit most real world video material, many motion compensating video compression algorithms use block matching because of its speed. Applications based on true motion vector estimates often use an optical flow algorithm because of their higher need for accuracy at the expense of increased computing time. This paper presents a modified block matching algorithm suitable for true motion applications. A modified full search will be used on a cost function consisting of SAD and a vector field smoothing term. Several strategies as search center prediction, spiral search, early search termination and multilevel successive elimination are implemented to keep the computational demand low. This way, high-quality estimates can be computed in real-time.

Title:
PERFORMANCE OF ADAPTIVE TRACKING ALGORITHMS
Author(s):
Janeth Cruz, Leopoldo Altamirano and Josué Pedroza
Abstract:
This paper compares the performance of adaptive trackers based on multiple algorithms. The aim of using multiple algorithms is to increase the robustness of the trackers under varying conditions. We perform two estimation algorithms UKF and IMM to measure the performance of tracking on outdoor scenes with occlusions. The purpose of this paper is to measure and evaluate tracker reliability for be able to determine the position of a target. The performance is evaluated using metrics related to truth track. We give a positional evaluation and statistics values of the performance of visual tracking systems, which adapt to changing environments

Title:
MOTION TRACKING WITH REFLECTIONS - 3D pointing device with self-calibrating mirror system
Author(s):
Shinichi Fukushige and Hiromasa Suzuki
Abstract:
We propose a system that uses a camera and a mirror to input behaviour of a pointer in 3D space. Using direct and reflection images of the pointer obtained from single directional camera input, the system computes the 3D positions and the normal vector of the mirror simultaneously. Although the system can only input the ‘‘relative positions’’ of the pointer, in terms of 3D locations without scale factor, calibration of the mirror orientation is not needed. Thus, the system presents a very simple and inexpensive way of implementing an interaction device.

Title:
COMPARISON OF MATCHING STRATEGIES FOR COLOUR IMAGES
Author(s):
Bogusław Cyganek and Łukasz Socha
Abstract:
The paper addresses the ubiquitous problem of matching of color images. Color plays very important role in human visual system and the question arises how it can influence image matching in case of a computer based vision systems. In this paper the area based matching methods are investigated. Several matching cost functions and different color spaces (RGB, HSI, YCrCb) are examined. Obtained results for color are compared with monochromatic methods. Quality of dense disparity maps was verified in two ways: by number of points rejected after cross-checking and by PSNR value between original reference image and its reconstruction from the second reference and disparity map. The main objective of the research was to verify benefits and drawbacks of using color information for matching versus inevitable cost associated with processing of greater amount of data.

Title:
REALTIME LOCALIZATION OF A CENTRAL CATADIOPTRIC CAMERA USING VERTICAL LINES
Author(s):
Bertrand Vandeportaele, Michel Cattoen and Philippe Marthon
Abstract:
Catadioptric sensors are used in mobile robot localization because of their panoramic field of view. However most of the existing systems require a constant orientation of the camera and a planar motion, and thus the localization cannot be achieved for persons. In this paper, we use the images of the vertical lines of indoor environment to localize in realtime the central catadioptric camera orientation and 2D position. The pose detection is done in two steps. First, the two axes absolute rotation that brings the vertical line images in vertical position on the viewing sphere is computed. Then the 2D pose is estimated using a 2D map of the site.

Title:
VISION-BASED TRACKING SYSTEM FOR HEAD MOTION CORRECTION IN FMRI IMAGES
Author(s):
Tali Lerner, Moshe Gur and Ehud Rivlin
Abstract:
This paper presents a new vision-based system for motion correction in functional-MRI experiments. fMRI is a popular technique for studying brain functionality by utilizing MRI technology. In an fMRI experiment a subject is required to perform a task while his brain is scanned by an MRI scanner. In order to achieve a high quality analysis the fMRI slices should be aligned. Hence, the subject is requested to avoid head movements during the entire experiment. However, due to the long duration of such experiments head motion is practically unavoidable. Most of the previous work in this field addresses this problem by extracting the head motion parameters from the acquired MRI data. Therefore, these works are limited to relatively small movements and may confuse head motion with brain activities. In the present work the head movements are detected by a system comprised of two cameras that monitor a specially designed device worn on the subject's head. The system does not depend on the acquired MRI data and therefore can overcome large head movements. Additionally, the system can be extended to cope with inter-block motion and can be integrated into the MRI scanner for real-time update of the scan-planes. The performance of the proposed system was tested in a laboratory environment and in fMRI experiments. It was found that high accuracy is obtained even when facing large head movements.

Title:
OPTICAL FLOW TO ANALYSE STABILISED IMAGES OF THE BEATING HEART
Author(s):
Martin Gröger and Gerd Hirzinger
Abstract:
An optical flow method is developed to analyse the motion of the beating heart surface and the efficacy of strategies to stabilise this motion. Although reduced by mechanical stabilisers, residual tissue motion makes safe surgery still difficult and time consuming. Compensation of this movement is therefore highly desired. Images of the heart surface, viewed by a video laparoscope, can be further stabilised based on motion information gained from tracking of natural landmarks in realtime. The remaining motion on the heart surface is assessed by a specially developed optical flow approach: It estimates the image velocities based on a robust region-based strategy and provides a reliable measure of the motion field of the heart. The analysis shows that tissue motion can be further reduced by a global motion correction strategy while local motion differences remain.

Title:
SWARMTRACK: A PARTICLE SWARM APPROACH TO VISUAL TRACKING
Author(s):
Luis Antón-Canalís, Elena Sánchez-Nielsen and Mario Hernández-Tejera
Abstract:
A new approach to solve the object tracking problem is proposed using a Swarm Intelligence metaphor. It is based on a prey-predator scheme with a swarm of predator particles defined to track a herd of prey pixels using the intensity of its flavours. The method is described, including the definition of predator particles’ behaviour as a set of rules in a Boids fashion. Object tracking behaviour emerges from the interaction of individual particles. The paper includes experimental evaluations with video streams that illustrate the robustness and efficiency for real-time vision based tasks using a general purpose computer.

Title:
ROBUST CAMERA MOTION ESTIMATION IN VIDEO SEQUENCES
Author(s):
Xiaobo An, Xueying Qin, Guofeng Zhang, Wei Chen and Hujun Bao
Abstract:
Camera motion estimation of video sequences requires robust recovery of camera parameters and is a cumbersome task concerning arbitrarily complex scenes in video sequences. In this paper, we present a novel algorithm for robust and accurate estimation of camera motion. We insert a virtual frame between each pair of consecutive frames, through which the in-between camera motion is decomposed into two separate components, i.e., pure rotation and pure translation. Given matched feature points between two frames, one point set corresponding to the far scene is chosen, which is used to estimate initial camera motion. We further rene it recursively by a non-linear optimizer, yielding the nal camera motion parameters. Our approach achieves accurate estimation of camera motion and avoids instability of camera tracking. We demonstrate high stability, accuracy and performance of our algorithm with a set of augmented reality applications based on acquired video sequences.

Title:
LOCAL MINIMUM DISTANCE FOR THE DENSE DISPARITY ESTIMATION
Author(s):
Eric Alvernhe, Philippe Montesinos, Stefan Janaqi and Min Tang
Abstract:
This paper presents a new algorithm to solve the problem of dense disparity map estimation in stereo-vision. Our method is an iterative process inspired by Partial Derivative Equation. A new criteria is used as the attachment term based on the distance to local minimum of a similarity measure. Our iterative process is heuristic. Nevertheless, we are able to interpret this algorithm presenting both combinatorial and continuous characteristics. The quality and precision of the results obtained by our method both on image benchmarks and real data clearly demonstrate the the validity of this approach.

Title:
A DIFFERENTIAL GEOMETRIC APPROACH FOR VISUAL NAVIGATION IN INDOOR SCENES
Author(s):
Luis Fuentes, Margarita Gonzalo-Tasis, G. Bermudez and Javier Finat
Abstract:
Visual perception of the environment provides a detailed scene representation which contributes to improve motion planning and obstacle avoidance navigation for wheelchairs in non-structured indoor scenes. In this work we develop a mobile representation of the scene based on perspective maps for the automatic navigation in absence of previous information about the scene. Images are captured with a passive low-cost video camera. The main feature for visual navigation in this work is a map of quadrilaterals with apparent motion. From this mobile map, perspective maps are updated following hierarchical grouping in quadrilaterals maps given by pencils of perspective lines through vanishing points. Egomotion is interpreted in terms of maps of mobile quadrilaterals. The main contributions of this paper are the introduction of Lie expansion/contraction operators for quadrilateral/cuboid and the adaptation of Kalman filtering for moving quadrilaterals to estimate and predict the egomotion of a mobile platform. Our approach is enough modular and flexible for adapting to indoor and outdoor scenes provided at least four homologue cuboids be present in the scene between each pair of sampled views of a video sequence.