Digital Image Processing Projects – ElysiumPro

elysiumpro image processing projects

Image Processing Projects

CSE Projects, ECE Projects
Description
I Image Processing means processing images using mathematical algorithm. ElysiumPro provides a comprehensive set of reference-standard algorithms and workflow process for students to do implement image segmentation, image enhancement, geometric transformation, and 3D image processing for research.
Download Project List

Quality Factor

  • 100% Assured Results
  • Best Project Explanation
  • Tons of Reference
  • Cost optimized
  • Controlpanel Access


1A Multi-Classifier System for Automatic Mitosis Detection in Breast Histopathology Images Using Deep Belief Networks
Mitotic count is an important diagnostic factor in breast cancer grading and prognosis. Detection of mitosis in breast histopathology images is very challenging mainly due to diffused intensities along object boundary and shape variation in different stages of mitosis. This paper demonstrates an accurate technique for detecting the mitotic cells in Hematoxyline and Eosin stained images by step by step refinement of segmentation and classification stages. Krill Herd Algorithm-based localized active contour model precisely segments cell nuclei from background stroma. A deep belief network based multi-classifier system classifies the labeled cells into mitotic and nonmitotic groups. The proposed method has been evaluated on MITOS data set provided for MITOS-ATYPIA contest 2014 and also on clinical images obtained from Regional Cancer Centre (RCC), Thiruvananthapuram, which is a pioneer institute specifically for cancer diagnosis and research in India. The algorithm provides improved performance compared with other state-of-the-art techniques with average F-score of 84.29% for the MITOS data set and 75% for the clinical data set from RCC.

2Deep Representation based feature extraction and recovering for Finger-vein verification

Finger-vein biometrics has been extensively investigated for personal verification. Despite recent advances in finger-vein verification, current solutions completely depend on domain knowledge and still lack the robustness to extract finger-vein features from raw images. This paper proposes a deep learning model to extract and recover vein features using limited a priori knowledge. First, based on a combination of the known state-of-the-art handcrafted finger-vein image segmentation techniques, we automatically identify two regions: a clear region with high separability between finger-vein patterns and background, and an ambiguous region with low separability between them. The first is associated with pixels on which all the above-mentioned segmentation techniques assign the same segmentation label (either foreground or background), while the second corresponds to all the remaining pixels. This scheme is used to automatically discard the ambiguous region and to label the pixels of the clear region as foreground or background. A training data set is constructed based on the patches centered on the labeled pixels. Second, a convolutional neural network (CNN) is trained on the resulting data set to predict the probability of each pixel of being foreground (i.e., vein pixel), given a patch centered on it. The CNN learns what a finger-vein pattern is by learning the difference between vein patterns and background ones. The pixels in any region of a test image can then be classified effectively. Third, we propose another new and original contribution by developing and investigating a fully convolutional network to recover missing finger-vein patterns in the segmented image. The experimental results on two public finger-vein databases show a significant improvement in terms of finger-vein verification accuracy.


Lorem ipsum dolor sit amet ...


3Modified classification and regression tree for facial expression recognition with using difference expression images

This study presents a modified classification and regression tree (M-CRT) framework based on difference expression images, to address the facial expression recognition (FER) problem. The authors firstly obtain facial expressional details by calculating the difference between the images of basic expressions and images of neutral expression, which reflect the information irrelevant to identities. Local binary patterns and supervised descent method are, respectively, used to obtain the global and local features from difference expression images. M-CRT model is developed for FER, which uses recursive segmentation to find the best classification decision according to the attributes of the global and local features, respectively. Compared with traditional methods, M-CRT can simultaneously maximise intra-class purity and distance between classes, which improves the discriminating power for classification. Experimental results on Japanese Female Facial Expression and CK+ database verify the effectiveness of their method.


Lorem ipsum dolor sit amet ...


4Feature Sensitive Label Fusion with Random Walker for Atlas-Based Image Segmentation

In this paper, a novel label fusion method is proposed for brain magnetic resonance image segmentation. This label fusion method is formulated on a graph, which embraces both label priors from atlases and anatomical priors from target image. To represent a pixel in a comprehensive way, three kinds of feature vectors are generated, including intensity, gradient, and structural signature. To select candidate atlas nodes for fusion, rather than exact searching, randomized k-d tree with spatial constraint is introduced as an efficient approximation for high-dimensional feature matching. Feature sensitive label prior (FSLP), which takes both the consistency and variety of different features into consideration, is proposed to gather atlas priors. As FSLP is a non-convex problem, one heuristic approach is further designed to solve it efficiently. Moreover, based on the anatomical knowledge, parts of the target pixels are also employed as the graph seeds to assist the label fusion process, and an iterative strategy is utilized to gradually update the label map. The comprehensive experiments carried out on two publicly available databases give results to demonstrate that the proposed method can obtain better segmentation quality.


Lorem ipsum dolor sit amet ...


5Ship Deztection from Optical Satellite Images Based on Saliency Segmentation and Structure-LBP Feature

Automatic ship detection from optical satellite imagery is a challenging task due to cluttered scenes and variability in ship sizes. This letter proposes a detection algorithm based on saliency segmentation and the local binary pattern (LBP) descriptor combined with ship structure. First, we present a novel saliency segmentation framework with flexible integration of multiple visual cues to extract candidate regions from different sea surfaces. Then, simple shape analysis is adopted to eliminate obviously false targets. Finally, a structure-LBP feature that characterizes the inherent topology structure of ships is applied to discriminate true ship targets. Experimental results on numerous panchromatic satellite images validate that our proposed scheme outperforms other state-of-the-art methods in terms of both detection time and detection accuracy.


Lorem ipsum dolor sit amet ...


6Disjunctive Normal Parametric Level Set With Application to Image Segmentation

Level set methods are widely used for image segmentation because of their convenient shape representation for numerical computations and capability to handle topological changes. However, in spite of the numerous works in the literature, the use of level set methods in image segmentation still has several drawbacks. These shortcomings include formation of irregularities of the signed distance function, sensitivity to initialization, lack of locality, and expensive computational cost, which increases dramatically as the number of objects to be simultaneously segmented grows. In this paper, we propose a novel parametric level set method called disjunctive normal level set (DNLS), and apply it to both two-phase (single object) and multiphase (multiobject) image segmentations. DNLS is a differentiable model formed by the union of polytopes, which themselves are created by intersections of half-spaces. We formulate the segmentation algorithm in a Bayesian framework and use a variational approach to minimize the energy with respect to the parameters of the model. The proposed DNLS can be considered as an open framework that allows the use of different appearance models and shape priors. Compared with the conventional level sets available in the literature, the proposed DNLS has the following major advantages: it requires significantly less computational time and memory, it naturally keeps the level set function regular during the evolution, it is more suitable for multiphase and local region-based image segmentations, and it is less sensitive to noise and initialization. The experimental results show the potential of the proposed method.


Lorem ipsum dolor sit amet ...


7Segmentation-Based Fine Registration of Very High Resolution Multi-temporal Images

In this paper, a segmentation-based approach to fine registration of multispectral and multi temporal very high resolution (VHR) images is proposed. The proposed approach aims at estimating and correcting the residual local misalignment [also referred to as registration noise (RN)] that often affects multi temporal VHR images even after standard registration. The method extracts automatically a set of object representative points associated with regions with homogeneous spectral properties (i.e., objects in the scene). Such points result to be distributed all over the considered scene and account for the high spatial correlation of pixels in VHR images. Then, it estimates the amount and direction of residual local misalignment for each object representative point by exploiting residual local misalignment properties in a multiple displacement analysis framework. To this end, a multiscale differential analysis of the multispectral difference image is employed for modeling the statistical distribution of pixels affected by residual misalignment (i.e., RN pixels) and detect them. The RN is used to perform a segmentation-based fine registration based on both temporal and spatial correlation. Accordingly, the method is particularly suitable to be used for images with a large number of border regions like VHR images of urban scenes. Experimental results obtained on both simulated and real multi temporal VHR images confirm the effectiveness of the proposed method.


Lorem ipsum dolor sit amet ...


8Scene Text Detection and Segmentation Based on Cascaded Convolution Neural Networks

Scene text detection and segmentation are two important and challenging research problems in the field of computer vision. This paper proposes a novel method for scene text detection and segmentation based on cascaded convolution neural networks (CNNs). In this method, a CNN-based text-aware candidate text region (CTR) extraction model (named detection network, DNet) is designed and trained using both the edges and the whole regions of text, with which coarse CTRs are detected. A CNN-based CTR refinement model (named segmentation network, SNet) is then constructed to precisely segment the coarse CTRs into text to get the refined CTRs. With DNet and SNet, much fewer CTRs are extracted than with traditional approaches while more true text regions are kept. The refined CTRs are finally classified using a CNN-based CTR classification model (named classification network, CNet) to get the final text regions. All of these CNN-based models are modified from VGGNet-16. Extensive experiments on three benchmark data sets demonstrate that the proposed method achieves the state-of-the-art performance and greatly outperforms other scene text detection and segmentation approaches.


Lorem ipsum dolor sit amet ...


9Weighted Level Set Evolution Based on Local Edge Features for Medical Image Segmentation

Level set methods have been widely used to implement active contours for image segmentation applications due to their good boundary detection accuracy. In the context of medical image segmentation, weak edges and inhomogeneities remain important issues that may hinder the accuracy of any segmentation method based on active contours implemented using level set methods. This paper proposes a method based on active contours implemented using level set methods for segmentation of such medical images. The proposed method uses a level set evolution that is based on the minimization of an objective energy functional whose energy terms are weighted according to their relative importance in detecting boundaries. This relative importance is computed based on local edge features collected from the adjacent region located inside and outside of the evolving contour. The local edge features employed are the edge intensity and the degree of alignment between the image's gradient vector flow field and the evolving contour's normal. We evaluate the proposed method for segmentation of various regions in real MRI and CT slices, X-ray images, and ultra sound images. Evaluation results confirm the advantage of weighting energy forces using local edge features to reduce leakage. These results also show that the proposed method leads to more accurate boundary detection results than the state-of-the-art edge-based level set segmentation methods, particularly around weak edges.


Lorem ipsum dolor sit amet ...


10Spatiotemporal Strategies for Joint Segmentation and Motion Tracking From Cardiac Image Sequences

Although accurate and robust estimations of the deforming cardiac geometry and kinematics from cine tomographic medical image sequences remain a technical challenge, they have significant clinical value. Traditionally, boundary or volumetric segmentation and motion estimation problems are considered as two sequential steps, even though the order of these processes can be different. In this paper, we present an integrated, spatiotemporal strategy for the simultaneous joint recovery of these two ill-posed problems. We use a mesh-free Galerkin formulation as the representation and computation platform, and adopt iterative procedures to solve the governing equations. Specifically, for each nodal point, the external driving forces are individually constructed through the integration of data-driven edginess measures, prior spatial distributions of myocardial tissues, temporal coherence of image-derived salient features, imaging/image-derived Eulerian velocity information, and cyclic motion model of myocardial behavior. The proposed strategy is accurate and very promising application results are shown from synthetic data, magnetic resonance (MR) phase contrast, tagging image sequences, and gradient echo cine MR image sequences.


Lorem ipsum dolor sit amet ...


11Hierarchical Image Segmentation Based on Iterative Contraction and Merging

In this paper, we propose a new framework for hierarchical image segmentation based on iterative contraction and merging. In the proposed framework, we treat the hierarchical image segmentation problem as a sequel of optimization problems, with each optimization process being realized by a contraction-and-merging process to identify and merge the most similar data pairs at the current resolution. At the beginning, we perform pixel-based contraction and merging to quickly combine image pixels into initial region-elements with visually indistinguishable intra-region color difference. After that, we iteratively perform region-based contraction and merging to group adjacent regions into larger ones to progressively form a segmentation dendrogram for hierarchical segmentation. Comparing with the state-of-the-art techniques, the proposed algorithm can not only produce high-quality segmentation results in a more efficient way, but also keep a lot of boundary details in the segmentation results.


Lorem ipsum dolor sit amet ...


12Semantic Segmentation of Remote Sensing Imagery Using an Object-Based Markov Random Field Model with Auxiliary Label Fields

The Markov random field (MRF) model has attracted great attention in the field of image segmentation. However, most MRF-based methods fail to resolve segmentation misclassification problems for high spatial resolution remote sensing images due to insufficiently using the hierarchical semantic information. In order to solve such a problem, this paper proposes an object-based MRF model with auxiliary label fields that can capture more macro and detailed information and apply it to the semantic segmentation of high spatial resolution remote sensing images. Specifically, apart from the label field, two auxiliary label fields are first introduced into the proposed model for interpreting remote sensing images from different perspectives, which are implemented by setting a different number of auxiliary classes. Then, the multilevel logistic model is used to describe the interactions within each label field, and a conditional probability distribution is developed to model the interactions between label fields. A net context structure is established among them to model the interactions of classes within and between label fields. A principled probabilistic inference is suggested to solve the proposed model by iteratively renewing the label field and auxiliary label fields, in which different information of auxiliary label fields can be integrated into the label field during iterations. Experiments on different remote sensing images demonstrate that our model produces more accurate segmentation than the state-of-the-art MRF-based methods. If some prior information is added, the proposed model can produce accurate results even in complex areas.


Lorem ipsum dolor sit amet ...


13Super-pixel-Based Difference Representation Learning for Change Detection in Multispectral Remote Sensing Images

With the rapid technological development of various satellite sensors, high-resolution remotely sensed imagery has been an important source of data for change detection in land cover transition. However, it is still a challenging problem to effectively exploit the available spectral information to highlight changes. In this paper, we present a novel change detection framework for high-resolution remote sensing images, which incorporates superpixel-based change feature extraction and hierarchical difference representation learning by neural networks. First, highly homogenous and compact image superpixels are generated using superpixel segmentation, which makes these image blocks adhere well to image boundaries. Second, the change features are extracted to represent the difference information using spectrum, texture, and spatial features between the corresponding superpixels. Third, motivated by the fact that deep neural network has the ability to learn from data sets that have few labeled data, we use it to learn the semantic difference between the changed and unchanged pixels. The labeled data can be selected from the bitemporal multispectral images via a preclassification map generated in advance. And then, a neural network is built to learn the difference and classify the uncertain samples into changed or unchanged ones. Finally, a robust and high-contrast change detection result can be obtained from the network. The experimental results on the real data sets demonstrate its effectiveness, feasibility, and superiority of the proposed technique.


Lorem ipsum dolor sit amet ...


14Unsupervised Linking of Visual Features to Textual Descriptions in Long Manipulation Activities

We present a novel unsupervised framework, which links continuous visual features and symbolic textual descriptions of manipulation activity videos. First, we extract the semantic representation of visually observed manipulations by applying a bottom-up approach to the continuous image streams. We then employ a rule-based reasoning to link visual and linguistic inputs. The proposed framework allows robots 1) to autonomously parse, classify, and label sequentially and/or concurrently performed atomic manipulations (e.g., “cutting” or “ stirring”), 2) to simultaneously categorize and identify manipulated objects without using any standard feature-based recognition techniques, and 3) to generate textual descriptions for long activities, e.g., “breakfast preparation.” We evaluated the framework using a dataset of 120 atomic manipulations and 20 long activities.


Lorem ipsum dolor sit amet ...


15A Region-Wised Medium Transmission Based Image De-hazing Method

Image dehazing is a technique to enhance the images acquired in poor weather conditions, such as fog and haze. Existing image dehazing methods are mainly based on dark channel prior. Since the dark channel is not reasonable for sky regions, a sky segmentation and region wised medium transmission based image dehazing method is proposed in this paper. Firstly, sky regions are segmented by quad-tree splitting based feature pixels detection. Then, a medium transmission estimation method for sky regions is proposed based on color characteristic observation of sky regions. The medium transmission is then filtered by an edge preserving guided filter. Finally, based on the estimated medium transmission, the hazed images are restored. Experimental results demonstrate that the performance of the proposed method is better than that of existing methods. The restored image is more natural, especially in the sky regions.


Lorem ipsum dolor sit amet ...


16Adaptive Solitary Pulmonary Nodule Segmentation for Digital Radiography Images Based on Random Walks and Sequential Filter

Solitary pulmonary nodules (SPN) in digital radiography (DR) images often have unclear contours and infiltration, which make it a challenging task for traditional segmentation models to get satisfactory segmentation results. To overcome this challenge, this paper has proposed an adaptive SPN segmentation model for DR images based on random walks segmentation and sequential filter. First, the SPN image is decomposed to get the cartoon component, which is used to acquire a set of seeds. Second, the seeds selection tactic is employed to optimize the scope of walking pixels and reduce the number of seeds, which could reduce the computational cost. Finally, we incorporate the sequential filter and construct the new representation of the weight and the probability matrices. In this paper, by using a data set of 724 SPN cases, the proposed method was tested and compared with four different models, and five kinds of evaluation indicators were given to evaluate the effect of segmentation. Experimental results indicate that the proposed method performs well on the blurred edge, as it could get relatively accurate results.


Lorem ipsum dolor sit amet ...


17Object-Based Multiple Foreground Segmentation in RGBD Video

We present an RGB and Depth (RGBD) video segmentation method that takes advantage of depth data and can extract multiple foregrounds in the scene. This video segmentation is addressed as an object proposal selection problem formulated in a fully-connected graph, where a flexible number of foregrounds may be chosen. In our graph, each node represents a proposal, and the edges model intra-frame and inter-frame constraints on the solution. The proposals are selected based on an RGBD video saliency map in which depth-based features are utilized to enhance the identification of foregrounds. Experiments show that the proposed multiple foreground segmentation method outperforms related techniques, and the depth cue serves as a helpful complement to RGB features. Moreover, our method provides performance comparable to the state-of-the-art RGB video segmentation techniques on regular RGB videos with estimated depth maps.


Lorem ipsum dolor sit amet ...


18Depth Map Reconstruction for Underwater Kinect Camera Using In-painting and Local Image Mode Filtering

Underwater optical cameras are widely used for security monitoring in ocean, such as earthquake prediction and tsunami alarming. Optical cameras recognize objects for autonomous underwater vehicles and provide security protection for sea-floor networks. However, there are many issues for underwater optical imaging, such as forward and backward scattering, light absorption, and sea snow. Many underwater image processing techniques have been proposed to overcome these issues. Among these techniques, the depth map gives important information for many applications of the post-processing. In this paper, we propose a Kinect-based underwater depth map estimation method that uses a captured coarse depth map by Kinect with the loss of depth information. To overcome the drawbacks of low accuracy of coarse depth maps, we propose a corresponding reconstruction architecture that uses the underwater dual channels prior dehazing model, weighted enhanced image mode filtering, and inpainting. Our proposed method considers the influence of mud sediments in water and performs better than the traditional methods. The experimental results demonstrated that, after inpainting, dehazing, and interpolation, our proposed method can create high-accuracy depth maps.


Lorem ipsum dolor sit amet ...


19A Survey of Dictionary Learning Algorithms for Face Recognition

During the past several years, as one of the most successful applications of sparse coding and dictionary learning, dictionary-based face recognition has received significant attention. Although some surveys of sparse coding and dictionary learning have been reported, there is no specialized survey concerning dictionary learning algorithms for face recognition. This paper provides a survey of dictionary learning algorithms for face recognition. To provide a comprehensive overview, we not only categorize existing dictionary learning algorithms for face recognition but also present details of each category. Since the number of atoms has an important impact on classification performance, we also review the algorithms for selecting the number of atoms. Specifically, we select six typical dictionary learning algorithms with different numbers of atoms to perform experiments on face databases. In summary, this paper provides a broad view of dictionary learning algorithms for face recognition and advances study in this field. It is very useful for readers to understand the profiles of this subject and to grasp the theoretical rationales and potentials as well as their applicability to different cases of face recognition.


Lorem ipsum dolor sit amet ...


20Multi-focus Image Fusion Based on Extreme Learning Machine and Human Visual System.

Change detection is one of the most important applications of remote sensing technology. It is a challenging task due to the obvious variations in the radiometric value of spectral signature and the limited capability of utilizing spectral information. In this paper, an improved sparse coding method for change detection is proposed. The intuition of the proposed method is that unchanged pixels in different images can be well reconstructed by the joint dictionary, which corresponds to knowledge of unchanged pixels, while changed pixels cannot. First, a query image pair is projected onto the joint dictionary to constitute the knowledge of unchanged pixels. Then reconstruction error is obtained to discriminate between the changed and unchanged pixels in the different images. To select the proper thresholds for determining changed regions, an automatic threshold selection strategy is presented by minimizing the reconstruction errors of the changed pixels. Adequate experiments on multispectral data have been tested, and the experimental results compared with the state-of-the-art methods prove the superiority of the proposed method. Contributions of the proposed method can be summarized as follows: 1) joint dictionary learning is proposed to explore the intrinsic information of different images for change detection. In this case, change detection can be transformed as a sparse representation problem. To the authors' knowledge, few publications utilize joint learning dictionary in change detection; 2) an automatic threshold selection strategy is presented, which minimizes the reconstruction errors of the changed pixels without the prior assumption of the spectral signature. As a result, the threshold value provided by the proposed method can adapt to different data due to the characteristic of joint dictionary learning; and 3) the proposed method makes no prior assumption of the modeling and the handling of the spectral signature, which can be adapted to different data.


Lorem ipsum dolor sit amet ...


21Joint Dictionary Learning for Multispectral Change Detection

Change detection is one of the most important applications of remote sensing technology. It is a challenging task due to the obvious variations in the radiometric value of spectral signature and the limited capability of utilizing spectral information. In this paper, an improved sparse coding method for change detection is proposed. The intuition of the proposed method is that unchanged pixels in different images can be well reconstructed by the joint dictionary, which corresponds to knowledge of unchanged pixels, while changed pixels cannot. First, a query image pair is projected onto the joint dictionary to constitute the knowledge of unchanged pixels. Then reconstruction error is obtained to discriminate between the changed and unchanged pixels in the different images. To select the proper thresholds for determining changed regions, an automatic threshold selection strategy is presented by minimizing the reconstruction errors of the changed pixels. Adequate experiments on multispectral data have been tested, and the experimental results compared with the state-of-the-art methods prove the superiority of the proposed method. Contributions of the proposed method can be summarized as follows: 1) joint dictionary learning is proposed to explore the intrinsic information of different images for change detection. In this case, change detection can be transformed as a sparse representation problem. To the authors' knowledge, few publications utilize joint learning dictionary in change detection; 2) an automatic threshold selection strategy is presented, which minimizes the reconstruction errors of the changed pixels without the prior assumption of the spectral signature.


Lorem ipsum dolor sit amet ...


22Robust Non-Rigid Point Set Registration Using Spatially Constrained Gaussian Fields

Estimating transformations from degraded point sets is necessary for many computer vision and pattern recognition applications. In this paper, we propose a robust non-rigid point set registration method based on spatially constrained context-aware Gaussian fields. We first construct a context-aware representation (e.g., shape context) for assignment initialization. Then, we use a graph Laplacian regularized Gaussian fields to estimate the underlying transformation from the likely correspondences. On the one hand, the intrinsic manifold is considered and used to preserve the geometrical structure, and a priori knowledge of the point set is extracted. On the other hand, by using the deterministic annealing, the presented method is extended to a projected high-dimensional feature space, i.e., reproducing kernel Hilbert space through a kernel trick to solve the transformation, in which the local structure is propagated by the coarse-to-fine scaling strategy. In this way, the proposed method gradually recovers much more correct correspondences, and then estimates the transformation parameters accurately and robustly when facing degradations. Experimental results on 2D and 3D synthetic and real data (point sets) demonstrate that the proposed method reaches better performance than the state-of-the-art algorithms.


Lorem ipsum dolor sit amet ...


23Robust Web Image Annotation via Exploring Multi-facet and Structural Knowledge

Driven by the rapid development of Internet and digital technologies, we have witnessed the explosive growth of Web images in recent years. Seeing that labels can reflect the semantic contents of the images, automatic image annotation, which can further facilitate the procedure of image semantic indexing, retrieval and other image management tasks, has become one of the most crucial research directions in multimedia. Most of the existing annotation methods heavily rely on well-labeled training data (expensive to collect) and/or single view of visual features (insufficient representative power). In this paper, inspired by the promising advance of feature engineering (e.g., CNN feature and SIFT feature) and inexhaustible image data (associated with noisy and incomplete labels) on the Web, we propose an effective and robust scheme, termed Robust Multi-view Semi-supervised Learning (RMSL), for facilitating image annotation task. Specifically, we exploit both labeled images and unlabeled images to uncover the intrinsic data structural information. Meanwhile, to comprehensively describe an individual datum, we take advantage of the correlated and complemental information derived from multiple facets of image data (i.e. multiple views or features). We devise a robust pair-wise constraint on outcomes of different views to achieve annotation consistency. Furthermore, we integrate a robust classifier learning component via l2,p loss, which can provide effective noise identification power during the learning process. Finally, we devise an efficient iterative algorithm to solve the optimization problem in RMSL. We conduct comprehensive experiments on three different datasets, and the results illustrate that our proposed approach is promising for automatic image annotation.


Lorem ipsum dolor sit amet ...


24Face Hallucination using Linear Models of Coupled Sparse Support

Most face super-resolution methods assume that low- and high-resolution manifolds have similar local geometrical structure, hence learn local models on the low-resolution manifold (e.g. sparse or locally linear embedding models), which are then applied on the high- resolution manifold. However, the low-resolution manifold is distorted by the one-to-many relationship between low- and high- resolution patches. This paper presents the Linear Model of Coupled Sparse Support (LM-CSS) method which learns linear models based on the local geometrical structure on the high-resolution manifold rather than on the low-resolution manifold. For this, in a first step, the low-resolution patch is used to derive a globally optimal estimate of the high-resolution patch. The approximated solution is shown to be close in Euclidean space to the ground-truth but is generally smooth and lacks the texture details needed by state-of-the-art face recognizers. Unlike existing methods, the sparse support that best estimates the first approximated solution is found on the high-resolution manifold. The derived support is then used to extract the atoms from the coupled low- and high-resolution dictionaries that are most suitable to learn an up-scaling function for every facial region. The proposed solution was also extended to compute face super-resolution of non-frontal images.


Lorem ipsum dolor sit amet ...


25Scalable Multi-View Semi-Supervised Classification via Adaptive Regression

With the advent of multi-view data, multi-view learning has become an important research direction in machine learning and image processing. Considering the difficulty of obtaining labeled data in many machine learning applications, we focus on the multi-view semi-supervised classification problem. In this paper, we propose an algorithm named Multi-View Semi-Supervised Classification via Adaptive Regression (MVAR) to address this problem. Specifically, regression based loss functions with l2,1 matrix norm are adopted for each view and the final objective function is formulated as the linear weighted combination of all the loss functions. An efficient algorithm with proved convergence is developed to solve the non-smooth l2,1-norm minimization problem. Regressing to class labels directly makes the proposed algorithm efficient in calculation and can be applied to large-scale datasets. The adaptively optimized weight coefficients balance the contributions of different views automatically, which makes the performance robust against the existence of low-quality views. With the learned projection matrices and bias vectors, predictions for out-of-sample data can be easily made. To validate the effectiveness of MVAR, comparisons are made with some benchmark methods on real-world datasets and in the scene classification scenario as well. The experimental results demonstrate the effectiveness of our proposed algorithm.


Lorem ipsum dolor sit amet ...


26Quality Assessment of Perceptual Crosstalk on Two-view Auto-stereoscopic Displays

Crosstalk is one of the most severe factors affecting the perceived quality of stereoscopic 3D (S3D) images. It arises from a leakage of light intensity between multiple views, as in auto-stereoscopic displays. Well-known determinants of crosstalk include the co-location contrast and disparity of the left and right images, which have been dealt with in prior studies. However, when a natural stereo image that contains complex naturalistic spatial characteristics is viewed on an auto-stereoscopic display, other factors may also play an important role in the perception of crosstalk. Here, we describe a new way of predicting the perceived severity of crosstalk, which we call the Binocular Perceptual Crosstalk Predictor (BPCP). BPCP uses measurements of three complementary 3D image properties (texture, structural duplication and binocular summation) in combination with two well-known factors (co-location contrast and disparity) to make predictions of crosstalk on two-view auto-stereoscopic displays. The new BPCP model includes two masking algorithms and a binocular pooling method. We explore a new masking phenomenon that we call duplicated structure masking, which arises from structural correlations between the original and distorted objects. We also utilize an advanced binocular summation model to develop a binocular pooling algorithm. Our experimental results indicate that BPCP achieves high correlations against subjective test results, improving upon those delivered by previous crosstalk prediction models.


Lorem ipsum dolor sit amet ...


27Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising

The discriminative model learning for image denoising has been recently attracting considerable attentions due to its favorable denoising performance. In this paper, we take one step forward by investigating the construction of feed-forward denoising convolutional neural networks (DnCNNs) to embrace the progress in very deep architecture, learning algorithm, and regularization method into image denoising. Specifically, residual learning and batch normalization are utilized to speed up the training process as well as boost the denoising performance. Different from the existing discriminative denoising models which usually train a specific model for additive white Gaussian noise at a certain noise level, our DnCNN model is able to handle Gaussian denoising with unknown noise level (i.e., blind Gaussian denoising). With the residual learning strategy, DnCNN implicitly removes the latent clean image in the hidden layers. This property motivates us to train a single DnCNN model to tackle with several general image denoising tasks, such as Gaussian denoising, single image super-resolution, and JPEG image deblocking. Our extensive experiments demonstrate that our DnCNN model can not only exhibit high effectiveness in several general image denoising tasks, but also be efficiently implemented by benefiting from GPU computing.


Lorem ipsum dolor sit amet ...


28A Novel Fast Tensor-Based Preconditioner for Image Restoration

Image restoration is one of the main parts of image processing. Mathematically, this problem can be modeled as a large scale structured ill-posed linear system. Ill-posedness of this problem results in low convergence rate of iterative solvers. For speeding up the convergence, preconditioning usually is used. Despite the existing preconditioners for image restoration which are constructed based on approximations of the blurring matrix, in this paper, we propose a novel preconditioner with a different viewpoint. Here, we show that image restoration problem can be modeled as a tensor contractive linear equation. This modeling enables us to propose a new preconditioner based on an approximation of the blurring tensor operator. Due to the particular structure of the blurring tensor for zero boundaries, we show that the truncated higher order singular value decomposition(HOSVD) of the blurring tensor is obtained very fast and so could be used as a preconditioner. Experimental results confirm the efficiency of this new preconditioner in image restoration and its outperformance in comparison with the other well-known preconditioners.


Lorem ipsum dolor sit amet ...


29Fast segmentation from blurred data in 3D fluorescence microscopy

We develop a fast algorithm for segmenting 3D images from linear measurements based on the Potts model (or piecewise constant Mumford-Shah model). To that end, we first derive suitable space discretizations of the 3D Potts model which are capable of dealing with 3D images defined on non-cubic grids. Our discretization allows us to utilize a specific splitting approach which results in decoupled subproblems of moderate size. The crucial point in the 3D setup is that the number of independent subproblems is so large that we can reasonably exploit the parallel processing capabilities of the graphics processing units (GPU). Our GPU implementation is up to 18 times faster than the sequential CPU version. This allows to process even large volumes in acceptable runtimes. As a further contribution, we extend the algorithm in order to deal with non-negativity constraints. We demonstrate the efficiency of our method for combined image deconvolution and segmentation on simulated data and on real 3D widefield fluorescence microscopy data.


Lorem ipsum dolor sit amet ...


30Blind Facial Image Quality Enhancement Using Non-Rigid Semantic Patches.

We propose a new way to solve a very general blind inverse problem of multiple simultaneous degradations, such as blur, resolution reduction, noise, and contrast changes, without explicitly estimating the degradation. The proposed concept is based on combining semantic non-rigid patches, problem-specific high-quality prior data, and non-rigid registration tools. We show how a significant quality enhancement can be achieved, both visually and quantitatively, in the case of facial images. The method is demonstrated on the problem of cellular photography quality enhancement of dark facial images for different identities, expressions, and poses, and is compared with the state-of-the-art denoising, deblurring, super-resolution, and color-correction methods.


Lorem ipsum dolor sit amet ...


31Sparse image reconstruction on the sphere: analysis and synthesis

We develop techniques to solve ill-posed inverse problems on the sphere by sparse regularisation, exploiting sparsity in both axisymmetric and directional scale-discretised wavelet space. Denoising, inpainting, and deconvolution problems, and combinations thereof, are considered as examples. Inverse problems are solved in both the analysis and synthesis settings, with a number of different sampling schemes. The most effective approach is that with the most restricted solution-space, which depends on the interplay between the adopted sampling scheme, the selection of the analysis/synthesis problem, and any weighting of the l1 norm appearing in the regularisation problem. More efficient sampling schemes on the sphere improve reconstruction fidelity by restricting the solution-space and also by improving sparsity in wavelet space. We apply the technique to denoise Planck 353 GHz observations, improving the ability to extract the structure of Galactic dust emission, which is important for studying Galactic magnetism.


Lorem ipsum dolor sit amet ...


32Piecewise linear approximation of vector-valued images and curves via 2nd-order variational model

Variational models are known to work well for addressing image restoration/regularization problems. However, most of the methods proposed in literature are defined for scalar inputs and are used on multiband images (such as RGB or multispectral imagery) by the composition of a simple band-wise processing. This involves suboptimal results and may introduce artifacts. Only in a few cases variational models are extended to the case of vector-valued inputs. However, the known implementations are restricted to 1st-order models, while 2nd-order models are never considered. Thus, typical problems of 1st-order models such as the staircasing effect cannot be overtaken. This paper considers a 2nd-order functional model to function approximation with free discontinuities given by Blake-Zisserman (BZ) and proposes an efficient minimization algorithm in the case of vector-valued inputs. In the BZ model, the Hessian of the solution is penalized outside a set of finite length, therefore the solution is forced to be piecewise linear. Moreover, the model allows the formation of free discontinuities and free gradient discontinuities. The proposed algorithm is applied to difficult color image restoration/regularization problems and to piecewise linear approximation of curves in space.


Lorem ipsum dolor sit amet ...


33Deep Convolutional Neural Network for Inverse Problems in Imaging

In this paper, we propose a novel deep convolutional neural network (CNN)-based algorithm for solving ill-posed inverse problems. Regularized iterative algorithms have emerged as the standard approach to ill-posed inverse problems in the past few decades. These methods produce excellent results, but can be challenging to deploy in practice due to factors including the high computational cost of the forward and adjoint operators and the difficulty of hyper parameter selection. The starting point of our work is the observation that unrolled iterative methods have the form of a CNN (filtering followed by point-wise nonlinearity) when the normal operator ( H*H where H* is the adjoint of the forward imaging operator, H ) of the forward model is a convolution. Based on this observation, we propose using direct inversion followed by a CNN to solve normal-convolutional inverse problems. The direct inversion encapsulates the physical model of the system, but leads to artifacts when the problem is ill-posed; the CNN combines multiresolution decomposition and residual learning in order to learn to remove these artifacts while preserving image structure. We demonstrate the performance of the proposed network in sparse-view reconstruction (down to 50 views) on parallel beam X-ray computed tomography in synthetic phantoms as well as in real experimental sinograms. The proposed network outperforms total variation-regularized iterative reconstruction for the more realistic phantoms and requires less than a second to reconstruct a 512 x 512 image on the GPU.


Lorem ipsum dolor sit amet ...


34Robust Face Recognition with Kernelized Locality-Sensitive Group Sparsity Representation

In this paper, a novel joint sparse representation method is proposed for robust face recognition. We embed both group sparsity and kernelized locality-sensitive constraints into the framework of sparse representation. The group sparsity constraint is designed to utilize the grouped structure information in the training data. The local similarity between test and training data is measured in the kernel space instead of the Euclidian space. As a result, the embedded nonlinear information can be effectively captured, leading to a more discriminative representation. We show that, by integrating the kernelized local-sensitivity constraint and the group sparsity constraint, the embedded structure information can be better explored, and significant performance improvement can be achieved. On the one hand, experiments on the ORL, AR, Extended Yale B and LFW datasets verify the superiority of our method. On the other hand, experiments on two unconstrained datasets, the LFW and the IJB-A, show that the utilization of sparsity can improve recognition performance, especially on the datasets with large pose variation.


Lorem ipsum dolor sit amet ...


35Large-scale Crowd sourced Study for Tone Mapped HDR Pictures

Measuring digital picture quality, as perceived by human observers, is increasingly important in many applications in which humans are the ultimate consumers of visual information. Standard dynamic range (SDR) images provide 8 bits/color/pixel. High dynamic range (HDR) images, usually created from multiple exposures of the same scene, can provide 16 or 32 bits/color/pixel, but need to be tonemapped to SDR for display on standard monitors. Multi-exposure fusion (MEF) techniques bypass HDR creation by fusing an exposure stack directly to SDR images to achieve aesthetically pleasing luminance and color distributions. Many HDR and MEF databases have a relatively small number of images and human opinion scores, obtained under stringently controlled conditions, thereby limiting realistic viewing. Moreover, many of these databases are intended to compare tone-mapping algorithms, rather than being specialized for developing and comparing image quality assessment (IQA) models. To overcome these challenges, we conducted a massively crowdsourced online subjective study. The primary contributions described in this paper are (1) the new ESPL-LIVE HDR Image Database that we created containing diverse images obtained by tone-mapping operators (TMO) and MEF algorithms, with and without post-processing; (2) a largescale subjective study that we conducted using a crowdsourced platform to gather more than 300,000 opinion scores on 1,811 images from over 5,000 unique observers; and (3) a detailed study of the correlation performance of state-of-the-art noreference image quality assessment algorithms against human opinion scores of these images. The database is available at: http://signal.ece.utexas.edu/%7Edebarati/HDRDatabase.zip.


Lorem ipsum dolor sit amet ...


36Direct Pattern Control Half toning of Neugebauer Primaries

Halftoning is a key stage of any printing image processing pipeline. With colorant-channel approaches, a key challenge for matrix-based halftoning is the co-optimization of the matrices used for individual colorants, which becomes increasingly complex and over-constrained as the number of colorants increases. Both choices of screen angles (in clustered-dot cases) or structures, and control over how individual matrices relate to each other and result in over- versus side-by-side printing of the colorants, impose challenging restrictions. The solution presented in this paper relies on the benefits of a Halftone Area Neugebauer Separation (HANS) pipeline, where local Neugebauer Primary use is specified at each pixel and where halftoning can be performed using a single matrix, regardless of the number of colorants. The provably complete planedependence of the resulting halftones will be presented among the solution’s benefits.


Lorem ipsum dolor sit amet ...


37MuLoG, or How to apply Gaussian denoisers to multi-channel SAR speckle reduction?

Speckle reduction is a longstanding topic in synthetic aperture radar (SAR) imaging. Since most current and planned SAR imaging satellites operate in polarimetric, interferometric or tomographic modes, SAR images are multi-channel and speckle reduction techniques must jointly process all channels to recover polarimetric and interferometric information. The distinctive nature of SAR signal (complex-valued, corrupted by multiplicative fluctuations) calls for the development of specialized methods for speckle reduction. Image denoising is a very active topic in image processing with a wide variety of approaches and many denoising algorithms available, almost always designed for additive Gaussian noise suppression. This paper proposes a general scheme, called MuLoG (MUlti-channel LOgarithm with Gaussian denoising), to include such Gaussian denoisers within a multi-channel SAR speckle reduction technique. A new family of speckle reduction algorithms can thus be obtained, benefiting from the ongoing progress in Gaussian denoising, and offering several speckle reduction results often displaying method-specific artifacts that can be dismissed by comparison between results.


Lorem ipsum dolor sit amet ...


38Multimodal Similarity Gaussian Process Latent Variable Model

Data from real applications involve multiple modalities representing content with the same semantics from complementary aspects. However, relations among heterogeneous modalities are simply treated as observation-to-fit by existing work, and the parameterized modality specific mapping functions lack flexibility in directly adapting to the content divergence and semantic complicacy in multimodal data. In this paper, we build our work based on the Gaussian process latent variable model (GPLVM) to learn the non-parametric mapping functions and transform heterogeneous modalities into a shared latent space. We propose multimodal Similarity Gaussian Process latent variable model (m-SimGP), which learns the mapping functions between the intra-modal similarities and latent representation. We further propose multimodal distance-preserved similarity GPLVM (m-DSimGP) to preserve the intra-modal global similarity structure, and multimodal regularized similarity GPLVM (m-RSimGP) by encouraging similar/dissimilar points to be similar/dissimilar in the latent space. We propose m-DRSimGP, which combines the distance preservation in m-DSimGP and semantic preservation in m-RSimGP to learn the latent representation. The overall objective functions of the four models are solved by simple and scalable gradient decent techniques. They can be applied to various tasks to discover the nonlinear correlations and to obtain the comparable low-dimensional representation for heterogeneous modalities. On five widely used real-world data sets, our approaches outperform existing models on cross-modal content retrieval and multimodal classification.


Lorem ipsum dolor sit amet ...


39Ocular Recognition for Blinking Eyes

Ocular recognition is expected to provide a higher flexibility in handling practical applications as oppose to the iris recognition, which only works for the ideal open-eye case. However, the accuracy of the recent efforts is still far from satisfactory at uncontrollable conditions, such as eye blinking which implies any poses of eyes. To address these issues, the skin texture, eyelids, and additional geometrical features are employed. In addition, to achieve higher accuracy, sequential forward floating selection (SFFS) is utilized to select the best feature combinations. Finally, the non-linear SVM is applied for identification purpose. Experimental results demonstrate that the proposed algorithm achieves the best accuracy for both open eye and blinking eye scenarios. As a result, it offers greater flexibility for the prospective subjects during recognition as well as higher reliability for security.


Lorem ipsum dolor sit amet ...


40Unsupervised Sequential Outlier Detection with Deep Architectures

Unsupervised outlier detection is a vital task and has high impact on a wide variety of applications domains, such as image analysis, video surveillance. It also gains longstanding attentions and has been extensively studied in multiple research areas. Detecting and taking action on outliers as quickly as possible are imperative in order to protect network and related stakeholders or to maintain the reliability of critical systems. However, outlier detection is difficult due to the one class nature and challenges in feature construction. Sequential anomaly detection is even harder with more challenges from temporal correlation in data, as well as presence of noise and high dimensionality. In this paper, we introduce a novel deep structured framework to solve the challenging sequential outlier detection problem. We use autoencoder models to capture the intrinsic difference between outliers and normal instances and integrate the models to recurrent neural networks (RNN) that allow the learning to make use of previous context as well as make the learners more robust to warp along the time axis. Furthermore, we propose to use a layer-wise training procedure, which significantly simplifies the training procedure and hence helps achieve efficient and scalable training. In addition, we investigate a fine-tuning step to update all parameters set by incorporating the temporal correlation in the sequence. We further apply our proposed models to conduct systematic experiments on five real-world benchmark datasets. Experimental results demonstrate the effectiveness of our model, compared with other state-of-the-art approaches.


Lorem ipsum dolor sit amet ...


41Single and Multiple Illuminant Estimation Using Convolutional Neural Networks

In this paper we present a three-stage method for the estimation of the color of the illuminant in RAW images. The first stage uses a Convolutional Neural Network that has been specially designed to produce multiple local estimates of the illuminant. The second stage, given the local estimates, determines the number of illuminants in the scene. Finally, local illuminant estimates are refined by non linear local aggregation, resulting in a global estimate in case of single illuminant. An extensive comparison with both local and global illuminant estimation methods in the state of the art, on standard datasets with single and multiple illuminants, proves the effectiveness of our method.


Lorem ipsum dolor sit amet ...


42Illumination Decomposition for Photograph with Multiple Light Sources

Illumination decomposition for a single photograph is an important and challenging problem in image editing operation. In this paper, we present a novel coarse-to-fine strategy to perform illumination decomposition for photograph with multiple light sources. We first reconstruct the lighting environment of the image using the estimated geometry structure of the scene. With the position of lights, we detect the shadow regions as well as the highlights in the projected image for each light. Then, using the illumination cues from shadows, we estimate the coarse illumination decomposed image emitted by each light source. Finally, we present a light-aware illumination optimization model, which efficiently produces the finer illumination decomposition results, as well as recover the texture detail under the shadow. We validate our approach on a number of examples, and our method effectively decomposes the input image into multiple components corresponding to different light sources.


Lorem ipsum dolor sit amet ...


43Piecewise-stationary motion modeling and iterative smoothing to track heterogeneous particle motions in dense environments

One of the major challenges in multiple particle tracking is the capture of extremely heterogeneous movements of objects in crowded scenes. The presence of numerous assignment candidates in the expected range of particle motion makes the tracking ambiguous and induces false positives. Lowering the ambiguity by reducing the search range, on the other hand, is not an option, as this would increase the rate of false negatives. We propose here a piecewise-stationary motion model (PMM) for the particle transport along an iterative smoother that exploits recursive tracking in multiple rounds in forward and backward temporal directions. By fusing past and future information, our method, termed PMMS, can recover fast transitions from freely or confined diffusive to directed motions with linear time complexity. To avoid false positives we complemented recursive tracking with a robust inline estimator of the search radius for assignment (a.k.a. gating), where past and future information are exploited using only two frames at each optimization step. We demonstrate the improvement of our technique on simulated data – especially the impact of density, variation in frame to frame displacements, and motion switching probability. We evaluated our technique on the 2D particle tracking challenge dataset published by Chenouard et al in 2014. Using high SNR to focus on motion modeling challenges, we show superior performance at high particle density. On biological applications, our algorithm allows us to quantify the extremely small percentage of motor-driven movements of fluorescent particles along microtubules in a dense field of unbound, diffusing particles. We also show with virus imaging that our algorithm can cope with a strong reduction in recording frame rate while keeping the same performance relative to methods relying on fast sampling.


Lorem ipsum dolor sit amet ...


44DeepFix: A Fully Convolutional Neural Network for predicting Human Eye Fixations

Understanding and predicting the human visual attention mechanism is an active area of research in the fields of neuroscience and computer vision. In this work, we propose DeepFix, a fully convolutional neural network which models the bottom-up mechanism of visual attention via saliency prediction. Unlike classical works which characterize the saliency map using various hand-crafted features, our model automatically learns features in a hierarchical fashion and predicts the saliency map in an end-to-end manner. DeepFix is designed to capture semantics at multiple scales while taking global context into account, by using network layers with very large receptive fields. Generally, fully convolutional nets are spatially invariant - this prevents them from modeling location dependent patterns (e.g. centre-bias). Our network handles this by incorporating a novel Location Biased Convolutional layer. We evaluate our model on multiple challenging saliency datasets and show that it achieves state-of-the-art results.


Lorem ipsum dolor sit amet ...


45Coherent Semantic-visual Indexing for Large-scale Image Retrieval in the Cloud

The rapidly increasing number of images on the internet has further increased the need for efficient indexing for digital image searching of large databases. The design of a cloud service that provides high efficiency but compact image indexing remains challenging, partly due to the well-known semantic gap between user queries and the rich semantics of large-scale data sets. In this paper, we construct a novel joint semantic-visual space by leveraging visual descriptors and semantic attributes, which narrows the semantic gap by combining both attributes and indexing into a single framework. Such a joint space embraces the flexibility of coherent semantic-visual indexing, which employs binary codes to boost retrieval speed while maintaining accuracy. To solve the proposed model, we make the following contributions. First, we propose an interactive optimization method to find the joint semantic and visual descriptor space. Second, we prove convergence of our optimization algorithm, which guarantees a good solution after a certain number of iterations. Third, we integrate the semantic-visual joint space system with spectral hashing, which finds an efficient solution to search up to billion-scale data sets. Finally, we design an online cloud service to provide a more efficient online multimedia service. Experiments on two standard retrieval datasets (i.e., Holidays1M, Oxford5K) show that the proposed method is promising compared with the current state-of-the-art and that the cloud system significantly improves performance.


Lorem ipsum dolor sit amet ...


46300 FPS Salient Object Detection via Minimum Directional Contrast

Global contrast considers the color difference between a target region or pixel and the rest of the image. It is frequently used to measure the saliency of the region or pixel. In previous global contrast-based methods, saliency is usually measured by the sum of contrast from the entire image. We find that the spatial distribution of contrast is one important cue of saliency that is neglected by previous works. Foreground pixel usually has high contrast from all directions, since it is surrounded by the background. Background pixel often shows low contrast in at least one direction, as it has to connect to the background. Motivated by this intuition, we first compute directional contrast from different directions for each pixel, and propose minimum directional contrast (MDC) as raw saliency metric. Then an O(1) computation of MDC using integral image is proposed. It takes only 1.5 ms for an input image of the QVGA resolution. In saliency post-processing, we use marker-based watershed algorithm to estimate each pixel as foreground or background, followed by one linear function to highlight or suppress its saliency. Performance evaluation is carried on four public data sets. The proposed method significantly outperforms other global contrast-based methods, and achieves comparable or better performance than the state-of-the-art methods. The proposed method runs at 300 FPS and shows six times improvement in runtime over the state-of-the-art methods.


Lorem ipsum dolor sit amet ...


47Fast Domain Decomposition for Global Image Smoothing

Edge-preserving smoothing (EPS) can be formulated as minimizing an objective function that consists of data and regularization terms. At the price of high-computational cost, this global EPS approach is more robust and versatile than a local one that typically has a form of weighted averaging. In this paper, we introduce an efficient decomposition-based method for global EPS that minimizes the objective function of L2 data and (possibly non-smooth and non-convex) regularization terms in linear time. Different from previous decomposition-based methods, which require solving a large linear system, our approach solves an equivalent constrained optimization problem, resulting in a sequence of 1-D sub-problems. This enables applying fast linear time solver for weighted-least squares and - L1 smoothing problems. An alternating direction method of multipliers algorithm is adopted to guarantee fast convergence. Our method is fully parallelizable, and its runtime is even comparable to the state-of-the-art local EPS approaches. We also propose a family of fast majorization–minimization algorithms that minimize an objective with non-convex regularization terms. Experimental results demonstrate the effectiveness and flexibility of our approach in a range of image processing and computational photography applications


Lorem ipsum dolor sit amet ...


48Learning the Personalized Intransitive Preferences of Images

Most of the previous studies on the user preferences assume that there is a personal transitive preference ranking of the consumable media like images. For example, the transitivity of preferences is one of the most important assumptions in the recommender system research. However, the intransitive relations have also been widely observed, such as the win/loss relations in online video games, in sport matches, and even in rock–paper–scissors games. It is also found that different subjects demonstrate the personalized intransitive preferences in the pairwise comparisons between the applicants for college admission. Since the intransitivity of preferences on images has barely been studied before and has a large impact on the research of personalized image search and recommendation, it is necessary to propose a novel method to predict the personalized intransitive preferences of images. In this paper, we propose the novel Multi-Criterion preference (MuCri) models to predict the intransitive relations in the image preferences. The MuCri models utilize different kinds of image content features as well as the latent features of users and images. Meanwhile, a new data set is constructed in this paper, in order to evaluate the performance of the MuCri models. The experimental evaluation shows that the MuCri models outperform all the baselines. Due to the interdisciplinary nature of this topic, we believe it would widely attract the attention of researchers in the image processing community as well as in other communities, such as machine learning, multimedia, and recommender system.


Lorem ipsum dolor sit amet ...


49Super Patch Match: an Algorithm for Robust Correspondences using Superpixel Patches

Superpixels have become very popular in many computer vision applications. Nevertheless, they remain underexploited, since the superpixel decomposition may produce irregular and nonstable segmentation results due to the dependency to the image content. In this paper, we first introduce a novel structure, a superpixel-based patch, called SuperPatch. The proposed structure, based on superpixel neighborhood, leads to a robust descriptor, since spatial information is naturally included. The generalization of the PatchMatch method to SuperPatches, named SuperPatchMatch, is introduced. Finally, we propose a framework to perform fast segmentation and labeling from an image database, and demonstrate the potential of our approach, since we outperform, in terms of computational cost and accuracy, the results of state-of-the-art methods on both face labeling and medical image segmentation.


Lorem ipsum dolor sit amet ...


50Weakly Supervised Part Proposal Segmentation from Multiple Images

Weakly supervised local part segmentation is challenging, due to the difficulty of modeling multiple local parts from image level prior. In this paper, we propose a new weakly supervised local part proposal segmentation method based on the observation that local parts will keep fixed along the object pose variations. Hence, the local part can be segmented by capturing object pose variations. Based on such observation, a new local part proposal segmentation model is proposed. Three aspects, such as shape similarity-based cosegmentation, shape matching-based part detection and segmentation, and graph matching-based part assignment are considered. A part segmentation energy function is first proposed. Four terms, such as MRF-based single image segmentation term, shape feature-based foreground consistency term, NCuts-based part segmentation term, and two-order graphs matching based part consistency term, are contained. Then, a three sub-minimization-based energy minimization method is proposed to accomplish approximation solution. Finally, we verify our method based on three image data sets (PASCAL VOC 2008 Part data set, UCB Bird data set, and Cat-Dog data set), and one video data set (UCF Sports) data set. The experimental results demonstrate a better segmentation performance compared with the existing object cosegmentation and part proposal generation methods.


Lorem ipsum dolor sit amet ...


LiveZilla Live Chat Software

Hi there! Click one of our representatives below and we will get back to you as soon as possible.

Chat with us on WhatsApp
Project Title & Abstract