Monday, April 19, 2010
1107/1111 Kim Engineering Building
University of Maryland, College Park, MD 20740
Sponsored by the Washington Chapter of the IEEE Signal Processing Society.
Poster registration is closed.
Signal Processing Night consists of two 75-minute poster sessions. Contributors are responsible for printing and bringing their own posters. Poster easels and a limited number of hard foam boards (on which to post posters) will be provided. We recommend poster widths to be between 30 and 42 inches. Posters can be printed at low rates at stores such as Costco. Limited subsidies are available by request to compensate for a portion of printing costs.
Monday, April 19, 2010
1107/1111 Kim Engineering Building
University of Maryland, College Park, MD 20742
Free parking is available nearby in Lots XX1, XX2, I, 9, and 11 after 4 pm. Paid visitor parking is available in the Paint Branch Drive Visitor Lot adjacent to the Kim Engineering Building.
Venue and parking locations: Google Maps, Official Campus Map
More details: Visitor's Guide
For questions, please contact Prof. Min Wu (minwu AT eng. umd. edu) or Steve Tjoa (kiemyang AT umd dot edu).
In this poster we present a study on the automatic identification of acquisition devices when only access to the output speech recordings is possible. A statistical characterization of the frequency response of the device contextualized by the speech content is proposed. In particular, the intrinsic characteristics of the device are captured by a template, constructed by appending together the means of a Gaussian mixture trained on the device speech recordings. This study focuses on two classes of acquisition devices, namely, landline telephone handsets and microphones. Three publicly available databases are used to assess the performance of linear- and mel-scaled cepstral coefficients. A Support Vector Machine classifier was used to perform closed-set identification experiments. The results show classification accuracies higher than 90 percent among the eight telephone handsets and eight microphones tested.
The latest communication technologies invariably consist of modules with dynamic behavior. There exists a number of design tools for communication system design with their foundation in dataflow modeling semantics. These tools must not only support the functional specification of dynamic communication modules and subsystems but also provide accurate estimation of resource requirements for efficient simulation and implementation. We explore this trade-off --- between flexible specification of dynamic behavior and accurate estimation of resource requirements --- using a representative application employing an adaptive modulation scheme. We propose an approach for precise modeling of such applications based on a recently-introduced form of dynamic dataflow called core functional dataflow. From our proposed modeling approach, we show how parameterized looped schedules can be generated and analyzed to simulate applications with low run-time overhead as well as guaranteed bounded memory execution. We demonstrate our approach using the Advanced Design System from Agilent Technologies, Inc., which is a commercial tool for design and simulation of communication systems.
Speaker recognition systems classify a test signal as a speaker or an imposter by evaluating a matching score between input and reference signals. We propose a new information theoretic approach for computation of the matching score using the Renyi entropy. The proposed entropic distance, the Kernelized Renyi distance (KRD), is formulated in a non-parametric way and the resulting measure is efficiently evaluated in a parallelized fashion on a graphical processor. The distance is then adapted as a scoring function and its performance compared with other popular scoring approaches in a speaker identification and speaker verification framework.
We study the role of contextual information in detecting objects from the visual scene. We present two applications where we detect human faces using the information of a supporting torso, and localize lane markings from the adjoining road information.
Iris images acquired from a partially cooperating subject often suffer from blur, occlusion due to eyelids, and specular reflections. The performance of existing iris recognition systems degrade significantly on these images. Hence it is essential to select good images from the incoming iris video stream, before they are input to the recognition algorithm. In this paper, we propose a sparsity based algorithm for selection of good iris images and their subsequent recognition. Unlike most existing algorithms for iris image selection, our method can handle segmentation errors and a wider range of acquisition artifacts common in iris image capture. We perform selection and recognition in a single step which is more efficient than devising separate specialized algorithms for the two. Recognition from partially cooperating users is a significant step towards deploying iris systems in a wide variety of applications.
YouTube and other web services alike have revolutionized content sharing and online social networking by providing an easy-to-use platform for users to post and share video. At the same time, content owners have raised serious concerns on unauthorized uploads of copyrighted movies and TV shows to these websites, as witnessed by high-profile lawsuits filed against YouTube and Google. In order to deter copyright violation and more importantly, to help keep online communities alive legally, "content fingerprinting" technologies are deployed to compute a short string of bits to capture unique characteristics of each video and use it determine whether an uploaded video belongs to a set of copyrighted content or not. Content fingerprints are also used by such applications as Shazam on iPhone to use recordings of short audio clips to identify the song and provide information about the artist, the album, and where to buy. This poster summarizes the research on content identification by the Media and Security Team (MAST) at the University of Maryland.
Users in a multimedia social network have different objectives and influence each other's decision and performance. Therefore, the complex behavior dynamics among users in multimedia social networks such as cheating, bargaining, and making agreements impacts the multimedia systems. We analyze and model human behaviors in multimedia social networks to understand the importance and impact of human factors on multimedia system design.
This poster describes our research on digital image forensics that aims at answering various forensic questions that can be classified into 1) identifying the components within a digital capture device that produces a given image, and 2) discovering the process history a digital image has gone through. For the former, we explain how the algorithms and parameters associated with a certain component within a capture device can be estimated with the given image, and suggest that such estimation can be used to determine the authentic source of the digital image. For the latter, we propose a universal method that can detect if a digital image has been manipulated after capture, and show that different manipulations can actually be distinguished.
The advancement of information technology is rapidly integrating the physical world where we live and the online world where we retrieve and share information. One immediate example of such integration is the increasing popularity of storing and managing personal data using third-party web services, as part of the emerging trend of cloud computing. Secure management of sensitive data stored online is becoming one of the critical research issues in cloud computing and online privacy protection. We propose techniques to achieve content based multimedia retrieval over encrypted databases, which can be used for online management of multimedia data while preserving data privacy. We propose two types of secure retrieval schemes by combining cryptographic techniques, such as order preserving encryption and randomized hash functions, with image processing and information retrieval techniques, such as visual words representation, inverted index, and min-hash. The first type of retrieval schemes scramble visual features extracted from images and allow similarity comparison of the features in their encrypted forms. The second type of schemes encrypt the state-of-the-art search indexes without significantly affecting their search capability. Retrieval results on an encrypted color image database and security analysis under different attack models show that retrieval performance comparable to conventional plaintext retrieval schemes can be achieved over encrypted databases while ensuring data confidentiality.
How to efficiently and fairly allocate data rate among different users is a key problem in the field of multiuser multimedia communication. However, most of the existing optimization-based methods, such as minimizing the weighted sum of the distortions or maximizing the weighted sum of the peak signal-to-noise ratios (PSNRs), have their weights heuristically determined. Moreover, those approaches mainly focus on the efficiency issue while there is no notion of fairness. In this paper, we address this problem by proposing a game-theoretic framework, in which the utility/payoff function of each user/player is jointly determined by the characteristics of the transmitted video sequence and the allocated bit-rate. We show that a unique Nash equilibrium (NE), which is proportionally fair in terms of both utility and PSNR, can be obtained, according to which the controller can efficiently and fairly allocate the available network bandwidth to the users. Moreover, we propose a distributed cheat-proof rate allocation scheme for the users to converge to the optimal NE using alternative ascending clock auction. We also show that the traditional optimization-based approach that maximizes the weighted sum of the PSNRs is a special case of the game-theoretic framework with the utility function defined as an exponential function of PSNR. Finally, we show several experimental results on real video data to demonstrate the efficiency and effectiveness of the proposed method.
Multimedia piracy is a growing concern for the film industry, resulting in annual losses of several billions of dollars. A promising technique to deter piracy and trace pirates, is "digital fingerprinting", wherein special signals are inserted into each legally distributed copy that can uniquely identify the recipient. When an unauthorized copy is discovered, the embedded signal can be used to determine the person responsible for the leak. Digital Fingerprinting can also be used in security related applications for traitor tracing. This poster summarizes the fingerprinting research by the Media and Security Team (MAST) at UMD.
The recent convergence of music, mathematics, and computation has given birth to the field of music information retrieval (MIR) -- an interdisciplinary technological area that attracts artists, scientists, and engineers alike. By analyzing spectral and temporal patterns in acoustic signals, MIR helps close the semantic gap between humans and digital music. In doing so, MIR provides convenient methods for browsing, searching, and organizing music from large databases using high-level musical queries. In our work, we relate recent mathematical developments in sparse and nonnegative factorization to the problem of musical scene analysis. These mathematical tools enable us to easily decompose an acoustic signal into musical notes or beats. This decomposition is not only more intuitive to humans, but it also facilitates other tasks in MIR such as music transcription, genre classification, and instrument recognition.
Dynamic spectrum access (DSA), enabled by cognitive radio technologies, has become a promising approach to improve efficiency in spectrum utilization, and the spectrum auction is one important DSA approach, in which secondary users lease some unused bands from primary users. However, spectrum auctions are different from existing auctions studied by economists, because spectrum resources are interference-limited rather than quantity-limited, and it is possible to award one band to multiple secondary users with negligible mutual interference. To accommodate this special feature in wireless communications, we present a novel multi-winner spectrum auction game not existing in auction literature. As secondary users may be selfish in nature and tend to be dishonest in pursuit of higher profits, we develop effective mechanisms to suppress their dishonest/collusive behaviors when secondary users distort their valuations about spectrum resources and interference relationships. Moreover, in order to make the proposed game scalable when the size of problem grows, the semi-definite programming (SDP) relaxation is applied to reduce the complexity significantly. Finally, simulation results are presented to evaluate the proposed auction mechanisms, and demonstrate the complexity reduction as well.
We show that, via temporal modulation, one can observe a high-speed periodic event well beyond the abilities of a low-frame rate camera. By strobing the exposure with unique sequences within the integration time of each frame, we take coded projections of dynamic events. From a sequence of such frames, we reconstruct a high-speed video of the high frequency periodic process. Strobing is used in entertainment, medical imaging and industrial inspection to generate lower beat frequencies. But this is limited to scenes with a detectable single dominant frequency and requires high-intensity lighting. In this paper, we address the problem of sub-Nyquist sampling of periodic signals and show designs to capture and reconstruct such signals. The key result is that for such signals the Nyquist rate constraint can be imposed on strobe-rate rather than the sensorrate. The technique is based on intentional aliasing of the frequency components of the periodic signal while the reconstruction algorithm exploits recent advances in sparse representations and compressive sensing. We exploit the sparsity of periodic signals in Fourier domain to develop reconstruction algorithms that are inspired by compressive sensing.
In recent years, the availability of high-quality digital cameras coupled with the rise of the Internet as a means of information delivery has caused digital content to become prevalent throughout society. This proves to be problematic, as the rise of digital media has coincided with the widespread availability of digital editing software. Accordingly, there is a great need for digital image forensic techniques capable of detecting image manipulations and forgeries. In this poster, we present several techniques capable of identifying digital image manipulation by detecting the unique statistical fingerprints left in an image’s pixel value histogram by contrast enhancement mappings. Specifically, we present methods to detect global and local contrast enhancement, and describe how these methods can be used to detect cut-and-paste image forgeries. Additionally, we present an iterative algorithm capable of jointly estimating the contrast enhancement mapping used to modify an image as well as the image’s pixel value histogram before the application of contrast enhancement.
Many sensor network related applications require precise knowledge of the location of constituent nodes. In these applications, it is desirable for the wireless nodes to be able to autonomously determine their locations before they start sensing and transmitting data. Most localization algorithms rely on anchor nodes whose locations are known, to determine the positions of the remaining nodes. In an adversarial scenario, some of these anchor nodes could be compromised and used to transmit misleading information aimed at preventing the accurate localization of the remaining sensors. In this paper, a computationally efficient algorithm to determine the location of sensors that can resist such attacks is described. The proposed algorithm combines gradient descent with a selective pruning of inconsistent measurements to achieve good localization accuracy. Simulation results show that the proposed algorithm has performance comparable to existing schemes while requiring less computational resources.
In this work our aim is to significantly reduce the rate of false negatives (i.e. the number of abnormal cases classified as normal) in dense mammograms. A very low rate of false negative detection is important since it is critical and risky to classify an abnormal mammogram as normal. Reducing the false positive rate is also important, but the only risk on misclassifying the normal mammogram as abnormal is that, the patient will have to go through the biopsy which is a time and nerve consuming process. Therefore, a normal detection is much more challenging than tumor detection since normal detection focuses on increasing the True negative rate (classification of normal as normal) and decreasing the false negative rate. The idea of characterizing normal mammogram instead of abnormal was investigated in recent years and little work was done on it . Other studies found that the relative risk of developing breast cancer in women with dense breasts is 400% higher than in women with fatty, non-dense breast tissue. Hence, we will focus on the “hard to classify” dense breast cases. In our future work, an overall CAD system will be designed to identify the features of the two types of breast tissue: fatty and dense. The objective of this CAD system will be to achieve highest possible sensitivity of classifying any abnormal case as normal with the highest specificity of classifying normal cases correctly.
We introduce a new Synthetic Aperture Radar (SAR) imaging modality that provides a high resolution map of the spatial distribution of targets and terrain based on a significant reduction in the number of transmitted and/or received electromagnetic waveforms. This new imaging scheme, which requires no new hardware components, allows the aperture to be compressed and presents many important applications and advantages among which include resolving ambiguities, strong resistance to countermesasures and interception, and reduced on-board storage constraints.