Exploiting Structure in Data
Research in graph signal processing (GSP) aims to develop tools for processing such data by providing a framework for the analysis of high-dimensional data defined on irregular graph domains. GSP extends fundamental signal processing concepts to data supported on graphs that we refer to as graph signals. In this work, we study two fraternal problems: (1) sampling and (2) reconstruction of signals on graphs. Both of these problems are eminent topics in the field of signal processing over the past decades and have meaningful implications for many real-world problems. Sampling is the task of choosing or measuring some representative subset of the signal such that we can interpolate and recover the whole signal. In many settings, acquiring samples is expensive and it is desirable to be able to approximate the signal as efficiently as possible. Signal reconstruction refers the task of recovering the true graph signal given noisy, incomplete, or corrupt measurements. It is evident that these two problems share a duality and are intimately tied to the underlying graph structure.
In graph signal processing, a common assumption is that the graph signal is smooth with respect to the graph, that is, the signal coefficients do not vary much over local neighborhoods of the graph. However, this characterization is insufficient for many real-world signals that exhibit spatially inhomogeneous levels of smoothness over the graph. In social networks for example, within a given community or social circle, users’ profiles tend to be homogeneous, while within a different social circle they will be of different, yet still have homogeneous values. Consequently, the signal is often characterized by large variations between regions and small variations within regions such that there are localized discontinuities and patterns in the signal. As a result, it is necessary to develop representations and algorithms to process and analyze such {\em piecewise smooth} graph signals. In this body of work, we propose to study the tasks of sampling and reconstruction for two complementary classes of graph signals that broadly capture characteristics of most graph-structured data: (1) smooth signals that are characterized by slow transitions with respect to graph and (2) piecewise-smooth signals that are characterized by localized behavior, abrupt discontinuities or fast transitions over the underlying graph. We examine why a one-size-fits-all approach may not always be appropriate and consequently design algorithms and frameworks specific to each graph signal model. Product graphs are a generative model that allows us to decompose a large real-world large graph into smaller graphs. We exploit this structure to alleviate the computational complexity and achieve better performance for both sampling and reconstruction.
We consider two sampling paradigms: (1) passive sampling, and (2) active sampling. Passive sampling refers to the setting where we are constrained to strategies that are blind to any samples of the signal. That is, we design strategies by considering only the underlying graph structure and any modeling assumptions we have made. In contrast, active or adaptive sampling strategies are able to choose samples in an online fashion: the decision of where to sample next depends on all the observations made previously. We study the fundamental limits of sampling these two graph signal models under passive and active settings. We present a framework for the reconstruction or estimation of graph signals and investigate the limitations of these methods.
The irregularity of the underlying structure of the data in contrast to the aforementioned regularity in classical signal processing makes studying these problems on graphs challenging but also compelling. A key theme throughout this work is the interplay between the graph structure and the signal that lies on it. We study how structural properties of both the graph and the signal on the graph inform not only how well we can perform these two tasks but also the design of algorithms and strategies to perform them efficiently.
-
Sampling Smooth Graph Signals: We present the analog of the classical sampling theory for bandlimited signals on graphs, and show extensions for efficient sampling and recovery on product graphs. We then study the fundamental statistical limits of sampling smooth graph signals and establish how the structure of the graph drives both the optimal sampling strategy, how well we can expect to do, and the performance of these algorithms. Product graphs are a generative model that allows us to decompose a large real-world large graph into smaller graphs. We develop frameworks for (randomized) sampling that both lessen the computational complexity and achieve better performance by availing of the inherent structure in product graphs. In the semi-supervised learning setting, it is often of interest to provide a confidence score or a level of uncertainty in addition to our decision. Towards this goal, in this work, we introduce a Bayesian treatment of semi-supervised learning on graphs via sampling theory.
-
Reconstruction/Denoising Smooth Graph Signals: Broadly, there are two frameworks for signal reconstruction or estimation. The first, which we refer to as the synthesis framework, generally consists of constructing an appropriate basis or dictionary over the graph for the class of signals, and then regressing the observed signal y over this basis. The graph Fourier basis promotes smoothness in the sense that smooth signals are approximately bandlimited with respect to the graph Fourier basis. A straightforward way to recover the signal would be to choose the best K such that we project the signal onto the corresponding bandlimited space. We study graph signal reconstruction based on the analysis framework where we formulate an optimization problem that regularizes the smoothness criterion with a penalty function. We can extend a general form of graph signal recovery to a flexible optimization problem formulation that accounts for outliers and multiple signals. We show how we can recover smooth signals on product graphs by structuring the signal as a low-rank tensor. Tensor factorization is a decomposition method for high dimensional data that is used to estimate the prominent factors in some signal. Similarly to matrix factorization, PCA, and in graph signal processing, transforms such as spectral graph wavelets and the graph Fourier transform, tensor decomposition allows us to detect latent structure in graph data. Typical tasks built on top of this include denoising, inpainting, and anomaly detection. We study the spectral decomposition of such product graphs and show how an extension of the smoothness assumption to signals on product graphs leads to the corresponding tensor Y possessing a low-dimensional structure.
-
Sampling Piecewise-Smooth Graph Signals: The discontinuities in piecewise-smooth signals make the task of efficiently sampling them substantially more challenging. Unlike sampling smooth signals that have no discontinuities, the localized nature of the discontinuities in piecewise smooth signals make the detection of these discontinuities inherently decoupled from the global features of the graph signal. It then follows that the passive sampling of piecewise-smooth graph signals is a significantly harder or even futile task. Consequently, we study the active sampling of piecewise-smooth signals by designing algorithms and strategies. As before, we seek to understand how the underlying graph structure influences both these limitations and the performance of these algorithms.
-
Reconstruction/Denoising Piecewise-Smooth Graph Signals: We discuss the reconstruction of piecewise-smooth graph signals from noisy or corrupted measurements and present two approaches. The first approach is based on approximating the signal with respect to a graph wavelet basis or dictionary while the second approach solves a optimization problem via the graph trend filtering framework which enforces a sparsity constraint on (higher-order) discrete graph differences. Particularly, we compare wavelet soft-thresholding on graphs and graph-based total variation denoising for the estimation of piecewise-smooth signals on graphs. Towards our graph trend filtering framework, we introduce a family of possible non-convex sparsity penalties, including SCAD and MCP that exhibit superior reconstruction performance over $\ell_1$ minimization for the denoising of piecewise smooth graph signals. Through theoretical analyses and empirical performance, we demonstrate that the use of non-convex penalties improves the performance of GTF in terms of both reduced reconstruction error and improved support recovery, i.e. how accurately we can localize the discontinuities of the piecewise smooth signals. While the non-convexity means that in general, we may not always find the global optimum of, it often affords us many other advantages. SCAD and MCP both taper off to a constant value and hence apply less shrinkage for higher values. As a result, they mitigate the bias effect while promoting sparsity.