Graph Semi-Supervised Learning through bridgeness biased random walks

Graph Semi-Supervised Learning (gSSL) is a classification paradigm that has received a great amount of attention due to its ability to exploit the structure of unlabeled data toghether with expert data in order to develop classifiers. Moreover, gSSL endorses an interpretation in terms of random walks propagating labeled data through the graph structure, which is the mechanism fuelling its force [1].
On the other hand, in the field of network science, a wealth of metrics was designed to mine the structure of real networks by grasping node properties and quantify their role, through measures such as centralities and, more generally, to uncover the network organization itself.
The contribution of the present work sets at the crossing: indeed, we build on such network tools and incorporate them into gSSL through the design of biased random walks. This approach grants the flexibility to choose nodes and networks features we would like to inform the random walk with and, moreover, we are able to design a scalable algorithm since it is cast, by essence, as a random walk.
The framework being general, we focus here on bridgeness centrality [2].
This measure aims to quantify how much a node lies on paths connecting couples of nodes within the network, discarding however contributions given by “local” paths that either start or end in the immediate neighborhood of a given node.
Bridgeness can hence be leveraged for classification purposes: indeed, a source of misclassification can be the "leakage" from one class to the other, due to the links connecting the two. Therefore, we exploit it to penalize the transitions towards nodes whose bridgeness score is high: in practice, high bridgeness nodes are regarded as a proxy for the "community edge" and such penalization acts as a barrier.
The efficacy of the method is then put to the test in cases, such as the one in figure, where links between the classes skew, in the classical methods like Page Rank, the classification result. Our method proves robust to those structural pitfalls, resulting in better classification performance [3].

Συνεδρία:

Information and Communication Technologies

Authors:

Sarah de Nigris and Matteo Morini

Room:

Date:

Monday, September 24, 2018 - 18:45 to 19:00

Partners

Twitter

Facebook

Contact