Learning on Big Graph Label Inference and Regularization with Anchor Hierarchy
Abstract: Several models have been proposed to cope with the rapidly increasing size of data, such as Anchor Graph Regularization (AGR). The AGR approach significantly accelerates graph-based based learning by exploring a set of anchors. However, when a dataset becomes much ch larger, AGR still faces a big graph which brings dramatically increasing computational costs. To overcome this issue, we propose a novel Hierarchical Anchor Graph Regularization (HAGR) approach by exploring multiple-layer layer anchors with a pyramid pyramid-style structure. ructure. In HAGR, the labels of datapoints are inferred from the coarsest anchors layer by layer in a coarse-tocoarse fine manner. The label smoothness regularization is performed on all datapoints, and we demonstrate that the optimization process only involves a small-size reduced Laplacian matrix. We also introduce a fast approach to construct our hierarchical anchor graph based on an approximate nearest neighbor search technique. Experiments on million million-scale scale datasets demonstrate the effectiveness and efficiencyy of the proposed HAGR approach over existing methods. Results show that the HAGR approach is even able to achieve a good performance within 3 minutes in an 8-million-example example classification task.