DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING | DWIGHT LOOK COLLEGE OF ENGINEERING | TEXAS A&M UNIVERSITY
Motif listing in large graphs
Motifs are small subgraphs (e.g., triangles, four-cycles) whose appearance in nature is much more frequent than in classical random graphs. Their discovery (enumeration or listing) plays an important role in various fields. This project aims to analyze complexity of the involved algorithms and design techniques that can handle trillion-edge graphs with limited RAM.
The next four undirected graphs are used in the ICDM 2016 paper. Each consists of two files -- (source node, degree) pairs and all adjacency lists dumped one after the other. All node IDs and degrees are unsigned 4-byte integers, LSB order. The IDs are sequential, with no gaps. Source nodes and neighbor lists are sorted ascending.IRLbot domain (14 GB): degree, edges IRLbot host (53 GB): degree, edges IRLbot IP (6 GB): degree, edges Full ClueWeb09 webgraph (358 GB): degree, edges Trigon source code (can be used for PCF as well)