[month] [year]

Vani Sancheti

Vani Sancheti supervised by Prof. Vikram Pudi  received her Master of Science – Dual Degree  in Computer Science and Engineering (CSE). Here’s a summary of her research work on Partition-based Frequent Subgraph Mining using MCS:

 In spite of the vast work in the area of Maximal Frequent Subgraph (MFS) mining, current algorithms do not scale to large databases with numerous graphs. Even the smallest search space required by current MFS algorithms for large databases is too huge to work within the main memory alone, making the algorithms I/O bound. Added to this is the expensive cost of subgraph isomorphism operation, which is commonly used for frequency computation. This thesis explores partition-based secondary memory algorithms and proposes Partition based Frequent Subgraph Mining using Maximum Common Subgraph (PMCS), that can make maximal frequent subgraph mining for large databases viable. The naive application of the partition-based approach is inefficient due to the need to recompute subgraph frequency and the possibility of storing intermediate subgraphs on the disk. In PMCS, we overcome these drawbacks by intelligently avoiding redundant computations and performing frequency computation operations optimally using the Maximum Common Subgraph operation. Our intermediate results are also small enough to fit in the main memory. While existing algorithms fail to scale beyond 300k, our results demonstrate that PMCS scales to databases of up to 1000k graphs with an average of 25-30 edges per graph.  

April 2025