Prasha Srivastava supervised by Dr. Zia Abbas received her Master of Science in Electronics and Communication Engineering (ECE). Here’s a summary of her research work on Exploring Synthetic Data Generation Techniques to Enhance Machine Learning Applications in VLSI Circuit Design:
In the rapidly evolving landscape of Very-Large-Scale Integration (VLSI) design, the integration of machine learning (ML) techniques has emerged as a powerful tool to enhance automation and optimization processes. However, the effectiveness of these ML applications is often hampered by a critical challenge: data scarcity. As VLSI systems grow in complexity, the demand for high-quality training datasets becomes increasingly vital for the development of robust and accurate ML models. This thesis addresses this pressing issue by exploring innovative strategies for data augmentation through the application of generative models, specifically Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and diffusion models. The initial segment of this research investigates the use of GANs to generate supplementary circuit data based on simulations performed in established design environments, including Cadence Virtuoso, HSPICE, and Microcap. By leveraging GANs, we aim to synthesize artificial circuit data that accurately reflects the characteristics of real-world data, thereby enhancing the training sets available for machine learning algorithms. A comprehensive comparative analysis is conducted to evaluate the performance of GANs against VAEs, highlighting their respective strengths and weaknesses in the context of data augmentation specifically tailored for VLSI applications. Subsequently, the thesis delves into the capabilities of diffusion models, which have recently gained prominence for their effectiveness in generating high-fidelity synthetic data. By utilizing these advanced generative techniques, we demonstrate the potential to produce artificial datasets that closely resemble real electronic circuit behavior, effectively addressing the limitations posed by traditional data collection methods. Through simulations conducted in the HSPICE design environment, we establish the quality and reliability of the synthetic data generated, thereby validating its applicability in enhancing ML model performance. A key focus of this research is the rigorous evaluation of the authenticity of the synthetic data and its impact on the predictive accuracy of machine learning models. The results reveal a significant reduction in prediction errors for circuit performance assessments when models are trained on augmented datasets, showcasing the transformative potential of generative models in the VLSI design domain. This improvement in accuracy not only highlights the effectiveness of the proposed methodologies but also underscores the importance of data-driven approaches in advancing the capabilities of electronic design automation. In addition to presenting the main contributions of this thesis, this work also sheds light on the challenges encountered during the implementation of generative techniques. Issues such as the calibration of model parameters, the validation of synthetic data fidelity, and the integration of these data solutions into existing design workflows are critically examined. Furthermore, we provide insights into the lessons learned from experiments that did not yield successful results, offering valuable perspectives for future research endeavors in this field. By addressing the data scarcity issue through innovative generative modeling approaches, this thesis contributes to the ongoing dialogue within the VLSI community regarding the future of machine learning applications in electronic design automation. The findings of this research not only pave the way for more effective ML solutions in VLSI design but also serve as a foundation for future advancements that may ultimately reshape the landscape of electronic design and technology.
May 2025