My research focuses on AI/LLM-driven software engineering and security, particularly leveraging large language models to enhance program analysis, vulnerability detection, and mobile ecosystem security.
Theme 1: LLM-driven Automated Software Engineering with Reliability Guarantees
This research theme aims to establish the foundations of reliable LLM-driven automated software engineering. It explores how large language models can reason about program semantics, support critical software engineering tasks with correctness guarantees, and be systematically evaluated in terms of effectiveness, robustness, and efficiency. By addressing challenges in reasoning, reliability, and evaluation, this work seeks to enable trustworthy integration of LLMs into real-world software development pipelines.
Program Reasoning Capability
Developing LLM-based techniques for precise semantic understanding of programs, enabling accurate reasoning over complex software systems.
- Program Analysis Reasoning
- Cross-representation/layer Semantic Reasoning
- Artemis: LLM-Assisted inter-procedural path-sensitive taint analysis (OOPSLA 2025, CCS 2023) Top
- LLM-CompDroid: Repairing configuration compatibility bugs (TOSEM 2025)
- KEENHash: Large-scale binary code similarity analysis (ISSTA 2025) Top
Focus areas:
Representative papers:
Reliability in SE Tasks
Designing LLM-driven approaches to automate core software engineering tasks with a focus on correctness, robustness, and practical effectiveness.
- Quality Assessment of LLM-generated Code
- Reliable Test Generation and Validation
- Automated Debugging and Program Repair
- Test augmentation (OOPSLA 2026) Top
- Coverage goal selection (TSE 2024)
- Low Code Programming using traditional vs LLM support (JSS 2025)
- ChatGPT vs SBST (TSE 2024)
- Unearthing Gas-Wasting Code Smells in Smart Contracts (TSE 2024)
- Assessing the Quality of Code Generation by ChatGPT (TSE 2024)
Focus areas:
Representative papers:
Evaluation and Efficiency
Developing principled frameworks to systematically evaluate LLM-based software engineering techniques in terms of effectiveness, reliability, and computational efficiency.
- Benchmark Design
- Evaluation Metrics
- Cost Aware Inference
- Carbon Footprint
Focus area:
Theme 2: Mobile Security and Android Ecosystem Analysis
Understanding security, privacy risks, and malicious behaviors in large-scale mobile ecosystems through systematic analysis of applications, system mechanisms, and software supply chains.
Apps & Android OS Security
- Unauthorized encrypted private data transmission (ICSE 2026) Top
- Mobile Sharing Service Abuse (WWW 2022) Top
- App Link Attack (FSE 2020) Top
- Resource Race Attack (SANER 2020)
- Diehard Android Apps (ASE 2020) Top
Representative papers:
Malware Detection and Adversarial Analysis
- Fine-grained malicious component detection (ASE 2023) Top
- Adversarial attacks on deep learning apps (JSEP 2023)
Representative papers:
App Ecosystem and Supply Chain Security
- Android app bundle analysis (TSE 2025)
- Third-party library and dependency analysis (ASE 2019,TSE 2021)
- Repackaged Apps Detection (SANER 2019)
- App Debloat (TSE 2022)
Representative papers:
Theme 3: Empirical Software Engineering and User-Centric Analysis
Conducting large-scale empirical studies to understand software quality, developer behavior, and user feedback.
Bug Analysis and Software Quality
- Bug characterization in Jupyter systems (EASE 2025)
- Defect prediction and software quality studies (SCP 2025, IJSEKE 2023, ICPADS 2021, WCMC 2021, TReli 2021, SAC 2021, QRS 2020, IST 2020, QRS 2019, ISSRE 2019, JCST 2019, JSS 2019a, JSS 2019b, IST 2018, ICPC 2018)
Representative papers:
User Review and Feedback Mining
- User-review-based bug localization (TSE 2022)
- Feedback Analysis in SPL Forked Developments (SPLC 2025)