Experience
-
Research Assistant
Jan 2025 — PresentQ-Lab, UCSDAdvisors: Prof. Lianhui Qin and Matthew Ho- Contributed to ArcMemo by leading the design of a program-synthesis-style memory ontology, developing many-to-one puzzle-to-feature mappings and manually curating concept parameterizations to enable concept-level reasoning.
- Engineered a reasoning-based retrieval mechanism (System-2 exploration) to resolve embedding failures, achieving a 7.5% relative gain on ARC-AGI-1 (59.33% official score).
- Built a complete concept dataset generation pipeline transforming hand-written concepts into validated helper puzzles through multi-stage LLM-based generation, code synthesis, and automated testing.
- Extended the framework to AIME math problems by designing a metacognitive self-assessment pipeline, improving accuracy by 9.3% via self-reflective memory usage.
LLM Agents Program Synthesis Concept-Level Memory Python -
Research Assistant
Jun 2025 — PresentWang Lab, UCSDAdvisors: Prof. Wei Wang and Young Su Ko- Developing TrustPPI: Domain-specific trust signals for protein-protein interaction prediction; showed deformation stability achieves 0.70–0.80 AUROC vs ~0.50 for generic confidence.
- Architected a heterogeneous Mixture-of-Experts system for chemical reaction prediction on USPTO datasets (1M+ reactions), integrating four specialized expert models with learned routing mechanisms.
- Implemented graph neural network encoders using directed message-passing architectures with shortest-path positional encodings, enabling permutation-invariant molecular representations for stereochemistry-sensitive reaction modeling.
- Engineered training pipelines with teacher-forcing, load-balancing losses, and router warmup; designed evaluation frameworks for top-k accuracy and per-expert ablation.
Protein Biology PyTorch Geometric MoE GNNs -
Student Researcher
Sep 2025 — PresentHalicioglu Data Science Institute, UCSDAdvisor: Prof. Hao Zhang (Senior Capstone)- Built an end-to-end distributed training stack for a 1.8B-parameter language model on 8 NVIDIA B200 GPUs, implementing pipeline parallelism and scaling analysis.
- Developing TPU-optimized speculative decoding to reduce latency for test-time reasoning systems.
Distributed Training Speculative Decoding B200 GPUs -
Student Researcher
Sep 2025 — PresentHalicioglu Data Science Institute, UCSDAdvisor: Prof. Yian Ma (Senior Capstone)- Developing AR-Bench, an interactive reasoning benchmark where agents decide when to ask clarifying questions versus commit to an answer.
- Designing mutual-information-based uncertainty signals to detect epistemic uncertainty for risk-controlled stopping decisions.
Uncertainty Quantification Mutual Information Benchmarking -
Research Assistant
Sep 2023 — Jun 2024- Built forecasting models (ARIMA, exponential smoothing) on 20+ years of naturalization data to project immigration trends for 2024 election policy briefs.
- Developed automated data pipelines and interactive Tableau dashboards to communicate findings to non-technical stakeholders.
R ARIMA Tableau
Teaching
-
Tutor
Summer 2025DSC 40A: Theoretical Foundations of Data Science IConducted tutoring sessions covering set theory, probability, and algorithmic thinking. Led review sessions before exams and supervised oral examinations.
Probability Combinatorics -
Grader
Fall 2024 — Fall 2025MATH Department, UCSDGraded for probability and statistics courses across 6 terms, including MATH 180A (Probability), MATH 180C (Stochastic Processes), MATH 181A (Mathematical Statistics), and MATH 185 (Computational Statistics). Provided detailed feedback on proof-based problems.
Probability Statistics Stochastic Processes