Profile Photo

Brynn (Yibo) Peng

彭艺博

Artificial Intelligence Engineering Master Student

Carnegie Mellon University

Intro

I recently graduated from Carnegie Mellon University with a Master's degree in Artificial Intelligence Engineering. Currently, I work as a Research Assistant at NeuLab, where I'm fortunate to be advised by Professor Graham Neubig. I also collaborate closely with Professor Daniel Fried and PhD candidate Zora (Zhiruo) Wang at CMU.

I am deeply intrigued by Multi-modal (e.g., vision, language) LLMs and Agents with applications to health science and code generation. My recent research focuses on retrieval-based methods, which I believe play a fundamental role in next-generation language models by improving their factuality, adaptability, and trustworthiness.

I am actively seeking a PhD position for Fall 2026 and potential collaboration opportunities. Feel free to reach out and review my CV for more details.

Research

2024

Can Long-Context Language Models Solve Repository-Level Code Generation? (in submission)

• Implemented an experimental framework to compare long context models with Retrieval-Augmented Generation (RAG) for code generation, using Unlimiformer to extend model's input context lengths.

• Conducted experiments adjusting context lengths to explore their impact on code generation quality, identifying optimal ranges and revealing trade-offs between context length and noise.

• Demonstrated that RAG maintained superior performance over long context models, highlighting its effectiveness in organizing and utilizing large-scale information even when extensive context is available.

2022

Research 2022

Time Series Augmentation Based on GAN

• Combine unsupervised and supervised learning.

• Capture dynamic and static features.

• Used denoising autoencoder as the generator.

• Use gated recurrent unit imputation (GRUI) neural network.

[Demo]

Project

2024

Project 2024

Unlimiformer: Long-Range Transformers with Unlimited Length Input

• Reproduced Unlimiformer to extend input length in code generation tasks, using k-nearest neighbors (kNN) retrieval to handle long-distance inputs without increasing computational complexity.

• Introduced Repocoder RAG method for retrieval and tailored to the code snippets of code generation.

• Improved model performance with Unlimiformer, achieving significant gains in EM (Exact Match) and ES (Evaluation Score) when handling input sequences beyond the original context length limit.

Project 2024

LLM Speculative Sampling

• Implemented the Speculative Decoding algorithm, improving inference speed for large Transformer models through parallel computation, achieving 2-3x acceleration in practical tests.

• Designed KV Cache optimization to reduce memory bandwidth bottlenecks and enhance inference efficiency.

• Applied the acceleration technique to code generation tasks, conducting experimental validation using Salesforce Codegen model series (ranging from 350M to 6B parameters).

Project 2024

RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation

• Set up an experimental environment for testing, including line, API, and function level code completion tasks.

• Enhanced retrieval strategy to analyze code generation performance on state-of-the-art LLM.

• Used denoising autoencoder as the generator.

• Optimized the retrieval-generation pipeline through adding the prompt length which improved retrieval quality and increased the EM (Exact Match) score by over 5 % compared to the original baseline.