Yibo (Brynn) Peng's Homepage

Intro

I recently graduated from Carnegie Mellon University with a Master's degree in Artificial Intelligence Engineering. Currently, I work as a Research Assistant at NeuLab, where I'm fortunate to be advised by Professor Graham Neubig. I also collaborate closely with Professor Daniel Fried and PhD candidate Zora (Zhiruo) Wang at CMU.

I am deeply intrigued by Multi-modal (e.g., vision, language) LLMs and Agents with applications to health science and code generation. My recent research focuses on retrieval-based methods, which I believe play a fundamental role in next-generation language models by improving their factuality, adaptability, and trustworthiness.

I am actively seeking a PhD position for Fall 2026 and potential collaboration opportunities. Feel free to reach out and review my CV for more details.

Research

2024

Can Long-Context Language Models Solve Repository-Level Code Generation? (in submission)

• Implemented an experimental framework to compare long context models with Retrieval-Augmented Generation (RAG) for code generation, using Unlimiformer to extend model's input context lengths.

• Conducted experiments adjusting context lengths to explore their impact on code generation quality, identifying optimal ranges and revealing trade-offs between context length and noise.

• Demonstrated that RAG maintained superior performance over long context models, highlighting its effectiveness in organizing and utilizing large-scale information even when extensive context is available.

2022

Time Series Augmentation Based on GAN

• Combine unsupervised and supervised learning.

• Capture dynamic and static features.

• Used denoising autoencoder as the generator.

• Use gated recurrent unit imputation (GRUI) neural network.

[Demo]

Project

2024

Unlimiformer: Long-Range Transformers with Unlimited Length Input

• Reproduced Unlimiformer to extend input length in code generation tasks, using k-nearest neighbors (kNN) retrieval to handle long-distance inputs without increasing computational complexity.

• Introduced Repocoder RAG method for retrieval and tailored to the code snippets of code generation.

• Improved model performance with Unlimiformer, achieving significant gains in EM (Exact Match) and ES (Evaluation Score) when handling input sequences beyond the original context length limit.

LLM Speculative Sampling

• Implemented the Speculative Decoding algorithm, improving inference speed for large Transformer models through parallel computation, achieving 2-3x acceleration in practical tests.

• Designed KV Cache optimization to reduce memory bandwidth bottlenecks and enhance inference efficiency.

• Applied the acceleration technique to code generation tasks, conducting experimental validation using Salesforce Codegen model series (ranging from 350M to 6B parameters).

RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation

• Set up an experimental environment for testing, including line, API, and function level code completion tasks.

• Enhanced retrieval strategy to analyze code generation performance on state-of-the-art LLM.

• Used denoising autoencoder as the generator.

• Optimized the retrieval-generation pipeline through adding the prompt length which improved retrieval quality and increased the EM (Exact Match) score by over 5 % compared to the original baseline.

Brynn (Yibo) Peng

彭艺博

Intro

Research

2024

2022

Project

2024