My Projects

Here are some of the key projects I've worked on, showcasing my skills in software development, machine learning, and systems programming.

SGLang

SGLang

March 2025 - Present

A high-performance serving framework for Large Language Models and Multi-Modal Language Models.

Key Contributions:

  • Contributed as a core developer to SGLang, a high-performance serving framework for LLMs
  • Co-lead the multi-modal language model (MLLM) team to integrate various MLLMs into the SGLang framework
  • Investigated and resolved critical issues for the Qwen-VL model series, enhancing its stability within SGLang
  • Refactored the multi-modal processing pipeline, improving robustness and maintainability
  • Collaborated with the VeRL team to enable SGLang as a rollout backend for MLLMs
  • Added support for new models including Qwen, Gemma, Mistral, etc.
  • Refactored the OpenAI-compatible server for improved robustness and maintainability
  • Established and maintained the MLLM benchmarking suite
  • Maintained the Continuous Integration/Continuous Deployment (CI/CD) pipeline

Technologies Used:

PythonPyTorchDockerGitCUDA
ServerlessLLM

ServerlessLLM

Jun 2024 - Present

An open-source serving system for cost-effective multi-LLM deployment on resource-constrained GPU environments.

Key Contributions:

  • Contributed as a core developer to ServerlessLLM, an open-source serving system for cost-effective multi-LLM deployment on resource-constrained GPU environments
  • Implemented a distributed profiling component for Ray workers to monitor and optimize performance
  • Containerized the project using Docker to streamline cross-platform deployment
  • Enhanced the auto-scaling component to enable elastic model instance scaling and efficient GPU multiplexing
  • Developed a command-line interface (CLI) and comprehensive test suites to ensure system reliability and usability

Technologies Used:

PythonCRayDockerGitGPU Computing
SER using Self-Supervised Learning and LLM

SER using Self-Supervised Learning and LLM

Sep 2024 - Dec 2024

A state-of-the-art Speech Emotion Recognition system using self-supervised models.

Key Contributions:

  • Developed a state-of-the-art Speech Emotion Recognition (SER) system by transitioning from traditional ML methods to fine-tuning self-supervised models
  • Fine-tuned the cross-lingually pre-trained model to achieve SOTA performance
  • Implemented extensive data augmentation and hyperparameter optimization techniques to enhance model robustness and generalization

Technologies Used:

PythonPyTorchscikit-learn
Virtual Memory & Cache Simulator

Virtual Memory & Cache Simulator

Sep 2023 - Dec 2023

A C-based simulator that merges cache systems with virtual memory management.

Key Contributions:

  • Engineered a C-based simulator that merges cache systems with virtual memory management, featuring TLB and Page Tables, to accurately simulate address translation from virtual to physical
  • Introduced adjustable settings for cache sizes, TLB entries, and page replacement methods, offering the ability to mimic different computing environments. This adaptability is key for analyzing how system performance varies with configuration changes
  • Established comprehensive error handling to detect, report, and resolve simulation issues, ensuring the simulator's reliability and producing precise outcomes while enhancing the user experience by minimizing disruptions

Technologies Used:

CAssemblyGit

More Projects

I'm constantly working on new projects. Check out my GitHub for the latest updates on what I'm building.

Interested in collaborating?

I'm always open to discussing new projects, creative ideas, or opportunities to be part of your team.

Get in Touch