Adversarial Paralinguistic Benchmark
28k+ examples with mismatched text-audio cues to diagnose transcript over-reliance in speech-LLMs (work in progress).
I build multimodal and multi-agent LLM systems with applications in affective computing and social intelligence.
I am currently a CS(AI) Master's student at USC Viterbi (graduating in December 2025).
I am also working as a research assistant at the
Intelligent Human Perception Lab, USC ICT
with Prof. Mohammad soleymani and
Ashutosh Chaubey, where I focus on building and evaluating multimodal LLMs for paralinguistic understanding.
I am also working at the Melady Lab at USC Viterbi
with Prof. Yan Liu and Wei Yang
on multi-agent collaboration and reinforcement learning.
28k+ examples with mismatched text-audio cues to diagnose transcript over-reliance in speech-LLMs (work in progress).
Encoder-decoder captioning model with audio-textual fusion, enhancing match rate by 20% with X-Norm vs early/late fusion.
Detection through exposing model overfitting to mislabeled or low-confidence examples, identified with majority voting among model judges.
More details on CV...
More under double-blind review at ICLR.