UTCS Artificial Intelligence
courses
talks/events
demos
people
projects
publications
software/data
labs
areas
admin
Directly Optimizing Evaluation Metrics to Improve Text to Motion (2023)
Yili Wang
There is a long-existing discrepancy between training and testing process of most generative models including both text-to-text models like machine translation (MT), and multi-modal models like image captioning and text-to-motion generation. These models are usually trained to optimize a specific objective like log-likelihood (MLE) in the Seq2Seq models or the KL-divergence in the variational autoencoder (VAE) models. However, they are tested using different evaluation metrics such as the BLEU score and Fréchet Inception Distance (FID). Our paper aims to address such discrepancy in text-to-motion generation models by developing algorithms to directly optimize the target metric during training time. We explore three major techniques: reinforcement learning, contrastive learning methods, and differentiable metrics that are originally applied to natural language processing fields and adapt them to the language-and-motion domain.
View:
PDF
Citation:
Masters Thesis, Department of Computer Science, UT Austin.
Bibtex:
@mastersthesis{wang:msthesis, title={Directly Optimizing Evaluation Metrics to Improve Text to Motion}, author={Yili Wang}, month={May}, school={Department of Computer Science, UT Austin}, url="http://www.cs.utexas.edu/users/ai-labpub-view.php?PubID=128015", year={2023} }
People
Yili Wang
Masters Alumni
ywang98 [at] utexas edu
Areas of Interest
Connecting Language and Perception
Deep Learning
Labs
Machine Learning