• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •
FAMO: Fast Adaptive Multitask Optimization.
Bo Liu, Yihao Feng, Peter Stone, and Qiang Liu.
In Neural Information Processing Systems Foundation,
July 2023.
One of the grand enduring goals of AI is to create generalist agents that canlearn multiple different tasks from diverse data via multitask learning (MTL).However, in practice, applying gradient descent (GD) on the average loss acrossall tasks may yield poor multitask performance due to severe under-optimizationof certain tasks. Previous approaches that manipulate task gradients for a morebalanced loss decrease require storing and computing all task gradients (O(k)space and time where k is the number of tasks), limiting their use in large-scalescenarios. In this work, we introduce Fast Adaptive Multitask Optimization(FAMO), a dynamic weighting method that decreases task losses in a balanced wayusing O(1) space and time. We conduct an extensive set of experiments coveringmulti-task supervised and reinforcement learning problems. Our results indicatethat FAMO achieves comparable or superior performance to state-of-the-artgradient manipulation techniques while offering significant improvements in spaceand computational efficiency. Code is available athttps://github.com/Cranial-XIX/FAMO.
@InProceedings{bo_liu_neurips_2023, author = {Bo Liu and Yihao Feng and Peter Stone and Qiang Liu}, title = {{FAMO}: Fast Adaptive Multitask Optimization}, booktitle = {Neural Information Processing Systems Foundation}, year = {2023}, month = {July}, location = {New Orleans, United States}, abstract = {One of the grand enduring goals of AI is to create generalist agents that can learn multiple different tasks from diverse data via multitask learning (MTL). However, in practice, applying gradient descent (GD) on the average loss across all tasks may yield poor multitask performance due to severe under-optimization of certain tasks. Previous approaches that manipulate task gradients for a more balanced loss decrease require storing and computing all task gradients (O(k) space and time where k is the number of tasks), limiting their use in large-scale scenarios. In this work, we introduce Fast Adaptive Multitask Optimization (FAMO), a dynamic weighting method that decreases task losses in a balanced way using O(1) space and time. We conduct an extensive set of experiments covering multi-task supervised and reinforcement learning problems. Our results indicate that FAMO achieves comparable or superior performance to state-of-the-art gradient manipulation techniques while offering significant improvements in space and computational efficiency. Code is available at https://github.com/Cranial-XIX/FAMO. }, }
Generated by bib2html.pl (written by Patrick Riley ) on Tue Nov 19, 2024 10:24:41