This website is the archive for past Forum for Artificial Intelligence talks. Please click this link to navigate to the list of current talks. FAI meets every other week (or so) to discuss scientific, philosophical, and cultural issues in artificial intelligence. Both technical research topics and broader inter-disciplinary aspects of AI are covered, and all are welcome to attend! If you would like to be added to the FAI mailing list, subscribe here. If you have any questions or comments, please send email to Catherine Andersson. |
Friday, September 21, 2018, 11:00AM
|
Soft Autonomy: the Road towards Increasingly Intelligent RobotsYu Gu [homepage]
The ability for human designers to foresee uncertainties for robots, and write software programs accordingly, is severely limited by their mental simulation capabilities. This predictive approach of robot programing grows quickly in complexity and often fails to handle the infinite possibilities represented by the physical world. As a result, robots today are task-specific or “rigid”; i.e., having difficulty at handling tasks or conditions that were not planned for. During this talk, the speaker will present the vision of “soft autonomy” in making future robots adaptive, flexible, and resilient. He will draw lessons from over a decade of UAV flight testing research and the development of a sample return robot that won NASA’s Centennial challenge, and identify several research directions in making future robots more intelligent.
About the speaker:Dr. Yu Gu is an Associate Professor in the Department of Mechanical and Aerospace Engineering at West Virginia University (WVU). His main research interest is to improve robots’ ability to function in increasingly complex environments and situations. Dr. Gu has designed over a dozen UAVs and ground robots and conducted numerous experiments. He was the leader of WVU Team Mountaineers that won NASA’s Sample Return Robot Centennial Challenge in 2014, 2015, and 2016 (total prize: $855,000). Dr. Gu is currently working on a precision robotics pollinator, an autonomous planetary rover, and cooperative exploration of underground tunnels with ground and aerial robots. |
Friday, September 21, 2018, 1:00PM
|
Battling Demons in Peer ReviewNihar Shah [homepage]
Peer review is the backbone of scholarly research. It is however faced with a number of challenges (or demons) such as subjectivity, bias/miscalibration, noise, and strategic behavior. The growing number of submissions in many areas of research such as machine learning has significantly increased the scale of these demons. This talk will present some principled and practical approaches to battle these demons in peer review: (1) Subjectivity: How to ensure that all papers are judged by the same yardstick? (2) Bias/miscalibration: How to use ratings in presence of arbitrary or adversarial miscalibration? (3) Noise: How to assign reviewers to papers to simultaneously ensure fair and accurate evaluations in the presence of review noise? (4) Strategic behavior: How to insulate peer review from strategic behavior of author-reviewers? The work uses tools from social choice theory, statistics and learning theory, information theory, game theory and decision theory. (No prior knowledge on these topics will be assumed.)
About the speaker:Nihar B. Shah is an Assistant Professor in the Machine Learning and Computer Science departments at CMU. He is a recipient of the 2017 David J. Sakrison memorial prize from EECS Berkeley for a "truly outstanding and innovative PhD thesis", the Microsoft Research PhD Fellowship 2014-16, the Berkeley Fellowship 2011-13, the IEEE Data Storage Best Paper and Best Student Paper Awards for the years 2011/2012, and the SVC Aiya Medal 2010. His research interests include statistics, machine learning, and game theory, with a current focus on applications to learning from people. |
Friday, September 28, 2018, 11:00AM
|
Off-policy Estimation in Reinforcement Learning: Algorithms and ApplicationsLihong Li [homepage]
In many real-world applications of reinforcement learning (RL) such as healthcare, dialogue systems and robotics, running a new policy on humans or robots can be costly or risky. This gives rise to the critical need for off-policy estimation, that is, estimate the average reward of a target policy given data that was previously collected by another policy. In this talk, we will review some of the key techniques, such as inverse propensity score and doubly robust methods, as well as a few important applications. We will then describe a recent work that for the first time makes off-policy estimation practical for long- or even infinite-horizon RL problems. (Joint work with Qiang Liu, Ziyang Tang, Denny Zhou and many others.)
About the speaker:Lihong Li is a research scientist at Google. He obtained a PhD degree in Computer Science from Rutgers University. After that, he has held research positions in Yahoo! Research and Microsoft Research. His main research interests are in reinforcement learning, including contextual bandits, and other related problems in AI. His work has found applications in recommendation, advertising, Web search and conversation systems, and has won best paper awards at ICML, AISTATS and WSDM. He serves as area chair or senior program committee member at major AI/ML conferences such as AAAI, ICML, IJCAI and NIPS. |
Friday, October 5, 2018, 11:00AM
|
Integrate Symbolic Planning with Reinforcement Learning for Interpretable, Data-Efficient and Robust Decision MakingFangkai Yang [homepage]
Reinforcement learning and symbolic planning have both been used to build intelligent autonomous agents. Reinforcement learning relies on learning from interactions with real world, which often requires an unfeasibly large amount of experience, while deep reinforcement learning approaches are criticized for lack of interpretability. Symbolic planning relies on manually crafted symbolic knowledge, which may not be robust to domain uncertainties and changes. In this talk I explore several ways to integrate symbolic planning with hierarchical reinforcement learning to cope with decision-making in a dynamic environment with uncertainties. Symbolic plans are used to guide the agent's task execution and learning, and the learned experience is fed back to symbolic knowledge to improve planning. This method is evaluated in benchmark reinforcement learning problems, leading to data-efficient policy search and robust symbolic plans in complex domains and improve task-level interpretability.
About the speaker:Dr. Fangkai Yang is a senior research scientist in Maana Inc, Bellevue, WA. He obtained his Ph.D degree on computer science from UT-Austin in 2014 under the supervision with Prof. Vladimir Lifschitz and and close collaboration with Prof. Peter Stone, working on studying theoretical foundations of representing, reasoning and planning with actions in logic formalisms and applications on task planning and learning for mobile intelligent robots. From 2014-2017 he was a research engineer of Schlumberger, Houston where he was involved in research and development on task planning and execution for the next generation of autonomous drilling rig and several other projects using answer set programming on industrial products for planning, scheduling and optimization. From 2017 to present he works for Maana Inc, focusing on integrating symbolic planning and reinforcement learning to build AI-powered decision making support platform. His research work is publicized in major AI conferences and journals such as IJCAI, KR, ICLP, ICAPS, LPNMR, TPLP, IJRR, etc. |
Friday, October 12, 2018, 11:00AM
|
Human Allied Statistical Relational AISriraam Natarajan [homepage]
Statistical Relational AI (StaRAI) Models combine the powerful formalisms of probability theory and first-order logic to handle uncertainty in large, complex problems. While they provide a very effective representation paradigm due to their succinctness and parameter sharing, efficient learning is a significant problem in these models. First, I will discuss state-of-the-art learning methods based on boosting that is representation independent. Our results demonstrate that learning multiple weak models can lead to a dramatic improvement in accuracy and efficiency. One of the key attractive properties of StaRAI models is that they use a rich representation for modeling the domain that potentially allows for seam-less human interaction. However, in current StaRAI research, the human is restricted to either being a mere labeler or being an oracle who provides the entire model. I will present the recent progress that allows for more reasonable human interaction where the human input is taken as “advice” and the learning algorithm combines this advice with data. Finally, I will discuss more recent work on soliciting advice from humans as needed that allows for seamless interactions with the human expert.
About the speaker:Sriraam Natarajan is an Associate Professor at the Department of Computer Science at University of Texas Dallas. He was previously an Associate Professor and earlier an Assistant Professor at Indiana University, Wake Forest School of Medicine, a post-doctoral research associate at University of Wisconsin-Madison and had graduated with his PhD from Oregon State University. His research interests lie in the field of Artificial Intelligence, with emphasis on Machine Learning, Statistical Relational Learning and AI, Reinforcement Learning, Graphical Models and Biomedical Applications. He has received the Young Investigator award from US Army Research Office, Amazon Faculty Research Award, Intel Faculty Award, XEROX Faculty Award and the IU trustees Teaching Award from Indiana University. He is a co-editor-in-chief of the machine learning section of Fronteirs in Big Data journal, an editorial board member of MLJ, JAIR and DAMI journals and the electronics publishing editor of JAIR. He is the organizer of the key workshops in the field of Statistical Relational Learning and has co-organized the AAAI 2010, the UAI 2012, AAAI 2013, AAAI 2014, UAI 2015 workshops on Statistical Relational AI (StarAI), ICML 2012 Workshop on Statistical Relational Learning, and the ECML PKDD 2011 and 2012 workshops on Collective Learning and Inference on Structured Data (Co-LISD). He was also the co-chair of the AAAI student abstract and posters at AAAI 2014 and AAAI 2015 and the chair of the AAAI students outreach at AAAI 2016 and 2017. |
Friday, October 19, 2018, 11:00AM
|
Towards Open-domain Generation of Programs from Natural LanguageGraham Neubig [homepage]
Code generation from natural language is the task of generating
programs written in a programming language (e.g. Python) given a
command in natural language (e.g. English). For example, if the input
is "sort list x in reverse order", then the system would be required
to output "x.sort(reverse=True)" in Python. In this talk, I will talk
about (1) machine learning models to perform this code generation, (2)
methods for mining data from programming web sites such as stack
overflow, and (3) methods for semi-supervised learning, that allow the
model to learn from either English or Python on its own, without the
corresponding parallel data.
About the speaker:Graham Neubig is an assistant professor at the Language Technologies Institute of Carnegie Mellon University. His work focuses on natural language processing, specifically multi-lingual models that work in many different languages, and natural language interfaces that allow humans to communicate with computers in their own language. Much of this work relies on machine learning to create these systems from data, and he is also active in developing methods and algorithms for machine learning over natural language data. He publishes regularly in the top venues in natural language processing, machine learning, and speech, and his work occasionally wins awards such as best papers at EMNLP, EACL, and WNMT. He is also active in developing open-source software, and is the main developer of the DyNet neural network toolkit. |
Friday, October 26, 2018, 11:00AM
|
Towards Globally Beneficial AIStefano Ermon [homepage]
Recent technological developments are creating new spatio-temporal data streams that contain a wealth of information relevant to sustainable development goals. Modern AI techniques have the potential to yield accurate, inexpensive, and highly scalable models to inform research and policy. A key challenge, however, is the lack of large quantities of labeled data that often characterize successful machine learning applications. In this talk, I will present new approaches for learning useful spatio-temporal models in contexts where labeled training data is scarce or not available at all. I will show applications to predict and map poverty in developing countries, monitor agricultural productivity and food security outcomes, and map infrastructure access in Africa. Our methods can reliably predict economic well-being using only high-resolution satellite imagery. Because images are passively collected in every corner of the world, our methods can provide timely and accurate measurements in a very scalable end economic way. Finally, I will disucss opportunities and challenges for using these predictions to support decision making, including technqiques calibration and for inferring human preferences from data.
About the speaker:Stefano Ermon is an Assistant Professor of Computer Science in the CS Department at Stanford University, where he is affiliated with the Artificial Intelligence Laboratory, and a fellow of the Woods Institute for the Environment. His research is centered on techniques for probabilistic modeling of data, inference, and optimization, and is motivated by a range of applications, in particular ones in the emerging field of computational sustainability. He has won several awards, including four Best Paper Awards (AAAI, UAI and CP), a NSF Career Award, an ONR Young Investigator Award, a Sony Faculty Innovation Award, an AWS Machine Learning Award, a Hellman Faculty Fellowship, and the IJCAI Computers and Thought Award. Stefano earned his Ph.D. in Computer Science at Cornell University in 2015. |
Friday, November 9, 2018, 11:00AM
|
Anytime probabilistic inferenceAlex Ihler [homepage]
Probabilistic graphical models are a powerful tool for representing systems with complex interdependence and uncertainty. Flexible languages such as probabilistic logic have made these models easier to express and more accessible to non-experts. However, exact inference is intractable, and while approximate methods can be effective in practice it is often difficult to know what method will be effective or have confidence in the quality of their results. I will describe a framework that combines strengths from the three major approximation paradigms (variational bounds, search, and sampling) to provide guaranteed confidence intervals and smoothly trade off computational effort with their quality. The resulting algorithms give fast and accurate results for a wide variety of models by automatically adapting to the properties of a given problem instance.
About the speaker:Alexander Ihler is an Associate Professor in the Department of Computer Science at the University of California, Irvine, and Director for UCI's Center for Machine Learning and Intelligent Systems. He received his Ph.D. in Electrical Engineering and Computer Science from MIT in 2005 and a B.S. with honors from Caltech in 1998. His research focuses on machine learning, including probabilistic graphical models and deep learning, with applications to areas such as sensor networks, computer vision, data mining, biology and physics. He is the recipient of an NSF CAREER award and several best paper awards at conferences including NIPS, IPSN, and AISTATS. |
Monday, November 19, 2018, 2:00PM
|
Diversity-promoting and large-scale machine learning for healthcarePengtao Xie [homepage]
In healthcare, a tsunami of medical data has emerged, including electronic health records, images, literature, etc. These data can be heterogeneous and noisy, which renders clinical decision-making time-consuming, error-prone and suboptimal. In this thesis, we develop machine learning (ML) models and systems for distilling high-value patterns from unstructured clinical data and making informed and real-time medical predictions and recommendations, to aid physicians in improving the efficiency of workflow and quality of patient care. When developing these models, we encounter several challenges: (1) How to better capture infrequent clinical patterns, such as rare subtypes of diseases; (2) How to make the models generalize well on unseen patients? (3) How to promote the interpretability of the decisions? (4) How to improve the timeliness of decision-making without sacrificing its quality? (5) How to efficiently discover massive clinical patterns from large-scale data? To address challenges (1-4), we systematically study diversity-promoting learning, which encourages the components in ML models (1) to diversely spread out to give infrequent patterns a broader coverage, (2) to be imposed with structured constraints for better generalization performance, (3) to be mutually complementary for more compact representation of information, and (4) to be less redundant for better interpretation. The study is performed in the context of both frequentist statistics and Bayesian statistics. In the former, we develop diversity-promoting regularizers that are empirically effective, theoretically analyzable and computationally efficient. In the latter, we develop Bayesian priors that effectively entail an inductive bias of "diversity" among a finite or infinite number of components and facilitate the development of efficient posterior inference algorithms. To address challenge (5), we study large-scale learning. Specifically, we design efficient distributed ML systems by exploiting a system-algorithm co-design approach. Inspired by a sufficient factor property of many ML models, we design a peer-to-peer system -- Orpheus -- that significantly reduces communication and fault tolerance costs. We apply the proposed diversity-promoting learning (DPL) techniques and distributed ML systems to address several critical issues in healthcare, including discharge medication prediction, automatic ICD code filling, automatic generalization of medical-imaging reports, similar-patient retrieval, hierarchical multi-label tagging of medical images, and large-scale medical-topic discovery. Evaluations on various clinical datasets demonstrate the effectiveness of the DPL methods and efficiency of the Orpheus system.
About the speaker:Pengtao Xie is a research scientist at Petuum Inc, leading the research and product development in machine learning for multiple vertical domains, including healthcare, manufacturing, and finance. In his PhD study in the Machine Learning Department at Carnegie Mellon University, he worked on latent space models and distributed machine learning, with application to clinical decision-makings. He published about thirty papers at top-tiered machine learning, natural language processing, computer vision, and data mining conferences including ICML, JMLR, ACL, ICCV, KDD, UAI, IJCAI, AAAI, and serve as program committee members or reviewers for about twenty renowned conferences and journals. He won the 2018 Innovator Award presented by the Pittsburgh Business Times. He was recognized as a Siebel Scholar and was a recipient of the Goldman Sachs Global Leader Scholarship and the National Scholarship of China. He received MS degrees from Carnegie Mellon University and Tsinghua University and BS from Sichuan University. |
Monday, November 26, 2018, 11:00AM
|
The Blocks World ReduxMartha Palmer [homepage]
Martha Palmer is the Helen & Hubert Croft Endowed Professor of Engineering in the Computer Science Department, and an Arts & Sciences Professor of Distinction in the Linguistics Department, at the University of Colorado, with a split appointment. She is also an Institute of Cognitive Science Faculty Fellow, a co-Director of CLEAR and an Association of Computational Linguistics (ACL) Fellow. She won an Outstanding Graduate Advisor 2014 Award, a Boulder Faculty Assembly 2010 Research Award and was the Director of the 2011 Linguistics Institute in Boulder, CO. Her research is focused on capturing elements of the meanings of words that can comprise automatic representations of complex sentences and documents in English, Chinese, Arabic, Hindi, and Urdu, funded by DARPA and NSF. A more recent focus is the application of these methods to biomedical journal articles and clinical notes, funded by NIH, and the geo- and bio-sciences, funded by NSF. She co-edits LiLT, Linguistic Issues in Language Technology, and has been a co-editor of the Journal of Natural Language Engineering and on the CLJ Editorial Board. She is a past President of ACL, past Chair of SIGLEX, was the Founding Chair of SIGHAN, and has well over 200 peer-reviewed publications.
About the speaker:Martha Palmer is the Helen & Hubert Croft Endowed Professor of Engineering in the Computer Science Department, and an Arts & Sciences Professor of Distinction in the Linguistics Department, at the University of Colorado, with a split appointment. She is also an Institute of Cognitive Science Faculty Fellow, a co-Director of CLEAR and an Association of Computational Linguistics (ACL) Fellow. She won an Outstanding Graduate Advisor 2014 Award, a Boulder Faculty Assembly 2010 Research Award and was the Director of the 2011 Linguistics Institute in Boulder, CO. Her research is focused on capturing elements of the meanings of words that can comprise automatic representations of complex sentences and documents in English, Chinese, Arabic, Hindi, and Urdu, funded by DARPA and NSF. A more recent focus is the application of these methods to biomedical journal articles and clinical notes, funded by NIH, and the geo- and bio-sciences, funded by NSF. She co-edits LiLT, Linguistic Issues! in Language Technology, and has been a co-editor of the Journal of Natural Language Engineering and on the CLJ Editorial Board. She is a past President of ACL, past Chair of SIGLEX, was the Founding Chair of SIGHAN, and has well over 200 peer-reviewed publications. |
Monday, November 26, 2018, 2:00PM
|
Building a Robotics Intelligence Architecture for Understanding and ActingEthan Stump [homepage]
Enabling robots to act as teammates rather than tools has been the core focus of a 10-year Army Research Laboratory program that is currently wrapping up. We will focus in our consortium’s development of an Intelligence Architecture that seeks to bridge natural language understanding, reasoning, planning, and control in order to enable robot teams to participate in complex, dynamic missions guided by human dialogue. The integrated system thus represents a wide swath of robotics research and gives us a structure to think about the interaction between probabilistic, symbolic, and metric techniques in pursuit of useful AI. We will highlight some of the successes and the many open questions that remain.
About the speaker:Ethan A. Stump is a researcher within the U.S. Army Research Laboratory’s Computational and Information Sciences Directorate, where he works on online machine learning applied to robotics with a focus on human-guided reinforcement learning. Dr. Stump is currently the Government Lead for Distributed Intelligence in the Distributed Collaborative Intelligent Systems Technologies (DCIST) Consortium, the Government Lead for Intelligence in the Robotics Consortium, and co-PI in an internal initiative on Human-in-the-Loop Reinforcement Learning. During his time at ARL, he was worked on diverse robotics-related topics including implementing mapping and navigation technologies to enable baseline autonomous capabilities for teams of ground robots and developing controller synthesis for managing the deployment of multi-robot teams to perform repeating tasks such as persistent surveillance by tying them formal task specifications. |
Friday, November 30, 2018, 11:00AM
|
Grounding Reinforcement Learning with Real-world Dialog ApplicationsZhou Yu [homepage]
Recently with the wide-spread of conversational devices, more and more
people started to realize the importance of dialog research. However,
some of them are still living in a simulated world, using simulated
data such as Facebook bAbI. In this talk, we emphasize that dialog
research needs to be grounded with the real need of users. We
introduce three user-centered task-oriented dialog systems utilized
reinforcement learning. The first system is a dialog systems that
utilized reinforcement learning to interleave social conversation and
task conversation to promote movies more effectively. The second
system is a sentiment adaptive bus information search system. It uses
sentiment as immediate reward to help the end-to-end RL dialog
framework to converge faster and better. The trained dialog policy
will also have a user friendly effect. It would adapt to user’s
sentiment when choosing dialog action templates. For example, the
policy will pick template that provides more detailed instructions
when user is being negative. This is extremely useful for customer
service dialog systems where users frequently get angry. The third
system is a task-oriented visual dialog systems. It uses a
hierarchical reinforcement learning to track multimZodal dialog states
and decide among sub tasks of whether to ask more information or just
give an answer. Such system can complete the task more successfully
and effectively. We are conducting a further experiment to deploy the
system as a shopping assistant.
About the speaker:Zhou is an Assistant Professor at the Computer Science Department in UC Davis. She received her PhD in the Language Technology Institute under School of Computer Science, Carnegie Mellon University. She was recently featured in Forbes as 2018 30 under 30 in Science. She was also a recipient of Rising stars in EECS in 2015. Dr. Yu received a B.S. in Computer Science and a B.A. in Linguistics from Zhejiang University in 2011. Zhou's team recently was selected as one out of the 8 groups to compete in AmazonAlexa Prize Challenge with $250,000 (https://developer.amazon.com/alexaprize). Zhou's group also received research awards and gifts from various companies, such as Intel, Tencent, Cisco and Bosh. |
Monday, March 11, 2019, 10:30AM
|
Scalable methods for computing state similarity in deterministic Markov Decision ProcessesPablo Castro [homepage]
We present new algorithms for computing and approximating bisimulation metrics in Markov Decision Processes (MDPs). Bisimulation metrics are an elegant formalism that capture behavioral equivalence between states and provide strong theoretical guarantees. Unfortunately, their computation is expensive and requires a tabular representation of the states, which has thus far rendered them impractical for large problems. In this paper we present two new algorithms for approximating bisimulation metrics in large, deterministic MDPs. The first does so via sampling and is guaranteed to converge to the true metric. The second is a differentiable loss which allows us to learn an approximation even for continuous state MDPs, which prior to this work has not been possible.
About the speaker:Pablo was born and raised in Quito, Ecuador, and moved to Montreal after high school to study at McGill. He stayed in Montreal for the next 10 years, finished his bachelors, worked at a flight simulator company, and then eventually obtained his masters and PhD at McGill, focusing on Reinforcement Learning. After his PhD Pablo did a 10-month postdoc in Paris before moving to Pittsburgh to join Google. He has worked at Google for almost 7 years, and is currently a research Software Developer in Google Brain in Montreal, focusing on fundamental Reinforcement Learning research, as well as Machine Learning and Creativity. Aside from his interest in coding/AI/math, Pablo is an active musician (https://www.psctrio.com/), loves running (5 marathons so far, including Boston!), and discussing politics and activism. |
Tuesday, April 16, 2019, 10:00AM
|
The Joy of Finding Out: Adventures in teaching people how to do online researchDan Russell [homepage]
I've been teaching people how to be more effective online researchers for the past decade. In that time, I've taught thousands of people (think students, librarians, professional researchers, and just plain folks) how to find out what they seek through Google (and many other online resources and tools). Just as importantly, I also teach a good deal of when the researcher should switch from online to offline content (and back). This talk covers my experiences in learning how to teach these skills, and what I've learned from direct interactions with my students and from various studies I've run in the lab and with live search traffic. I'll discuss my MOOC (PowerSearchingWithGoogle.com), which has had over 4M students, my live classes, and various publications in paper, book, and video formats. I can tell you which methods work best, and why. I'll also talk about the mental models that people have of search systems (and the ways in which those are often incorrect).
About the speaker:Daniel Russell is Google's Senior Research Scientist for Search Quality and User Happiness in Mountain View. He earned his PhD in computer science, specializing in Artificial Intelligence. These days he realizes that amplifying human intelligence is his real passion. His day job is understanding how people search for information, and the ways they come to learn about the world through Google. Dan's current research is to understand how human intelligence and artificial intelligence can work together to better than either as a solo intelligence. His 20% job is teaching the world to search more effectively. His MOOC, PowerSearchingWithGoogle.com, is currently hosting over 3,000 learners / week in the course. In the past 3 years, 4 million students have attended his online search classes, augmenting their intelligence with AI. His instructional YouTube videos have a cumulative runtime of over 350 years (24 hours/day; 7 days/week; 365 weeks/year). |
Friday, May 3, 2019, 11:00AM
|
Graphical Models and Approximate Counting ProblemsNicholas Ruozzi [homepage]
Markov random fields provide a unified framework to represent probability distributions that factorize over a given graph. As inference in these models is often intractable, approximate methods such as belief propagation (BP) are often used in practice. However, theoretically characterizing the quality of the BP approximation has proved challenging. Recent work has demonstrated that these types of approximations can yield provable lower bounds on a variety of combinatorial counting problems including the matrix permanent and the partition function of the ferromagnetic Ising model. I’ll describe the known results and a new conjectured lower bound on the number of graph homomorphisms from a bipartite graph G into an arbitrary graph H.
About the speaker:Nicholas Ruozzi is currently an Assistant Professor of Computer Science at the University of Texas at Dallas. His research interests include graphical models, approximate inference, variational methods, and explainable machine learning. Before joining UTD he was a postdoctoral researcher at Columbia University, and he completed his Ph.D. at Yale University. |
Friday, May 3, 2019, 3:00PM
|
State Abstraction in Reinforcement LearningDavid Abel [homepage]
Reinforcement Learning presents a challenging problem: agents must generalize experiences, efficiently explore their world, and learn from feedback that is sparse and delayed, all under a limited computational budget. Abstraction can help make each of these endeavors more tractable. Through abstraction, agents form concise models of both their surroundings and behavior, enabling effective decision making in diverse and complex environments. In this talk I explore abstraction's role in reinforcement learning, with a focus on state abstraction. I present classes of state abstractions that can (1) preserve near-optimal behavior, (2) be transferred across similar tasks, and (3) induce a trade-off between performance and learning or planning time. Collectively, these results provide a partial path toward abstractions that minimize the complexity of decision making while retaining near-optimality.
About the speaker:David Abel is a fourth year PhD candidate in CS at Brown University, advised by Michael Littman. His work focuses on the theory of reinforcement learning, with occasional ventures into philosophy and computational sustainability. Before Brown, he received his bachelors in CS and philosophy from Carleton College. |