-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
372 lines (327 loc) · 24 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
<!DOCTYPE html>
<html lang="en">
<head>
<title>
UCLA NLP Seminar Series
</title>
<!-- Next line is for the nice mobile view -->
<link rel="apple-touch-icon" sizes="180x180" href="https://uclanlp.github.io/nlp-seminar/icons/apple-touch-icon.png">
<link rel="icon" type="image/png" sizes="32x32" href="https://uclanlp.github.io/nlp-seminar/icons/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="16x16" href="https://uclanlp.github.io/nlp-seminar/icons/favicon-16x16.png">
<link rel="icon" type="image/x-icon" href="https://uclanlp.github.io/nlp-seminar/icons/favicon.ico">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="https://uclanlp.github.io/nlp-seminar/style.css">
</head>
<script src="https://uclanlp.github.io/nlp-seminar/script.js"></script>
<body>
<header>
<div class="header-content">
<div class="lab-logos">
<img src="https://web.cs.ucla.edu/~kwchang/img/uclanlp.png" alt="Kai-Wei Chang's Lab">
</div>
<div class="header-text">
<h1>UCLA NLP Seminar Series</h1>
<p>Welcome to our weekly seminar series.</p>
</div>
<div class="lab-logos">
<img src="https://web.cs.ucla.edu/~kwchang/img/uclanlp.png" alt="Kai-Wei Chang's Lab">
</div>
</div>
</header>
<nav class="nav-container">
<div class="nav-menu">
<ul class="menu-list">
<li class="menu-item"><a href="https://uclanlp.github.io/nlp-seminar/">Home</a></li>
<li class="menu-item"><a href="past_talks.html">Past Talks</a></li>
</ul>
</div>
</nav>
<main>
<h2>Talk Schedule for Winter 2025</h2>
<table class="seminar-schedule">
<thead>
<tr>
<th>Date</th>
<th>Speaker</th>
<th>Title</th>
</tr>
</thead>
<tbody>
<tr>
<!-- <td>Postponed</td>
<td><a href="https://swabhs.com/">Swabha Swayamdipta</a></td>
<td>Ensuring Safety and Accountability in LLMs, Pre- and Post Training</td>
</tr> -->
<!-- <tr>
<td>Postponed</td>
<td><a href="https://people.ischool.berkeley.edu/~dbamman/">David Bamman</a></td>
<td>Building Accountable NLP Models for Social Good</td>
</tr> -->
<tr>
<td>Jan 24</td>
<td><a href="https://wellecks.com/">Sean Welleck</a></td>
<td>Reasoning with Inference-Time Compute</td>
</tr>
<tr>
<td>Jan 31</td>
<td><a href="https://www.sarahooker.me/">Sara Hooker</a></td>
<td>Understanding the role of data, scale and capacity in recent breakthroughs </td>
</tr>
<tr>
<td>Feb 7</td>
<td><a href="https://natashajaques.ai/">Natasha Jaques</a></td>
<td>Social Reinforcement Learning for pluralistic alignment and human-AI interaction</td>
</tr>
<tr>
<td>Feb 14</td>
<td><a href="https://www.alexander-spangher.com/">Alexander Spangher</a></td>
<td>Planning in Creative Contexts</td>
</tr>
<tr>
<td>Feb 21</td>
<td><a href="https://izmailovpavel.github.io/">Pavel Izmailov</a></td>
<td>Weak to Strong Generalization</td>
</tr>
</tbody>
</table>
<h2>🚀 Upcoming Talks </h2>
<!-- Natasha Block -->
<div class="talk" onclick="toggleDetails(this)">
<div class="talk-summary">
<div class="date">
<div class="month">FEB</div>
<div class="day">7</div>
</div>
<div class="speaker-image">
<a href="https://natashajaques.ai/" target="_block">
<img src="https://natashajaques.ai/author/natasha-jaques/avatar_hu04b955ed351a495cc2c1096ede1d3b28_762133_270x270_fill_q75_lanczos_center.jpg" alt="Natasha Jaques" style="max-height: 300px;"/>
</a>
</div>
<div class="speaker-text">
<h3>Social Reinforcement Learning for pluralistic alignment and human-AI interaction</h3>
<p><img src="icons/person.png" alt="Person Icon" class="icon"><a href="https://natashajaques.ai/">Natasha Jaques</a></p>
<p><img src="icons/clock.png" alt="Clock Icon" class="icon">Feb 7, 2025, 2:00 PM</p>
<p><img src="icons/location.png" alt="Location Icon" class="icon">Virtual Talk</p>
</div>
<div class="footer">
<button class="more-details">More Details</button>
</div>
</div>
<div class="talk-details">
<p><strong>Speaker Bio:</strong> Natasha Jaques is an Assistant Professor of Computer Science and Engineering at the University of Washington, and a Senior Research Scientist at Google DeepMind. Her research focuses on Social Reinforcement Learning in multi-agent and human-AI interactions. During her PhD at MIT, she developed techniques for learning from human feedback signals to train language models which were later built on by OpenAI’s series of work on Reinforcement Learning from Human Feedback (RLHF). In the multi-agent space, she has developed techniques for improving coordination through social influence, and unsupervised environment design. Natasha’s work has received various awards, including Best Demo at NeurIPS, an honourable mention for Best Paper at ICML, and the Outstanding PhD Dissertation Award from the Association for the Advancement of Affective Computing. Her work has been featured in Science Magazine, MIT Technology Review, Quartz, IEEE Spectrum, Boston Magazine, and on CBC radio, among others. Natasha earned her Masters degree from the University of British Columbia, undergraduate degrees in Computer Science and Psychology from the University of Regina, and was a postdoctoral fellow at UC Berkeley. </p>
<p><strong>Abstract:</strong> If AGI is right around the corner, why are AI agents still so bad at so many tasks? AI still fails to coordinate effectively with other agents, follow natural language instructions to complete embodied tasks, and generalize to circumstances not encountered during training. Even in pure language settings like dialog, AI still fails to adapt to the needs of individual users, instead aligning to a single set of values that may ignore the needs of minority groups. In this talk, I will argue that Social Learning is a key facet of intelligence that enables both humans and animals to easily adapt to new circumstances, coordinate with different people, and acquire complex behaviors. By improving the social intelligence of AI agents, we can get a step closer to adaptive, flexible, generalist agents which better align to diverse human values. This talk will overview recent work in the Social Reinforcement Learning lab, describing how to enable pluralistic alignment of large language models using human feedback, smooth coordination with diverse human partners, and improve social reasoning for understanding natural language commands.</p>
</div>
</div>
<!-- Alex block -->
<div class="talk" onclick="toggleDetails(this)">
<div class="talk-summary">
<div class="date">
<div class="month">FEB</div>
<div class="day">14</div>
</div>
<div class="speaker-image">
<a href="https://www.alexander-spangher.com" target="_block">
<img src="https://www.alexander-spangher.com/assets/img/isi-headshot-3.jpg" alt="Alexander Spangher" style="max-height: 300px;"/>
</a>
</div>
<div class="speaker-text">
<h3>Planning in Creative Contexts</h3>
<p><img src="icons/person.png" alt="Person Icon" class="icon"><a href="https://www.alexander-spangher.com/">Alexander Spangher</a></p>
<p><img src="icons/clock.png" alt="Clock Icon" class="icon">Feb 14, 2025, 2:00 PM</p>
<p><img src="icons/location.png" alt="Location Icon" class="icon">289, Engineering VI</p>
</div>
<div class="footer">
<button class="more-details">More Details</button>
</div>
</div>
<div class="talk-details">
<p><strong>Speaker Bio:</strong> Alexander Spangher is pursuing his PhD in computer science at the University of Southern California; he is formerly a writer and data scientist at the New York Times. He focuses on computational journalism and is advised by Jonathan May, Emilio Ferrara and Nanyun Peng. His research is broad and has pursued the following side directions: he has worked at Microsoft Research under the mentorship of Eric Horvitz to detect misinformation. He has collaborated with EleutherAI to build state-of-the-art symbolic music models. Finally, he has collaborated with the MIT Plasma Science and Fusion Center (PFSC) to model disruptions in nuclear fusion reactions. His work has received numerous awards: 2 Outstanding Paper Awards at EMNLP 2024, 1 Spotlight Award at ICML 2024, and an Outstanding Paper Award at NAACL 2022. He is fortunate to be supported by a 4-year Bloomberg PhD Fellowship. </p>
<p><strong>Abstract:</strong> Recent modeling innovations incorporate planning — or reasoning about actions (exhibited by models like GPT-o1 and Deepseek's R1) — and have demonstrated impressive performance gains in areas like mathematical problem-solving and computer coding. However, such domains are characterized by well-defined goals (or rewards). For many human-centered tasks in creative contexts, rewards are not as clearly defined and it is thus not clear how to make similar progress in these domains. In this talk, I will outline a research agenda that can enable us to make progress in these fundamentally human processes. I focus on tasks related to journalism, where there is a pressing need for technical innovation. Specifically, in this talk I will focus on the task of retrieving sources relevant to news stories: I will show how (1) we can make inferences about human actions based on environmental state-observations (a process known to cognitive psychologists as "end-state" or "ghost conditions", but as yet unexplored in machine learning); and, (2) how these inferences can help us learn human values and rewards. </p>
</div>
</div>
<!-- Pavel Block -->
<div class="talk" onclick="toggleDetails(this)">
<div class="talk-summary">
<div class="date">
<div class="month">Feb</div>
<div class="day">21</div>
</div>
<div class="speaker-image">
<a href="https://izmailovpavel.github.io/" target="_block">
<img src="https://izmailovpavel.github.io/imgs/pavel-alaska.jpeg" alt="Prof. Pavel Izmailov" width="500" height="600";/>
</a>
</div>
<div class="speaker-text">
<h3>Weak to Strong Generalization</h3>
<p><img src="icons/person.png" alt="Person Icon" class="icon"><a href="https://izmailovpavel.github.io/">Pavel Izmailov</a></p>
<p><img src="icons/clock.png" alt="Clock Icon" class="icon">Feb 21, 2025, 2:00 PM</p>
<p><img src="icons/location.png" alt="Location Icon" class="icon">289, Engineering VI</p>
</div>
<div class="footer">
<button class="more-details">More Details</button>
</div>
</div>
<div class="talk-details">
<p><strong>Speaker Bio:</strong> I am a Researcher at Anthropic. I am primarily interested in reasoning, AI for science and AI alignment. Previously, I worked on reasoning and problem solving in language models at OpenAI. I contributed to the recent OpenAI o1 models, a new state-of-the-art in LLM reasoning. I have also worked on weak-to-strong-generalization on the superalignment team under Jeff Wu, Jan Leike and Ilya Sutskever. I also had a short stint at xAI, where I reported to Elon Musk. Starting in Fall 2025, I will be joining NYU as an Assistant Professor in the Tandon CSE department, and Courant CS department by courtesy. I am also a member of the NYU CILVR Group. I defended my PhD in Computer Science at NYU, in 2023.</p>
<p><strong>Abstract:</strong> Widely used alignment techniques, such as reinforcement learning from human feedback (RLHF), rely on the ability of humans to supervise model behavior—for example, to evaluate whether a model faithfully followed instructions or generated safe outputs. However, future superhuman models will behave in complex ways too difficult for humans to reliably evaluate; humans will only be able to weakly supervise superhuman models. We study an analogy to this problem: can weak model supervision elicit the full capabilities of a much stronger model? We test this using a range of pretrained language models in the GPT-4 family on natural language processing (NLP), chess, and reward modeling tasks. We find that when we naively finetune strong pretrained models on labels generated by a weak model, they consistently perform better than their weak supervisors, a phenomenon we call weak-to-strong generalization. However, we are still far from recovering the full capabilities of strong models with naive finetuning alone, suggesting that techniques like RLHF may scale poorly to superhuman models without further work. We find that simple methods can often significantly improve weak-to-strong generalization: for example, when finetuning GPT-4 with a GPT-2-level supervisor and an auxiliary confidence loss, we can recover close to GPT-3.5-level performance on NLP tasks. Our results suggest that it is feasible to make empirical progress today on a fundamental challenge of aligning superhuman models. </p>
</div>
</div>
<h2>🚨 Past Talks </h2>
<!-- Sean Block -->
<div class="talk" onclick="toggleDetails(this)">
<div class="talk-summary">
<div class="date">
<div class="month">JAN</div>
<div class="day">24</div>
</div>
<div class="speaker-image">
<a href="https://wellecks.com/" target="_block">
<img src="https://uclanlp.github.io/nlp-seminar/icons/welleck_photo.png" alt="Sean Welleck" style="max-height: 300px;"/>
</a>
</div>
<div class="speaker-text">
<h3>Reasoning with Inference-Time Compute</h3>
<p><img src="icons/person.png" alt="Person Icon" class="icon"><a href="https://wellecks.com/">Sean Welleck</a></p>
<p><img src="icons/clock.png" alt="Clock Icon" class="icon">Jan 24, 2025, 2:00 PM</p>
<p><img src="icons/location.png" alt="Location Icon" class="icon">Virtual Talk</p>
</div>
<div class="footer">
<button class="more-details">More Details</button>
</div>
</div>
<div class="talk-details">
<p><strong>Speaker Bio:</strong> Sean Welleck is an Assistant Professor at Carnegie Mellon University, where he leads the Machine Learning, Language, and Logic (L3) Lab. His areas of focus include generative models, algorithms for large language models, and AI for code, science, and mathematics. Sean received a Ph.D. from New York University. He was a postdoctoral scholar at the University of Washington and the Allen Institute for Artificial Intelligence. He is a recipient of a NeurIPS 2021 Outstanding Paper Award, and two NVIDIA AI Pioneering Research Awards. </p>
<p><strong>Abstract:</strong> One of the most striking findings in modern research on large language models (LLMs) is that scaling up compute at training time leads to better final results. However, there is another lesser-mentioned scaling phenomenon, where adopting more sophisticated methods and/or scaling compute at inference time can result in significantly better outputs from LLMs. In this talk, I will talk about our lab's recent work on using inference-time strategies to enable better reasoning. This includes training models to think prior to steps of formal mathematical proving, leveraging strong evaluation models to enable easy-to-hard generalization, and inference scaling laws that optimally balance cost and performance. Together, these advances point to a new paradigm of scaling compute at inference time. </p>
</div>
</div>
<!-- Sara Block -->
<div class="talk" onclick="toggleDetails(this)">
<div class="talk-summary">
<div class="date">
<div class="month">JAN</div>
<div class="day">31</div>
</div>
<div class="speaker-image">
<a href="https://www.sarahooker.me/" target="_block">
<img src="https://uclanlp.github.io/nlp-seminar/icons/sara-12.jpg" alt="Sara Hooker" style="max-height: 300px;"/>
</a>
</div>
<div class="speaker-text">
<h3>Understanding the role of data, scale and capacity in recent breakthroughs</h3>
<p><img src="icons/person.png" alt="Person Icon" class="icon"><a href="https://www.sarahooker.me/">Sara Hooker</a></p>
<p><img src="icons/clock.png" alt="Clock Icon" class="icon">Jan 31, 2025, 2:00 PM</p>
<p><img src="icons/location.png" alt="Location Icon" class="icon">289, Engineering VI</p>
</div>
<div class="footer">
<button class="more-details">More Details</button>
</div>
</div>
<div class="talk-details">
<p><strong>Speaker Bio:</strong> Sara Hooker leads Cohere For AI, the dedicated research arm of Cohere. Cohere For AI seeks to solve complex machine learning problems and supports fundamental research that explores the unknown. With a long track-record of impactful research at Google Brain, Sara brings a wealth of knowledge from across machine learning. Her work has focused on model efficiency training techniques and optimizing for models that fulfill multiple desired criteria -- interpretable, efficient, fair and robust. Sara leads a team of researchers and engineers working on making large language models more efficient, safe and grounded. Sara is currently on Kaggle's ML Advisory Research Board and serves on the World Economic Forum council on the Future of Artificial Intelligence. </p>
<!-- <p><strong>Abstract:</strong> INSERT TALK ABSTRACT HERE </p> -->
</div>
</div>
<!-- <div class="talk" onclick="toggleDetails(this)">
<div class="talk-summary">
<div class="date">
<div class="month">JAN</div>
<div class="day">10</div>
</div>
<div class="speaker-image">
<a href="https://swabhs.com/" target="_block">
<img src="Image Link" alt="Speaker Name" style="max-height: 300px;"/>
</a>
</div>
<div class="speaker-text">
<h3>Talk Title</h3>
<p><img src="icons/person.png" alt="Person Icon" class="icon"><a href="Speaker Website">Speaker Name</a></p>
<p><img src="icons/clock.png" alt="Clock Icon" class="icon">Jan 10, 2024, 2:00 PM</p>
<p><img src="icons/location.png" alt="Location Icon" class="icon">289, Engineering VI</p>
<p><img src="icons/zoom.png" alt="Zoom Icon" class="icon">To Be Announced</p>
</div>
<div class="footer">
<button class="more-details">More Details</button>
</div>
</div>
<div class="talk-details">
<p><strong>Speaker Bio:</strong> INSERT BIO </p>
<p><strong>Abstract:</strong> INSERT TALK ABSTRACT HERE </p>
</div>
</div> -->
<h2>Organizing Committee</h2>
<div class="committee-content">
<h3>Faculty</h3>
<div class="row">
<div class="speaker-image">
<a href="https://web.cs.ucla.edu/~kwchang/" target="_block">
<img src="https://web.cs.ucla.edu/~kwchang/img/myphoto.jpg" style="max-height: 200px; max-width: 200px;"/>
</a>
</figure>
<p><b>Prof. Kai-Wei Chang</b></p>
</div>
<div class="speaker-image">
<a href="https://violetpeng.github.io/" target="_block">
<img src="https://violetpeng.github.io/photos/profile22.png" style="max-height: 200px; max-width: 200px;"/>
</a>
</figure>
<p><b>Prof. Nanyun Peng </b></p>
</div>
<div class="speaker-image">
<a href="https://saadiagabriel.com/" target="_block">
<img src="https://saadiagabriel.com/website.png" style="max-height: 200px; max-width: 200px;"/>
</a>
</figure>
<p><b>Prof. Saadia Gabriel </b></p>
</div>
<div class="speaker-image">
<a href="https://www.coalas-lab.com/elisakreiss" target="_block">
<img src="https://comm.ucla.edu/wp-content/uploads/2023/08/ElisaSept2022.png" style="max-height: 200px; max-width: 200px;"/>
</a>
</figure>
<p><b>Prof. Elisa Kreiss </b></p>
</div>
</div>
<div style="clear: both; text-align: left;">
<h3>Students</h3>
</div>
<div class="row">
<div class="speaker-image">
<a href="https://tanmayparekh.github.io/" target="_block">
<img src="https://tanmayparekh.github.io/vertical-ucla-pic.jpeg" style="max-height: 200px; max-width: 200px;"/>
</a>
</figure>
<p><b>Tanmay Parekh</b></p>
</div>
<div class="speaker-image">
<a href="https://yufeitian.github.io/" target="_block">
<img src="https://yufeitian.github.io/website/images/avatar.jpg" style="max-height: 200px; max-width: 200px;"/>
</a>
</figure>
<p><b>Yufei Tian</b></p>
</div>
<div class="speaker-image">
<a href="https://asuvarna31.github.io" target="_block">
<img src="https://asuvarna31.github.io/images/IMG_0888.jpg" style="max-height: 200px; max-width: 200px;"/>
</a>
</figure>
<p><b>Ashima Suvarna</b></p>
</div>
<div class="speaker-image">
<a href="https://evelinehong.github.io/" target="_block">
<img src="https://evelinehong.github.io/assets/images/credit_to_zeng.jpg" style="max-height: 200px; max-width: 200px;"/>
</a>
</figure>
<p><b>Yining Hong </b></p>
</div>
<div class="speaker-image">
<a href="https://salmanrahman.net/" target="_block">
<img src="https://salmanrahman.net/assets/salman-img.jpg" style="max-height: 200px; max-width: 150px;"/>
</a>
</figure>
<p><b>Salman Rahman </b></p>
</div>
</div>
</div>
</main>
</body>
</html>