Sudipta Kar

Senior Applied Scientist at Amazon AGI


speaker_sudipta Sudipta Kar is a Senior Applied Scientist at Amazon AGI. He received his Ph.D. in Computer Science from the University of Houston in 2020 under the supervision of Thamar Solorio. His doctoral research focused on creative text analysis. Currently, he works on developing intelligent systems to enable seamless proactivity in smart voice assistants such as Alexa. His research interests include computational systems for low-resource languages, language models, and information extraction. He has co-organized multiple natural language processing workshops and shared tasks, including BLP, CALCS, MultiCoNER, and SentiMix. Additionally, in 2023 he led the first NLP hackathon held in Bangladesh.

Tutorial title: The Power of Rewards - Reinforcing Language Models with Reinforcement Learning

Language models ranging from BERT to GPT have shown impressive performance on many natural language tasks through first self-supervised pre-training on large text corpora and then fine-tuning on downstream tasks or even with zero-shot or few-shot approaches. These models are also very powerful in solving multiple tasks. However, their capabilities are still limited as they lack a built-in reward signal to directly optimize for the end task objective or align to certain human preferences.On the other hand, Reinforcement Learning (RL) is another area in machine learning which is commonly applied to develop systems that improve over real-time feedback loops (such as games). Because it provides a framework for optimizing goal-directed behavior through rewards. In this tutorial, we will explore how reinforcement learning came into the play of language modeling and suddenly changed the game by reinforcing the representational power of large language models with the ability to more efficiently solve tasks requiring reasoning, planning, and knowledge. We will first provide background on contemporary language models and reinforcement learning fundamentals. Next, we will discuss techniques for applying policy gradient methods to fine-tune language models to maximize rewards from an environment. Attendees will learn how to practically apply RL to language tasks, understand tradeoffs between different algorithms, and gain insight into state-of-the-art research in this emerging field.

Danae Sánchez

Researcher at the University of Copenhagen


speaker_danae Danae Sánchez Villegas is a postdoctoral researcher at the University of Copenhagen. She holds a Ph.D. and a Master's degree in Computer Science from the University of Sheffield and a Bachelor's degree in Computer Engineering from the Instituto Tecnológico Autónomo de México. Her research interests include multilingual natural language understanding, vision and language modeling, and computational social science. Danae has worked as a Research Associate in the Natural Language Processing Group at the University of Sheffield and as an Applied Scientist Intern at Amazon Alexa.

Tutorial title: Exploring Transformers and Limitations in Language Modeling.

This tutorial explores language modeling techniques in Natural Language Processing (NLP), covering key concepts from traditional approaches to Transformer architectures. Beginning with an introduction to NLP and language modeling, it delves into probabilistic language models and progresses to neural language models, emphasizing the significance of embeddings for semantic representation. Moving on to Transformer models, we will discuss key concepts such as multi-head attention mechanisms, masked language modeling, and encoder models. Additionally, the tutorial addresses the limitations of large language models, providing insights into challenges and considerations for leveraging these models effectively in practical applications.

Alham Fikri Aji

Assistant Professor at MBZUAI, UAE.


speaker_aji Alham Fikri Aji is an Assistant Professor at MBZUAI, holding a Ph.D. from the University of Edinburgh's Institute for Language, Cognition, and Computation. His doctoral research, supervised by Dr. Kenneth Heafield and Dr. Rico Sennrich, focused on enhancing the training and inference speed of machine translation. Dr. Aji's current research centers around multilingual, low-resource, and low-compute Natural Language Processing (NLP). His recent work has been in developing diverse multilingual large language models. and multilingual NLP resources, particularly for underrepresented languages, with a specific emphasis on Indonesian. He has worked at Amazon, Google, and Apple in the past.

Tutorial title: Training Lightweight Model via Knowledge Distillation and Parameter Efficient Finetuning

Language models can be resource-hungry and prohibitive to train and deploy for many people due to a lack of proper computing resources. In this tutorial, we delve into how to build smaller, more lightweight models without sacrificing quality via knowledge distillation. Additionally, we delve into training your model with less memory usage through parameter-efficient finetuning. In this tutorial, we will delve into the hands-on implementation. It is ideal for those just delving into the field of AI/NLP.

Víctor Mijangos

Professor at UNAM, Mexico.


speaker_victor Víctor Mijangos de la Cruz is a Full-Time Professor at the Faculty of Sciences, where he teaches subjects related to Artificial Intelligence, Neural Networks, and Computational Linguistics. In addition to teaching, he develops research projects aimed at exploring inductive biases in neural networks and the application of deep learning models for the development of technologies in indigenous languages. His interests lie in the study of the capabilities and limitations of deep learning, computational linguistics, and the creation of technologies for indigenous languages, particularly in the Otomí language.

Tutorial title: Introduction to Attention Mechanisms in Transformers.

Attention layers are currently central mechanisms in language models. Transformers, which represent the state of the art in this field, rely on the use of attention layers in combination with other strategies. Attention has also been used in models based on sequence-to-sequence recurrent networks, providing significant improvements in natural language processing tasks such as machine translation and text generation. Understanding how these mechanisms work is essential to comprehend current language models. This workshop aims to present a first approach to the attention mechanisms used in neural networks. Firstly, the basic theoretical concepts to understand attention and its operation will be presented, other attention mechanisms, mainly sparse attention, will be reviewed, and the relationship of attention with auto-encoded and auto-regressive language models will be discussed. Finally, its relationship with other mechanisms such as convolutional layers and graph layers, highlighting their advantages and disadvantages, will be addressed. Secondly, the technical principles for the implementation of attention mechanisms in Pytorch and their incorporation within the architecture of Transformers will be covered.

Diyi Yang

Assistant Professor at Stanford


Diyi Yang Diyi Yang is an assistant professor in the Computer Science Department at Stanford University, also affiliated with the Stanford NLP Group, Stanford HCI Group and Stanford Human Centered AI Institute. Her research focuses on human-centered natural language processing and computational social science. She is a recipient of IEEE “AI 10 to Watch” (2020), Microsoft Research Faculty Fellowship (2021), NSF CAREER Award (2022), an ONR Young Investigator Award (2023), and a Sloan Research Fellowship (2024). Her work has received multiple paper awards or nominations at top NLP and HCI conferences, (e.g., Best Paper Honorable Mention at ICWSM 2016, Best Paper Honorable Mention at SIGCHI 2019, and Outstanding Paper at ACL 2022).

Keynote: Human Centered NLP for Positive Impact

Large language models have revolutionized the way humans interact with AI systems, transforming a wide range of fields and disciplines. However, there is a growing amount of evidence and concern about the negative aspects of NLP systems such as biases and the lack of input from users. How can we build NLP systems that are more aware of human factors? In this talk, we will present two case studies on how human-centered design can be leveraged to build responsible NLP applications. The first one presents a participatory design approach to develop dialect-inclusive language tools and adaptation techniques for low-resourced language and dialect. The second part looks at social skill training with LLMs by demonstrating how we use LLMs to teach conflict resolution skills through simulated practice. We conclude by discussing how human-AI interaction via LLMs can empower individuals and foster positive change.

Veronica Perez Rosas

Researcher at the University of Michigan


Veronica Perez Rosas Veronica Perez Rosas obtained her Ph.D. in Computer Science and Engineering from the University of North Texas in 2014. She is a Level I Researcher recognized by the National System of Researchers. Currently, she is a researcher at the University of Michigan, where she is a part of the Artificial Intelligence laboratory and the Inform Language and Information Technologies research group in the Department of Computer Science. Her research interests include natural language processing (NLP), machine learning, computational linguistics, and multimodal representations. Her research focuses on NLP applications, including automatic detection of misinformation, NLP in mental health, as well as the detection of human behaviors such as sentiment, deception, sarcasm, and affective response.

Alexis Palmer

Assistant Professor at the University of Colorado Boulder


Alexis Palmer Alexis Palmer is a computational linguist and a professor in the Department of Linguistics at the University of Colorado Boulder. She studied English literature at the University of Michigan, and later earned both an MA and a PhD in computational linguistics from the University of Texas at Austin. Her main research interests include computational semantics, computational discourse, and the development and application of computational methods to support language documentation and revitalization.

Umut Pajaro Velasquez

Information Systems Ph.D. student


Umut Pajaro Umut Pajaro Velasquez holds a BA in Communications and, an MA in Cultural Studies. Now is an Information Systems Ph.D. student. Currently delving into digital rights and AI ethics, striving to rectify biases, particularly those affecting gender, race, and marginalized communities in technology. Actively contributor and advocator to global dialogues on digital governance. As former Chair of the ISOC Gender Standing Group, they have spoken at events like the Internet Governance Forum and RightsCon. Their involvement includes roles such as a Mozilla Festival Wrangler, Lideres LACNIC 2.0, and Open Life Science 7 Fellow. They are a published researcher in mass media, digital rights, and AI ethics, aiming to transition into academia or diplomacy, and also driven by a passion for photography and poetry.

Keynote: Towards Inclusive Design and Development: Open Data and Gender as Key Tools for Inclusive LLMs Models

We will explore the synergy between open data and collaborative construction with a gender lens, unraveling how this combination becomes a fundamental catalyst for the creation of more equitable and bias-free deep language models. We will take a close look at how the transparency and accessibility of open data, along with the active inclusion of gender perspectives in its construction, contribute significantly to mitigating inherent biases in LLMs. We will see how the use of practical strategies demonstrate how this approach not only addresses crucial challenges, but also drives innovation towards a future where NLPs and LLMs more accurately and fairly reflect the diversity of our society. In conclusion, we propose a path towards collaboratively building a more inclusive and equitable technological future.

Jocelyn Dunstan

Assistant Professor at the Pontifical Catholic University of Chile.


speaker_jocelyn Jocelyn Dunstan is an Assistant Professor at the Pontifical Catholic University of Chile. She holds a Ph.D. in Applied Mathematics and Theoretical Physics from the University of Cambridge in the UK. She specializes in leveraging machine learning and natural language processing to address key challenges. Her research primarily revolves around clinical text mining and patient prioritization. In addition to her academic role at the Catholic University of Chile, she is actively engaged as a researcher at prominent institutions such as the Millenium Institute for Foundational Research on Data (IMFD) and the Advanced Center for Electrical and Electronic Engineering (AC3E). Further information about her group's work can be found on their webpage at pln.cmm.uchile.cl.

Manuel Montes y Gomez

Full Professor at INAOE, Mexico.


speaker_jocelyn Manuel Montes-y-Gómez is Full Professor at the National Institute of Astrophysics, Optics and Electronics (INAOE) of Mexico. His research is on automatic text processing. He is author of more than 250 journal and conference papers in the fields of information retrieval, text mining and authorship analysis.
He has been visiting professor at the Polytechnic University of Valencia (Spain), and the University of Alabama (USA). He is also a member of the Mexican Academy of Sciences (AMC), and founding member of the Mexican Academy of Computer Science (AMEXCOMP), the Mexican Association of Natural Language Processing (AMNLP), and of the Language Technology Network of CONACYT. In the context of them, he has been the organizer of the National Workshop on Language Technologies (from 2004 to 2016), the Mexican Workshop on Plagiarism Detection and Authorship Analysis (2016-2020), the Mexican Autumn School on Language Technologies (2015 and 2016), and a shared task on author profiling, aggressiveness analysis and fake news detection in Mexican Spanish at IberLEF (2018-2021).

Luciana Benotti

Associate Professor at the National University of Córdoba and AI Researcher at CONICET, Argentina.


speaker_benotti Luciana Benotti is an Associate Professor in Computer Science at the National University of Córdoba and a Researcher in Artificial Intelligence at CONICET, Argentina. Her research interests include different aspects of situated and interactive NLP, such as interpreting instructions in a dialogue, generating contextualized questions, and deciding when to speak in a dialogue system, among others. She is particularly interested in how linguistic and non-linguistic features contribute to the meaning conveyed during a conversation. These features include what the conversational participants are doing while they talk, the visual context, temporal aspects, etc. She has been a visiting scientist at the University of Trento (2019), Stanford University (2018), Roskilde University (2014), the University of Costa Rica (2012), and the University of Southern California (2010). She holds a joint MSc Erasmus Mundus from the Free University of Bolzano and the Polytechnic University of Madrid and a PhD from the Université de Lorraine. This year, she was chosen as the Latin American representative for the North American Association for Computational Linguistics (NAACL).

Veronica Perez Rosas

Researcher at the University of Michigan


Veronica Perez Rosas Veronica Perez Rosas obtained her Ph.D. in Computer Science and Engineering from the University of North Texas in 2014. She is a Level I Researcher recognized by the National System of Researchers. Currently, she is a researcher at the University of Michigan, where she is a part of the Artificial Intelligence laboratory and the Inform Language and Information Technologies research group in the Department of Computer Science. Her research interests include natural language processing (NLP), machine learning, computational linguistics, and multimodal representations. Her research focuses on NLP applications, including automatic detection of misinformation, NLP in mental health, as well as the detection of human behaviors such as sentiment, deception, sarcasm, and affective response.