Recent advancements in natural language processing (NLP) have been driven by pre-trained large language models (LLMs). These LLMs serve as the foundation for the most cutting-edge model across a wide array of NLP tasks. With the availability of vast textual data and improvements in GPU parallelization, these models can now generate text fluently and follow instructions to accomplish a diverse range of tasks.

This course aims to teach the fundamentals of large language models, allowing the attendees to gain hands-on understanding and implementation of them.. The course will begin with the definition of language models and gradually increase in complexity, emphasizing adaptation techniques (e.g., in-context learning, few-shot learning, instruction learning) and methods to align models with human preferences. Additionally, advanced training techniques such as parallelism, selective architectures, and scaling laws will be covered. The course will also address bias and ethical concerns related to these models.

The course is part of the NLP master hosted by the Ixa NLP research group at the HiTZ research center of the University of the Basque Country (UPV/EHU).

Student profile

Addressed to professionals, researchers and students who want to understand and apply deep learning techniques to text. The practical part requires basic programming experience, a university-level course in computer science and experience in Python. Basic math skills (algebra or pre-calculus) are also needed.

Contents

Introduction to LLMs

Basics of Large Language Models
Modeling and (architecture
Training techniques
LABORATORY:Basic prompting for text completion

Data, Bias, Harness

Data for training a LLM.
Data contamination
Dangers of data:
. Bias
. Toxicity
Harness
LABORATORY: Measuring Bias on GPT2

Model Adaptation

Description of techniques for adapting models to specific tasks:
. Probing
. Prompt learning
. Reasoning (Chain-of-Though, Self-consistency,...)
. Instruction learning
LABORATORY: LLM adaptation for Question-Answering task

Human Alignment and Chatbots

Description of techniques to align LLMs with human preferences:
. Reinforcement Learning with Human Feedback (RLHF)
. Direct Preference Optimization (DPO)
LABORATORY: RAG based Chatbot

Parallelism and Scaling Laws

Description techniques to train very large models using multiple GPUs:
Parallelism
Scaling Law:
. Modified Scaling Law
. Chinchilla
. Beyond Scaling Law
Parameter Efficient Fine-Tuning LABORATORY: Comparison of PEFT techniques

Selective Architectures

Description of techniques for scaling LLMs: Mixture-of-Experts
Switch Transformers
Merging of Models
LABORATORY: Language Agents and Tool LLMs

Instructors

Person 1

Oier Lopez de Lacalle

Assistant Professor, member of Ixa
and HiTZ

Practical details

General information

Part of the Language Analysis and Processing master program.
  • The classes will be broadcasted live online. The practical labs will be also held online
  • 5 theoretical sessions with interleaved hands-on labs (20 hours).
  • Scheduled from September 30th to October 4th 2024, 15:00-19:00 CET.
  • Teaching language: English.
  • Capacity: 60 attendants (First-come first-served).
  • Cost: 270€ + 4€ insurance = 274€
    (If you are an UPV/EHU member or have already registered for another course, it is 270€).

Registration

Pre-registration is open
  • Please register by email to ixa.administratzailea@ehu.eus (subject "Registration to LLMs" and CC olatz.arregi@ehu.eus).
  • Also for any enquiry you might have.
  • After you receive the payment instructions you will have three days to formalize the payment.
  • The university provides official certificates (for an additional 27.96 euros). Please apply AFTER completing the course.
  • UPV/EHU can provide invoices addressed to universities or companies. More details are provided after registration is made.



Prerequisites
Basic Python programming experience.
Not a requirement but the Deep Learning for Natural Language Processing course is complementary and might allow for a better understanding of the underlying algorithms.