
Generative AI with LLMs and Reinforcement Learning
Design AI-driven systems using large language models and reinforcement learning to build Q&A bots, improve NER, optimize dialogue summarization, and reduce toxic content.
PBL
Tracks
Track 1
Building an AI-driven Contextual Q&A Bot with Large Language Models
Track 2
Evaluating Named Entity Recognition (NER) Responses Using Large Language Models
Track 3
Optimizing Dialogue Summarization with Generative AI: Fine-Tuning and Prompt Engineering with LLMs
Track 4
Leveraging Reinforcement Learning to Minimize Toxic Content in Language Models
Project
OpenAI Project
Location
Online
Duration
8 Weeks
Upcoming Sessions
Fall 2025
Outcomes
Integrate hate speech classifiers for safer language generation.
Utilize LangChain for context-aware NER enhancements
Fine-tune pre-trained models for improved summarization
Apply Proximal Policy Optimization to enhance model behavior
Implement chunking and RAG for handling large documents.
You Will Get
Industry Guidance
Work directly with our project leads—experts and top researchers—who bring their real-world insights and expertise straight to your learning experience.

Research Experience
Collaborate with teammates and the project lead in a multi-week project to pursue novel questions in your research field.

Peer Networks
Engage with our PBL participants from all over the world. Collaborate with new peers and learn about their own research endeavours.
.png)
A Strong Portfolio
Put your best foot forward in the PBL with a standout project and receive a PBL Evaluation Report that can be used as a recommendation letter for employers and grad schools.

Expert Guidance
Get personalized feedback to grow your research and innovation skills.

Deliverables
Real projects, lasting connections, and new opportunities beyond your program.
Project Deliverables
The final presentation of your 8 weeks could be a poster, written report, or a slide deck, all of which can be expanded on.
Research Extension
Utilize up to 5 additional meeting times with the project lead after the project’s conclusion to build your work out for publication or conference presentation.
Industry Network
Meet peers in your projects and participate in a global talent community both online and in-person.
Industry Application
OpenAI’s work aligns with this PBL by utilizing large language models and reinforcement learning for building advanced Q&A bots, improving NER accuracy, and reducing toxic content in AI-generated text. These technologies are widely applicable across industries like finance, healthcare, and tech, improving automation, content moderation, and customer service systems.
Popular Industry Positions
Machine Learning Engineer
Fine-tunes models for specific tasks like text summarization and NER.
AI Engineer
Develops AI-driven systems, enhancing chatbots, and NLP applications.
Data Scientist
Analyzes large datasets using machine learning and NLP for decision-making.
Tracks
Track 1
Building an AI-driven Contextual Q&A Bot with Large Language Models
Design and develop a contextual question-and-answer bot using advanced open-source language models.
Enable users to ask questions specific to the content of a given PDF document, ensuring responses align with the document’s context.
Leverage large language models to process and interpret PDF content, offering accurate answers while rejecting out-of-scope queries.
Write a Python program that integrates LLMs and innovative techniques like chunking and retrieval-augmented generation (RAG) to handle large documents.
Ensure the system includes mechanisms for managing token limits and denying responses to irrelevant questions.
Track 2
Evaluating Named Entity Recognition (NER) Responses Using Large Language Models
Delve into improving Named Entity Recognition (NER) accuracy by harnessing advanced large language models.
Mitigate false positives in NER outputs using context-aware filtering mechanisms provided by LLMs.
Build a Python program that processes NER responses and implements intelligent filtering to reduce errors.
Leverage tools like Lang-Chain to enhance the precision of NER outputs through deep word- and context-level understanding.
Empower participants to refine NER systems, driving innovation in the field of natural language processing.
Track 3
Optimizing Dialogue Summarization with Generative AI: Fine-Tuning and Prompt Engineering with LLMs
Explore the potential of dialogue summarization by leveraging generative AI and large language models (LLMs).
Investigate how different prompt structures influence model output, using prompt engineering to create precise and contextually relevant summaries.
Experiment with zero-shot, one-shot, and few-shot learning to uncover the nuances of prompt engineering and its role in enhancing LLM performance.
Fine-tune a pre-existing LLM, such as FLAN-T5 from Hugging Face, to improve its dialogue summarization capabilities, measuring results with ROUGE metrics.
Explore Parameter Efficient Fine-Tuning (PEFT) to balance accuracy and efficiency, optimizing scalability and resource usage without compromising performance.
Track 4
Leveraging Reinforcement Learning to Minimize Toxic Content in Language Models
Dive into the refinement of the FLAN-T5 model, utilizing reinforcement learning to generate safer, non-toxic content.
Integrate Meta AI's hate speech reward model, a binary classifier that assesses text as “hate” or “not hate,” guiding the model toward safer language generation.
Employ Proximal Policy Optimization (PPO) to fine-tune the model by progressively rewarding non-toxic outputs, reducing harmful behavior.
Gain hands-on experience in advanced AI techniques, such as generative AI, reinforcement learning, and prompt engineering, while applying LLMs to real-world challenges.
Equip participants with the skills needed to build robust AI systems that mitigate biases and toxic content, enhancing applications in natural language processing, dialogue systems, and content filtering.
PBL Journey
Online PBL Projects meet once a week for 8 weeks, and follow the research project format. Participants will meet the project lead, learn the conventions of the field and familiarize themselves with the tracks, then spend the middle portion of their time collaborating to develop their research.
At the end, participants will present their final project and receive feedback, with the opportunity to extend their timeline and develop the project in greater depth.
.png)
Project Team
Our Academic Team plays a vital role in your PBL journey at Blended Learning. We are dedicated to enhancing your learning experience and ensuring your academic success. Our team consists of three distinct roles, each with a specific focus to support your Research Guidance, Project Progress, and Personal Growth.

Project Lead
Providing Industry and Research Guidance
AI Scientist and Researcher in Reinforcement Learning
With over six years of research and industry experience, she specializes in optimization techniques, deep reinforcement learning, and human-in-the-loop reinforcement learning. Her expertise includes advanced natural language processing using TensorFlow and PyTorch. She is proficient in Python, R, Java, C++, and AWS technologies, excelling at simplifying complex concepts for clear communication. Her work stands out in fast-paced, deadline-driven environments, consistently delivering high-quality content and training. Her academic background in Operations Research and Applied Mathematics further strengthens her contributions to cutting-edge AI and machine learning projects.

Academic Advisor
Tracking Your Project Development
The Academic Advisor is dedicated to your project completion success. They manage the progress of your PBL, guiding team formation, facilitating group discussions, and resolving conflicts. Additionally, the Academic Advisor ensures team member contributions are on track and provides logistical support, including attendance tracking, hosting recitation sessions, managing research support requests, and conducting student evaluations at the end of the PBL.
From Our Students
"After a night spent debugging, I suddenly discovered the program running perfectly. In that triumphant moment, you realize your true capability and success. The exhaustion fades, replaced by the thrill of knowing your skills and persistence led to this achievement, reaffirming your potential."

Nicole Y.
National University of Singapore
B.S. Economics

FAQs
What is the learning format of a PBL?
All PBLs are offered in an 8-week online format that begins with an orientation followed by subject setup overview of the different tracks. The majority of the session time is dedicated to project development, with a final presentation at the culmination of the 8 weeks. Many PBLs are also offered bi-annually in an on-campus format that consists of daily in-person meetings.
How long does each PBL cohort last?
One round of the Online PBL cohort lasts 8 weeks, preceded bys a pre-PBL orientation week. Each On-Campus PBL usually has 8 in person meetings, with intensive classroom education and collaboration. This means the biggest difference between online and on-campus PBLs is time participants have in between meetings.
How can I be more academically prepared before the PBL starts?
Review the Blended Learning Insights sent by the Academic Advisor and familiarize yourself with the project topic and pre-learning materials. Ensure you have all necessary softwares and other resources needed for the PBL.
For each PBL cohort, will I work in teams? Are PBL team members self-selected or assigned?
Yes, you will work in teams for each round of the PBL Cohort. Each team has 3 to 6 participants, organized by the Academic Team. The Academic Advisor will organize groupings based on students' backgrounds, preferred track, and skills.
Can I work with the Project Lead on my project after the PBL ends?
Yes, with your AI + X Research Plan, you may request up to five PBL Research Extension meetings, where you work with the project lead to develop your project into a working manuscript. To schedule a PBL Research Extension meeting, talk to your Academic Advisor at the conclusion of your PBL.
What do I receive at the end of the PBL?
At the conclusion of the PBL cohort, you can request a PBL Evaluation Report which summarizes the PBL content, the hours you spent, the track you chose, and includes a recommendation letter from the Project Lead (for eligible participants who completed the project successfully).
Is attendance mandatory for PBL Live Sessions and Recitation Sessions?
Yes, attendance is mandatory for both PBL Live Sessions and Recitation Sessions. Participants with three or more unexcused absences forfeit their eligibility for a PBL Evaluation Report.
Do I need to have my camera on during online PBL Live Sessions?
Yes, you must have your camera on during online PBL Live Sessions. Participants with cameras off will be marked as absent. This is meant to encourage active engagement and participation in meetings.