top of page
Location
Online
Duration
8 Weeks
Upcoming Sessions
Fall 2025
Have more questions?

Generative AI with LLMs and Reinforcement Learning

Design AI-driven systems using large language models and reinforcement learning to build Q&A bots, improve NER, optimize dialogue summarization, and reduce toxic content.

PBL
Tracks

Track 1

Building an AI-driven Contextual Q&A Bot with Large Language Models

Track 2

Evaluating Named Entity Recognition (NER) Responses Using Large Language Models

Track 3

Optimizing Dialogue Summarization with Generative AI: Fine-Tuning and Prompt Engineering with LLMs

Track 4

Leveraging Reinforcement Learning to Minimize Toxic Content in Language Models

Project
OpenAI Project
Location
Online
Duration
8 Weeks
Upcoming Sessions
Fall 2025
Outcomes
Location
Online
Duration
8 Weeks
Upcoming Sessions
Fall 2025
Have more questions?

Outcomes

Integrate hate speech classifiers for safer language generation.

Utilize LangChain for context-aware NER enhancements

Fine-tune pre-trained models for improved summarization

Apply Proximal Policy Optimization to enhance model behavior

Implement chunking and RAG for handling large documents.

You Will Get

You Will Get

Industry Guidance

Work directly with our project leads—experts and top researchers—who bring their real-world insights and expertise straight to your learning experience.

c3f29f6181c2d3c72f01e64d939e709f88454d33.png

Research Experience

Collaborate with teammates and the project lead in a multi-week project to pursue novel questions in your research field.

00bdc10abfb4efb76a02c482fe15a30f216f189b.png

Peer Networks

Engage with our PBL participants from all over the world. Collaborate with new peers and learn about their own research endeavours.

71b05227664371becca57dd000eba0d9790e1a6d (1).png

A Strong Portfolio

Put your best foot forward in the PBL with a standout project and receive a PBL Evaluation Report that can be used as a recommendation letter for employers and grad schools.

5ae8b4a4454211e919b992384b3aae2258f33447.png

Expert Guidance

Get personalized feedback to grow your research and innovation skills.

d32bdd48907f471855614a5815ee2f49e077d23d.png
Final Outcomes

Deliverables

Real projects, lasting connections, and new opportunities beyond your program.

Project Deliverables
The final presentation of your 8 weeks could be a poster, written report, or a slide deck, all of which can be expanded on.
Research Extension
Utilize up to 5 additional meeting times with the project lead after the project’s conclusion to build your work out for publication or conference presentation.
Industry Network
Meet peers in your projects and participate in a global talent community both online and in-person.
Industry Application

Industry Application

OpenAI’s work aligns with this PBL by utilizing large language models and reinforcement learning for building advanced Q&A bots, improving NER accuracy, and reducing toxic content in AI-generated text. These technologies are widely applicable across industries like finance, healthcare, and tech, improving automation, content moderation, and customer service systems.

Popular Industry Positions

Machine Learning Engineer

Fine-tunes models for specific tasks like text summarization and NER.

AI Engineer

Develops AI-driven systems, enhancing chatbots, and NLP applications.

Data Scientist

Analyzes large datasets using machine learning and NLP for decision-making.

Tracks

Tracks

Track 1

Building an AI-driven Contextual Q&A Bot with Large Language Models

Design and develop a contextual question-and-answer bot using advanced open-source language models.

  • Enable users to ask questions specific to the content of a given PDF document, ensuring responses align with the document’s context.

  • Leverage large language models to process and interpret PDF content, offering accurate answers while rejecting out-of-scope queries.

  • Write a Python program that integrates LLMs and innovative techniques like chunking and retrieval-augmented generation (RAG) to handle large documents.

  • Ensure the system includes mechanisms for managing token limits and denying responses to irrelevant questions.

Track 2

Evaluating Named Entity Recognition (NER) Responses Using Large Language Models

Delve into improving Named Entity Recognition (NER) accuracy by harnessing advanced large language models.

  • Mitigate false positives in NER outputs using context-aware filtering mechanisms provided by LLMs.

  • Build a Python program that processes NER responses and implements intelligent filtering to reduce errors.

  • Leverage tools like Lang-Chain to enhance the precision of NER outputs through deep word- and context-level understanding.

  • Empower participants to refine NER systems, driving innovation in the field of natural language processing.

Track 3

Optimizing Dialogue Summarization with Generative AI: Fine-Tuning and Prompt Engineering with LLMs

Explore the potential of dialogue summarization by leveraging generative AI and large language models (LLMs).

  • Investigate how different prompt structures influence model output, using prompt engineering to create precise and contextually relevant summaries.

  • Experiment with zero-shot, one-shot, and few-shot learning to uncover the nuances of prompt engineering and its role in enhancing LLM performance.

  • Fine-tune a pre-existing LLM, such as FLAN-T5 from Hugging Face, to improve its dialogue summarization capabilities, measuring results with ROUGE metrics.

  • Explore Parameter Efficient Fine-Tuning (PEFT) to balance accuracy and efficiency, optimizing scalability and resource usage without compromising performance.

Track 4

Leveraging Reinforcement Learning to Minimize Toxic Content in Language Models

Dive into the refinement of the FLAN-T5 model, utilizing reinforcement learning to generate safer, non-toxic content.

  • Integrate Meta AI's hate speech reward model, a binary classifier that assesses text as “hate” or “not hate,” guiding the model toward safer language generation.

  • Employ Proximal Policy Optimization (PPO) to fine-tune the model by progressively rewarding non-toxic outputs, reducing harmful behavior.

  • Gain hands-on experience in advanced AI techniques, such as generative AI, reinforcement learning, and prompt engineering, while applying LLMs to real-world challenges.

  • Equip participants with the skills needed to build robust AI systems that mitigate biases and toxic content, enhancing applications in natural language processing, dialogue systems, and content filtering.

PBL Journey

PBL Journey

Online PBL Projects meet once a week for 8 weeks, and follow the research project format. Participants will meet the project lead, learn the conventions of the field and familiarize themselves with the tracks, then spend the middle portion of their time collaborating to develop their research. 

At the end, participants will present their final project and receive feedback, with the opportunity to extend their timeline and develop the project in greater depth.

Image (6).png
Project Team

Project Team

Our Academic Team plays a vital role in your PBL journey at Blended Learning. We are dedicated to enhancing your learning experience and ensuring your academic success. Our team consists of three distinct roles, each with a specific focus to support your Research Guidance, Project Progress, and Personal Growth.

4722e5181a1b6919584022c079615e44a38f430a.png
Project Lead

Providing Industry and Research Guidance

AI Scientist and Researcher in Reinforcement Learning


With over six years of research and industry experience, she specializes in optimization techniques, deep reinforcement learning, and human-in-the-loop reinforcement learning. Her expertise includes advanced natural language processing using TensorFlow and PyTorch. She is proficient in Python, R, Java, C++, and AWS technologies, excelling at simplifying complex concepts for clear communication. Her work stands out in fast-paced, deadline-driven environments, consistently delivering high-quality content and training. Her academic background in Operations Research and Applied Mathematics further strengthens her contributions to cutting-edge AI and machine learning projects.

f4e1243687f7da3975da4ad5b6d26b63e519c477.png
Academic Advisor

Tracking Your Project Development

The Academic Advisor is dedicated to your project completion success. They manage the progress of your PBL, guiding team formation, facilitating group discussions, and resolving conflicts. Additionally, the Academic Advisor ensures team member contributions are on track and provides logistical support, including attendance tracking, hosting recitation sessions, managing research support requests, and conducting student evaluations at the end of the PBL.

From Our Students

From Our Students

"After a night spent debugging, I suddenly discovered the program running perfectly. In that triumphant moment, you realize your true capability and success. The exhaustion fades, replaced by the thrill of knowing your skills and persistence led to this achievement, reaffirming your potential."

iStock-2183635872.jpg
Nicole Y.

National University of Singapore
B.S. Economics

03cb9890f604a800c8fd0c4151130a007f49c2e6.jpg
FAQs

FAQs

What is the learning format of a PBL?

All PBLs are offered in an 8-week online format that begins with an orientation followed by subject setup overview of the different tracks. The majority of the session time is dedicated to project development, with a final presentation at the culmination of the 8 weeks. Many PBLs are also offered bi-annually in an on-campus format that consists of daily in-person meetings.

How long does each PBL cohort last?

One round of the Online PBL cohort lasts 8 weeks, preceded bys a pre-PBL orientation week. Each On-Campus PBL usually has 8 in person meetings, with intensive classroom education and collaboration. This means the biggest difference between online and on-campus PBLs is time participants have in between meetings. 

How can I be more academically prepared before the PBL starts?

Review the Blended Learning Insights sent by the Academic Advisor and familiarize yourself with the project topic and pre-learning materials. Ensure you have all necessary softwares and other resources needed for the PBL.

For each PBL cohort, will I work in teams? Are PBL team members self-selected or assigned?

Yes, you will work in teams for each round of the PBL Cohort. Each team has 3 to 6 participants, organized by the Academic Team. The Academic Advisor will organize groupings based on students' backgrounds, preferred track, and skills. 

Can I work with the Project Lead on my project after the PBL ends?

Yes, with your AI + X Research Plan, you may request up to five PBL Research Extension meetings, where you work with the project lead to develop your project into a working manuscript. To schedule a PBL Research Extension meeting, talk to your Academic Advisor at the conclusion of your PBL.

What do I receive at the end of the PBL?

At the conclusion of the PBL cohort, you can request a PBL Evaluation Report which summarizes the PBL content, the hours you spent, the track you chose, and includes a recommendation letter from the Project Lead (for eligible participants who completed the project successfully).

Is attendance mandatory for PBL Live Sessions and Recitation Sessions?

Yes, attendance is mandatory for both PBL Live Sessions and Recitation Sessions. Participants with three or more unexcused absences forfeit their eligibility for a PBL Evaluation Report. 

Do I need to have my camera on during online PBL Live Sessions?

Yes, you must have your camera on during online PBL Live Sessions. Participants with cameras off will be marked as absent. This is meant to encourage active engagement and participation in meetings.

Explore PBLs

Real-World Applications of ML, NLP & LLM - OpenAI Project.jpg

Hugging Face Project

AI in Natural Language Processing

Build and apply AI models to analyze, classify, and understand natural language, gaining practical skills in modern NLP techniques and tools.

Fall 2025

Real-World Applications of ML, NLP & LLM - OpenAI Project.jpg

OpenAI Project

Generative AI with LLMs and Reinforcement Learning

Design AI-driven systems using large language models and reinforcement learning to build Q&A bots, improve NER, optimize dialogue summarization, and reduce toxic content.

Fall 2025

Bring Your Learning to
Life with
AI+X

Start working on real projects and build your AI+X skills from day one.

bottom of page