How We Built a WhatsApp Bot That Watches & Summarises YouTube Videos So You Don't Have To — NioBot

Read Time:
minutes
If you're interested in creating a custom chatbot with unique capabilities or need assistance building one, don't hesitate to reach out to us.

In this blog, we'll guide you through creating a WhatsApp-integrated chatbot that not only answers your queries but also provides YouTube 🎬 video summaries with just a link. You'll see why we made certain design choices, the challenges we overcame, and the lessons we learned. By the end, you'll have all the resources to try it yourself.

🔗 You can find the source code here 🔗

We set out to solve a simple problem : how do you make a chatbot not just smart, but genuinely useful and accessible to everyone, everywhere? The answer was clear — integrate it with WhatsApp, the app billions of people rely on daily for communication. No extra apps, no new platforms to learn, just a WhatsApp message away.

With this in mind, we built a chatbot using Twilio that addresses real, everyday challenges:

  • Extracts and summarises YouTube 🎬 videos from links shared in chats.
  • Acts as an always online assistant with real-time internet access for instant answers.

Who Is This For?

We built this chatbot for:

  • YouTube 🎬 viewers who want quick summaries without watching entire videos.
  • Students and researchers managing projects or assignments and needing concise information or real-time answers.
  • Professionals looking for fast insights or up-to-date information within their workflow.
  • Everyday WhatsApp users who prefer simple, no-hassle tools integrated into their favourite messaging app.

It’s designed to fit seamlessly into daily routines, offering solutions that are both practical and accessible.

Let’s look at the problem first.

The Problem

We’ve all been there:

  • You’re trying to quickly find information online, but switching between apps or opening multiple tabs slows you down.
  • You’ve just come across an interesting YouTube video and want to get the key points, but you don't have time to sit through the entire video. You either have to search for a transcript (which is often incomplete) or manually skim through the video—both of which can take more time than you’d like.

These might seem like small inconveniences, but when they happen daily, they add up costing time, energy, and focus.

Despite the rise of AI driven tools, most solutions don’t integrate seamlessly into what we already use. For instance, even though you can search the web for answers or use transcription tools for videos, these involve extra steps—logging into a new platform, learning how it works, or copying data across apps. It’s inefficient and takes you out of your workflow.

What Makes This Frustrating?

  1. Scattered Information: You either search for answers manually or wait for AI tools in isolated interfaces to respond, which isn’t always practical.
  2. YouTube Overload: Videos often take too long to deliver the exact insights you need, and skipping around doesn’t guarantee you’ll find the right points.
  3. Lack of Accessibility: Tools that address these issues aren’t built where you’re already spending your time—like WhatsApp, a platform you use every day.
That’s exactly why we built this chatbot.

With two simple modes:

  1. Normal Chat Mode with Internet Access: Ask questions or look up real-time information directly in the chat—no need to open a browser or switch apps.
  2. YouTube Video Summarisation: Paste a video URL, and the bot extracts and summarizes the key points for you.

It’s straightforward, practical, and fits naturally into your daily routine.Let’s now look at its implementation and working.

Step-by-Step Guide to Building the Chatbot

Architecture Diagram

This section will guide you through setting up Twilio for WhatsApp integration, exposing your local server using Ngrok, and implementing the chatbot’s functionality step by step. By the end, you’ll have a fully functional chatbot capable of summarizing YouTube videos and engaging in real time conversations.

Setting Up Twilio & Ngrok

To integrate WhatsApp with your chatbot, follow these steps:

Step 1: Sign Up for Twilio

  1. Go to Twilio’s website and create an account.
  2. Verify your email address and phone number to activate your account.
  3. Once logged in, you’ll get access to the Twilio Console, your dashboard for managing APIs, numbers, and keys.

Step 2: Get Sandbox Access for WhatsApp

  1. Navigate to Messaging in the Twilio Console and select Try WhatsApp.
  2. Follow the steps to activate the sandbox, which will provide you with a test phone number.

Step 3: Get Your Twilio Credentials

  1. Go to the Account Settings in the Twilio Console.
  2. Note down your Account SID and Auth Token—you’ll need these for integration.
  3. In the WhatsApp section, copy your sandbox phone number (if testing) or production number (if live).

To test the chatbot during development, you’ll need a publicly accessible URL for Twilio to forward incoming WhatsApp messages to your Flask application. This is where Ngrok comes into play. It exposes your local server to the internet, making it reachable by Twilio’s webhook.

Here’s how you can configure this step:

Step 4: Install Ngrok

  1. Download and install Ngrok from the official website.
  2. Authenticate Ngrok using your account by running:
ngrok authtoken YOUR_AUTH_TOKEN

Step 5: Start Ngrok

  1. Start Ngrok on the same port where your Flask application is running. For example, if Flask runs on port 4040, use the code below.
  2. Ngrok will generate a public URL (something like https://2d57-2401-...ngrok-free.app).
ngrok http 4040

Step 6: Configure Twilio Webhook

  1. In the Twilio Console, go to Messaging > Sandbox for WhatsApp.
  2. Under When a message comes in, paste your Ngrok URL appended with /summary.
  3. Save the settings.
https://2d57-2401-4900-8821-87b6-fc63-a195-a18a-d23b.ngrok-free.app/summary

What’s Happening Behind the Scenes?

  1. Twilio Sandbox: Receives your WhatsApp message and forwards it to the webhook URL you specified (the Ngrok link).
  2. Ngrok: Acts as a secure tunnel, routing the message to your local Flask server.
  3. Flask App: Processes the message using the logic we implemented and sends a response back via Twilio.

With this setup, you can test the full functionality of your WhatsApp integrated chatbot in real-time without deploying it to the cloud.

Implementation Steps

With Twilio and Ngrok set up, we can now implement the chatbot’s functionality.

Step 1: Install Dependencies

First, ensure you have all the required dependencies installed. Use the following command to install them:

pip install flask twilio langchain langchain-together langchain-community python-dotenv

Step 2: Set Up Your Environment Variables

Create a .env file to securely store your API keys and credentials:

OPENAI_API_KEY=your_openai_api_key
TWILIO_ACCOUNT_SID=your_twilio_account_sid
TWILIO_AUTH_TOKEN=your_twilio_auth_token

Step 3: Import Required Libraries

Here’s the list of libraries we used:

  • Flask for creating a web server to handle WhatsApp webhook events.
  • Twilio for WhatsApp integration.
  • LangChain for building the LLM pipeline for chat responses and video summarization.
  • YoutubeLoader to load and process video transcripts from YouTube links.

Step 4: Define Key Functions

YouTube Video Summarisation:

  1. This function uses YoutubeLoader to extract transcripts from video URLs.
  2. The transcript is passed to an LLM (via LangChain) with a custom prompt to generate a concise summary and highlight key points
def summarise(video_url):
    loader = YoutubeLoader.from_youtube_url(video_url, add_video_info=False)
    data = loader.load()
    product_description_template = PromptTemplate(
        input_variables=["video_transcript"],
        template="..."  # Custom summarization prompt
    )
    chain = LLMChain(llm=llm_chat, prompt=product_description_template)
    summary = chain.invoke({"video_transcript": data[0].page_content})
    return summary['text']

Chat Response in Internet Mode:

def chat_response(message):
    response_template = PromptTemplate(
        input_variables=["user_message"],
        template="..."  # Chat-specific prompt
    )
    chain = LLMChain(llm=llm_chat, prompt=response_template)
    response = chain.invoke({"user_message": message})
    return response['text']

Step 5: WhatsApp Message Routing

  • The chatbot responds differently based on user input:
    • Chat Mode: General queries with real-time internet access.
    • Summary Mode: Processes YouTube links and provides a concise summary.
    • Invalid input prompts the user to either enter a valid URL or switch modes
@app.route('/summary', methods=['POST'])
def summary():
    url = request.form.get('Body').strip()
    sender = request.form.get('From')
    if url.lower() == "/chat":
        user_chat_mode[sender] = True
        response = "You've entered /chat mode! Ask me anything..."
    elif url.lower() == "/summary":
        user_chat_mode[sender] = False
        response = "You've entered /summary mode! Send a valid YouTube link..."
    elif user_chat_mode[sender]:
        response = chat_response(url)
    elif is_youtube_url(url):
        response = summarise(url)
    else:
        response = "Please enter a valid YouTube video URL or type '/chat'."
    return respond(response)

Step 6: Run the Server

Start the Flask server to listen for incoming WhatsApp messages:

if __name__ == '__main__':
    app.run(port=4040)

How to Use the WhatsApp Chatbot

To enable the chatbot to communicate over WhatsApp, you'll need to connect your device to the Twilio WhatsApp Sandbox. Follow these steps to ensure a seamless connection and start interacting with the bot.

Step 1: Connecting to the Twilio Sandbox

  1. Open WhatsApp on your device.
  2. Send a message to Twilio’s Sandbox number: +1 4155238886
  3. Use the provided sandbox code, e.g., join smile-hung.

Note : This Sandbox number is specific to your Twilio Account while the code is specific to the Sandbox session.

Step 2: What Happens If You’re Not Connected?

If your number isn’t connected to the Sandbox, Twilio will send a reminder:

Twilio Sandbox: ⚠️ Your number whatsapp:+916267702526 is not connected to a Sandbox. You need to connect it first by sending join <sandbox name>. Sandbox membership lasts for 72 hours. You can rejoin a Sandbox as many times as you want.

Make sure to complete the connection process to avoid encountering this error.

Step 3: Testing the Bot

Once connected, you can start interacting with the bot. Depending on the mode you’re in, the bot will behave differently:

/Summary Mode

Initial Interaction:

When you first message the bot, you’ll receive an introductory response:

NioBot: Hello! I'm NioBot, created by the developers at Ionio.ai. I can assist you with basic queries in /chat mode. You can also type /summary followed by a valid YouTube link to get a summary of the video!

Sample Transcript:

  1. User: /summary
    NioBot:You've entered summary mode. Please send a valid YouTube link, and I'll summarize the video for you!
  2. User: [YouTube Link]
    NioBot: Summary: The main topic of this video is the incredible durability and versatility of glass...
    Key Points:
    1. Glass is inherently brittle
    2. Gorilla Glass is made using ion exchange
    3. The earliest human-made glass was an accident
    4. Transparent glass was a game-changer
  3. User : hi
    NioBot: Please enter a valid YouTube video URL or type '/chat' to enter chat mode.

/Chat Mode

Initial Interaction:

Switching to chat mode enables a general conversation:

User: /chat

NioBot: You've entered /chat mode! Ask me anything, and I'll respond with a sprinkle of light sarcasm. If you want to switch back to /summary mode, just send me a YouTube link!

Sample Transcript:

User: What’s the meaning of life?

NioBot: Ah, the age-old question. Well, if you’re asking me, it’s all about coding, coffee, and conquering bugs! But seriously, there’s no one answer…

Key Features of Each Mode

  1. /Summary Mode:
    • Responds to YouTube links with concise, informative video summaries.
    • Guides users to provide valid YouTube URLs.
  2. /Chat Mode:
    • Allows casual, conversational interaction.
    • Injects light humor into responses for an engaging experience.

And there you have it! A fully functional WhatsApp chatbot that can summarise YouTube videos and chat in real-time. By integrating with the platforms we already use, we've simplified how you can access useful information with minimal effort.

What I’d Do Differently If I Built NioBot Again

As with any project, there were key lessons along the way. If I could go back and rebuild NioBot, I would take a few different paths, both technical and personal, each focused on making it more robust and seamless.

Prioritize Scalability

During the initial built, scalability wasn’t top of mind. I focused on getting a working prototype, but as the bot grew, I started running into issues with performance under high loads, especially with multiple users interacting with it at once.

What I would do:

I’d design the system with scalability in mind from the beginning, optimizing for load balancing and introducing more sophisticated methods for handling concurrent users.

Automate More Testing

As much as I tested the bot during development, the testing process wasn’t as automated as it should have been. I spent a lot of time manually testing different features and scenarios, which wasn’t scalable as the project grew.

What I would do:

I’d set up automated testing frameworks to run unit and integration tests regularly, ensuring that every feature works as expected.

What’s Next for NioBot? Future Features That Will Take It to the Next Level

I’ve been brainstorming what comes next to make NioBot even better. We’ve got some exciting ideas for future updates—ones that will make it more interactive and useful for everyone. These new features are all about pushing the bot’s limits and offering even more value to users.

1. Video RAG Integration: Finding That Perfect Moment in a YouTube Video

One of the things I find frustrating when watching YouTube videos is trying to find that one specific part. Sometimes I just want the answer to one question, but the video is too long to sit through. So, I started thinking—why not let NioBot do the heavy lifting for me?

It’s an idea that excites me a lot. I can already picture how much time this could save people. Whether you’re watching a tutorial or just trying to find a specific piece of info, NioBot would make it super easy.

2. Vision Capabilities: Letting NioBot See the World

The more I work with NioBot, the more I realize how cool it would be if it could actually see images, not just process text. For example, imagine you share a picture of a cityscape and ask, "What kind of architecture is this?" Or maybe you upload a photo of a circuit board and say, "Can you identify the components here?"

With vision abilities, NioBot could become even more interactive, allowing users to not only ask questions about text but also engage with visual content. Integrating with a vision model will make it more intuitive and versatile, opening up new possibilities for users to interact with it in creative ways.

Wrapping Up…

With NioBot, we’ve taken two powerful tasks—summarizing videos and providing real-time information and made them easily accessible through WhatsApp. It’s designed to save time and add convenience, all while keeping things simple.

If you’re looking to create something unique for your business, let’s connect. We’re here to bring your ideas to life.

If you're interested in creating a custom chatbot with unique capabilities or need assistance building one, don't hesitate to reach out to us.

Book an AI consultation

Looking to build AI solutions? Let's chat.

Schedule your consultation today - this not a sales call, feel free to come prepared with your technical queries.

You'll be meeting Rohan Sawant, the Founder.
 Company
Book a Call

Let us help you.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Behind the Blog 👀
Shivam Mitter
Writer

The guy on coffee who can do AI/ML.

Pranav Patel
Editor

Good boi. He is a good boi & does ML/AI. AI Lead.