Scrape YouTube Comments Using Python: A Step-by-Step Guide

Comments on YouTube videos can provide valuable insights and feedback. In this tutorial, we’ll guide you through building a YouTube comment scraper using Python. You’ll learn how to extract and analyze comments from any YouTube video, helping you gather user opinions and sentiment data. Whether you’re a developer, marketer, or researcher, this guide will equip you with the tools to unlock valuable insights from YouTube’s vast comment ecosystem.

Why Scrape YouTube Comments?

YouTube comments are a treasure trove of information. Here’s why scraping them might be useful:

  • Audience Analysis: Understand what users think about your content or a competitor’s video.
  • Sentiment Analysis: Gauge the general tone (positive, negative, neutral) of audience feedback.
  • Content Ideas: Extract common questions and feedback to inspire future content.
  • Data Mining: Collect data for academic or market research.

Tools You’ll Need

To build a YouTube comment scraper in Python, you’ll use the following tools and libraries:

  1. YouTube Data API: Provided by Google, it allows programmatic access to YouTube data.
  2. Python Libraries:
    • googleapiclient to interact with the YouTube Data API.
    • pandas for organizing and analyzing the scraped data.
    • requests and json for handling HTTP requests and parsing responses.
  3. API Key: A YouTube Data API key from the Google Cloud Console.

Setting Up Your Environment

Step 1: Install Required Libraries

First, ensure you have Python installed. Then, install the required libraries:

pip install google-api-python-client pandas requests

Step 2: Obtain Your API Key

  • Go to the Google Cloud Console.
  • Create a new project and enable the “YouTube Data API v3.”
  • Generate an API key for accessing the API.

youtube api

You can also read my Build a YouTube Scraper tutorial, where I provided detailed instructions on how to create a YouTube API key from the Google Console.

Step-by-Step Guide to Scraping YouTube Comments

Step 1: Import Libraries

Start by importing the necessary Python libraries:

from googleapiclient.discovery import build
import pandas as pd

Step 2: Initialize the YouTube API Client

Use your API key to create an API client:

api_key = "YOUR_API_KEY" 
youtube = build("youtube", "v3", developerKey=api_key)

Step 3: Extract Video Comments

Define a function to fetch comments from a YouTube video:

def get_comments(video_id):
    comments = []
    request = youtube.commentThreads().list(
        part="snippet",
        videoId=video_id,
        maxResults=100
    )
    response = request.execute()

    while response:
        for item in response['items']:
            comment = item['snippet']['topLevelComment']['snippet']['textDisplay']
            author = item['snippet']['topLevelComment']['snippet']['authorDisplayName']
            comments.append({"Author": author, "Comment": comment})
        
        if 'nextPageToken' in response:
            request = youtube.commentThreads().list(
                part="snippet",
                videoId=video_id,
                pageToken=response['nextPageToken'],
                maxResults=100
            )
            response = request.execute()
        else:
            break
    return comments

The get_comments function fetches YouTube comments for a given video ID using the YouTube Data API. Here’s a concise explanation:

  1. Initialize: Creates an empty list comments to store comment data.
  2. API Request: Makes an initial API call to fetch up to 100 comments for the video.
  3. Extract Data: Loops through the response to extract comment text and author name, appending them to the comments list.
  4. Pagination: Checks for nextPageToken to fetch additional pages of comments if available.
  5. Return: Outputs the complete list of comments as dictionaries containing Author and Comment.

This function effectively handles pagination and retrieves all top-level comments from a video.

Step 4: Save Comments to a CSV File

Save the extracted comments into a CSV file for analysis:

video_id = "YOUR_VIDEO_ID"
comments = get_comments(video_id)
df = pd.DataFrame(comments)
df.to_csv("youtube_comments.csv", index=False)
print("Comments saved to youtube_comments.csv")

Here is a screenshot showing what the CSV result looks like:

youtube_comment

Analyzing YouTube Comments

With the comments saved in a CSV file, you can analyze them using Python or a tool like Excel. For instance, you can use Python’s TextBlob library to perform sentiment analysis on the comments.

Example: Sentiment Analysis

Install the textblob library and analyze the sentiment of each comment:

pip install textblob
 
from textblob import TextBlob

df['Sentiment'] = df['Comment'].apply(lambda x: TextBlob(x).sentiment.polarity)
print(df.head())

Ethical Considerations

When scraping data from YouTube, ensure you adhere to ethical and legal guidelines:

  • Respect YouTube’s Terms of Service.
  • Use the data responsibly, especially for public comments.

Full Code

from googleapiclient.discovery import build
import pandas as pd
api_key = "YOUR_API_KEY"
youtube = build("youtube", "v3", developerKey=api_key)

def get_comments(video_id):
    comments = []
    request = youtube.commentThreads().list(
        part="snippet",
        videoId=video_id,
        maxResults=100
    )
    response = request.execute()

    while response:
        for item in response['items']:
            comment = item['snippet']['topLevelComment']['snippet']['textDisplay']
            author = item['snippet']['topLevelComment']['snippet']['authorDisplayName']
            comments.append({"Author": author, "Comment": comment})
        
        if 'nextPageToken' in response:
            request = youtube.commentThreads().list(
                part="snippet",
                videoId=video_id,
                pageToken=response['nextPageToken'],
                maxResults=100
            )
            response = request.execute()
        else:
            break
    return comments


video_id = "YOUR_VIDEO_ID"
comments = get_comments(video_id)
df = pd.DataFrame(comments)
df.to_csv("youtube_comments.csv", index=False)
print("Comments saved to youtube_comments.csv")

Conclusion

Building a YouTube comment scraper in Python is a straightforward and powerful way to gather insights from audience feedback. With the tools and steps provided, you can extract, save, and analyze YouTube comments to uncover trends, opinions, and actionable insights.

What will you do with your scraped YouTube comments? Share your thoughts or questions in the comments below!

Responses

Related Projects

google flights
YouTube_Scraper_Python_Thumbnail
airbnb scraping with python web scraping guide
Zillow