Skip to content

Latest commit

 

History

History
155 lines (121 loc) · 7.75 KB

youtube.md

File metadata and controls

155 lines (121 loc) · 7.75 KB

YouTube

YouTube, like Zillow, will allow us to use its data spigot without authentication, but we must create a secret ID or key as we did before:

If your client application does not use OAuth 2.0, then it must include an API key when it calls an API that's enabled within a Google Cloud Platform project. The application passes this key into all API requests as a key=API_key parameter. API keys do not grant access to any account information, and are not used for authorization.

Follow these steps to get set up:

  1. Get a google account if you don't already have one.
  2. Go to the Google API Console and create a project. I call mine msds692-test or something like that.
  3. Enable the "YouTube Data API v3" API from your console.

Never store your API key in your code.

Familiarize yourself with the API documentation and install some Python code that will make our lives easier:

$ pip install --ignore-installed --upgrade google-api-python-client

(The --ignore-installed avoids errors on El Capitan OS X.)

First contact

The Python library we are using hides all of the direct URL access so that this code will look different than what we did for Zillow. Using our developer key, we build() an object that acts like a proxy, letting us communicate with YouTube:

# code from https://developers.google.com/youtube/v3/code_samples/python#search_by_keyword
import sys
import urllib
from googleapiclient.discovery import build

DEVELOPER_KEY = sys.argv[1]
QUERY = sys.argv[2] # e.g., "cats and dogs"
YOUTUBE_API_SERVICE_NAME = "youtube"
YOUTUBE_API_VERSION = "v3"

youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION, developerKey=DEVELOPER_KEY)

Given that object, we can make a search, retrieving a list of elements:

search_response = youtube.search().list(
    q=QUERY,            # search terms
    part="id,snippet",  # what we want back
    maxResults=20,      # how many results we want back
    type="video"        # only tell me about videos
).execute()

Exercise: The search_response variable is a dict with an items key containing the search results. Use the debugger or look at API documentation to figure out what the elements of the individual search responses are. As usual, it appears there is a small discrepancy between the documentation and what I see in the actual dictionary. Figure out how to print the title and a link to the videos returned from the search. E.g., Running from the command-line, we should get:

$ python search.py MYSECRETKEY cats
Funny Cats Compilation [Most See] Funny Cat Videos Ever Part 1 https://www.youtube.com/watch?v=tntOCGkgt98
Cats are just the funniest pets ever - Funny cat compilation https://www.youtube.com/watch?v=htOroIbxiFY
Funny Cats - A Funny Cat Videos Compilation 2016 || NEW HD https://www.youtube.com/watch?v=G8KpPw303PY
Funny Cats Compilation 2016  - Best Funny Cat Videos Ever || Funny Vines https://www.youtube.com/watch?v=njSyHmcEdkw
Cats Being Jerks Video Compilation || FailArmy https://www.youtube.com/watch?v=O1KW3ZkLtuo
Cats are super funny creatures that make us laugh - Funny cat & kitten compilation https://www.youtube.com/watch?v=Zwq98O42ta0
Funny Videos Of Funny Cats Compilation 2016 [BEST OF] https://www.youtube.com/watch?v=9nZMHBDw8os
Passive Aggressive Cats Video Compilation 2016 https://www.youtube.com/watch?v=lx3egn8v4Mg
Cats are simply funny, clumsy and cute! - Funny cat compilation https://www.youtube.com/watch?v=PK2939Jji3M
Startled Cats Compilation https://www.youtube.com/watch?v=6U_XREUMOAU
People Try Walking Their Cats https://www.youtube.com/watch?v=9C1leq--_wM
Cats Saying "No" to Bath - A Funny Cats In Water Compilation https://www.youtube.com/watch?v=Wmz0wGx5sq8
Bad Cats Video Compilation 2016 https://www.youtube.com/watch?v=MXOj8yVu1fA
11 Cats You Won’t Believe Actually Exist! https://www.youtube.com/watch?v=QtMmgzGYih0
Cuddly Cats Video Compilation 2016 https://www.youtube.com/watch?v=bw5WtZmU-i0
Cats, funniest creatures in animal kingdom - Funny cat compilation https://www.youtube.com/watch?v=qIDEC2h4dZo
Funny Cats Vine Compilation September 2015 https://www.youtube.com/watch?v=HxM46vRJMZs
Adam Ruins Everything - Why Going Outside is Bad for Cats https://www.youtube.com/watch?v=GpAFpwDVBJQ
Funny Cat Videos - Cat Vines Compilation https://www.youtube.com/watch?v=VJHnPUFffCU
Gangsta Cats Video Compilation 2016 https://www.youtube.com/watch?v=VS6UOyTb5eU

Getting all video comments

Now that we know how to perform a video search, let's learn how to extract comments for particular video.

Exercise: We will extract comments by video ID, because that is what the API requires. Here are two sample video IDs:

videoId = "tntOCGkgt98"  # cat compilation
videoId = "gU_gYzwTbYQ"  # bonkers the cat

You need to call youtube.commentThreads().list(...) to get the comments. There is a bunch of sample code in the API documentation. Follow the code samples to extract the author and text of the top level comments. Here's a sample session for bonkers the cat vid:

$ python comments_one_video.py MYSECRET KEY
Comment by Okeefe Niemann: The cutest kitty!!
Comment by Charles Siu: Cute kitty
Comment by Michael Schulze: Cute kitty. I <3 data science!
Comment by Zachary Barnes: LOLZ Bonkerz is the best
Comment by Daren Ma: Is this safe for class? LOL
Comment by samandchloesmommy: Oh my goodness, that's the cutest thing ever! What a funny boy he is.
...

Please do not bring up videos that could be offensive to your fellow students!

Exercise: Now let's combine the search with the comment retrieval so that we find all comments associated with, say, 10 videos. Create a function that returns a list of video IDs (cut/paste from previous functionality):

def videoIDs(query):
    youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION, ...)
    ...
    return ids

and a function that returns a dictionary mapping video ID to a list of comment strings:

def comments(query):
    ids = videoIDs(query)
    comments = {} # map video ID to list of comment strings
    for id in ids:
        ... previous code to pull comments ...
        comments[id] = []
        for item in results["items"]:
            ...
            comments[id].append("Comment by %s: %s" % (author, text))
    return comments

Note: You'll need to wrap the call to get comment threads, youtube.commentThreads().list() in try/except on HttpError since it throws an exception if comments are disabled on a movie. See solution in comments.py.

Then the main program can just print out a list of comments for each video, putting a tab in front of the comments so it's easier to see which video they are associated with.

allcomments = comments(QUERY)
for vid in allcomments.keys()[:5]: # just 5 videos
    comments = allcomments[vid]
    print("Video "+vid)
    print("\t",)
    print('\n\t'.join(comments))

Sample output:

$ python comments.py SECRETKEY 'star wars'
...
Video O0SxVbK7Dmc	
Comment by Josh R: Rex Gregor in Cody!?!?!?!? You mean Wolfe good sir
	Comment by bini 28: We need a what if with Star Wars
	Comment by Clayton Catlin: I think, they had to do it like that. Otherwise, the original trilogy would be moot.
	Comment by Mando With a Plan: “Good Soldiers, Follow Orders..”
	Comment by Jake Subers: It's Commander Wolffe not Cody and also technically they weren't the 501st anymore when they left with Ahsoka.
	Comment by Austin Ryan: i feel like you forget order 66 is more than kill
...