Skip to content

Content Analysis

Elias Dabbas edited this page Feb 2, 2018 · 2 revisions

Two main areas are available for content analysis, in relation to advertising:

Word Frequency:

Mainly a function that measures both, the absolute and weighted frequency of words in a piece of content.

  • Tweets (or any social post)
  • Web pages (page titles, or URLs)
  • Keywords

The important thing is that these pieces of content usually come with an associated metric, and this function takes this metric into consideration to measure both types of frequencies of words. An in depth tutorial can be found here with the code built from scratch.

Structured Content Extraction: (not yet implemented)

Utilizing the word_frequency function, and knowing that certain structured entities mean different things, we can take these into consideration, extract them, and measure their absolute and weighted frequencies:

For each of the entities we measure:

  • number of occurences
  • number of unique elements
  • absolute frequency
  • weighted frequency
  • #hashtags: this shows what content / conversations you decide to include yourself in
  • @mentions tagged users: a proxy of how conversational your account is, and who you mostly talk to online
  • URLs: a measure of how much you link, and where you link to. Where you link to is a bit tricky, as most social platforms links use shorteners, and we will need a special service to each of the providers to figure out where the URL ends up (although some APIs like Twitter's provide this information). It's still tricky if you are scraping the data.
  • emoji: an emoji is worth a thousand words! the unicodedata module provide the textual descriptions of emoji, so it becomes easy to figure them out
  • media: images / videos included in social posts
  • symbols (mainly Twitter): financial ticker symbols
  • arbitrary keywords: check whether or not, and how much your content includes certain terms, phrases, or any keyword that you ought to be covering.
  • feeling / activity (mainly Facebook): another structured indication of your announcements / moods.
  • checkins
  • special mini apps: like polls for example
Clone this wiki locally