dscigametrics
, or Data Science Google Analytics Metrics, is a python package that provides a set of ready-made functions that can help users with minimual coding skills easily digest and analyse advertising data obtained from Google Analytics. While Google Analytics allows users to easily download data as a csv file, the resulting spreadsheet is an intimidating and unituitive block of dense information. Instead of trying to analyse this in excel, users can instead load it into a python script as a pandas dataframe and let this package do the analysis work for them!
compute_metrics
summarises general performance of campaign based on four metrics.stat_summary
summarises variance of campaign performance based on four metrics.daily_plot
visualises performance changes of campaign based on four metrics.find_campaigns
identifies the best and worst performing campaigns based on a selected metric.
The popularity and influence of Google Analytics means that there is already a decent number of related python packages, such as googleanalytics, which can be found here on PyPI. However the majority of these packages provide functionality that allows developers to interact with the Google Analytics API, which presupposes a fairly high level of technical skill. Our package is intended to help users with a novice familiarity with python by operating directly on downloaded GA data sets instead.
$ pip install dscigametrics
git clone [email protected]:UBC-MDS/Group_9_GA_Metrics.git
cd Group_9_GA_Metrics # Navigate to the cloned repository directory
$ conda env create -f environment.yml # Create Conda environment
$ conda activate ga_package # Activate the Conda environment
Ensure the Conda environment is activated. You should see Group_9_GA_Metrics in the terminal prompt.
$ poetry install # Install the package using Poetry
Here is a basic example of how to use this package:
import dscigametrics
import pandas as pd
data = pd.read_csv('where/is/your/data/saved.csv')
Choose Your parameters:
campaign_id = 123851219
start_date = 20220801
end_date = 20220831
Compute metrics:
metrics_dictionary = compute_metrics(data, campaign_id, start_date, end_date)
conversion rate: 0.116
new to return rate: 0.88
total transaction revenue: $14548.0
average transaction revenue: $501.6551724137931
Calculate Summary Statistics:
summary = stat_summary(data, campaign_id, start_date, end_date)
Create daily plot:
plot = daily_plot(data, campaign_id, start_date, end_date, width=300, height=800)
Find the best and worst performance campaign:
campaign_ids = [219011657, 140569061, 215934049, 123851219]
metric = 'conversion_rate'
best_worst_campaigns = find_campaigns(
data=data,
start_date=start_date,
end_date=end_date,
campaign_ids=campaign_ids,
metric=metric
)
{'best_campaign': {'id': 123851219, 'value': 0.116}, 'worst_campaign': {'id': 219011657, 'value': 0.056}}
Documentation for all functions in the package, as well as a demonstration notebook, can be found here on Read the Docs.
Beth Ou-Yang, Ian MacCarthy, Yili Tang, Weilin Han
Contributions are welcome and greatly appreciated! If you're interested in contributing to this project, take a look at the contributor guide.
dscigametrics
was created by DSCI524 Cohort8 Group9. It is licensed under the terms of the MIT license.
dscigametrics
was created with cookiecutter
and the py-pkgs-cookiecutter
template.