Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@Outlinevpn Process Daily usage statistics #870

Merged
merged 10 commits into from
Sep 13, 2024
Merged

Conversation

koechkevin
Copy link
Contributor

@koechkevin koechkevin commented Sep 4, 2024

Description

Currently, Outline vpn manager statistics are generated monthly by running this script.

dist/cfa_vpn.py/outline_manager.pex -e

We are currently proposing a solution that allows a user get these stats easily without always having the intervention of tech team devs.

Here is the proposed implementation plan

This PR introduces 2 API endpoints.

  1. POST /api/userStatistics: To query outline vpn API for the statistics, calculate daily usage then store it in a local database for analysis. This endpoint will be executed via a cron.
  2. GET /api/userStatistics: Gets the stored statistics in the database as per the queries sent.

Process Daily User Statistics.

This action is triggered by POST /api/userStatistics. Since The Outline VPN API returns only userId and total data transferred by the user over time, we need to find a way to calculate daily usage by finding the difference from last fetch.

  1. Get data transferred from Outline VPN API.
  2. Get users from Outline API to get user data from user IDs.
  3. Calculate the data transferred by each user. We achieve this by querying the previously fetched and stored cumulative data transfer and subtracting it from the current cumulative obtained from 1 above. This cron will be executed at least once a day at around 11 PM so every calculation will be computed to the day it was run.
  4. Store the computed data in the database. If a computation for the current date for the user ID already exists, we simply update it else create a new entry.
  5. Return the processed records.

Get statistics.

GET /api/userStatistics
This API allows users to fetch stored statistics from the database based on various filters. The records can be filtered by date, email, date range, user ID, and record ID. Results can also be grouped and ordered based on specific fields.

Parameter Type Description Example
date string Filter results by a specific date in YYYY-MM-DD format. ?date=2024-09-03
email string Filter results by the user’s email address. [email protected]
dateBetween.start string Start of the date range for filtering. Requires dateBetween.end. ?dateBetween.start=2024-09-03
dateBetween.end string End of the date range for filtering. Requires dateBetween.start. ?dateBetween.end=2024-09-04
ID number Filter results by a specific record ID. ?ID=45
userId number Filter results by a specific user ID. ?userId=107
orderBy string Order results by a specific field in either ascending (ASC) or descending (DESC) order. ?orderBy=usage DESC
groupBy * string Group results by a specific field. Common options are email, date. ?groupBy=email

The response will return records in the format

[{
        "ID": 103,
        "userId": "107",
        "usage": 9899671,
        "date": "2024-9-3",
        "cumulativeData": 5958400566,
        "email": "[email protected]",
        "createdAt": "2024-09-03T13:07:43.464Z"
    }]

Grouped by Email

[{
        "email": "[email protected]",
        "userId": "107",
        "totalUsage": 9985821081
    }]

Grouped by Date

[{
        "date": "2024-9-4",
        "totalUsage": 21785853001
    }]

Fixes #875

Type of change

  • New feature (non-breaking change which adds functionality)

  • This change requires a documentation update

Screenshots

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation

@koechkevin koechkevin self-assigned this Sep 4, 2024
@koechkevin koechkevin marked this pull request as draft September 4, 2024 13:05
@koechkevin koechkevin changed the title @Outlinevpn @Outlinevpn Process Daily usage statistics Sep 5, 2024
@koechkevin koechkevin marked this pull request as ready for review September 5, 2024 09:27
@koechkevin koechkevin requested a review from a team September 5, 2024 09:28
@koechkevin
Copy link
Contributor Author

@m453h refer to this when reviewing implementation plan.

apps/roboshield/next-env.d.ts Outdated Show resolved Hide resolved
apps/vpnmanager/src/pages/api/userStatistics.ts Outdated Show resolved Hide resolved
Copy link
Member

@kilemensi kilemensi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏽

--

Think about naming and structuring things e.g. in lib, may be there should be an outline.js file that handles pulling data from the VPN backend. API end points, may be they can be a bit more RESTful i.e. /api/statics with PUT or POST updating the stats and GET returning the stats, etc.

Update 1
On APIs, I think the processGSheet is the biggest offender ...

apps/vpnmanager/src/lib/data/database.ts Outdated Show resolved Hide resolved
apps/vpnmanager/src/lib/data/database.ts Outdated Show resolved Hide resolved
apps/vpnmanager/src/lib/data/database.ts Outdated Show resolved Hide resolved
apps/vpnmanager/src/lib/data/database.ts Outdated Show resolved Hide resolved
apps/vpnmanager/src/lib/userStatistics.ts Outdated Show resolved Hide resolved
apps/vpnmanager/src/pages/api/userStatistics.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@m453h m453h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Thanks @koechkevin , I've gained further context after carefully reading through the implementation plan and it looks solid. I could be missing something but looking at the Table, I was thinking that UPSERT could simplify the whole operation of calculating and inserting the user stats if we keep userId and date as unique constraints.

@kilemensi
Copy link
Member

The key (accessUrl) you posted on above was real and working @koechkevin . I had to delete it for obvious reasons ... Hopeful you'll be able to create a new one.

I think lets remove accessUrl completely from all stats functionality. The only time we should access it is when we need to email it to the user AND it must never be stored or logged anywhere outside the VPN itself.

@koechkevin koechkevin requested review from m453h and removed request for m453h September 6, 2024 13:17
Copy link
Contributor

github-actions bot commented Sep 9, 2024

Latest updated Preview URL

Name Review
codeforafrica-ui-pr-870 Visit

@koechkevin
Copy link
Contributor Author

koechkevin commented Sep 9, 2024

The data usage API was hardcoded only to obtain data transferred in the last (30 * 24) hours creating a sliding window problem, which makes the daily computations we have proposed in this PR inaccurate.

We may have to open a PR for the above fix or fork the repository to deploy.

@kilemensi
Copy link
Member

The data usage API was hardcoded only to obtain data transferred in the last (30 * 24) hours creating a sliding window problem, which makes the daily computations we have proposed in this PR inaccurate.

We may have to open a PR for the above fix or fork the repository to deploy.

Not sure I fully follow @koechkevin (or have full context) but:

  1. Haven't we known this all along i.e. isn't this how the current script works @thepsalmist ?
  2. Why would we need to fork the repo or another PR?

@kilemensi kilemensi added the enhancement New feature or request label Sep 11, 2024
@koechkevin
Copy link
Contributor Author

@m453h / @kelvinkipruto

Copy link
Contributor

@m453h m453h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

Just an additional thought, the Filters interface could have stricter types for orderBy since we know the fields we would expect

@koechkevin koechkevin added this pull request to the merge queue Sep 13, 2024
Merged via the queue into main with commit 1c58c3a Sep 13, 2024
6 checks passed
@koechkevin koechkevin deleted the outline-vpn-user-stats branch September 13, 2024 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

@vpnmanager Data transfer statistics
4 participants