Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exporter les Acteurs selon la licence à appliquer à la source #1135

Merged
merged 4 commits into from
Dec 13, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions .github/actions/export_acteur/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
name: "Export Displayed Acteur"
description: "Export displayed acteur and upload to S3"
inputs:
SCALINGO_API_TOKEN:
description: "Duplicate API token"
required: true
fabienheureux marked this conversation as resolved.
Show resolved Hide resolved
SCALINGO_APP:
description: "Scalingo production app"
required: true
S3_HOST:
description: "S3 host URL"
required: true
S3_BUCKET:
description: "S3 production bucket"
required: true
AWS_ACCESS_KEY_ID:
description: "AWS access key ID"
required: true
AWS_SECRET_ACCESS_KEY:
description: "AWS secret access key"
required: true
fabienheureux marked this conversation as resolved.
Show resolved Hide resolved
FILE_NAME:
description: "File name for the exported file"
required: false
default: "exported_displayedacteur.csv"
SCALINGO_APP_REGION:
description: "Scalingo region for the app"
required: false
default: "osc-fr1"

runs:
using: "composite"
steps:
- name: Install Scalingo CLI
uses: scalingo-community/[email protected]
with:
region: ${{ inputs.SCALINGO_APP_REGION }}
- name: Login Scalingo CLI
shell: bash
run: |
scalingo login --api-token ${{ inputs.SCALINGO_API_TOKEN }}
- name: Generate timestamped file name
id: generate_filename
shell: bash
run: |
TIMESTAMP=$(date +'%Y%m%d_%H%M%S')
echo "TIMESTAMPED_FILE_NAME=${TIMESTAMP}_${{ inputs.FILE_NAME }}" >> $GITHUB_ENV
- name: Execute sync script in one-off container
shell: bash
run: |
scalingo --app ${{ inputs.SCALINGO_APP }} run \
python manage.py export_displayedacteur --file ${{ env.TIMESTAMPED_FILE_NAME }}
continue-on-error: true
- name: Get file from s3
shell: bash
run: |
aws --endpoint-url ${{ inputs.S3_HOST }} s3 cp ${{ inputs.S3_BUCKET }}/exports/${{ env.TIMESTAMPED_FILE_NAME }} ${{ env.TIMESTAMPED_FILE_NAME }}
- name: Save file to artefact
uses: actions/upload-artifact@v4
with:
path: ${{ env.TIMESTAMPED_FILE_NAME }}
33 changes: 33 additions & 0 deletions .github/workflows/preprod_export_acteur_open_license.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: "✅ Export des Acteurs sous licence ouverte en PREPROD"

on:
workflow_dispatch:

env:
DUPLICATE_API_TOKEN: ${{ secrets.DUPLICATE_API_TOKEN }}
PREPROD_APP: ${{ secrets.SCALINGO_PREPROD_APP }}
S3_HOST: https://cellar-c2.services.clever-cloud.com
S3_PREPROD_BUCKET: ${{ secrets.LVAO_S3_PREPROD_BUCKET }}
AWS_ACCESS_KEY_ID: ${{ secrets.LVAO_S3_ACCESS_KEY }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.LVAO_S3_SECRET_KEY }}

defaults:
run:
shell: bash

jobs:
export_acteur_open_license_preprod:
name: ✅ Export des Acteurs sous licence ouverte en PREPROD
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Export Displayed Acteur
uses: ./.github/actions/export_acteur
with:
SCALINGO_API_TOKEN: ${{ env.DUPLICATE_API_TOKEN }}
SCALINGO_APP: ${{ env.PREPROD_APP }}
S3_HOST: ${{ env.S3_HOST }}
S3_BUCKET: ${{ env.S3_PREPROD_BUCKET }}
AWS_ACCESS_KEY_ID: ${{ env.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ env.AWS_SECRET_ACCESS_KEY }}
35 changes: 35 additions & 0 deletions .github/workflows/prod_export_acteur_open_license.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
name: "🛑 Export des Acteurs sous licence ouverte en PROD"

on:
workflow_dispatch:
schedule:
- cron: "0 2 * * SUN"

env:
DUPLICATE_API_TOKEN: ${{ secrets.DUPLICATE_API_TOKEN }}
PRODUCTION_APP: ${{ secrets.SCALINGO_PRODUCTION_APP }}
S3_HOST: https://cellar-c2.services.clever-cloud.com
S3_PRODUCTION_BUCKET: ${{ secrets.LVAO_S3_PRODUCTION_BUCKET }}
AWS_ACCESS_KEY_ID: ${{ secrets.LVAO_S3_ACCESS_KEY }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.LVAO_S3_SECRET_KEY }}

defaults:
run:
shell: bash

jobs:
export_acteur_open_license_preprod:
name: Export des Acteurs sous licence ouverte en PROD
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Export Displayed Acteur
uses: ./.github/actions/export_acteur
with:
SCALINGO_API_TOKEN: ${{ env.DUPLICATE_API_TOKEN }}
SCALINGO_APP: ${{ env.PRODUCTION_APP }}
S3_HOST: ${{ env.S3_HOST }}
S3_BUCKET: ${{ env.S3_PRODUCTION_BUCKET }}
AWS_ACCESS_KEY_ID: ${{ env.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ env.AWS_SECRET_ACCESS_KEY }}
44 changes: 30 additions & 14 deletions qfdmo/admin/acteur.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from typing import Any
from typing import Any, List

import orjson
from django import forms
Expand Down Expand Up @@ -566,20 +566,25 @@ class OpenSourceDisplayedActeurResource(resources.ModelResource):

limit = 0
offset = 0
licenses = []

def __init__(self, limit=0, offset=0, **kwargs):
def __init__(
self, limit: int = 0, offset: int = 0, licenses: List[str] = [], **kwargs
):
self.limit = limit
self.offset = offset
self.licenses = licenses
super().__init__(**kwargs)

uuid = fields.Field(column_name="Identifiant", attribute="uuid", readonly=True)
sources = fields.Field(
column_name="Contributeurs", attribute="sources", readonly=True
)
sources = fields.Field(column_name="Paternité", attribute="sources", readonly=True)

def dehydrate_sources(self, acteur):
sources = ["Longue Vie Aux Objets", "ADEME"]
sources.extend([f"{source.libelle}" for source in acteur.sources.all()])
acteur_sources = acteur.sources.all()
if self.licenses:
acteur_sources = acteur_sources.filter(licence__in=self.licenses)
sources.extend([f"{source.libelle}" for source in acteur_sources])
seen = set()
deduplicated_sources = []
for source in sources:
Expand Down Expand Up @@ -682,10 +687,22 @@ def dehydrate_propositions_services(self, acteur):
)

def get_queryset(self):

queryset = super().get_queryset()

queryset = queryset.prefetch_related(
"sources",
"labels",
"proposition_services__sous_categories",
"proposition_services__action",
)

# Only Actif
queryset = queryset.filter(
statut=ActeurStatus.ACTIF,
).exclude(
)
# Exclude acteurs only professionals
queryset = queryset.exclude(
public_accueilli__in=[
ActeurPublicAccueilli.AUCUN,
ActeurPublicAccueilli.PROFESSIONNELS,
Expand All @@ -695,16 +712,15 @@ def get_queryset(self):
queryset = queryset.exclude(
identifiant_unique__icontains="_reparation_",
)

fabienheureux marked this conversation as resolved.
Show resolved Hide resolved
queryset = queryset.prefetch_related(
"sources",
"labels",
"proposition_services__sous_categories",
"proposition_services__action",
)
# Export only acteurs with expected licenses
if self.licenses:
queryset = queryset.filter(sources__licence__in=self.licenses)
queryset = queryset.distinct()
queryset = queryset.order_by("uuid")

if self.limit:
return queryset[self.offset : self.offset + self.limit]

return queryset

class Meta:
Expand Down
53 changes: 44 additions & 9 deletions qfdmo/management/commands/export_displayedacteur.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,51 @@
from django.core.management.base import BaseCommand

from qfdmo.admin import OpenSourceDisplayedActeurResource
from qfdmo.models.acteur import DataLicense

CHUNK = 1000


class Command(BaseCommand):
help = "Export Ressources using CSV format"

def add_arguments(self, parser):
parser.add_argument(
"--file",
type=str,
help="File to export to",
default=(f"export_acteur_{datetime.now().strftime('%Y%m%d_%H%M%S')}.xlsx"),
)
parser.add_argument(
"--licenses",
nargs="+",
action="extend",
type=str,
help=(
f"Licenses to export, options : {DataLicense.values}, "
f"default: '{DataLicense.OPEN_LICENSE.value}'"
),
)

def handle(self, *args, **options):
self.stdout.write(f"Exporting Ressources, starting at {datetime.now()}")
target_file = datetime.now().strftime(
"exports/export_acteur_%Y%m%d_%H%M%S.xlsx"
self.stdout.write(
self.style.SUCCESS(f"Exporting Ressources, starting at {datetime.now()}")
)
target_file = "exports/" + options["file"]
licenses = options["licenses"]
if not licenses:
licenses = [DataLicense.OPEN_LICENSE.value]
if not all(license in DataLicense.values for license in licenses):
self.stdout.write(
self.style.ERROR(
f"Invalid licenses, options : {licenses}, "
f"Available values: '{DataLicense.values}'"
)
)
return

self.stdout.write(
self.style.SUCCESS(f"Exporting DisplayedActeur using licenses: {licenses}")
)

with tempfile.NamedTemporaryFile(mode="w+b", suffix=".xlsx") as tmp_file:
Expand All @@ -31,26 +65,27 @@ def handle(self, *args, **options):

offset = 0
dataset = OpenSourceDisplayedActeurResource(
limit=CHUNK, offset=offset
limit=CHUNK, offset=offset, licenses=licenses
).export()
sheet.append(dataset.headers)

while dataset.dict:
self.stdout.write(f"Exporting {offset} to {offset + CHUNK}")
self.stdout.write(
self.style.SUCCESS(f"Exporting {offset} to {offset + CHUNK}")
)
dataset.headers = None

for row in dataset.dict:
sheet.append(row)

offset += CHUNK
dataset = OpenSourceDisplayedActeurResource(
limit=CHUNK, offset=offset
limit=CHUNK, offset=offset, licenses=licenses
).export()

self.stdout.write(f"Writing to {target_file}")
self.stdout.write(self.style.SUCCESS(f"Writing to {target_file}"))

workbook.save(tmp_file.name)
tmp_file.seek(0)
default_storage.save(target_file, ContentFile(tmp_file.read()))

self.stdout.write(f"Ended at {datetime.now()}")
self.stdout.write(self.style.SUCCESS(f"Ended at {datetime.now()}"))
Loading
Loading