GitHub - MarletteFunding/etl-challenge

Problem:

You have a vendor that delivers 2 files (data.csv, headers.txt) every day at 6AM UTC.
Here you are provided the files in the root path as if they were already downloaded from an external source (e.g. S3 or SFTP).
One file contains the raw pipe-delimited data (data.csv) without a header row.
The other file (headers.txt) contains the headers in the same order as the data in data.csv, but each header is a separate row. E.g. the first column in data.csv is Order ID, the next is Order Date and so on.
Write a program that takes the data from both files, transforms into a single CSV file with both header row and data in a comma-separated format (so not the source pipe-delimiter), and finally loads it into a database using the provided function.
Use the class provided in src/main.py, start your code in the class's run function, and write as if this were production-ready code.
We are not using a real database here, so just pretend that the Database class's load_file function handles everything for us.

Note: use Python >= 3.7 to solve and do not use any other packages or dependencies (e.g. Pandas). The goal is to show us your coding style and see how you would approach the problem using plain Python. This should not take more than 1 hour to complete.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src		src
README.md		README.md
data.csv		data.csv
headers.txt		headers.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Problem:

About

Releases

Packages

Languages

MarletteFunding/etl-challenge

Folders and files

Latest commit

History

Repository files navigation

Problem:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages