- You have a vendor that delivers 2 files (
data.csv
,headers.txt
) every day at 6AM UTC. - Here you are provided the files in the root path as if they were already downloaded from an external source (e.g. S3 or SFTP).
- One file contains the raw pipe-delimited data (
data.csv
) without a header row. - The other file (
headers.txt
) contains the headers in the same order as the data indata.csv
, but each header is a separate row. E.g. the first column indata.csv
isOrder ID
, the next isOrder Date
and so on. - Write a program that takes the data from both files, transforms into a single CSV file with both header row and data in a comma-separated format (so not the source pipe-delimiter), and finally loads it into a database using the provided function.
- Use the class provided in
src/main.py
, start your code in the class'srun
function, and write as if this were production-ready code. - We are not using a real database here, so just pretend that the Database class's
load_file
function handles everything for us.
Note: use Python >= 3.7 to solve and do not use any other packages or dependencies (e.g. Pandas). The goal is to show us your coding style and see how you would approach the problem using plain Python. This should not take more than 1 hour to complete.