- Translations
- About
- A brief description
- A detailed description
- Application behaviour by case
- Standardizantion of files and sharepoints path
- Instalation
- Commands
- References
The application finances_mail_invoices_and_bills was developed to automatise monthly email sent of invoices and bills for a company clients. It works with the company sharepoint's folders, emails elaboration with files attachment and reports elaboration of succesfull and non-successfull cases.
This application is made to non-developers operate it and can be used once a month or more according to the demand. It is only runned manually for this version.
It was originally developed for Windows OS.
This application uses Python's framework Django, the libs OpenPyXl, Pandas, Selenium as RPA, the SQLite3 database and a windows .bat file that runs the whole application. It also works with Pytest test cases.
Everyone may operate this application, but its configuration may be done only after reading A detailed description and Instalation .
This application is manually started by clicking twice the .bat file 'script_for_bat_file.bat' that can be placed anywhere in the user's computer. The user doesn't need to do anything else but he can occasionally open the terminal created by the .bat file during the process and read the messages displayed.
Bellow a brief description of how this application works:
finances_mail_invoices_and_bills application works with 3 different sharepoint's folders* (ordered by year and month):
1. Client's data
2. Client's monthly invoices and bills
3. Reports
*This folder's links bust be provided in advance for configuration to make the application work. Read Instalation 's .env file for more information.
The application first looks for 2 base spreadsheets (client's contact data (email) and other client's data (ID, invoice number, etc) respectively) of the current month in Client's data folder using Selenium. If found, these spreadsheets are downloaded in an application's specific folder called 'raw_table/'. After downloading it the application uses OpenPyXL to extract relevant data from specific columns of the downloaded spreadsheets. With these extractions a third spreadsheet is created also with some new columns and placed in another application folder called 'edited_table/'. From the edited_table's spreadsheet the application uses Pandas to insert it in a SQLite3 database.With a database created the Django* framework comes to scene. This framework will be responsible with the database lines' management. It extracts data from the database and does the following:
*For developers: The application finances_mail_invoices_and_bills doesn't use Django's server or any endpoint or url from it, but just its model and a single view instantiated out of its projects' folder.
Now it is time to use Selenium again. From every single database's line Django takes specific data and feeds Selenium. From this data Selenium goes back to the company's sharepoint in Client's monthly invoices and bills folder and looks for this data, which are pdfs and another spreadsheets.If found, they're selected, downloaded and moved to an application's folder called 'attachments/'. Then, an email is created using django's EmailMessage with this files attached to it and sent to the client.
A copy of this email may be sent to the company's email if configured in django's 'settings'.
If the email is succesfully sent the successful cases ('Sent') report is fed with the client's ID (brazilian's CNPJ) and its invoice's number (brazilian's NFE) and a message is displayed in .bat's terminal:
"Email successfully sent! Check inbox."
If not found or something else fails, the unsuccesful cases ('Not sent') report is fed with this same client's info and another message is displayed in .bat's terminal:
"No client found for {CNPJ}!"
or
"No nfe found for {NFE}!"
Or another error identified.
This process is made for every single database line.
At the end of it, the tables inside 'raw_table/' are deleted and a message is displayed in .bat's terminal:"Application finished its process succesfully!"
The successful and unsuccessful reports' info are extracted and placed in a third report called 'daily report' which is sent to sharepoint's Reports folder by Selenium.
With the 'raw_table/' folder emptied the application will look for the 2 base spreadsheets in Client's data folder again when manually restarted. These 2 may have been updated or have new data.
If the new base spreadsheets contain new data the application already have files to compare what is new and what is not and feed the 'edited_table/''s spreadsheet, which is not deleted at the end of a process*, with new data.
*edited_table/'s table contains data of sent and not sent emails. It is maintained to avoid email sent duplicity.
This application is designed to operate at the beggining and during the month. When it changes something else must be done:In the application's folder there is a file named "DELETE_ME_BEFORE_FIRST_MONTH_OPERATION.txt" that must be deleted before starting the first process of the month. Its deletion will delete all the application's tables and its reports.
For more information how this application behaves read Application behaviour by case.
For a more technical information of how this app works read the chapter below:
This chapter is made for developers and for everyone that really want to understand how the app works and may be configured.
The application 'finances_tables_to_db_and_mail' has 5 main directories:
- management_before_django/: stores modules and script responsible for manipulating the 2 base spreadsheets downloaded and SQLite3 database creation as well as the tables' folders.
- robot_sharepoint: stores the three Selenium modules and its auxiliary modules, the attachments and the reports folders.
- dj_project: stores the django project and its app filter_tables/.
- utils: stores auxiliary functions and variables.
- tests: stores, for now, tests for root files using pytest.
And everything is runned with the script './run_everything_here.py':
The whole application can be runned at the root directory by the script './run_everything_here.py' or by the .bat file 'script_for_bat_file.bat' that runs it.
This script runs all the application by running the module tables_to_db and the django view EmailAttachByTable's instance.
But firstly the script looks for the file "DELETE_ME_BEFORE_FIRST_MONTH_OPERATION.txt" in the root dir. If not found, the script deletes all the tables present in the application and recreates the DELETE_ME file.
After this the script searches for the 2 base spreadsheets in the directory './management_before_django/raw_table/'. If found the aplication follows up looking for attachments. If not, it uses the function 'robot_for_raw_table()' to search for the 2 spreadsheets on sharepoint's link for Client's data and downloads it to 'raw_table/'.
Everything from this directory is runned by the module '/table_managements/scripts/tables_to_db.py'. This 3 module instances- manages tables content (1 filter_table_column),
- inserts the result in a SQLite3 database (2 insert_table_to_db) and
- creates a Django module from this database (3 create_model_from_database).
- robot_for_contacts_downloads (for searching and downloading spreadsheet that has clients' contacts' data)
- robot_for_database_downloads (idem for spreadsheet that has clients' other data - e.g. due date, net amount to pay, etc)
- robot_for_attachments_downloads (idem for files that will be attached to client's email)
- robot_to_upload_files (for final reports)
All of these 4 'robots' need to login to sharepoint using email and password provided in .env, but they may have different approaches:
These functions are used to download a single element per process. The contacts one doesn't work to year and month periods/folders while database does. Both of them uses the download_directories_management module inside 'robot_utils' that is going to be treated afterwards. This one downloads multiple elements and works with year and month periods/folders. Furthermore, for finding the desired files it uses the client's ID* (brazilian CNPJ) and invoice number (brazilian NFE). The downloaded files may have spreadsheets and/or bill pdfs, but all may have invoices. The same for download_directories_management module.*The same client ID may have different invoice numbers. So a single client may have more than one email sent by him.
This 'robot' takes the final report made after the end of the app's process and uploads it to a specific sharepoint folder and works with year and month periods/folders.(At the time this application was developed it was not found any manner to download the selected files directly to the desired directories such as 'attachments/', but only to the default Windows' "Downloads". So the two functions described bellow served for remedying this issue.)
Because of the issue described above on Windows OS and Selenium's configuration for 'webdriver.Edge' the download_directories_management module was made. It has two functions:
-
empty_download_directories (before Selenium tools are used this function is called to delete previous files downloaded preventing system crashs)
-
moving_files_from_virtual_dir (the downloaded files are moved from the default download dir to the desired one. For instance, robot_for_attachments_downloads uses it to send downloaded files from 'Downloads' to 'attachments/' folder.)
The view EmailAttachByTable reads line by line from the models' Table. From each line it firstly reads the column 'STATUS' content.
If its content is set to "Not sent", the view will take the Table's row data and feed robot_for_attachments_downloads() with it, if made succesfully by finding the files and downloading them it will read the attachments/ folder and select the appropriate template for the email (the file that defines which one is selected is the 'bill').
After that it takes info from the files' titles to feed the template*. For instance, the year competency, the invoice number, etc. This new data extracted will be used in the email template.
*For more information read Standardizantion of files and sharepoints path.
If succesfully sent the view will update the Table 'status' column cell to "Sent" and keep reanding each row continuously.
This directory stores functions and variables that are used throughout the application. It has two directories: functions nad variables. There are only 3:-
deleting_elements (used to search and delete no more desired content)
-
path_length (used to check if there are one or more tables in the path selected)
-
do_we_have_model (used to solve issue of django model)
Function used in 'run_everything_here' to check if there are tables in raw_table/ or not and if not download them and also in 'filter_table_column' to check if the new third table created from 'raw_table/' needs to be compared to the 'edited_table/' one.(This function was created to solve an issue of django model: as at the time of this development sometimes the model was deleted during the process this function was created to replaced it in this case avoiding developer's msnuslsupport.)
Tests developed for testing the application's functionalities disposed by directories. Use of pytest.
Bellow a resumed list of the application behaviour by case:
ERRORS:
1. Interrupted process (lack of light energy, lack of internet, user closes terminal accidentally):
Procedure:
Process finished.
When restart process:
Spreadsheet downloaded remains;
Spreadsheet edited remains;
Sent elements report remains;
Not sent elements report deleted;
Final report recriated.
2. Interrupted process (internal error or voluntairly interruption of process on terminal (ex: end of working day)):
Procedure:
Final report sent to sharepoint;
Process finished.
When restart process:
Downloaded spreadsheet remains;
Edited spreadsheet remains;
Sent elements report remains;
Not sent elements report deleted;
Final report recriated.
IDEAL:
3. Process ended succesfully, but new invoices and bills will come during the month:
Procedure:
Final report sent to sharepoint;
Delete downloaded spreadsheet;
Edited spreadsheet remains;
Process finished.
When restart process:
spreadsheet remains;
Downloaded spreadsheet remains;
Edited spreadsheet remains;
Sent elements report remains;
Not sent elements report deleted;
Final report recriated.
4. Process ended (end of billings period - all billings sent OR MONTH TURN):
Procedure:
Delete Spreadsheets;
Delete reports;
Process finished.
When restart process:
Download spreadsheet;
Spreadsheet downloaded is created;
Spreadsheet edited is created;
Sent elements report is created;
Not sent elements report is created;
Final report recriated.
(for the client this application was originally developed)
SPREADSHEETS DOWNLOAD (The way sharepoints folders need to be disposed to make 'robot' finds and downloads spreadsheets):
CONTACTS SPREADSHEET (matr020 - CONTACTS.xlsx):
Obtained in sharepoint's path: Documents > 02 - BILLINGS > 14 - CLIENTS DATA
Lines:
1st line: Columns titles
Other lines: (content)
Columns used:
Column E (CNPJ/CPF): dd.ddd.ddd/dddd-dd (Ex: 11.222.333/4444-55) # digits with '.', '/' and '-'.
Column H (E-Mail): email standard (Ex: [email protected])
CLIENTS DATA SPREADSHEET (finr940.xlsx):
Obtained in sharepoint's path: Documents > 02 - BILLINGS > 14 - CLIENTS DATA > YEAR {year_number} > {xx} {month_name} # Ex: YEAR 2023 > 01 JANUARY
Obs: Both year and month are dinamically obtained by the application.
Lines:
1st line: Columns titles
Other lines: (content)
Columns used:
Column D (Client's name): text
Column E (CNPJ): dddddddddddddd (Ex: 01234567890123) # only digits for clients' id
Column G (Number): 0ddddd (Ex: 012345) # Invoice number must begin with 0
Column K (Due date): dd/mm/aaaa (Ex: 28/02/2024) # 'day-month-year' format
Column S (Net value): 00.000,00 (50.000,44) # '.' when there are thousands and ',' for cents
ATTACHMENT DOWNLOAD:
Obtained in sharepoint's path: 01 - MEASUREMENTS > YEAR {year_number} > {xx} {month_name} > {CNPJ - nome} > {number_nfe}
# Ex: YEAR 2023 > 01 JANUARY > 00111222333444455 - name > 012345
Obs: The values are dinamically obtained by the application.
Bills:
"'invoice' <invoice_number> <service_type> <year_competency>.pdf" # mind the 3 spaces!
Example: 'invoice 17758 FIX December24.pdf'.
Measurements:
"Measurement.xlsx"
Example: 'Measurement.xlsx'
Invoices:
"BILL <invoice_number>.pdf" # mind the space!
Example: 'BILL 17757.pdf'
REPORTS UPLOAD:
Obtained in sharepoint's path: Documents > 02 - BILLINGS > 15 - EMAIL SENDING REPORTS > YEAR {year_number} > {xx} {month_name}
# Ex: YEAR 2023 > 01 JANUARY
Obs: Both year and month are dinamically obtained by the application.
-
The code versioning Git,
-
The programming language Python,
-
A code editor, also known as IDE. For instance, Visual Studio Code (VSCode),
-
And versioning your directory to receive the aplication clone:
git init
git clone https://github.com/AndreKuratomi/finances_mail_invoices_and_bills.git
WINDOWS:
Obs: In case of any mistake similar to this one:
unable to access 'https://github.com/AndreKuratomi/finances_mail_invoices_and_bills.git': SSL certificate problem: self-signed certificate in certificate chain
Configure git to disable SSL certification:
git config --global http.sslVerify "false"
Enter the directory:
cd finances_tables_to_db_and_mail
LINUX:
python3 -m venv venv --upgrade-deps
WINDOWS:
py -m venv venv --upgrade-deps
In case an error like this one is returned just follow the command displayed:
The virtual environment was not created successfully because ensurepip is not
available. On Debian/Ubuntu systems, you need to install the python3-venv
package using the following command.
apt install python3.10-venv
You may need to use sudo with that command. After installing the python3-venv
package, recreate your virtual environment.
*It is a good practice to work with virtual enviroments because different projects may need different dependencies. A virtual enviroment is only a separated enviroment from the user machine. If not used, the user's machine may have lots of dependencies intalled that may only be used in a single project.
LINUX:
source/venv/bin/activate
WINDOWS:
On Windows operational system it is necessary to configure the Execution Policy at PowerShell:
Get-ExecutionPolicy # to check the Execution policy type
Set-ExecutionPolicy RemoteSigned # to change the type of policy if the command above shows 'Restricted'
Obs: It may often be necessary to open PowerShell as administrador for that.
.\env\Scripts\activate
pip install -r requirements.txt
WINDOWS:
In case any error similar to the one bellow be returned:
ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: 'C:\\Users\\andre.kuratomi\\OneDrive - Company\\Área de Trabalho\\tables_to_db_mail_for_finances\\tables_to_db_and_mail_finances\\env\\Lib\\site-packages\\jedi\\third_party\\django-stubs\\django-stubs\\contrib\\contenttypes\\management\\commands\\remove_stale_contenttypes.pyi'
HINT: This error might have occurred since this system does not have Windows Long Path support enabled. You can find information on how to enable this at https://pip.pypa.io/warnings/enable-long-paths
Run cmd as adminstrador with the following command:
reg.exe add HKLM\SYSTEM\CurrentControlSet\Control\FileSystem /v LongPathsEnabled /t REG_DWORD /d 1 /f
code .
./
touch .env
Inside it we need to put our enviroment variables taking as reference the given file .env.example:
# DJANGO:
SECRET_KEY=secret_key
# EMAIL VARIABLES:
EMAIL_HOST_USER=host_email
EMAIL_HOST_PASSWORD=host_password
# SHAREPOINT VARIABLES:
SHAREPOINT_FOR_UPLOAD_URL=sharepoint_for_upload_url
SHAREPOINT_BILLINGS_URL=billings
SHAREPOINT_MEASUREMENTS_URL=measurements
DOWNLOAD_DIRECTORY=download_directory
RAW_TABLE_DIRECTORY=raw_table_url
Obs: Do not share info from .env file. It is already mentioned in .gitignore for not being pushed to the repo.
For all the necessary procedures for running the aplication we may only run the command bellow in the root directory:
WINDOWS:
py run_everything_here.py
LINUX:
python3 run_everything_here.py
If you want to run the tests use this command in the same directory:
pytest