Skip to content

Latest commit

 

History

History
28 lines (23 loc) · 920 Bytes

README.md

File metadata and controls

28 lines (23 loc) · 920 Bytes

Sberbank Home Assignment (2018)

Position: Junior Data Scientist.

The task of determining the industry of the company.

Data

Data is stored in a folder data using git lfs
To download data install git lfs and execute:

git lfs fetch
git lfs checkout

pays.csv - payments between companies:

  • hash_inn_kt - anonymized sender INN
  • hash_inn_dt - anonymized recipient INN
  • week - week
  • count - number of payments per week
  • sum - the amount of payments per week, rounded to the nearest hundred

inn_info_public.csv - company information:

  • hash_inn - anonymized INN
  • okved2 - anonymized industry (target variable)
  • region - anonymized region
  • is_public - flag train / validation

Solution

My solution in solution.ipynb