` readings with more than 30 days in advance, in case they represent more 1%. \n",
+ "4.\tStacked all invoices by customer with the mean of each variable (although in some cases other aggregation functions are used instead).\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Feature selection\n",
+ "\n",
+ "Here should be explained how the features used for the model were selected, if any feature selection methodology was implemented and so on.\n",
+ "\n",
+ "Example:\n",
+ "\n",
+ "The feature selection methodology used was the Forward Selection. Also, some variables used on the Credit risk model were also considered, up until when the model reached its optimal performance. PCA and RFE did not present better results so far.\n",
+ "\n",
+ "**Features used**\n",
+ "\n",
+ "Here should be placed the features used to run the model entirely.\n",
+ "\n",
+ "Example:\n",
+ "\n",
+ "The features used to train both the clustering and classifier can be found under `models/config`, with the static variable `COLS_TO_TRAIN`.\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Modeling\n",
+ "\n",
+ "Here goes a better explanation on what algorithm was chosen and how does the complete model work.\n",
+ "\n",
+ "Example:\n",
+ "\n",
+ "To identify the clusters, the K-Means algorithm was chosen. A Random Forest Classifier from scikit-learn was then trained to predict the clusters for new customers and also customers with updated variable values. This pipeline has shown a good performance, running under 5 minutes and with satisfactory results.\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Model selection\n",
+ "\n",
+ "Here should be stated a clarification on how it was decided to choose the running algorithm, and why not other ones. If any future implementations are envisioned, a brief statement should be made as well.\n",
+ "\n",
+ "Example:\n",
+ "\n",
+ "Although K-Medians normally performs better with outliers, K-Means still has shown more consistent results and therefore was decided to be used. K-Modes and K-Prototypes were also tested, for including categorical variables, but also had not either reached a good Silhouette Score or divided the data in a meaningful way. Spectral Clustering presented good results for a sample dataset, but has shown poor performance for the complete dataset, and still has to be studied for the next version. DBSCAN was considered, but since it neglects the outliers, it was then decided to be discarded. \n",
+ "\n",
+ "For the classification algorithm, Random Forest outperformed the Decision Tree and the XGBoost classifiers.\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Model validation\n",
+ "\n",
+ "A step-by-step explanation on what was done to choose the running model. The metrics measured to select them are also desired to be placed in the steps.\n",
+ "\n",
+ "Example:\n",
+ "\n",
+ "1.\tThe cluster numbers were chosen based on the Silhouette Score of 0,85.\n",
+ "2.\tThe dataset was sampled to ensure the clustering consistency.\n",
+ "3.\tThe clustered data was used as input for the Random Forest Classifier\n",
+ "4.\tThe data was split into train and test (80/20).\n",
+ "5.\tThe classifier reached the Accuracy Score of 0,96.\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Model optimization\n",
+ "\n",
+ "Here should be explained if any kind of optimization of the model was made.\n",
+ "\n",
+ "Example:\n",
+ "\n",
+ "The hyper-parameters `n_estimators` and `max_depth` were manually optimized for the classifier, although there is still room for improvement, through GridSearch, Genetic Algorithms, etc. \n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Drifting and Retraining\n",
+ "\n",
+ "Here should be explained what are the necessary steps done in order to retrain the model, whenever some kind of drifting or underperformance is noticed.\n",
+ "\n",
+ "Example:\n",
+ "\n",
+ "Whenever drifting is noticed, lines 40, 41, 47, 56, 60 and 61 need to be uncommented on `main.py` before running the model. Once it is done, they need to be commented again. In future versions, this process is ought to be automated.\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Foreseen improvements\n",
+ "\n",
+ "If any future improvements are identified during any of the steps, they should be pointed out in this section.\n",
+ "\n",
+ "Example:\n",
+ "\n",
+ "- New features to be engineered in the next versions can potentially enhance clustering or recommendations. For example, it is still to be tested whether there is a type of customer that only pays back on a specific time of the month or week, and therefore they won't show in the recommendations list when they don't pay. \n",
+ "\n",
+ "- Treating the (many) outliers with some other measures, could potentially enhance predictions.\n",
+ "\n",
+ "- Automatically gather data and run the model is something to be worked on for future versions"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.6"
+ },
+ "latex_envs": {
+ "LaTeX_envs_menu_present": true,
+ "autoclose": false,
+ "autocomplete": true,
+ "bibliofile": "biblio.bib",
+ "cite_by": "apalike",
+ "current_citInitial": 1,
+ "eqLabelWithNumbers": true,
+ "eqNumInitial": 1,
+ "hotkeys": {
+ "equation": "Ctrl-E",
+ "itemize": "Ctrl-I"
+ },
+ "labels_anchors": false,
+ "latex_user_defs": false,
+ "report_style_numbering": false,
+ "user_envs_cfg": false
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": true,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {},
+ "toc_section_display": true,
+ "toc_window_display": false
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/log/.gitkeep b/log/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/notebooks_dev/DE_00_extract.ipynb b/notebooks_dev/DE_00_extract.ipynb
new file mode 100644
index 0000000..bfda83b
--- /dev/null
+++ b/notebooks_dev/DE_00_extract.ipynb
@@ -0,0 +1,394 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Define module in wihch `#export` tag will save the code in `src`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#default_exp de__extract"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Import modules that are only used in documentation, nbdev related stuff like testing using assert and more generally inside this notebook (not going to src)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.showdoc import *\n",
+ "\n",
+ "%load_ext autoreload\n",
+ "%autoreload 2 #autoreload to make code from other modules get updated online inside notebook\n",
+ "\n",
+ "import sys\n",
+ "sys.path.append('..') #appends project root to path in order to import project packages since `noteboks_dev` is not on the root\n",
+ "\n",
+ "#DO NOT EDIT"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "\n",
+ "#Internal Imports\n",
+ "#imports that are going to be used only during development and are not intended to be loaded inside the generated modules.\n",
+ "#for example: use imported modules to generate graphs for documentation, but lib is unused in actual package\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Data Engineering Extract\n",
+ "> Module containing data extracting functionalities to be used in the Data Engineering pipelines"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Dev comments"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### TODOs - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- [X] TODO: do something\n",
+ "- [ ] TODO: do something else"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Notebook History - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- 16/02 - developed feature A as requested by business team\n",
+ "- 17/02 - couldn't quite understand specific business rule, request explanation from business team\n",
+ "- 21/02 - business rule is now clearly explained, foo should be ran before bar and not otherwise"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Code session"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### External imports\n",
+ "> imports that are intended to be loaded in the actual modules (going to src) e.g.: module dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < func > THIS IS TEMPLATE CONTENT- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "def func(a,b): \n",
+ " '''\n",
+ " a function that subs a and b\n",
+ " '''\n",
+ " return a + b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`func` comments and usage examples for documentation:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "func(1,2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < Class > THIS IS TEMPLATE CONTENT-- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "class Class:\n",
+ " '''\n",
+ " a class to apply a function through the apply method\n",
+ " '''\n",
+ " def __init__(self, func):\n",
+ " assert callable(func), 'func should be callable type'\n",
+ " self.func = func\n",
+ " \n",
+ " def apply(self, *args, **kwargs):\n",
+ " return self.func(*args, **kwargs)\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`Class` comments and usage examples for documentation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "cls_instance = Class(sum)\n",
+ "\n",
+ "cls_instance.apply([1,2,3,4])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Code created from src\n",
+ "> Session concainning code generated in src (.py files) and converted back to notebook using nbdev_update_lib command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "##############\n",
+ "#we highly recommend you creating ccode from the notebook instead \n",
+ "#of creating from src and running nbdev_update_lib. Still, if you want\n",
+ "#to proceeed, please create your new code bellow this tag before running nbdev_update_lib\n",
+ "##############"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Experiment session\n",
+ "> Session to run the code and test functions and classes generated in this notebook. Helpfull for documentation and experimental development.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Tests\n",
+ "> Session to write tests in the nb-dev fashion (using assert)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def test_1(a,b):\n",
+ " '''\n",
+ " tests if func is returning a sum\n",
+ " '''\n",
+ " return func(a,b) == a + b"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Passed!\n"
+ ]
+ }
+ ],
+ "source": [
+ "assert test_1(1,2)\n",
+ "print('Passed!')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Export -"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.export import notebook2script\n",
+ "notebook2script()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.6"
+ },
+ "latex_envs": {
+ "LaTeX_envs_menu_present": true,
+ "autoclose": false,
+ "autocomplete": true,
+ "bibliofile": "biblio.bib",
+ "cite_by": "apalike",
+ "current_citInitial": 1,
+ "eqLabelWithNumbers": true,
+ "eqNumInitial": 1,
+ "hotkeys": {
+ "equation": "Ctrl-E",
+ "itemize": "Ctrl-I"
+ },
+ "labels_anchors": false,
+ "latex_user_defs": false,
+ "report_style_numbering": false,
+ "user_envs_cfg": false
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": true,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {
+ "height": "calc(100% - 180px)",
+ "left": "10px",
+ "top": "150px",
+ "width": "250.319px"
+ },
+ "toc_section_display": true,
+ "toc_window_display": true
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/notebooks_dev/DE_01_transform.ipynb b/notebooks_dev/DE_01_transform.ipynb
new file mode 100644
index 0000000..2e538af
--- /dev/null
+++ b/notebooks_dev/DE_01_transform.ipynb
@@ -0,0 +1,394 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Define module in wihch `#export` tag will save the code in `src`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#default_exp de__transform"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Import modules that are only used in documentation, nbdev related stuff like testing using assert and more generally inside this notebook (not going to src)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.showdoc import *\n",
+ "\n",
+ "%load_ext autoreload\n",
+ "%autoreload 2 #autoreload to make code from other modules get updated online inside notebook\n",
+ "\n",
+ "import sys\n",
+ "sys.path.append('..') #appends project root to path in order to import project packages since `noteboks_dev` is not on the root\n",
+ "\n",
+ "#DO NOT EDIT"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "\n",
+ "#Internal Imports\n",
+ "#imports that are going to be used only during development and are not intended to be loaded inside the generated modules.\n",
+ "#for example: use imported modules to generate graphs for documentation, but lib is unused in actual package\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Data Engineering Transform\n",
+ "> Module containing data transformation functionalities to be used in the Data Engineering pipelines"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Dev comments"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### TODOs - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- [X] TODO: do something\n",
+ "- [ ] TODO: do something else"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Notebook History - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- 16/02 - developed feature A as requested by business team\n",
+ "- 17/02 - couldn't quite understand specific business rule, request explanation from business team\n",
+ "- 21/02 - business rule is now clearly explained, foo should be ran before bar and not otherwise"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Code session"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### External imports\n",
+ "> imports that are intended to be loaded in the actual modules (going to src) e.g.: module dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < func > THIS IS TEMPLATE CONTENT- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "def func(a,b): \n",
+ " '''\n",
+ " a function that subs a and b\n",
+ " '''\n",
+ " return a + b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`func` comments and usage examples for documentation:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "func(1,2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < Class > THIS IS TEMPLATE CONTENT-- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "class Class:\n",
+ " '''\n",
+ " a class to apply a function through the apply method\n",
+ " '''\n",
+ " def __init__(self, func):\n",
+ " assert callable(func), 'func should be callable type'\n",
+ " self.func = func\n",
+ " \n",
+ " def apply(self, *args, **kwargs):\n",
+ " return self.func(*args, **kwargs)\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`Class` comments and usage examples for documentation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "cls_instance = Class(sum)\n",
+ "\n",
+ "cls_instance.apply([1,2,3,4])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Code created from src\n",
+ "> Session concainning code generated in src (.py files) and converted back to notebook using nbdev_update_lib command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "##############\n",
+ "#we highly recommend you creating ccode from the notebook instead \n",
+ "#of creating from src and running nbdev_update_lib. Still, if you want\n",
+ "#to proceeed, please create your new code bellow this tag before running nbdev_update_lib\n",
+ "##############"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Experiment session\n",
+ "> Session to run the code and test functions and classes generated in this notebook. Helpfull for documentation and experimental development.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Tests\n",
+ "> Session to write tests in the nb-dev fashion (using assert)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def test_1(a,b):\n",
+ " '''\n",
+ " tests if func is returning a sum\n",
+ " '''\n",
+ " return func(a,b) == a + b"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Passed!\n"
+ ]
+ }
+ ],
+ "source": [
+ "assert test_1(1,2)\n",
+ "print('Passed!')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Export -"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.export import notebook2script\n",
+ "notebook2script()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.6"
+ },
+ "latex_envs": {
+ "LaTeX_envs_menu_present": true,
+ "autoclose": false,
+ "autocomplete": true,
+ "bibliofile": "biblio.bib",
+ "cite_by": "apalike",
+ "current_citInitial": 1,
+ "eqLabelWithNumbers": true,
+ "eqNumInitial": 1,
+ "hotkeys": {
+ "equation": "Ctrl-E",
+ "itemize": "Ctrl-I"
+ },
+ "labels_anchors": false,
+ "latex_user_defs": false,
+ "report_style_numbering": false,
+ "user_envs_cfg": false
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": true,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {
+ "height": "calc(100% - 180px)",
+ "left": "10px",
+ "top": "150px",
+ "width": "250.319px"
+ },
+ "toc_section_display": true,
+ "toc_window_display": true
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/notebooks_dev/DE_02_load.ipynb b/notebooks_dev/DE_02_load.ipynb
new file mode 100644
index 0000000..54d0d6f
--- /dev/null
+++ b/notebooks_dev/DE_02_load.ipynb
@@ -0,0 +1,394 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Define module in wihch `#export` tag will save the code in `src`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#default_exp de__load"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Import modules that are only used in documentation, nbdev related stuff like testing using assert and more generally inside this notebook (not going to src)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.showdoc import *\n",
+ "\n",
+ "%load_ext autoreload\n",
+ "%autoreload 2 #autoreload to make code from other modules get updated online inside notebook\n",
+ "\n",
+ "import sys\n",
+ "sys.path.append('..') #appends project root to path in order to import project packages since `noteboks_dev` is not on the root\n",
+ "\n",
+ "#DO NOT EDIT"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "\n",
+ "#Internal Imports\n",
+ "#imports that are going to be used only during development and are not intended to be loaded inside the generated modules.\n",
+ "#for example: use imported modules to generate graphs for documentation, but lib is unused in actual package\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Data Engineering Load\n",
+ "> Module containing data loading functionalities to be used in the Data Engineering pipelines"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Dev comments"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### TODOs - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- [X] TODO: do something\n",
+ "- [ ] TODO: do something else"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Notebook History - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- 16/02 - developed feature A as requested by business team\n",
+ "- 17/02 - couldn't quite understand specific business rule, request explanation from business team\n",
+ "- 21/02 - business rule is now clearly explained, foo should be ran before bar and not otherwise"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Code session"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### External imports\n",
+ "> imports that are intended to be loaded in the actual modules (going to src) e.g.: module dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < func > THIS IS TEMPLATE CONTENT- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "def func(a,b): \n",
+ " '''\n",
+ " a function that subs a and b\n",
+ " '''\n",
+ " return a + b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`func` comments and usage examples for documentation:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "func(1,2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < Class > THIS IS TEMPLATE CONTENT-- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "class Class:\n",
+ " '''\n",
+ " a class to apply a function through the apply method\n",
+ " '''\n",
+ " def __init__(self, func):\n",
+ " assert callable(func), 'func should be callable type'\n",
+ " self.func = func\n",
+ " \n",
+ " def apply(self, *args, **kwargs):\n",
+ " return self.func(*args, **kwargs)\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`Class` comments and usage examples for documentation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "cls_instance = Class(sum)\n",
+ "\n",
+ "cls_instance.apply([1,2,3,4])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Code created from src\n",
+ "> Session concainning code generated in src (.py files) and converted back to notebook using nbdev_update_lib command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "##############\n",
+ "#we highly recommend you creating ccode from the notebook instead \n",
+ "#of creating from src and running nbdev_update_lib. Still, if you want\n",
+ "#to proceeed, please create your new code bellow this tag before running nbdev_update_lib\n",
+ "##############"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Experiment session\n",
+ "> Session to run the code and test functions and classes generated in this notebook. Helpfull for documentation and experimental development.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Tests\n",
+ "> Session to write tests in the nb-dev fashion (using assert)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def test_1(a,b):\n",
+ " '''\n",
+ " tests if func is returning a sum\n",
+ " '''\n",
+ " return func(a,b) == a + b"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Passed!\n"
+ ]
+ }
+ ],
+ "source": [
+ "assert test_1(1,2)\n",
+ "print('Passed!')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Export -"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.export import notebook2script\n",
+ "notebook2script()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.6"
+ },
+ "latex_envs": {
+ "LaTeX_envs_menu_present": true,
+ "autoclose": false,
+ "autocomplete": true,
+ "bibliofile": "biblio.bib",
+ "cite_by": "apalike",
+ "current_citInitial": 1,
+ "eqLabelWithNumbers": true,
+ "eqNumInitial": 1,
+ "hotkeys": {
+ "equation": "Ctrl-E",
+ "itemize": "Ctrl-I"
+ },
+ "labels_anchors": false,
+ "latex_user_defs": false,
+ "report_style_numbering": false,
+ "user_envs_cfg": false
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": true,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {
+ "height": "calc(100% - 180px)",
+ "left": "10px",
+ "top": "150px",
+ "width": "250.319px"
+ },
+ "toc_section_display": true,
+ "toc_window_display": true
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/notebooks_dev/DS_00_load.ipynb b/notebooks_dev/DS_00_load.ipynb
new file mode 100644
index 0000000..14fadc1
--- /dev/null
+++ b/notebooks_dev/DS_00_load.ipynb
@@ -0,0 +1,395 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Define module in wihch `#export` tag will save the code in `src`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#default_exp ds__load"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Import modules that are only used in documentation, nbdev related stuff like testing using assert and more generally inside this notebook (not going to src)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.showdoc import *\n",
+ "\n",
+ "%load_ext autoreload\n",
+ "%autoreload 2 #autoreload to make code from other modules get updated online inside notebook\n",
+ "\n",
+ "import sys\n",
+ "sys.path.append('..') #appends project root to path in order to import project packages since `noteboks_dev` is not on the root\n",
+ "\n",
+ "#DO NOT EDIT"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "\n",
+ "#Internal Imports\n",
+ "#imports that are going to be used only during development and are not intended to be loaded inside the generated modules.\n",
+ "#for example: use imported modules to generate graphs for documentation, but lib is unused in actual package\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Data Science Load\n",
+ "> Module containing data loading functionalities to be used in the Data Science pipelines"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Dev comments"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### TODOs - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- [X] TODO: do something\n",
+ "- [ ] TODO: do something else"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Notebook History - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- 16/02 - developed feature A as requested by business team\n",
+ "- 17/02 - couldn't quite understand specific business rule, request explanation from business team\n",
+ "- 21/02 - business rule is now clearly explained, foo should be ran before bar and not otherwise"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Code session"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### External imports\n",
+ "> imports that are intended to be loaded in the actual modules (going to src) e.g.: module dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < func > THIS IS TEMPLATE CONTENT- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "def func(a,b): \n",
+ " '''\n",
+ " a function that subs a and b\n",
+ " '''\n",
+ " return a + b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`func` comments and usage examples for documentation:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "func(1,2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < Class > THIS IS TEMPLATE CONTENT-- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "class Class:\n",
+ " '''\n",
+ " a class to apply a function through the apply method\n",
+ " '''\n",
+ " def __init__(self, func):\n",
+ " assert callable(func), 'func should be callable type'\n",
+ " self.func = func\n",
+ " \n",
+ " def apply(self, *args, **kwargs):\n",
+ " return self.func(*args, **kwargs)\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`Class` comments and usage examples for documentation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "cls_instance = Class(sum)\n",
+ "\n",
+ "cls_instance.apply([1,2,3,4])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Code created from src\n",
+ "> Session concainning code generated in src (.py files) and converted back to notebook using nbdev_update_lib command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "##############\n",
+ "#we highly recommend you creating ccode from the notebook instead \n",
+ "#of creating from src and running nbdev_update_lib. Still, if you want\n",
+ "#to proceeed, please create your new code bellow this tag before running nbdev_update_lib\n",
+ "##############"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Experiment session\n",
+ "> Session to run the code and test functions and classes generated in this notebook. Helpfull for documentation and experimental development.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Tests\n",
+ "> Session to write tests in the nb-dev fashion (using assert)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def test_1(a,b):\n",
+ " '''\n",
+ " tests if func is returning a sum\n",
+ " '''\n",
+ " return func(a,b) == a + b"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Passed!\n"
+ ]
+ }
+ ],
+ "source": [
+ "assert test_1(1,2)\n",
+ "print('Passed!')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Export -"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.export import notebook2script\n",
+ "notebook2script()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.6"
+ },
+ "latex_envs": {
+ "LaTeX_envs_menu_present": true,
+ "autoclose": false,
+ "autocomplete": true,
+ "bibliofile": "biblio.bib",
+ "cite_by": "apalike",
+ "current_citInitial": 1,
+ "eqLabelWithNumbers": true,
+ "eqNumInitial": 1,
+ "hotkeys": {
+ "equation": "Ctrl-E",
+ "itemize": "Ctrl-I"
+ },
+ "labels_anchors": false,
+ "latex_user_defs": false,
+ "report_style_numbering": false,
+ "user_envs_cfg": false
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": true,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {
+ "height": "calc(100% - 180px)",
+ "left": "10px",
+ "top": "150px",
+ "width": "250.319px"
+ },
+ "toc_section_display": true,
+ "toc_window_display": true
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
+
diff --git a/notebooks_dev/DS_01_preprocess.ipynb b/notebooks_dev/DS_01_preprocess.ipynb
new file mode 100644
index 0000000..03821f3
--- /dev/null
+++ b/notebooks_dev/DS_01_preprocess.ipynb
@@ -0,0 +1,394 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Define module in wihch `#export` tag will save the code in `src`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#default_exp ds__preprocess"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Import modules that are only used in documentation, nbdev related stuff like testing using assert and more generally inside this notebook (not going to src)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.showdoc import *\n",
+ "\n",
+ "%load_ext autoreload\n",
+ "%autoreload 2 #autoreload to make code from other modules get updated online inside notebook\n",
+ "\n",
+ "import sys\n",
+ "sys.path.append('..') #appends project root to path in order to import project packages since `noteboks_dev` is not on the root\n",
+ "\n",
+ "#DO NOT EDIT"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "\n",
+ "#Internal Imports\n",
+ "#imports that are going to be used only during development and are not intended to be loaded inside the generated modules.\n",
+ "#for example: use imported modules to generate graphs for documentation, but lib is unused in actual package\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Data Science Preprocess\n",
+ "> Module containing data preprocessing functionalities to be used in the Data Science pipelines"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Dev comments"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### TODOs - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- [X] TODO: do something\n",
+ "- [ ] TODO: do something else"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Notebook History - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- 16/02 - developed feature A as requested by business team\n",
+ "- 17/02 - couldn't quite understand specific business rule, request explanation from business team\n",
+ "- 21/02 - business rule is now clearly explained, foo should be ran before bar and not otherwise"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Code session"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### External imports\n",
+ "> imports that are intended to be loaded in the actual modules (going to src) e.g.: module dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < func > THIS IS TEMPLATE CONTENT- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "def func(a,b): \n",
+ " '''\n",
+ " a function that subs a and b\n",
+ " '''\n",
+ " return a + b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`func` comments and usage examples for documentation:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "func(1,2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < Class > THIS IS TEMPLATE CONTENT-- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "class Class:\n",
+ " '''\n",
+ " a class to apply a function through the apply method\n",
+ " '''\n",
+ " def __init__(self, func):\n",
+ " assert callable(func), 'func should be callable type'\n",
+ " self.func = func\n",
+ " \n",
+ " def apply(self, *args, **kwargs):\n",
+ " return self.func(*args, **kwargs)\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`Class` comments and usage examples for documentation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "cls_instance = Class(sum)\n",
+ "\n",
+ "cls_instance.apply([1,2,3,4])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Code created from src\n",
+ "> Session concainning code generated in src (.py files) and converted back to notebook using nbdev_update_lib command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "##############\n",
+ "#we highly recommend you creating ccode from the notebook instead \n",
+ "#of creating from src and running nbdev_update_lib. Still, if you want\n",
+ "#to proceeed, please create your new code bellow this tag before running nbdev_update_lib\n",
+ "##############"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Experiment session\n",
+ "> Session to run the code and test functions and classes generated in this notebook. Helpfull for documentation and experimental development.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Tests\n",
+ "> Session to write tests in the nb-dev fashion (using assert)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def test_1(a,b):\n",
+ " '''\n",
+ " tests if func is returning a sum\n",
+ " '''\n",
+ " return func(a,b) == a + b"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Passed!\n"
+ ]
+ }
+ ],
+ "source": [
+ "assert test_1(1,2)\n",
+ "print('Passed!')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Export -"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.export import notebook2script\n",
+ "notebook2script()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.6"
+ },
+ "latex_envs": {
+ "LaTeX_envs_menu_present": true,
+ "autoclose": false,
+ "autocomplete": true,
+ "bibliofile": "biblio.bib",
+ "cite_by": "apalike",
+ "current_citInitial": 1,
+ "eqLabelWithNumbers": true,
+ "eqNumInitial": 1,
+ "hotkeys": {
+ "equation": "Ctrl-E",
+ "itemize": "Ctrl-I"
+ },
+ "labels_anchors": false,
+ "latex_user_defs": false,
+ "report_style_numbering": false,
+ "user_envs_cfg": false
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": true,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {
+ "height": "calc(100% - 180px)",
+ "left": "10px",
+ "top": "150px",
+ "width": "250.319px"
+ },
+ "toc_section_display": true,
+ "toc_window_display": true
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/notebooks_dev/DS_02_build_features.ipynb b/notebooks_dev/DS_02_build_features.ipynb
new file mode 100644
index 0000000..e68d308
--- /dev/null
+++ b/notebooks_dev/DS_02_build_features.ipynb
@@ -0,0 +1,394 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Define module in wihch `#export` tag will save the code in `src`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#default_exp ds__build_features"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Import modules that are only used in documentation, nbdev related stuff like testing using assert and more generally inside this notebook (not going to src)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.showdoc import *\n",
+ "\n",
+ "%load_ext autoreload\n",
+ "%autoreload 2 #autoreload to make code from other modules get updated online inside notebook\n",
+ "\n",
+ "import sys\n",
+ "sys.path.append('..') #appends project root to path in order to import project packages since `noteboks_dev` is not on the root\n",
+ "\n",
+ "#DO NOT EDIT"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "\n",
+ "#Internal Imports\n",
+ "#imports that are going to be used only during development and are not intended to be loaded inside the generated modules.\n",
+ "#for example: use imported modules to generate graphs for documentation, but lib is unused in actual package\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Data Science Build Features\n",
+ "> Module containing feature engineering functionalities to be used in the Data Science pipelines"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Dev comments"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### TODOs - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- [X] TODO: do something\n",
+ "- [ ] TODO: do something else"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Notebook History - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- 16/02 - developed feature A as requested by business team\n",
+ "- 17/02 - couldn't quite understand specific business rule, request explanation from business team\n",
+ "- 21/02 - business rule is now clearly explained, foo should be ran before bar and not otherwise"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Code session"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### External imports\n",
+ "> imports that are intended to be loaded in the actual modules (going to src) e.g.: module dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < func > THIS IS TEMPLATE CONTENT- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "def func(a,b): \n",
+ " '''\n",
+ " a function that subs a and b\n",
+ " '''\n",
+ " return a + b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`func` comments and usage examples for documentation:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "func(1,2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < Class > THIS IS TEMPLATE CONTENT-- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "class Class:\n",
+ " '''\n",
+ " a class to apply a function through the apply method\n",
+ " '''\n",
+ " def __init__(self, func):\n",
+ " assert callable(func), 'func should be callable type'\n",
+ " self.func = func\n",
+ " \n",
+ " def apply(self, *args, **kwargs):\n",
+ " return self.func(*args, **kwargs)\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`Class` comments and usage examples for documentation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "cls_instance = Class(sum)\n",
+ "\n",
+ "cls_instance.apply([1,2,3,4])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Code created from src\n",
+ "> Session concainning code generated in src (.py files) and converted back to notebook using nbdev_update_lib command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "##############\n",
+ "#we highly recommend you creating ccode from the notebook instead \n",
+ "#of creating from src and running nbdev_update_lib. Still, if you want\n",
+ "#to proceeed, please create your new code bellow this tag before running nbdev_update_lib\n",
+ "##############"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Experiment session\n",
+ "> Session to run the code and test functions and classes generated in this notebook. Helpfull for documentation and experimental development.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Tests\n",
+ "> Session to write tests in the nb-dev fashion (using assert)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def test_1(a,b):\n",
+ " '''\n",
+ " tests if func is returning a sum\n",
+ " '''\n",
+ " return func(a,b) == a + b"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Passed!\n"
+ ]
+ }
+ ],
+ "source": [
+ "assert test_1(1,2)\n",
+ "print('Passed!')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Export -"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.export import notebook2script\n",
+ "notebook2script()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.6"
+ },
+ "latex_envs": {
+ "LaTeX_envs_menu_present": true,
+ "autoclose": false,
+ "autocomplete": true,
+ "bibliofile": "biblio.bib",
+ "cite_by": "apalike",
+ "current_citInitial": 1,
+ "eqLabelWithNumbers": true,
+ "eqNumInitial": 1,
+ "hotkeys": {
+ "equation": "Ctrl-E",
+ "itemize": "Ctrl-I"
+ },
+ "labels_anchors": false,
+ "latex_user_defs": false,
+ "report_style_numbering": false,
+ "user_envs_cfg": false
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": true,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {
+ "height": "calc(100% - 180px)",
+ "left": "10px",
+ "top": "150px",
+ "width": "250.319px"
+ },
+ "toc_section_display": true,
+ "toc_window_display": true
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/notebooks_dev/DS_03_modelling.ipynb b/notebooks_dev/DS_03_modelling.ipynb
new file mode 100644
index 0000000..162e52f
--- /dev/null
+++ b/notebooks_dev/DS_03_modelling.ipynb
@@ -0,0 +1,394 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Define module in wihch `#export` tag will save the code in `src`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#default_exp ds__modelling"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Import modules that are only used in documentation, nbdev related stuff like testing using assert and more generally inside this notebook (not going to src)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.showdoc import *\n",
+ "\n",
+ "%load_ext autoreload\n",
+ "%autoreload 2 #autoreload to make code from other modules get updated online inside notebook\n",
+ "\n",
+ "import sys\n",
+ "sys.path.append('..') #appends project root to path in order to import project packages since `noteboks_dev` is not on the root\n",
+ "\n",
+ "#DO NOT EDIT"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "\n",
+ "#Internal Imports\n",
+ "#imports that are going to be used only during development and are not intended to be loaded inside the generated modules.\n",
+ "#for example: use imported modules to generate graphs for documentation, but lib is unused in actual package\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Data Science Modelling\n",
+ "> Module containing modelling/model functionalities and classes to be used in the Data Science pipelines"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Dev comments"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### TODOs - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- [X] TODO: do something\n",
+ "- [ ] TODO: do something else"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Notebook History - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- 16/02 - developed feature A as requested by business team\n",
+ "- 17/02 - couldn't quite understand specific business rule, request explanation from business team\n",
+ "- 21/02 - business rule is now clearly explained, foo should be ran before bar and not otherwise"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Code session"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### External imports\n",
+ "> imports that are intended to be loaded in the actual modules (going to src) e.g.: module dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < func > THIS IS TEMPLATE CONTENT- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "def func(a,b): \n",
+ " '''\n",
+ " a function that subs a and b\n",
+ " '''\n",
+ " return a + b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`func` comments and usage examples for documentation:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "func(1,2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < Class > THIS IS TEMPLATE CONTENT-- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "class Class:\n",
+ " '''\n",
+ " a class to apply a function through the apply method\n",
+ " '''\n",
+ " def __init__(self, func):\n",
+ " assert callable(func), 'func should be callable type'\n",
+ " self.func = func\n",
+ " \n",
+ " def apply(self, *args, **kwargs):\n",
+ " return self.func(*args, **kwargs)\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`Class` comments and usage examples for documentation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "cls_instance = Class(sum)\n",
+ "\n",
+ "cls_instance.apply([1,2,3,4])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Code created from src\n",
+ "> Session concainning code generated in src (.py files) and converted back to notebook using nbdev_update_lib command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "##############\n",
+ "#we highly recommend you creating ccode from the notebook instead \n",
+ "#of creating from src and running nbdev_update_lib. Still, if you want\n",
+ "#to proceeed, please create your new code bellow this tag before running nbdev_update_lib\n",
+ "##############"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Experiment session\n",
+ "> Session to run the code and test functions and classes generated in this notebook. Helpfull for documentation and experimental development.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Tests\n",
+ "> Session to write tests in the nb-dev fashion (using assert)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def test_1(a,b):\n",
+ " '''\n",
+ " tests if func is returning a sum\n",
+ " '''\n",
+ " return func(a,b) == a + b"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Passed!\n"
+ ]
+ }
+ ],
+ "source": [
+ "assert test_1(1,2)\n",
+ "print('Passed!')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Export -"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.export import notebook2script\n",
+ "notebook2script()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.6"
+ },
+ "latex_envs": {
+ "LaTeX_envs_menu_present": true,
+ "autoclose": false,
+ "autocomplete": true,
+ "bibliofile": "biblio.bib",
+ "cite_by": "apalike",
+ "current_citInitial": 1,
+ "eqLabelWithNumbers": true,
+ "eqNumInitial": 1,
+ "hotkeys": {
+ "equation": "Ctrl-E",
+ "itemize": "Ctrl-I"
+ },
+ "labels_anchors": false,
+ "latex_user_defs": false,
+ "report_style_numbering": false,
+ "user_envs_cfg": false
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": true,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {
+ "height": "calc(100% - 180px)",
+ "left": "10px",
+ "top": "150px",
+ "width": "250.319px"
+ },
+ "toc_section_display": true,
+ "toc_window_display": true
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/notebooks_dev/DS_04_validate.ipynb b/notebooks_dev/DS_04_validate.ipynb
new file mode 100644
index 0000000..82677d5
--- /dev/null
+++ b/notebooks_dev/DS_04_validate.ipynb
@@ -0,0 +1,394 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Define module in wihch `#export` tag will save the code in `src`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#default_exp ds__validate"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Import modules that are only used in documentation, nbdev related stuff like testing using assert and more generally inside this notebook (not going to src)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.showdoc import *\n",
+ "\n",
+ "%load_ext autoreload\n",
+ "%autoreload 2 #autoreload to make code from other modules get updated online inside notebook\n",
+ "\n",
+ "import sys\n",
+ "sys.path.append('..') #appends project root to path in order to import project packages since `noteboks_dev` is not on the root\n",
+ "\n",
+ "#DO NOT EDIT"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "\n",
+ "#Internal Imports\n",
+ "#imports that are going to be used only during development and are not intended to be loaded inside the generated modules.\n",
+ "#for example: use imported modules to generate graphs for documentation, but lib is unused in actual package\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Data Science Validate\n",
+ "> Module containing model validation functionalities to be used in the Data Science pipelines"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Dev comments"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### TODOs - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- [X] TODO: do something\n",
+ "- [ ] TODO: do something else"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Notebook History - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- 16/02 - developed feature A as requested by business team\n",
+ "- 17/02 - couldn't quite understand specific business rule, request explanation from business team\n",
+ "- 21/02 - business rule is now clearly explained, foo should be ran before bar and not otherwise"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Code session"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### External imports\n",
+ "> imports that are intended to be loaded in the actual modules (going to src) e.g.: module dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < func > THIS IS TEMPLATE CONTENT- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "def func(a,b): \n",
+ " '''\n",
+ " a function that subs a and b\n",
+ " '''\n",
+ " return a + b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`func` comments and usage examples for documentation:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "func(1,2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < Class > THIS IS TEMPLATE CONTENT-- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "class Class:\n",
+ " '''\n",
+ " a class to apply a function through the apply method\n",
+ " '''\n",
+ " def __init__(self, func):\n",
+ " assert callable(func), 'func should be callable type'\n",
+ " self.func = func\n",
+ " \n",
+ " def apply(self, *args, **kwargs):\n",
+ " return self.func(*args, **kwargs)\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`Class` comments and usage examples for documentation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "cls_instance = Class(sum)\n",
+ "\n",
+ "cls_instance.apply([1,2,3,4])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Code created from src\n",
+ "> Session concainning code generated in src (.py files) and converted back to notebook using nbdev_update_lib command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "##############\n",
+ "#we highly recommend you creating ccode from the notebook instead \n",
+ "#of creating from src and running nbdev_update_lib. Still, if you want\n",
+ "#to proceeed, please create your new code bellow this tag before running nbdev_update_lib\n",
+ "##############"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Experiment session\n",
+ "> Session to run the code and test functions and classes generated in this notebook. Helpfull for documentation and experimental development.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Tests\n",
+ "> Session to write tests in the nb-dev fashion (using assert)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def test_1(a,b):\n",
+ " '''\n",
+ " tests if func is returning a sum\n",
+ " '''\n",
+ " return func(a,b) == a + b"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Passed!\n"
+ ]
+ }
+ ],
+ "source": [
+ "assert test_1(1,2)\n",
+ "print('Passed!')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Export -"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.export import notebook2script\n",
+ "notebook2script()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.6"
+ },
+ "latex_envs": {
+ "LaTeX_envs_menu_present": true,
+ "autoclose": false,
+ "autocomplete": true,
+ "bibliofile": "biblio.bib",
+ "cite_by": "apalike",
+ "current_citInitial": 1,
+ "eqLabelWithNumbers": true,
+ "eqNumInitial": 1,
+ "hotkeys": {
+ "equation": "Ctrl-E",
+ "itemize": "Ctrl-I"
+ },
+ "labels_anchors": false,
+ "latex_user_defs": false,
+ "report_style_numbering": false,
+ "user_envs_cfg": false
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": true,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {
+ "height": "calc(100% - 180px)",
+ "left": "10px",
+ "top": "150px",
+ "width": "250.319px"
+ },
+ "toc_section_display": true,
+ "toc_window_display": true
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/notebooks_dev/DS_05_postprocess.ipynb b/notebooks_dev/DS_05_postprocess.ipynb
new file mode 100644
index 0000000..9097686
--- /dev/null
+++ b/notebooks_dev/DS_05_postprocess.ipynb
@@ -0,0 +1,394 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Define module in wihch `#export` tag will save the code in `src`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#default_exp ds__postprocess"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Import modules that are only used in documentation, nbdev related stuff like testing using assert and more generally inside this notebook (not going to src)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.showdoc import *\n",
+ "\n",
+ "%load_ext autoreload\n",
+ "%autoreload 2 #autoreload to make code from other modules get updated online inside notebook\n",
+ "\n",
+ "import sys\n",
+ "sys.path.append('..') #appends project root to path in order to import project packages since `noteboks_dev` is not on the root\n",
+ "\n",
+ "#DO NOT EDIT"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "\n",
+ "#Internal Imports\n",
+ "#imports that are going to be used only during development and are not intended to be loaded inside the generated modules.\n",
+ "#for example: use imported modules to generate graphs for documentation, but lib is unused in actual package\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Data Science Postprocess\n",
+ "> Module intended for creating model output postprocessing routines, prior to serving or anything downstream to inference. E.g.: design AB testings..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Dev comments"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### TODOs - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- [X] TODO: do something\n",
+ "- [ ] TODO: do something else"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Notebook History - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- 16/02 - developed feature A as requested by business team\n",
+ "- 17/02 - couldn't quite understand specific business rule, request explanation from business team\n",
+ "- 21/02 - business rule is now clearly explained, foo should be ran before bar and not otherwise"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Code session"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### External imports\n",
+ "> imports that are intended to be loaded in the actual modules (going to src) e.g.: module dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < func > THIS IS TEMPLATE CONTENT- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "def func(a,b): \n",
+ " '''\n",
+ " a function that subs a and b\n",
+ " '''\n",
+ " return a + b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`func` comments and usage examples for documentation:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "func(1,2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < Class > THIS IS TEMPLATE CONTENT-- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "class Class:\n",
+ " '''\n",
+ " a class to apply a function through the apply method\n",
+ " '''\n",
+ " def __init__(self, func):\n",
+ " assert callable(func), 'func should be callable type'\n",
+ " self.func = func\n",
+ " \n",
+ " def apply(self, *args, **kwargs):\n",
+ " return self.func(*args, **kwargs)\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`Class` comments and usage examples for documentation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "cls_instance = Class(sum)\n",
+ "\n",
+ "cls_instance.apply([1,2,3,4])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Code created from src\n",
+ "> Session concainning code generated in src (.py files) and converted back to notebook using nbdev_update_lib command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "##############\n",
+ "#we highly recommend you creating ccode from the notebook instead \n",
+ "#of creating from src and running nbdev_update_lib. Still, if you want\n",
+ "#to proceeed, please create your new code bellow this tag before running nbdev_update_lib\n",
+ "##############"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Experiment session\n",
+ "> Session to run the code and test functions and classes generated in this notebook. Helpfull for documentation and experimental development.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Tests\n",
+ "> Session to write tests in the nb-dev fashion (using assert)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def test_1(a,b):\n",
+ " '''\n",
+ " tests if func is returning a sum\n",
+ " '''\n",
+ " return func(a,b) == a + b"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Passed!\n"
+ ]
+ }
+ ],
+ "source": [
+ "assert test_1(1,2)\n",
+ "print('Passed!')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Export -"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.export import notebook2script\n",
+ "notebook2script()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.6"
+ },
+ "latex_envs": {
+ "LaTeX_envs_menu_present": true,
+ "autoclose": false,
+ "autocomplete": true,
+ "bibliofile": "biblio.bib",
+ "cite_by": "apalike",
+ "current_citInitial": 1,
+ "eqLabelWithNumbers": true,
+ "eqNumInitial": 1,
+ "hotkeys": {
+ "equation": "Ctrl-E",
+ "itemize": "Ctrl-I"
+ },
+ "labels_anchors": false,
+ "latex_user_defs": false,
+ "report_style_numbering": false,
+ "user_envs_cfg": false
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": true,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {
+ "height": "calc(100% - 180px)",
+ "left": "10px",
+ "top": "150px",
+ "width": "250.319px"
+ },
+ "toc_section_display": true,
+ "toc_window_display": true
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/notebooks_dev/MLE_00_pipeline_utils.ipynb b/notebooks_dev/MLE_00_pipeline_utils.ipynb
new file mode 100644
index 0000000..09fe9f0
--- /dev/null
+++ b/notebooks_dev/MLE_00_pipeline_utils.ipynb
@@ -0,0 +1,394 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Define module in wihch `#export` tag will save the code in `src`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#default_exp mle__pipeline_utils"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Import modules that are only used in documentation, nbdev related stuff like testing using assert and more generally inside this notebook (not going to src)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.showdoc import *\n",
+ "\n",
+ "%load_ext autoreload\n",
+ "%autoreload 2 #autoreload to make code from other modules get updated online inside notebook\n",
+ "\n",
+ "import sys\n",
+ "sys.path.append('..') #appends project root to path in order to import project packages since `noteboks_dev` is not on the root\n",
+ "\n",
+ "#DO NOT EDIT"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "\n",
+ "#Internal Imports\n",
+ "#imports that are going to be used only during development and are not intended to be loaded inside the generated modules.\n",
+ "#for example: use imported modules to generate graphs for documentation, but lib is unused in actual package\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Machine Learning Engineering Pipeline Utils\n",
+ "> Module intended to be used by the Machine Learning Engineering team for building pipelines utils or connceting pieces from other DS and DE modules, still, full pipelines are recomended to be under the pipeline-scripts folder."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Dev comments"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### TODOs - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- [X] TODO: do something\n",
+ "- [ ] TODO: do something else"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Notebook History - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- 16/02 - developed feature A as requested by business team\n",
+ "- 17/02 - couldn't quite understand specific business rule, request explanation from business team\n",
+ "- 21/02 - business rule is now clearly explained, foo should be ran before bar and not otherwise"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Code session"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### External imports\n",
+ "> imports that are intended to be loaded in the actual modules (going to src) e.g.: module dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < func > THIS IS TEMPLATE CONTENT- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "def func(a,b): \n",
+ " '''\n",
+ " a function that subs a and b\n",
+ " '''\n",
+ " return a + b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`func` comments and usage examples for documentation:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "func(1,2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < Class > THIS IS TEMPLATE CONTENT-- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "class Class:\n",
+ " '''\n",
+ " a class to apply a function through the apply method\n",
+ " '''\n",
+ " def __init__(self, func):\n",
+ " assert callable(func), 'func should be callable type'\n",
+ " self.func = func\n",
+ " \n",
+ " def apply(self, *args, **kwargs):\n",
+ " return self.func(*args, **kwargs)\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`Class` comments and usage examples for documentation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "cls_instance = Class(sum)\n",
+ "\n",
+ "cls_instance.apply([1,2,3,4])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Code created from src\n",
+ "> Session concainning code generated in src (.py files) and converted back to notebook using nbdev_update_lib command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "##############\n",
+ "#we highly recommend you creating ccode from the notebook instead \n",
+ "#of creating from src and running nbdev_update_lib. Still, if you want\n",
+ "#to proceeed, please create your new code bellow this tag before running nbdev_update_lib\n",
+ "##############"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Experiment session\n",
+ "> Session to run the code and test functions and classes generated in this notebook. Helpfull for documentation and experimental development.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Tests\n",
+ "> Session to write tests in the nb-dev fashion (using assert)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def test_1(a,b):\n",
+ " '''\n",
+ " tests if func is returning a sum\n",
+ " '''\n",
+ " return func(a,b) == a + b"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Passed!\n"
+ ]
+ }
+ ],
+ "source": [
+ "assert test_1(1,2)\n",
+ "print('Passed!')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Export -"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.export import notebook2script\n",
+ "notebook2script()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.6"
+ },
+ "latex_envs": {
+ "LaTeX_envs_menu_present": true,
+ "autoclose": false,
+ "autocomplete": true,
+ "bibliofile": "biblio.bib",
+ "cite_by": "apalike",
+ "current_citInitial": 1,
+ "eqLabelWithNumbers": true,
+ "eqNumInitial": 1,
+ "hotkeys": {
+ "equation": "Ctrl-E",
+ "itemize": "Ctrl-I"
+ },
+ "labels_anchors": false,
+ "latex_user_defs": false,
+ "report_style_numbering": false,
+ "user_envs_cfg": false
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": true,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {
+ "height": "calc(100% - 180px)",
+ "left": "10px",
+ "top": "150px",
+ "width": "250.319px"
+ },
+ "toc_section_display": true,
+ "toc_window_display": true
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/notebooks_dev/MLE_01_serving.ipynb b/notebooks_dev/MLE_01_serving.ipynb
new file mode 100644
index 0000000..2053928
--- /dev/null
+++ b/notebooks_dev/MLE_01_serving.ipynb
@@ -0,0 +1,394 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Define module in wihch `#export` tag will save the code in `src`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#default_exp mle__serving"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Import modules that are only used in documentation, nbdev related stuff like testing using assert and more generally inside this notebook (not going to src)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.showdoc import *\n",
+ "\n",
+ "%load_ext autoreload\n",
+ "%autoreload 2 #autoreload to make code from other modules get updated online inside notebook\n",
+ "\n",
+ "import sys\n",
+ "sys.path.append('..') #appends project root to path in order to import project packages since `noteboks_dev` is not on the root\n",
+ "\n",
+ "#DO NOT EDIT"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "\n",
+ "#Internal Imports\n",
+ "#imports that are going to be used only during development and are not intended to be loaded inside the generated modules.\n",
+ "#for example: use imported modules to generate graphs for documentation, but lib is unused in actual package\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Machine Learning Engineering Serving\n",
+ "> Module intended to be used by the Machine Learning Engineering team for building serving functions, API helping functions, DB writing queries and everything related to serving models... Still, the final pipelines are recomended to be under pipeline-scripts folder on root."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Dev comments"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### TODOs - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- [X] TODO: do something\n",
+ "- [ ] TODO: do something else"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Notebook History - THIS IS TEMPLATE CONTENT -"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "THIS IS TEMPLATE CONTENT\n",
+ "- 16/02 - developed feature A as requested by business team\n",
+ "- 17/02 - couldn't quite understand specific business rule, request explanation from business team\n",
+ "- 21/02 - business rule is now clearly explained, foo should be ran before bar and not otherwise"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Code session"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### External imports\n",
+ "> imports that are intended to be loaded in the actual modules (going to src) e.g.: module dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "\n",
+ "#import ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < func > THIS IS TEMPLATE CONTENT- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "def func(a,b): \n",
+ " '''\n",
+ " a function that subs a and b\n",
+ " '''\n",
+ " return a + b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`func` comments and usage examples for documentation:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "func(1,2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### < Class > THIS IS TEMPLATE CONTENT-- "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "class Class:\n",
+ " '''\n",
+ " a class to apply a function through the apply method\n",
+ " '''\n",
+ " def __init__(self, func):\n",
+ " assert callable(func), 'func should be callable type'\n",
+ " self.func = func\n",
+ " \n",
+ " def apply(self, *args, **kwargs):\n",
+ " return self.func(*args, **kwargs)\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`Class` comments and usage examples for documentation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "cls_instance = Class(sum)\n",
+ "\n",
+ "cls_instance.apply([1,2,3,4])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Code created from src\n",
+ "> Session concainning code generated in src (.py files) and converted back to notebook using nbdev_update_lib command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#export\n",
+ "##############\n",
+ "#we highly recommend you creating ccode from the notebook instead \n",
+ "#of creating from src and running nbdev_update_lib. Still, if you want\n",
+ "#to proceeed, please create your new code bellow this tag before running nbdev_update_lib\n",
+ "##############"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Experiment session\n",
+ "> Session to run the code and test functions and classes generated in this notebook. Helpfull for documentation and experimental development.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Tests\n",
+ "> Session to write tests in the nb-dev fashion (using assert)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def test_1(a,b):\n",
+ " '''\n",
+ " tests if func is returning a sum\n",
+ " '''\n",
+ " return func(a,b) == a + b"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Passed!\n"
+ ]
+ }
+ ],
+ "source": [
+ "assert test_1(1,2)\n",
+ "print('Passed!')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Export -"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#hide\n",
+ "from nbdev.export import notebook2script\n",
+ "notebook2script()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.6"
+ },
+ "latex_envs": {
+ "LaTeX_envs_menu_present": true,
+ "autoclose": false,
+ "autocomplete": true,
+ "bibliofile": "biblio.bib",
+ "cite_by": "apalike",
+ "current_citInitial": 1,
+ "eqLabelWithNumbers": true,
+ "eqNumInitial": 1,
+ "hotkeys": {
+ "equation": "Ctrl-E",
+ "itemize": "Ctrl-I"
+ },
+ "labels_anchors": false,
+ "latex_user_defs": false,
+ "report_style_numbering": false,
+ "user_envs_cfg": false
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": true,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {
+ "height": "calc(100% - 180px)",
+ "left": "10px",
+ "top": "150px",
+ "width": "250.319px"
+ },
+ "toc_section_display": true,
+ "toc_window_display": true
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/pipeline-scripts/Setup.py b/pipeline-scripts/Setup.py
new file mode 100644
index 0000000..e69de29
diff --git a/requirements.txt b/requirements.txt
new file mode 100644
index 0000000..e69de29
diff --git a/settings.ini b/settings.ini
new file mode 100644
index 0000000..19aac76
--- /dev/null
+++ b/settings.ini
@@ -0,0 +1,77 @@
+[DEFAULT]
+#### All sections below are required unless otherwise specified
+host = github
+lib_name = src
+repo_name = {repo_name}
+user = {user}
+author = {author}
+author_email = {author_email}
+description = {description}
+copyright = {copyright}
+branch = {branch}
+version = 0.0.1
+min_python = 3.6
+audience = Developers
+language = English
+#### Anything shown as '%(...)s' is substituted with that setting automatically
+doc_host = https://%(user)s.github.io
+doc_baseurl = /%(repo_name)s/%(lib_name)s/
+lib_path = %(lib_name)s
+title = %(repo_name)s
+#For Enterprise Git pages use:
+#doc_host = https://pages.github.%(company_name)s.com.
+git_url = https://github.com/%(user)s/%(lib_name)s/tree/%(branch)s/
+
+
+#### Add licenses and see current list in `setup.py`
+license = apache2
+#### From 1-7: Planning Pre-Alpha Alpha Beta Production Mature Inactive
+status = 2
+#### Optional. Same format as setuptools requirements
+# requirements =
+#### Optional. Same format as setuptools console_scripts
+# console_scripts =
+#### Optional. Same format as setuptools dependency-links
+# dep_links =
+
+
+#### Change to, e.g. "nbs", to put your notebooks in nbs dir instead of repo root
+nbs_path = notebooks_dev
+doc_path = docs
+
+#doc_baseurl = /%(lib_name)s/
+# For Enterprise Github pages docs use:
+# doc_baseurl = /%(repo_name)s/%(lib_name)s/
+
+
+#### For Enterprise Git add variable repo_name and company name
+# company_name = nike
+# keywords = some keywords
+
+#### Set to True if you want to create a more fancy sidebar.json than the default
+custom_sidebar = False
+
+####
+# You probably won't need to change anything under here,
+# unless you have some special requirements
+####
+
+#### Whether to look for library notebooks recursively in the `nbs_path` dir
+recursive = True
+
+#### Anything shown as '%(...)s' is substituted with that setting automatically
+# For Enterprise Github use:
+#git_url = https://github.%(company_name)s.com/%(repo_name)s/%(lib_name)s/tree/%(branch)s/
+
+
+#Optional advanced parameters
+#Monospace docstings: adds tags around the doc strings, preserving newlines/indentation.
+#monospace_docstrings = False
+#Test flags: introduce here the test flags you want to use separated by |
+#tst_flags =
+#Custom sidebar: customize sidebar.json yourself for advanced sidebars (False/True)
+#custom_sidebar =
+#Cell spacing: if you want cell blocks in code separated by more than one new line
+#cell_spacing =
+#Custom jekyll styles: if you want more jekyll styles than tip/important/warning, set them here
+#jekyll_styles = note,warning,tip,important
diff --git a/setup.py b/setup.py
new file mode 100644
index 0000000..de97dee
--- /dev/null
+++ b/setup.py
@@ -0,0 +1,47 @@
+from pkg_resources import parse_version
+from configparser import ConfigParser
+import setuptools
+assert parse_version(setuptools.__version__)>=parse_version('36.2')
+
+# note: all settings are in settings.ini; edit there, not here
+config = ConfigParser(delimiters=['='])
+config.read('settings.ini')
+cfg = config['DEFAULT']
+
+cfg_keys = 'version description keywords author author_email'.split()
+expected = cfg_keys + "lib_name user branch license status min_python audience language".split()
+for o in expected: assert o in cfg, "missing expected setting: {}".format(o)
+setup_cfg = {o:cfg[o] for o in cfg_keys}
+
+licenses = {
+ 'apache2': ('Apache Software License 2.0','OSI Approved :: Apache Software License'),
+}
+statuses = [ '1 - Planning', '2 - Pre-Alpha', '3 - Alpha',
+ '4 - Beta', '5 - Production/Stable', '6 - Mature', '7 - Inactive' ]
+py_versions = '2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8'.split()
+
+requirements = cfg.get('requirements','').split()
+lic = licenses[cfg['license']]
+min_python = cfg['min_python']
+
+setuptools.setup(
+ name = cfg['lib_name'],
+ license = lic[0],
+ classifiers = [
+ 'Development Status :: ' + statuses[int(cfg['status'])],
+ 'Intended Audience :: ' + cfg['audience'].title(),
+ 'License :: ' + lic[1],
+ 'Natural Language :: ' + cfg['language'].title(),
+ ] + ['Programming Language :: Python :: '+o for o in py_versions[py_versions.index(min_python):]],
+ url = cfg['git_url'],
+ packages = setuptools.find_packages(),
+ include_package_data = True,
+ install_requires = requirements,
+ dependency_links = cfg.get('dep_links','').split(),
+ python_requires = '>=' + cfg['min_python'],
+ long_description = open('README.md').read(),
+ long_description_content_type = 'text/markdown',
+ zip_safe = False,
+ entry_points = { 'console_scripts': cfg.get('console_scripts','').split() },
+ **setup_cfg)
+
diff --git a/src/.gitkeep b/src/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/src/__init__.py b/src/__init__.py
new file mode 100644
index 0000000..f102a9c
--- /dev/null
+++ b/src/__init__.py
@@ -0,0 +1 @@
+__version__ = "0.0.1"
diff --git a/src/_nbdev.py b/src/_nbdev.py
new file mode 100644
index 0000000..52203cc
--- /dev/null
+++ b/src/_nbdev.py
@@ -0,0 +1,24 @@
+# AUTOGENERATED BY NBDEV! DO NOT EDIT!
+
+__all__ = ["index", "modules", "custom_doc_links", "git_url"]
+
+index = {"func": "MLE_01_serving.ipynb",
+ "Class": "MLE_01_serving.ipynb"}
+
+modules = ["de__extract.py",
+ "de__transform.py",
+ "de__load.py",
+ "ds__load.py",
+ "ds__preprocess.py",
+ "ds__build_features.py",
+ "ds__modelling.py",
+ "ds__validate.py",
+ "ds__postprocess.py",
+ "mle__pipeline_utils.py",
+ "mle__serving.py"]
+
+doc_url = "https://{user}.github.io/{repo_name}/src/"
+
+git_url = "https://github.com/{user}/src/tree/{branch}/"
+
+def custom_doc_links(name): return None
diff --git a/src/de__extract.py b/src/de__extract.py
new file mode 100644
index 0000000..8555636
--- /dev/null
+++ b/src/de__extract.py
@@ -0,0 +1,34 @@
+# AUTOGENERATED! DO NOT EDIT! File to edit: notebooks_dev/DE_00_extract.ipynb (unless otherwise specified).
+
+__all__ = ['func', 'Class']
+
+# Cell
+
+#import ...
+
+# Cell
+def func(a,b):
+ '''
+ a function that subs a and b
+ '''
+ return a + b
+
+# Cell
+class Class:
+ '''
+ a class to apply a function through the apply method
+ '''
+ def __init__(self, func):
+ assert callable(func), 'func should be callable type'
+ self.func = func
+
+ def apply(self, *args, **kwargs):
+ return self.func(*args, **kwargs)
+
+
+# Cell
+##############
+#we highly recommend you creating ccode from the notebook instead
+#of creating from src and running nbdev_update_lib. Still, if you want
+#to proceeed, please create your new code bellow this tag before running nbdev_update_lib
+##############
\ No newline at end of file
diff --git a/src/de__load.py b/src/de__load.py
new file mode 100644
index 0000000..153b172
--- /dev/null
+++ b/src/de__load.py
@@ -0,0 +1,34 @@
+# AUTOGENERATED! DO NOT EDIT! File to edit: notebooks_dev/DE_02_load.ipynb (unless otherwise specified).
+
+__all__ = ['func', 'Class']
+
+# Cell
+
+#import ...
+
+# Cell
+def func(a,b):
+ '''
+ a function that subs a and b
+ '''
+ return a + b
+
+# Cell
+class Class:
+ '''
+ a class to apply a function through the apply method
+ '''
+ def __init__(self, func):
+ assert callable(func), 'func should be callable type'
+ self.func = func
+
+ def apply(self, *args, **kwargs):
+ return self.func(*args, **kwargs)
+
+
+# Cell
+##############
+#we highly recommend you creating ccode from the notebook instead
+#of creating from src and running nbdev_update_lib. Still, if you want
+#to proceeed, please create your new code bellow this tag before running nbdev_update_lib
+##############
\ No newline at end of file
diff --git a/src/de__transform.py b/src/de__transform.py
new file mode 100644
index 0000000..694e9be
--- /dev/null
+++ b/src/de__transform.py
@@ -0,0 +1,34 @@
+# AUTOGENERATED! DO NOT EDIT! File to edit: notebooks_dev/DE_01_transform.ipynb (unless otherwise specified).
+
+__all__ = ['func', 'Class']
+
+# Cell
+
+#import ...
+
+# Cell
+def func(a,b):
+ '''
+ a function that subs a and b
+ '''
+ return a + b
+
+# Cell
+class Class:
+ '''
+ a class to apply a function through the apply method
+ '''
+ def __init__(self, func):
+ assert callable(func), 'func should be callable type'
+ self.func = func
+
+ def apply(self, *args, **kwargs):
+ return self.func(*args, **kwargs)
+
+
+# Cell
+##############
+#we highly recommend you creating ccode from the notebook instead
+#of creating from src and running nbdev_update_lib. Still, if you want
+#to proceeed, please create your new code bellow this tag before running nbdev_update_lib
+##############
\ No newline at end of file
diff --git a/src/ds__build_features.py b/src/ds__build_features.py
new file mode 100644
index 0000000..7d54b54
--- /dev/null
+++ b/src/ds__build_features.py
@@ -0,0 +1,34 @@
+# AUTOGENERATED! DO NOT EDIT! File to edit: notebooks_dev/DS_02_build_features.ipynb (unless otherwise specified).
+
+__all__ = ['func', 'Class']
+
+# Cell
+
+#import ...
+
+# Cell
+def func(a,b):
+ '''
+ a function that subs a and b
+ '''
+ return a + b
+
+# Cell
+class Class:
+ '''
+ a class to apply a function through the apply method
+ '''
+ def __init__(self, func):
+ assert callable(func), 'func should be callable type'
+ self.func = func
+
+ def apply(self, *args, **kwargs):
+ return self.func(*args, **kwargs)
+
+
+# Cell
+##############
+#we highly recommend you creating ccode from the notebook instead
+#of creating from src and running nbdev_update_lib. Still, if you want
+#to proceeed, please create your new code bellow this tag before running nbdev_update_lib
+##############
\ No newline at end of file
diff --git a/src/ds__load.py b/src/ds__load.py
new file mode 100644
index 0000000..4c358b9
--- /dev/null
+++ b/src/ds__load.py
@@ -0,0 +1,34 @@
+# AUTOGENERATED! DO NOT EDIT! File to edit: notebooks_dev/DS_00_load.ipynb (unless otherwise specified).
+
+__all__ = ['func', 'Class']
+
+# Cell
+
+#import ...
+
+# Cell
+def func(a,b):
+ '''
+ a function that subs a and b
+ '''
+ return a + b
+
+# Cell
+class Class:
+ '''
+ a class to apply a function through the apply method
+ '''
+ def __init__(self, func):
+ assert callable(func), 'func should be callable type'
+ self.func = func
+
+ def apply(self, *args, **kwargs):
+ return self.func(*args, **kwargs)
+
+
+# Cell
+##############
+#we highly recommend you creating ccode from the notebook instead
+#of creating from src and running nbdev_update_lib. Still, if you want
+#to proceeed, please create your new code bellow this tag before running nbdev_update_lib
+##############
\ No newline at end of file
diff --git a/src/ds__modelling.py b/src/ds__modelling.py
new file mode 100644
index 0000000..782ec69
--- /dev/null
+++ b/src/ds__modelling.py
@@ -0,0 +1,34 @@
+# AUTOGENERATED! DO NOT EDIT! File to edit: notebooks_dev/DS_03_modelling.ipynb (unless otherwise specified).
+
+__all__ = ['func', 'Class']
+
+# Cell
+
+#import ...
+
+# Cell
+def func(a,b):
+ '''
+ a function that subs a and b
+ '''
+ return a + b
+
+# Cell
+class Class:
+ '''
+ a class to apply a function through the apply method
+ '''
+ def __init__(self, func):
+ assert callable(func), 'func should be callable type'
+ self.func = func
+
+ def apply(self, *args, **kwargs):
+ return self.func(*args, **kwargs)
+
+
+# Cell
+##############
+#we highly recommend you creating ccode from the notebook instead
+#of creating from src and running nbdev_update_lib. Still, if you want
+#to proceeed, please create your new code bellow this tag before running nbdev_update_lib
+##############
\ No newline at end of file
diff --git a/src/ds__postprocess.py b/src/ds__postprocess.py
new file mode 100644
index 0000000..1924d99
--- /dev/null
+++ b/src/ds__postprocess.py
@@ -0,0 +1,34 @@
+# AUTOGENERATED! DO NOT EDIT! File to edit: notebooks_dev/DS_05_postprocess.ipynb (unless otherwise specified).
+
+__all__ = ['func', 'Class']
+
+# Cell
+
+#import ...
+
+# Cell
+def func(a,b):
+ '''
+ a function that subs a and b
+ '''
+ return a + b
+
+# Cell
+class Class:
+ '''
+ a class to apply a function through the apply method
+ '''
+ def __init__(self, func):
+ assert callable(func), 'func should be callable type'
+ self.func = func
+
+ def apply(self, *args, **kwargs):
+ return self.func(*args, **kwargs)
+
+
+# Cell
+##############
+#we highly recommend you creating ccode from the notebook instead
+#of creating from src and running nbdev_update_lib. Still, if you want
+#to proceeed, please create your new code bellow this tag before running nbdev_update_lib
+##############
\ No newline at end of file
diff --git a/src/ds__preprocess.py b/src/ds__preprocess.py
new file mode 100644
index 0000000..ebd5dd8
--- /dev/null
+++ b/src/ds__preprocess.py
@@ -0,0 +1,34 @@
+# AUTOGENERATED! DO NOT EDIT! File to edit: notebooks_dev/DS_01_preprocess.ipynb (unless otherwise specified).
+
+__all__ = ['func', 'Class']
+
+# Cell
+
+#import ...
+
+# Cell
+def func(a,b):
+ '''
+ a function that subs a and b
+ '''
+ return a + b
+
+# Cell
+class Class:
+ '''
+ a class to apply a function through the apply method
+ '''
+ def __init__(self, func):
+ assert callable(func), 'func should be callable type'
+ self.func = func
+
+ def apply(self, *args, **kwargs):
+ return self.func(*args, **kwargs)
+
+
+# Cell
+##############
+#we highly recommend you creating ccode from the notebook instead
+#of creating from src and running nbdev_update_lib. Still, if you want
+#to proceeed, please create your new code bellow this tag before running nbdev_update_lib
+##############
\ No newline at end of file
diff --git a/src/ds__validate.py b/src/ds__validate.py
new file mode 100644
index 0000000..7b6a283
--- /dev/null
+++ b/src/ds__validate.py
@@ -0,0 +1,34 @@
+# AUTOGENERATED! DO NOT EDIT! File to edit: notebooks_dev/DS_04_validate.ipynb (unless otherwise specified).
+
+__all__ = ['func', 'Class']
+
+# Cell
+
+#import ...
+
+# Cell
+def func(a,b):
+ '''
+ a function that subs a and b
+ '''
+ return a + b
+
+# Cell
+class Class:
+ '''
+ a class to apply a function through the apply method
+ '''
+ def __init__(self, func):
+ assert callable(func), 'func should be callable type'
+ self.func = func
+
+ def apply(self, *args, **kwargs):
+ return self.func(*args, **kwargs)
+
+
+# Cell
+##############
+#we highly recommend you creating ccode from the notebook instead
+#of creating from src and running nbdev_update_lib. Still, if you want
+#to proceeed, please create your new code bellow this tag before running nbdev_update_lib
+##############
\ No newline at end of file
diff --git a/src/mle__pipeline_utils.py b/src/mle__pipeline_utils.py
new file mode 100644
index 0000000..24f2080
--- /dev/null
+++ b/src/mle__pipeline_utils.py
@@ -0,0 +1,34 @@
+# AUTOGENERATED! DO NOT EDIT! File to edit: notebooks_dev/MLE_00_pipeline_utils.ipynb (unless otherwise specified).
+
+__all__ = ['func', 'Class']
+
+# Cell
+
+#import ...
+
+# Cell
+def func(a,b):
+ '''
+ a function that subs a and b
+ '''
+ return a + b
+
+# Cell
+class Class:
+ '''
+ a class to apply a function through the apply method
+ '''
+ def __init__(self, func):
+ assert callable(func), 'func should be callable type'
+ self.func = func
+
+ def apply(self, *args, **kwargs):
+ return self.func(*args, **kwargs)
+
+
+# Cell
+##############
+#we highly recommend you creating ccode from the notebook instead
+#of creating from src and running nbdev_update_lib. Still, if you want
+#to proceeed, please create your new code bellow this tag before running nbdev_update_lib
+##############
\ No newline at end of file
diff --git a/src/mle__serving.py b/src/mle__serving.py
new file mode 100644
index 0000000..5018042
--- /dev/null
+++ b/src/mle__serving.py
@@ -0,0 +1,34 @@
+# AUTOGENERATED! DO NOT EDIT! File to edit: notebooks_dev/MLE_01_serving.ipynb (unless otherwise specified).
+
+__all__ = ['func', 'Class']
+
+# Cell
+
+#import ...
+
+# Cell
+def func(a,b):
+ '''
+ a function that subs a and b
+ '''
+ return a + b
+
+# Cell
+class Class:
+ '''
+ a class to apply a function through the apply method
+ '''
+ def __init__(self, func):
+ assert callable(func), 'func should be callable type'
+ self.func = func
+
+ def apply(self, *args, **kwargs):
+ return self.func(*args, **kwargs)
+
+
+# Cell
+##############
+#we highly recommend you creating ccode from the notebook instead
+#of creating from src and running nbdev_update_lib. Still, if you want
+#to proceeed, please create your new code bellow this tag before running nbdev_update_lib
+##############
\ No newline at end of file
diff --git a/utf8encode.py b/utf8encode.py
new file mode 100644
index 0000000..bbf8161
--- /dev/null
+++ b/utf8encode.py
@@ -0,0 +1,33 @@
+import glob
+
+def utf8encode(file_path):
+ '''
+ encodes a file to utf-8
+ '''
+ with open(file_path, 'r+', encoding='utf-8', errors = 'replace') as file:
+ file.write(file.read())
+ file.truncate()
+
+ print(f'Converted {file_path} utf-8 (unicode)')
+ return
+
+def get_html_paths(roots):
+ '''
+ gets path of html files recursively starting from root in roots(list of roots)
+ '''
+ assert isinstance(roots,(tuple,list,set))
+ all_files = []
+ for root in roots:
+ files = glob.glob(f'{root}/**/*.html',
+ recursive=True)
+ all_files += files if type(files) != str else [files]
+
+ return all_files
+
+if __name__ == '__main__':
+ '''
+ converts .html files in /docs to utf-8 and unicode charset
+ '''
+ paths = get_html_paths(['docs'])
+ for path in paths:
+ utf8encode(path)