You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 13, 2024. It is now read-only.
Goals: we have a single-page website where you can add a source and destination node to a pipeline and connect them together. In a source Node, you can upload a file as your dataset, and for destination node, you can choose a file format which you want your final result to be provided in. after creating the basic pipeline, you can execute it, and download the result, from destination node.
Smaller Steps:
Upload a csv file to our database. with its content correctly being imported.
Execute the generated pipeline (check the correctness of result)
download the exact imported file, in the same csv format.
Testing the input output process either in this project or a separate project is needed. now and then we will need to use this test for newer file formats and checking the correctness of the output
More Ideas:
You are free to add your idea, without considering whether it is possible or not. Just let your mind to fly!
Adding support for multiple outputs
Add Code documentations for architectureal subjects
convert uploaded data to Json and then dataTable (it seems that json file included dataType as well)
Further Tasks (Nice To Have):
handle errors and exceptions for all APIs, both in front and back.
support for more file formats as input:
Json
Text
Xml
support for more file formats as output:
Json
Text
Xml
Reviewing and reformatting the codes of our teammates
Testing process: for each phase we can also have a testing issue, where we completely test the proceses in different situations and write the scenarios as well. like a todo list (in markdown) we can check whether a test scenario is working or not.
Merging the result of Front and Back for testing might need a person from each side to check the correctness and failures.
Conclusion for first phase:
Nice to have a meeting and think about the pros and cons of the current phase and check plannings for the next phase.
Implementation
Interfaces and abstracts
Node
General abstract class of all nodes. we can also use template methods for default functions.
Execute(ExecutionType : enum, Nodes : Dictionary) => string : return the string query of the node
Id : string
ProcessNode : Node
All functional nodes, manipulating the source data are here. each one has its own execution method with some private properties.
previousNodes : list of Ids
IParser
this parser is used for converting source raw data to a single unique datatype suitable for importing to database.
details of this method is not clear, though the functionality is. (programmer itself might need to check and think more about it)
RunQuery(queryString : string) => TempTable
CreateTable(name : string) => string
Classes
SourceNode : INode
tableName : string
IDestinationiNode : INode
previousNodes : list of Ids
tableName
Pipeline
after execution, the output result is stored in a table in database. the name is given from destinationNode. (while creating a destination node, an api is called and a table is created. the tablename then is returned as response to be stored in the recently created node in frontend)
Nodes : Dictionary
Execute(Nodes : Dictionary) : Dictionary (id -> output)
Preview()
CSVParser : IParser
PostgresqlDatabase : IDatabase
CustomDeserializer (json to Pipeline)
While deserializing the received json file from API, we need to convert it to the Pipeline class. here we might need such a class. though it might not be needed if a simpler method does exist.
Services and Controllers:
DataInventoryService: handle data-related requests for uploading data or connection strings
file content is attached to request body for this API. the format and the filename are figured out by server itself. for uploading a file and its API in angular, see this link. this API will return the name of table in database where the uploaded data is stored. source node must store this name for further usages!
Phase#1: Input/Output
Goals: we have a single-page website where you can add a source and destination node to a pipeline and connect them together. In a source Node, you can upload a file as your dataset, and for destination node, you can choose a file format which you want your final result to be provided in. after creating the basic pipeline, you can execute it, and download the result, from destination node.
Smaller Steps:
More Ideas:
Further Tasks (Nice To Have):
Conclusion for first phase:
Implementation
Interfaces and abstracts
Classes
SourceNode : INode
IDestinationiNode : INode
Pipeline
CSVParser : IParser
PostgresqlDatabase : IDatabase
CustomDeserializer (json to Pipeline)
Services and Controllers:
Enums
API Calls:
The text was updated successfully, but these errors were encountered: