Skip to content
This repository has been archived by the owner on Jul 13, 2024. It is now read-only.

phase#1: input/output #1

Open
34 of 43 tasks
Soltaniant opened this issue Aug 20, 2022 · 0 comments
Open
34 of 43 tasks

phase#1: input/output #1

Soltaniant opened this issue Aug 20, 2022 · 0 comments
Assignees
Labels
roadmap steps to achieve a phase in details

Comments

@Soltaniant
Copy link
Collaborator

Soltaniant commented Aug 20, 2022

Phase#1: Input/Output

Goals: we have a single-page website where you can add a source and destination node to a pipeline and connect them together. In a source Node, you can upload a file as your dataset, and for destination node, you can choose a file format which you want your final result to be provided in. after creating the basic pipeline, you can execute it, and download the result, from destination node.

Smaller Steps:

  • Upload a csv file to our database. with its content correctly being imported.
  • Execute the generated pipeline (check the correctness of result)
  • download the exact imported file, in the same csv format.
  • Testing the input output process either in this project or a separate project is needed. now and then we will need to use this test for newer file formats and checking the correctness of the output

More Ideas:

You are free to add your idea, without considering whether it is possible or not. Just let your mind to fly!

  • Adding support for multiple outputs
  • Add Code documentations for architectureal subjects
  • convert uploaded data to Json and then dataTable (it seems that json file included dataType as well)

Further Tasks (Nice To Have):

  • handle errors and exceptions for all APIs, both in front and back.
  • support for more file formats as input:
    • Json
    • Text
    • Xml
  • support for more file formats as output:
    • Json
    • Text
    • Xml
  • Reviewing and reformatting the codes of our teammates
  • Testing process: for each phase we can also have a testing issue, where we completely test the proceses in different situations and write the scenarios as well. like a todo list (in markdown) we can check whether a test scenario is working or not.
  • Merging the result of Front and Back for testing might need a person from each side to check the correctness and failures.

Conclusion for first phase:

Nice to have a meeting and think about the pros and cons of the current phase and check plannings for the next phase.

Implementation

Interfaces and abstracts

  • Node

    General abstract class of all nodes. we can also use template methods for default functions.

    • Execute(ExecutionType : enum, Nodes : Dictionary) => string : return the string query of the node
    • Id : string
  • ProcessNode : Node

    All functional nodes, manipulating the source data are here. each one has its own execution method with some private properties.

    • previousNodes : list of Ids
  • IParser

    this parser is used for converting source raw data to a single unique datatype suitable for importing to database.

    • Parse(rawData) : DataTable
  • IDatabase
    • ImportDataTable(table : DataTable, tableName : string)

      details of this method is not clear, though the functionality is. (programmer itself might need to check and think more about it)

    • RunQuery(queryString : string) => TempTable
    • CreateTable(name : string) => string

Classes

  • SourceNode : INode

    • tableName : string
  • IDestinationiNode : INode

    • previousNodes : list of Ids
    • tableName
  • Pipeline

    after execution, the output result is stored in a table in database. the name is given from destinationNode. (while creating a destination node, an api is called and a table is created. the tablename then is returned as response to be stored in the recently created node in frontend)

    • Nodes : Dictionary
    • Execute(Nodes : Dictionary) : Dictionary (id -> output)
    • Preview()
  • CSVParser : IParser

  • PostgresqlDatabase : IDatabase

  • CustomDeserializer (json to Pipeline)

    While deserializing the received json file from API, we need to convert it to the Pipeline class. here we might need such a class. though it might not be needed if a simpler method does exist.

Services and Controllers:

  • DataInventoryService: handle data-related requests for uploading data or connection strings
    • MapToParser(info) => IParser
    • AddSource(dataset : File, datasetName : string) => tableName : string
    • AddDestination(datasetName : string) => tableName : string
  • DataInventoryController: routing the requests and calling the appropriate functions
    • AddSource(dataset : File, datasetName : string)
    • AddDestination(datasetName : string)
  • PipelineService: handle all pipeline related services from executing to previewing
    • Preview(pipeline : json, id : string) => after/before : Tupple
    • Execute(pipeline : json) => Dictionary (id -> output result)
  • PipelineController: routing the requests and calling the appropriate functions
    • Preview(pipeline : json, id : string)
    • Execute(pipeline : json)

Enums

  • ExecutionType

    executing a pipeline is done through many different conditions where we have a seperate type for each. here are some of them:

    • FullExecution
    • Heading
    • Preview
    • Validation

API Calls:

  • Execute([from body] pipeline : Pipeline) => Dictionary (id : string -> tablename : string)
  • AddSourceByFile() => tableName : string

    file content is attached to request body for this API. the format and the filename are figured out by server itself. for uploading a file and its API in angular, see this link. this API will return the name of table in database where the uploaded data is stored. source node must store this name for further usages!

  • AddDestination([from body] datasetName : string) => tableName : string
  • Download(tableName : string, fileFormat‌ : string)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
roadmap steps to achieve a phase in details
Projects
None yet
Development

No branches or pull requests

4 participants