Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SETL-5] Implement Json Extract and Load #11

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

furkanvarol
Copy link
Collaborator

Implemented JsonExtract which extracts JSON file(s) as org.apache.spark.sql.Rows, and JsonLoad which loads org.apache.spark.sql.Rows as JSON formatted file(s). Both mechanism assumes that each line (will) has and one JSON object.

@msimav
Copy link

msimav commented Aug 18, 2016

JSON Extract is a compound operation consisted from file extract and then JSON transform and
JSON Load is a compoun operation too. What are the adventages of having those oprations as separate classes instead of just combining what we already have?

@furkanvarol
Copy link
Collaborator Author

You mean we should implement this functionality with what we already have or we should not have implement it at all?

@msimav
Copy link

msimav commented Aug 18, 2016

I mean after #12 is merged we will have RowToJsonTransform and JsonToRowTransform classes which we can combine with FileExtract and FileLoad classes to achieve the behaviours of JsonExtract and JsonLoad classes.

@furkanvarol
Copy link
Collaborator Author

I see, okay I will change it after #12 merged.

* @param path File path(s).
* In order to supply multiple files, you can use wildcards or give multiple paths separated with comma.
*/
final case class JsonExtract(path: String) extends Extract[Row] {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those primitive operations must be composable without creating new classes. This class is not only meaningless but also an antipattern that we must avoid as library authors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants