Druid Query Clarification for Program Dashboard #26
Unanswered
Shakthieshwari
asked this question in
Documentation
Replies: 2 comments 2 replies
-
@sowmya-dixit @anandp504 can you please help us out here ? |
Beta Was this translation helpful? Give feedback.
1 reply
-
@sowmya-dixit @anandp504 We also tried postAggregations with type = expression
This Works in normal druid query, but DruidQueryProcessingModel doesnot supports this. Please help us out here Thanks |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi Team,
As part of 5.0 release https://project-sunbird.atlassian.net/browse/SB-30327 We need to enable the survey status report on the program dashboard as csv.
We have a druid datasource for which data would be in this format
https://docs.google.com/spreadsheets/d/1LJ3iZLXkG5QHSK2u7dEeA8nVfBfibkKg2QRYdRhBysQ/edit?usp=sharing
The Requirement is, we wanted the current/max status of each submission id.
We can't give submission_status column in the dimension of druid query, as it would cause an issue by creating multiple rows for each submission id with difference status.(S1-> started, S1-> completed)
As we wanted max , we have calculated it via a aggregator. Attaching the druid query below what we tried ...
{ "id": "ekstep.analytics.dataset.add", "ver": "1.0", "ts": "2016-12-07T12:40:40+05:30", "params": { "msgid": "4f04da60-1e24-4d31-aa7b-1daf91c46341" }, "request": {"dataset":"druid-dataset","datasetSubId":"ml-survey-status-detail-exhaust","datasetConfig":{"type":"ml-survey-status-detail-exhaust","params":{"programId":"program-1","solutionId":"solution-1"}},"datasetType":"druid","visibility":"private","version":"v1","authorizedRoles":["PROGRAM_MANAGER","PROGRAM_DESIGNER"],"validationJson":{"type":"object","properties":{"tag":{"id":"http://api.ekstep.org/dataexhaust/request/tag","type":"string"},"dataset":{"id":"http://api.ekstep.org/dataexhaust/request/dataset","type":"string"},"datasetSubId":{"id":"http://api.ekstep.org/dataexhaust/request/datasetSubId","type":"string"},"requestedBy":{"id":"http://api.ekstep.org/dataexhaust/request/requestedBy","type":"string"},"encryptionKey":{"id":"http://api.ekstep.org/dataexhaust/request/encryptionKey","type":"string"},"datasetConfig":{"id":"http://api.ekstep.org/dataexhaust/request/datasetConfig","type":"object"}},"required":["tag","dataset","datasetConfig"]},"druidQuery":{"id":"ml-survey-status-detail-exhaust","labels":{"user_id":"UUID","user_type":"User type","user_sub_type":"User sub type","state_name":"Declared State","district_name":"District","block_name":"Block","school_externalId":"School ID","school_name":"School Name","board_name":"Declared Board","organisation_name":"Org Name","program_name":"Program Name","program_externalId":"Program ID","survey_name":"Survey Name","survey_externalId":"Survey ID","survey_submission_id":"survey_submission_id","submission_date":"Submission date","total_content_plays_on_portal":"Status of Submission"},"dateRange":{"interval":{"startDate":"1901-01-01","endDate":"2101-01-01"},"granularity":"all","intervalSlider":0},"metrics":[{"metric":"total_content_plays_on_portal","label":"total_content_plays_on_portal","druidQuery":{"dataSource":"ml-survey-status-rollup","intervals":"1901-01-01T00:00+00:00/2101-01-01T00:00:00+00:00","granularity": "all","dimensions":[{"fieldName":"user_id","aliasName":"UUID"},{"fieldName":"user_type","aliasName":"User type"},{"fieldName":"user_sub_type","aliasName":"User sub type"},{"fieldName":"state_name","aliasName":"Declared State"},{"fieldName":"district_name","aliasName":"District"},{"fieldName":"block_name","aliasName":"Block"},{"fieldName":"school_externalId","aliasName":"School ID"},{"fieldName":"school_name","aliasName":"School Name"},{"fieldName":"board_name","aliasName":"Declared Board"},{"fieldName":"organisation_name","aliasName":"Org Name"},{"fieldName":"program_name","aliasName":"Program Name"},{"fieldName":"program_externalId","aliasName":"Program ID"},{"fieldName":"survey_name","aliasName":"Survey Name"},{"fieldName":"survey_externalId","aliasName":"Survey ID"},{"fieldName":"survey_submission_id","aliasName":"survey_submission_id"},{"fieldName":"submission_date","aliasName":"Submission date"}],"queryType":"groupBy","aggregations":[{"fieldName":"status_code","type":"longMax","name":"total_content_plays_on_portal"}]}}],"output":[{"zip":false,"label":"","dims":["date"],"fileParameters":["id","dims"],"metrics":["total_content_plays_on_portal"],"type":"csv"}],"queryType":"groupBy"},"supportedFormats":["csv","zip"],"exhaustType":"OnDemand"}}
Now we have maximum status_code (1,3,5,7), but we require a string (started,inProgress,ratingPending,completed) as the status column, we tried using postAggregation but it failed as it will not return a string. Please let me know how to get this ?
Attaching the csv below which we generated on the above druid query ....
Please do the needful at the earliest ... As it is part of 5.0 release our QA is blocked in the sign-off, so please help us out on high priority (we also tried to reach out via slack, but no reply)
We are also exploring on lookup and extraction function, but no luck
Waiting for your response back @sowmya-dixit @anandp504
Cc - @vijiurs @Ashwiniev95 @aks30 @kiranharidas187
Thanks
Shakthi
7C5AD7FC6C3537276AE61302475C017A_20220825.csv
Beta Was this translation helpful? Give feedback.
All reactions