-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enclosed InputFormats do not work #83
Comments
Thanks for reporting this. I assume "version 2.0.0" refers to Spatial Framework for Hadoop. Please let us know the versions of Hive and Hadoop. |
@randallwhitman Hadoop 2.8.5, Hive 2.3.4 |
Thanks for the details. We do not have Hive-2.3.4 (nor Hadoop-2.8.5) installed, and unfortunately the testing framework is not at the level of making it easy to paste a sample query into a test - Esri/spatial-framework-for-hadoop#163. Maybe it will reproduce with another version of Hive or with SparkSql. |
I can confirm that both issues reproduce on Hadoop 2.8.3 and Hive 2.3.2. |
I took a look at reading Enclosed Esri JSON, using 15 points from the JSON-MR mini-sample, and Hive-2.3.5 read the table data OK.
I guess that tests only reading not writing. |
Finally repro the reported issue. create external table test15eej(rowid int, shape binary)
row format serde 'com.esri.hadoop.hive.serde.EsriJsonSerDe'
stored as inputformat 'com.esri.json.hadoop.EnclosedEsriJsonInputFormat'
outputformat 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
location 'file:///tmp/test15eej';
The output file was in fact unenclosed -
create external table alt15uej(rowid int, shape binary)
row format serde 'com.esri.hadoop.hive.serde.EsriJsonSerDe'
stored as inputformat 'com.esri.json.hadoop.UnenclosedEsriJsonInputFormat'
outputformat 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
location 'file:///tmp/write15eej'
With larger data, the output would be expected to span multiple files. In that case, it's not clear how the file[s] could be enclosed at all - maybe each file of the collection could have Enclosed format? |
I am following the instructions in this tutorial, and I am able to create a table using the
UnenclosedEsriJsonInputFormat.
However, I would like to use the enclosed format.
I have tried these two serdes:
Although I am able to create the table, and insert data, when I do a select the result is always empty:
select ST_AsGeoJSON(area), count from taxi_agg;
Changing
EnclosedEsriJsonInputFormat
toUnenclosedEsriJsonInputFormat
, orEnclosedGeoJsonInputFormat
toUnenclosedGeoJsonInputFormat
gives correct results.Not sure if I am doing something wrong, or if there is a problem with the Enclosed Serde.
Version: 2.0.0
The text was updated successfully, but these errors were encountered: