Releases · databricks/spark-xml

08 Jan 16:05

srowen

v0.8.0

1bee816

Version 0.8.0

New Features

Support for validating XML rows against an XSD
from_xml for parsing an existing column or string to a struct
schema_of_xml for inferring schema of XML in a string column

Changes: https://github.com/databricks/spark-xml/milestone/5?closed=1

Assets 2

06 Nov 16:08

srowen

v0.7.0

cc977ef

Version 0.7.0

Fixes

Important fix to XML writing, which could cause newlines to be inserted in the wrong place in output (#417)
Ignore XML processing instructions, which otherwise fail parsing (#412)
Ignore text children in mixed text/element nodes, instead of parsing element incorrectly (#416)

Changes: https://github.com/databricks/spark-xml/milestone/4?closed=1

Assets 2

08 Aug 18:10

srowen

v0.6.0

348bb13

Version 0.6.0

Fixes:

Fixed an error that could cause records to be dropped when uncompressed files are read and XML tags happen to span an input split boundary, but fit within the stream read buffer (#400)
Fixed issue with nested tags names in attributes (#374)

Improvements:

inferSchema can now be set to false during parsing to leave all values as string type (#393)
Also treat empty values as null if the nullValue is "" (#381)
Log malformed records for debugging (#372)

Changes: https://github.com/databricks/spark-xml/issues?utf8=%E2%9C%93&q=milestone%3A0.6.0+is%3Aclosed+

Assets 2

30 Dec 16:13

HyukjinKwon

v0.5.0

1a712ea

Version 0.5.0

Spark-xml 0.5.0 include many bug fixes but also following

Improvements :

Partial results support #358, #368 and #370
XML self-closing tag support #352
Scala 2.12 support #343
Hadoop 2.9+ support #282
Add an option ignoreSurroundingSpaces to allow to trim spaces between values #237

Removals, Behavior Changes and Deprecations

Scala 2.10 drop #343

Issues Closed

https://github.com/databricks/spark-xml/milestone/1?closed=1

Assets 2

06 Nov 10:50

HyukjinKwon

v0.4.1

f38c6fe

Version 0.4.1

Spark-xml 0.4.1 adds following

Improvements :

Produce the correct results instead of null from pruned scan in some cases - #186, #197
Treat string types for nullValue - #182

Removals, Behavior Changes and Deprecations

Deprecates treatEmptyAsNulls option - #182

Assets 2

06 Nov 11:18

HyukjinKwon

v0.3.5

294cb0e

Version 0.3.5

Spark-xml 0.3.5 adds following

Improvements :

Produce the correct results instead of null from pruned scan in some cases - #189, #199

Assets 2

10 Sep 11:30

HyukjinKwon

v0.4.0

8e15a09

Version 0.4.0

Spark-xml 0.4.0 adds following

Features:

Support for PERMISSIVE/DROPMALFORMED mode and corrupt record option - #107

Removals, Behavior Changes and Deprecations

Deprecate saveAsXmlFile and promote the usage of write() - #150
Deprecate xmlFile and promote the usage of read() - #150
Drop 1.x compatibility from 0.4.0 - #150
Make not supporting UserDefinedType as it became private - #150
Change default values for valueTag and attributePrefix to _ and _VALUE - #142

Assets 2

10 Sep 07:35

HyukjinKwon

v0.3.4

02435b6

Version 0.3.4

XML Data Source 0.3.4 adds following

Improvements:

Produces correct order of columns for nested rows when user specifies a schema - #125
No value in nested struct causes arrayIndexOutOfBounds - #121 by @lokm01
compression aslias for codec option - #145
Remove dead codes - #144
Fix nested element with name of parent bug - #161 by @mattroberts297
Do not allow empty strings for attributePrefix, valueTag and rowTag - #170
Add missed other default case when parsing/inferring XML documents - #166
Minor documentation changes - #159 by @mattroberts297 and #143 by @anastasia

Assets 2

25 Apr 10:01

HyukjinKwon

v0.3.3

ad7abbd

Version 0.3.3

XML Data Source 0.3.3 adds following

Improvements:

Parse elements in array having attributes correctly
Parse correctly duplicated valueTag field in few special cases
Parse non-existing element in an array as null
Support to parse XML documents when the datatypes in the same elements are structural data type and non-structural data type
Ignore comments
Improvement of documentation

Assets 2

25 Apr 09:55

HyukjinKwon

v0.3.2

8f423b6

Version 0.3.2

Spark-xml 0.3.2 adds following

Improvements:

Fix a bug in type inference for empty values in structual types
Performance improvement
Support for parsing correctly when structural data types are specified
Parse long characters within tags
Added some more tests
Parse correctly even if some attributes exist sparsely
Ignore namespaces
Improvement of documentation

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Features

Fixes

Fixes:

Improvements:

Improvements :

Removals, Behavior Changes and Deprecations

Issues Closed

Improvements :

Removals, Behavior Changes and Deprecations

Improvements :

Features:

Removals, Behavior Changes and Deprecations

Improvements:

Improvements:

Improvements:

Releases: databricks/spark-xml

Version 0.8.0

New Features

Version 0.7.0

Fixes

Version 0.6.0

Fixes:

Improvements:

Version 0.5.0

Improvements :

Removals, Behavior Changes and Deprecations

Issues Closed

Version 0.4.1

Improvements :

Removals, Behavior Changes and Deprecations

Version 0.3.5

Improvements :

Version 0.4.0

Features:

Removals, Behavior Changes and Deprecations

Version 0.3.4

Improvements:

Version 0.3.3

Improvements:

Version 0.3.2

Improvements: