Skip to content

Commit

Permalink
issue #17 and issue #18 - format conversions like GeoJSON and new OGR…
Browse files Browse the repository at this point in the history
…Input component
  • Loading branch information
justb4 committed Oct 2, 2014
1 parent 6779b3d commit 3858e25
Show file tree
Hide file tree
Showing 31 changed files with 1,905 additions and 398 deletions.
17 changes: 12 additions & 5 deletions docs/using.rst
Original file line number Diff line number Diff line change
Expand Up @@ -231,27 +231,34 @@ when constructing a Chain.

The following data types are currently symbolically defined in the :class:`stetl.packet.FORMAT` class:

- ``xml_line_stream`` - each Packet contains a line (string) from an XML file or string representation (DEPRECATED)
- ``any`` - 'catch-all' type, may be any of the types below.

- ``etree_doc`` - a complete in-memory XML DOM structure using the ``lxml`` etree

- ``etree_element_stream`` - each Packet contains a single DOM Element (usually a Feature) in ``lxml`` etree format

- ``etree_feature_array`` - each Packet contains an array of DOM Elements (usually Features) in ``lxml`` etree format

- ``xml_doc_as_string`` - a string representation of a complete XML document
- ``geojson_feature`` - as ``struct`` but following naming conventions for a single Feature according to the GeoJSON spec: http://geojson.org

- ``string``- a general string
- ``geojson_collection`` - as ``struct`` but following naming conventions for a FeatureCollection according to the GeoJSON spec: http://geojson.org

- ``ogr_feature`` - a single Feature object from an OGR source (via Python SWIG wrapper)

- ``ogr_feature_array`` - a Python list (array) of a single Feature objects from an OGR source

- ``record`` - a Python ``dict`` (hashmap)

- ``record_array`` - a Python list (array) of ``dict``

- ``string``- a general string

- ``struct`` - a JSON-like generic tree structure

- ``geojson_struct`` - as ``struct`` but following naming conventions according to the GeoJSON spec: http://geojson.org
- ``xml_doc_as_string`` - a string representation of a complete XML document

- ``xml_line_stream`` - each Packet contains a line (string) from an XML file or string representation (DEPRECATED)

- ``any`` - 'catch-all' type, may be any of the above.

Many components, in particular Filters, are able to transform data formats.
For example the `XmlElementStreamerFileInput` can produce an
Expand Down
4 changes: 2 additions & 2 deletions examples/basics/10_jinja2_templating/etl.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -49,15 +49,15 @@ file_path = output/cities.gml
class = inputs.fileinput.JsonFileInput
# file_path = input/cities-gjson.json
file_path = https://raw.githubusercontent.com/justb4/stetl/master/examples/basics/10_jinja2_templating/input/cities-gjson.json
output_format = geojson_struct
output_format = geojson_collection

# More advanced gml templating with globals for more or less static content
# and GeoJSON to GML geometry conversion
[filter_template_geojson2gml]
class = filters.templatingfilter.Jinja2TemplatingFilter
template_file = templates/cities-gjson2gml.jinja2
template_globals_path = input/globals.json,https://raw.githubusercontent.com/justb4/stetl/master/examples/basics/10_jinja2_templating/input/more-globals.json
input_format = geojson_struct
input_format = geojson_collection

[output_gml_file2]
class = outputs.fileoutput.FileOutput
Expand Down
35 changes: 33 additions & 2 deletions examples/basics/11_formatconvert/etl.cfg
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
# Trivial use of convert filter: convert XML DOM (etree_doc) to string and back.

[etl]
chains = input_xml_file|convert_xml_to_string|convert_string_to_xml|output_file
chains = input_xml_file|convert_xml_to_string|convert_string_to_xml|output_xml_file,
input_complex_xml_file|convert_to_json|output_json_file,
input_gml_sf_file|convert_to_geojson|output_geojson_file

# XML to String and back
[input_xml_file]
class = inputs.fileinput.XmlFileInput
file_path = input/cities.xml
Expand All @@ -17,7 +20,35 @@ class = filters.formatconverter.FormatConverter
input_format = string
output_format = etree_doc

[output_file]
[output_xml_file]
class = outputs.fileoutput.FileOutput
file_path = output/cities.xml

# XML to JSON
[input_complex_xml_file]
class = inputs.fileinput.XmlFileInput
file_path = input/building-dutch-bag.xml

[convert_to_json]
class = filters.formatconverter.FormatConverter
input_format = etree_doc
output_format = struct

[output_json_file]
class = outputs.fileoutput.FileOutput
file_path = output/building-dutch.json

# GML to GeoJSON
[input_gml_sf_file]
class = inputs.fileinput.XmlFileInput
file_path = input/cities.gml

[convert_to_geojson]
class = filters.formatconverter.FormatConverter
input_format = etree_doc
output_format = struct

[output_geojson_file]
class = outputs.fileoutput.FileOutput
file_path = output/cities-gjson.json

561 changes: 561 additions & 0 deletions examples/basics/11_formatconvert/input/building-dutch-bag.xml

Large diffs are not rendered by default.

104 changes: 104 additions & 0 deletions examples/basics/11_formatconvert/input/cities.gml
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
<wfs:FeatureCollection xmlns:wfs="http://www.opengis.net/wfs"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.opengis.net/gml http://schemas.opengis.net/gml/2.1.2/feature.xsd">
<gml:featureMember xmlns:gml="http://www.opengis.net/gml">
<feature:heronfeat xmlns:feature="http://heron-mc.org" fid="cities.1358">
<feature:geometry>
<gml:Point srsName="EPSG:4326">
<gml:coordinates decimal="." cs="," ts=" ">4.632,52.387</gml:coordinates>
</gml:Point>
</feature:geometry>
<feature:ObjectID>12386304</feature:ObjectID>
<feature:CITY_NAME>Haarlem</feature:CITY_NAME>
<feature:GMI_ADMIN>NLD-NHL</feature:GMI_ADMIN>
<feature:ADMIN_NAME>Noord-Holland</feature:ADMIN_NAME>
<feature:FIPS_CNTRY>NL</feature:FIPS_CNTRY>
<feature:CNTRY_NAME>Netherlands</feature:CNTRY_NAME>
<feature:STATUS>Provincial capital</feature:STATUS>
<feature:POP_RANK>5</feature:POP_RANK>
<feature:POP_CLASS>100,000 to 250,000</feature:POP_CLASS>
<feature:PORT_ID>0</feature:PORT_ID>
<feature:LABEL_FLAG>0</feature:LABEL_FLAG>
</feature:heronfeat>
</gml:featureMember>
<gml:featureMember xmlns:gml="http://www.opengis.net/gml">
<feature:heronfeat xmlns:feature="http://heron-mc.org" fid="cities.1359">
<feature:geometry>
<gml:Point srsName="EPSG:4326">
<gml:coordinates decimal="." cs="," ts=" ">4.89483636,52.37304545</gml:coordinates>
</gml:Point>
</feature:geometry>
<feature:ObjectID>12386305</feature:ObjectID>
<feature:CITY_NAME>Amsterdam</feature:CITY_NAME>
<feature:GMI_ADMIN>NLD-NHL</feature:GMI_ADMIN>
<feature:ADMIN_NAME>Noord-Holland</feature:ADMIN_NAME>
<feature:FIPS_CNTRY>NL</feature:FIPS_CNTRY>
<feature:CNTRY_NAME>Netherlands</feature:CNTRY_NAME>
<feature:STATUS>National capital</feature:STATUS>
<feature:POP_RANK>3</feature:POP_RANK>
<feature:POP_CLASS>500,000 to 1,000,000</feature:POP_CLASS>
<feature:PORT_ID>31060</feature:PORT_ID>
<feature:LABEL_FLAG>0</feature:LABEL_FLAG>
</feature:heronfeat>
</gml:featureMember>
<gml:featureMember xmlns:gml="http://www.opengis.net/gml">
<feature:heronfeat xmlns:feature="http://heron-mc.org" fid="cities.1360">
<feature:geometry>
<gml:Point srsName="EPSG:4326">
<gml:coordinates decimal="." cs="," ts=" ">5.112,52.1</gml:coordinates>
</gml:Point>
</feature:geometry>
<feature:ObjectID>12386306</feature:ObjectID>
<feature:CITY_NAME>Utrecht</feature:CITY_NAME>
<feature:GMI_ADMIN>NLD-UTR</feature:GMI_ADMIN>
<feature:ADMIN_NAME>Utrecht</feature:ADMIN_NAME>
<feature:FIPS_CNTRY>NL</feature:FIPS_CNTRY>
<feature:CNTRY_NAME>Netherlands</feature:CNTRY_NAME>
<feature:STATUS>Provincial capital</feature:STATUS>
<feature:POP_RANK>5</feature:POP_RANK>
<feature:POP_CLASS>100,000 to 250,000</feature:POP_CLASS>
<feature:PORT_ID>0</feature:PORT_ID>
<feature:LABEL_FLAG>1</feature:LABEL_FLAG>
</feature:heronfeat>
</gml:featureMember>
<gml:featureMember xmlns:gml="http://www.opengis.net/gml">
<feature:heronfeat xmlns:feature="http://heron-mc.org" fid="cities.1361">
<feature:geometry>
<gml:Point srsName="EPSG:4326">
<gml:coordinates decimal="." cs="," ts=" ">4.281,52.076</gml:coordinates>
</gml:Point>
</feature:geometry>
<feature:ObjectID>12386307</feature:ObjectID>
<feature:CITY_NAME>The Hague</feature:CITY_NAME>
<feature:GMI_ADMIN>NLD-ZHL</feature:GMI_ADMIN>
<feature:ADMIN_NAME>Zuid-Holland</feature:ADMIN_NAME>
<feature:FIPS_CNTRY>NL</feature:FIPS_CNTRY>
<feature:CNTRY_NAME>Netherlands</feature:CNTRY_NAME>
<feature:STATUS>Provincial capital</feature:STATUS>
<feature:POP_RANK>4</feature:POP_RANK>
<feature:POP_CLASS>250,000 to 500,000</feature:POP_CLASS>
<feature:PORT_ID>0</feature:PORT_ID>
<feature:LABEL_FLAG>0</feature:LABEL_FLAG>
</feature:heronfeat>
</gml:featureMember>
<gml:featureMember xmlns:gml="http://www.opengis.net/gml">
<feature:heronfeat xmlns:feature="http://heron-mc.org" fid="cities.1362">
<feature:geometry>
<gml:Point srsName="EPSG:4326">
<gml:coordinates decimal="." cs="," ts=" ">4.48515455,51.92559091</gml:coordinates>
</gml:Point>
</feature:geometry>
<feature:ObjectID>12386308</feature:ObjectID>
<feature:CITY_NAME>Rotterdam</feature:CITY_NAME>
<feature:GMI_ADMIN>NLD-ZHL</feature:GMI_ADMIN>
<feature:ADMIN_NAME>Zuid-Holland</feature:ADMIN_NAME>
<feature:FIPS_CNTRY>NL</feature:FIPS_CNTRY>
<feature:CNTRY_NAME>Netherlands</feature:CNTRY_NAME>
<feature:STATUS>Other</feature:STATUS>
<feature:POP_RANK>2</feature:POP_RANK>
<feature:POP_CLASS>1,000,000 to 5,000,000</feature:POP_CLASS>
<feature:PORT_ID>31140</feature:PORT_ID>
<feature:LABEL_FLAG>0</feature:LABEL_FLAG>
</feature:heronfeat>
</gml:featureMember>
</wfs:FeatureCollection>
Loading

0 comments on commit 3858e25

Please sign in to comment.