Skip to content

Commit

Permalink
Redesign and expend API
Browse files Browse the repository at this point in the history
  • Loading branch information
pszufe committed Jul 7, 2024
1 parent c26f690 commit 47c06de
Show file tree
Hide file tree
Showing 9 changed files with 281 additions and 114 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,5 @@
/docs/build/
/docs/site/
/.vscode

.ipynb_checkpoints/*
/.ipynb_checkpoints/*
6 changes: 3 additions & 3 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "OSMToolset"
uuid = "a1c25ae6-0f93-4b3a-bddf-c248cb99b9fa"
authors = ["pszufe <[email protected]> and contributors"]
version = "0.1.2"
version = "0.2.0"

[deps]
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
Expand All @@ -16,11 +16,11 @@ StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
[compat]
CSV = "0.10"
DataFrames = "1.1, 1.2, 1.3, 1.4, 1.5, 1.6"
EzXML = "1.0, 1.1"
EzXML = "1.0, 1.1, 1.2"
NamedTupleTools = "0.14"
OpenStreetMapX = "0.2, 0.3, 0.4"
Parsers = "2"
SpatialIndexing = "0.1.5"
SpatialIndexing = "0.1.5, 0.1.6"
StatsBase = "0.28, 0.29, 0.30, 0.31, 0.32, 0.33, 0.34"
julia = "1.6"

Expand Down
153 changes: 115 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

`OSMToolset` package provides the tools for efficient extraction of [point-of-interest](https://en.wikipedia.org/wiki/Point_of_interest) from maps and building various custom [walkability](https://en.wikipedia.org/wiki/Walkability) indexes in [Julia](https://julialang.org/).

**Documentation**: [![Dev](https://img.shields.io/badge/docs-dev-blue.svg)](https://pszufe.github.io/OSMToolset.jl/dev/)
**Documentation**: [![Dev](https://img.shields.io/badge/docs-dev-blue.svg)](https://pszufe.github.io/OSMToolset.jl/dev/)
<br>
[![DOI](https://zenodo.org/badge/637564645.svg)](https://zenodo.org/doi/10.5281/zenodo.10016849)
<!-- [![Stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://pszufe.github.io/OSMToolset.jl/stable/) -->
Expand Down Expand Up @@ -53,66 +53,140 @@ julia> df1 = find_poi(file)
4 columns and 76 rows omitted
```
The default configuration file can be founds in `OSMToolset.__builtin_config_path`. This configuration has meta-data columns that can be seen in results of the parsing process. You could create on base on that your own configuration and use it from scratch.
```
myconfig = ScrapePOIConfig{AttractivenessMetaPOI}(OSMToolset.__builtin_config_path)
df1 = find_poi(file;scrape_config=myconfig)
```

Suppose that rather you want to configure manually what is scraped. Perhaps we just wanted parking spaces
that can be either defined in an OSM file as `amenity=parking` or as `parking` key value:
```
julia> config = DataFrame(key=["parking", "amenity"], values=["*", "parking"])
2×2 DataFrame
Row │ key values
│ String String
─────┼──────────────────
1 │ parking *
2 │ amenity parking
julia> config = ScrapePOIConfig("parking",("amenity","parking"))
ScrapePOIConfig{NoneMetaPOI} with 2 keys:
No │ key values
────┼──────────────────
1 │ amenity parking
2 │ parking *
```
Note that contrary to the previous example this time we do not have meta data columns and hence we will use the `NoneMetaPOI` configuration.

Note that the scraping configuration can be extracted to a data frame by executing `config |> DataFrame`. Such dataframe can also be used to create a new configuration by executing `ScrapePOIConfig{NoneMetaPOI}(DataFrame(key=["amenity","parking"],values=["parking","*"]))`.

Note that since we do not use meta data yet we use parameter: `NoneMetaPOI`.
Now this can be scraped as :
```
julia> df2 = find_poi(file; scrape_config=ScrapePOIConfig{NoneMetaPOI}(config))
julia> df2 = find_poi(file, config)
12×7 DataFrame
Row │ elemtype elemid nodeid lat lon key value
│ Symbol Int64 Int64 Float64 Float64 String String
─────┼───────────────────────────────────────────────────────────────────────
1 │ way 187565434 1982207088 42.3603 -71.0866 amenity parking
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
12 │ way 1052438049 9672086211 42.3624 -71.0878 parking surface
10 rows omitted
12 │ way 1052438049 9672086211 42.3624 -71.0878 parking surface 10 rows omitted
```

It is also possible to extract adjacent tags within the same node - this cab be achieved via the `all_tags` option.
For an example we could get the information on parking place metadata.

```
find_poi(file, ScrapePOIConfig("parking",("amenity","parking")); all_tags=true)
25×7 DataFrame
Row │ elemtype elemid nodeid lat lon key value
│ Symbol Int64 Int64 Float64 Float64 String String
─────┼────────────────────────────────────────────────────────────────────────────────
1 │ way 187565434 1982207088 42.3603 -71.0866 amenity parking
2 │ way 187565434 1982207088 42.3603 -71.0866 access private
3 │ way 187565434 1982207088 42.3603 -71.0866 parking surface
4 │ way 187565434 1982207088 42.3603 -71.0866 surface asphalt
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
25 │ way 1052438049 9672086211 42.3624 -71.0878 parking surface
20 rows omitted
```
It can be seen that the same nodeid is repeated for different tags.

The data that we extract can be decorated with additionaly information, such as range and influence of the POI.

```
julia> config2 = ScrapePOIConfig(("amenity","cafe")=>AttractivenessMetaPOI(:food,1,500), ("amenity","restaurant")=>AttractivenessMetaPOI(:food,2,1000), ("parking",("amenity","parking")) => AttractivenessMetaPOI(:car,1,500))
ScrapePOIConfig{AttractivenessMetaPOI} with 2 keys:
No │ key values group influence range
────┼───────────────────────────────────────────────
1 │ amenity cafe food 1.0 500.0
2 │ amenity restaurant food 2.0 1000.0
```
Here we assume that the importance of restaurant is larger than of cafe and that people are more likely to walk a larger distance to visit a restaurant.

```
This data can be further processed in many ways. For example [here](https://pszufe.github.io/OSMToolset.jl/dev/visualize/) is a sample code that performs POI vizualisation
julia> filter!(r->r.nodeid in [1884055322, 11173231405], # select two places
find_poi(file, config2, all_tags=true))
5×10 DataFrame
Row │ elemtype elemid nodeid lat lon key value group influence range
│ Symbol Int64 Int64 Float64 Float64 String String Symbol? Float64? Float64?
─────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ node 1884055322 1884055322 42.3617 -71.09 amenity cafe food 1.0 500.0
2 │ node 1884055322 1884055322 42.3617 -71.09 name Forbes Family Cafe missing missing missing
3 │ node 1884055322 1884055322 42.3617 -71.09 opening_hours Mo-Fr 11:00-15:00 missing missing missing
4 │ node 11173231405 11173231405 42.3622 -71.0864 amenity cafe food 1.0 500.0
5 │ node 11173231405 11173231405 42.3622 -71.0864 name Ripple Cafe missing missing missing
```


The data can be further processed in many ways. For example [here](https://pszufe.github.io/OSMToolset.jl/dev/visualize/) is a sample code that performs POI vizualisation

## Spatial attractiveness processing

Suppose we have the `df1` data from the previous example. Now we can do a spatial attractiveness index in the following way:
Let's consider a more complex attractiveness information:
```
config3 = ScrapePOIConfig(("amenity","cafe")=>AttractivenessMetaPOI(:food,1,500), ("amenity","restaurant")=>AttractivenessMetaPOI(:food,2,1000), (["parking",("amenity","parking")] .=> Ref(AttractivenessMetaPOI(:car,1,500)))... )
ScrapePOIConfig{AttractivenessMetaPOI} with 4 keys:
No │ key values group influence range
────┼───────────────────────────────────────────────
1 │ amenity cafe food 1.0 500.0
2 │ amenity parking car 1.0 500.0
3 │ amenity restaurant food 2.0 1000.0
4 │ parking * car 1.0 500.0
```

Note that in this demo we assume attractiveness configuration defined as `AttractivenessMetaPOI`. If you want a different structure of data for this index you need to crate a subtype of `MetaPOI` and use it in the constructor.

We search for such locations:
```
ix = AttractivenessSpatIndex(df1)
julia> df3 = find_poi(file, config3)
18×10 DataFrame
Row │ elemtype elemid nodeid lat lon key value group influence range
│ Symbol Int64 Int64 Float64 Float64 String String Symbol Float64 Float64
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ node 1884054889 1884054889 42.3621 -71.0892 amenity cafe food 1.0 500.0
2 │ node 1884055322 1884055322 42.3617 -71.09 amenity cafe food 1.0 500.0
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
17 │ way 1052438049 9672086211 42.3624 -71.0878 amenity parking car 1.0 500.0
18 │ way 1052438049 9672086211 42.3624 -71.0878 parking surface car 1.0 500.0
14 rows omitted
```
Note that the default configuration works with the `AttractivenessMetaPOI` data format. If you want a different structure of data for this index you need to crate a subtype of `MetaPOI` and use it in the constructor.

Let us consider some point on the map:
Now with this data we create a spatial attractiveness index in the following way:
```
lat, lon = mean(df1.lat), mean(df1.lon)
ix = AttractivenessSpatIndex(df3);
```

Let us consider a point on the map:
```
using Statistics
lat, lon = mean(df3.lat), mean(df3.lon)
```
We can use the API to calculate attractiveness of that location:
```
julia> attractiveness(ix, lat, lon)
(education = 42.73746118854219, entertainment = 30.385266049775055, healthcare = 12.491783858701343, leisure = 134.5949900134078, parking = 7.310719949554132, restaurants = 25.200347106553586, shopping = 6.89416203789267, transport = 12.090409181473555)
(car = 8.595822085195946, food = 5.151440338789913)
```
If, for the debugging purposes, we want to understand what data has been used to calculate that attractiveness use the `explain=true` parameter:
For this location we can see it is easy to find food and park your car nearby.

If, for some debugging purposes, we want to understand what data has been used to calculate that attractiveness use the `explain=true` parameter:
```
julia> attractiveness(ix, lat, lon ;explain=true).explanation
68×7 DataFrame
Row │ group influence range attractiveness poidistance lat lon
│ Symbol Float64 Float64 Float64 Float64 Float64 Float64
─────┼─────────────────────────────────────────────────────────────────────────────────
1 │ education 20.0 10000.0 16.9454 1527.31 42.3553 -71.105
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
68shopping 5.0 500.0 0.618922 438.108 42.3625 -71.0834
66 rows omitted
julia> attractiveness(ix, lat, lon; explain=true)
(car = 8.595822085195946, food = 5.151440338789913, explanation = 18×7 DataFrame
Row │ group influence range attractiveness poidistance lat lon
│ Symbol Float64 Float64 Float64 Float64 Float64 Float64
─────┼────────────────────────────────────────────────────────────────────────────
1 │ food 1.0 500.0 0.183414 408.293 42.3599 -71.0913
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
18food 2.0 1000.0 1.44716 276.42 42.3627 -71.084
16 rows omitted)ted
```
The attractiveness function is fully configurable on how the attractiveness is actually calculated.
The available parameters can be used to define attractiveness dimension, aggreagation function,
Expand All @@ -121,19 +195,22 @@ attractivess function and how the distance is on map is calculated.
Let us for an example take maximum influence values rather than summing them:
```
julia> att = attractiveness(ix, lat, lon, aggregator = x -> length(x)==0 ? 0 : maximum(x))
(education = 19.245381074958622, entertainment = 17.69295158791498, healthcare = 6.245891929350671, leisure = 4.723681042516024, parking = 2.9623334286775806, restaurants = 4.596901824773207, shopping = 2.0103741801865715, transport = 6.407028429850689)
(car = 0.8840868352005442, food = 1.747669233262405)
```

We could also used the custom scraped `df2` for the attractiveness:

We could also used a DataFrame without meta data columns for the attractiveness:
```
ix2 = AttractivenessSpatIndex{NoneMetaPOI}(df2; get_range=a->300, get_group=a->:parking);
df4 = find_poi(file, ScrapePOIConfig(("amenity","parking"), "parking"))
ix4 = AttractivenessSpatIndex{NoneMetaPOI}(df4; get_range=a->300, get_group=a->:parking);
```
Note that since we did not have metadata we have manually provided `300` meters for the range and `:parking` for the group.

Now we can use this custom scraper to query the attractiveness:
```
julia> attractiveness(ix2, lat, lon; aggregator = sum, calculate_attractiveness = (a,dist) -> dist > 300 ? 0 : 300/dist )
(parking = 13.200370032301507,)
julia> attractiveness(ix4, lat, lon; aggregator = sum, calculate_attractiveness = (a,dist) -> dist > 300 ? 0 : 300/dist )
(parking = 30.235559263812686,)
```
Note that for this code to work we needed to provide the way the attractiveness is calculated with the respect of metadata a (now an empty `struct` as this is NoneMetaPOI).

Expand Down
2 changes: 1 addition & 1 deletion docs/src/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Scraping points-of-interest (POI)
```@docs
find_poi
ScrapePOIConfig
MetaPOI
AbstractMetaPOI
NoneMetaPOI
AttractivenessMetaPOI
```
Expand Down
4 changes: 2 additions & 2 deletions docs/src/visualize.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,8 +112,8 @@ function getplotdata(md, dfpoi)
end

md = get_map_data("Boston.osm"; use_cache=false, only_intersections=false);
config = DataFrame(key="amenity", values=["restaurant", "fast_food", "food_court", "pub", "bar", "cafe", "ice_cream"])
dfpoi = find_poi("Boston.osm"; scrape_config=ScrapePOIConfig{NoneMetaPOI}(config))
configdf = DataFrame(key="amenity", values=["restaurant", "fast_food", "food_court", "pub", "bar", "cafe", "ice_cream"])
dfpoi = find_poi("Boston.osm", ScrapePOIConfig{NoneMetaPOI}(configdf))


attdf = getplotdata(md, dfpoi)
Expand Down
3 changes: 2 additions & 1 deletion src/OSMToolset.jl
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ using StatsBase
using NamedTupleTools
using Parsers
using EzXML
import DataFrames: DataFrame
import OpenStreetMapX
import OpenStreetMapX: LLA, ENU, distance, MapData, center

Expand All @@ -24,7 +25,7 @@ export find_poi
export calc_tiling
export getbounds, Bounds
export ScrapePOIConfig
export MetaPOI
export AbstractMetaPOI
export NoneMetaPOI
export AttractivenessMetaPOI
export sample_osm_file
Expand Down
Loading

4 comments on commit 47c06de

@pszufe
Copy link
Owner Author

@pszufe pszufe commented on 47c06de Jul 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator register()

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/110598

Tip: Release Notes

Did you know you can add release notes too? Just add markdown formatted text underneath the comment after the text
"Release notes:" and it will be added to the registry PR, and if TagBot is installed it will also be added to the
release that TagBot creates. i.e.

@JuliaRegistrator register

Release notes:

## Breaking changes

- blah

To add them here just re-invoke and the PR will be updated.

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.2.0 -m "<description of version>" 47c06de4da96e6746132f0cf45a9303e977a6a15
git push origin v0.2.0

@pszufe
Copy link
Owner Author

@pszufe pszufe commented on 47c06de Jul 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator register()

A major map scraping API extension. See the README on how the library can be used now.

Breaking changes

  • scrape_config is no longer a keyword parameter. Instead of find_poi(file; scrape_config = my_scrape_config) just write find_poi(file, my_scrape_config)
  • The constructor ScrapePOIConfig(filename) is no longer available. Instead of do ScrapePOIConfig(CSV.read(filename, DataFrame))

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request updated: JuliaRegistries/General/110598

Tip: Release Notes

Did you know you can add release notes too? Just add markdown formatted text underneath the comment after the text
"Release notes:" and it will be added to the registry PR, and if TagBot is installed it will also be added to the
release that TagBot creates. i.e.

@JuliaRegistrator register

Release notes:

## Breaking changes

- blah

To add them here just re-invoke and the PR will be updated.

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.2.0 -m "<description of version>" 47c06de4da96e6746132f0cf45a9303e977a6a15
git push origin v0.2.0

Please sign in to comment.