Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data race when HTTP server boots #22

Open
tomtaylor opened this issue Aug 12, 2022 · 2 comments
Open

Data race when HTTP server boots #22

tomtaylor opened this issue Aug 12, 2022 · 2 comments

Comments

@tomtaylor
Copy link
Member

I was trying to track down a bug where infrequently connections to the PIP endpoint would return EOF in my HTTP Client, so I booted the server up with -race to see if this might be a server error. I don't think it is, but I did spot this in the logs as the server boots.

[spatial] app.go:179: Register /data/ handler
[spatial] app.go:204: Register /api/point-in-polygon handler
==================
WARNING: DATA RACE
Write at 0x00c0001946d0 by goroutine 8:
  github.com/whosonfirst/go-whosonfirst-spatial-www/server.(*HTTPServerApplication).RunWithFlagSet.func1()
      /Users/tom/projects/wof/go-whosonfirst-spatial-www/server/app.go:124 +0x84

Previous write at 0x00c0001946d0 by main goroutine:
  github.com/whosonfirst/go-whosonfirst-spatial-www/server.(*HTTPServerApplication).RunWithFlagSet()
      /Users/tom/projects/wof/go-whosonfirst-spatial-www/server/app.go:161 +0xd74
  github.com/whosonfirst/go-whosonfirst-spatial-www/server.(*HTTPServerApplication).Run()
      /Users/tom/projects/wof/go-whosonfirst-spatial-www/server/app.go:69 +0x1d0
  main.main()
      /Users/tom/projects/wof/go-whosonfirst-spatial-www/cmd/server/main.go:24 +0x4c

Goroutine 8 (running) created at:
  github.com/whosonfirst/go-whosonfirst-spatial-www/server.(*HTTPServerApplication).RunWithFlagSet()
      /Users/tom/projects/wof/go-whosonfirst-spatial-www/server/app.go:122 +0x9bc
  github.com/whosonfirst/go-whosonfirst-spatial-www/server.(*HTTPServerApplication).Run()
      /Users/tom/projects/wof/go-whosonfirst-spatial-www/server/app.go:69 +0x1d0
  main.main()
      /Users/tom/projects/wof/go-whosonfirst-spatial-www/cmd/server/main.go:24 +0x4c
==================
2022/08/12 08:58:19 index 101874113#:0 [-0.26, -0.15]x[52.88, 52.93]
2022/08/12 08:58:19 index 1175612693#:0 [-2.49, -2.25]x[53.42, 53.54]
2022/08/12 08:58:19 index 101872113#:0 [-1.97, -1.85]x[53.28, 53.36]
2022/08/12 08:58:19 index 101853557#:0 [-0.64, -0.58]x[51.68, 51.73]
2022/08/12 08:58:19 index 101854787#:0 [-0.39, -0.33]x[51.79, 51.84]
[etc]
@thisisaaronland
Copy link
Member

Okay, the first thing I can see is that the code (read: me) is wrapping to call to index data in an unnecessary Go routine:

https://github.com/whosonfirst/go-whosonfirst-spatial-www/blob/main/server/app.go#L122-L129

It is unnecessary because the code it is calling invokes its own Go routine:

func (p *SpatialApplication) IndexPaths(ctx context.Context, paths ...string) error {

So the code starts by saying: Spatial application, go an index these paths in the background.

And a little further down it says: Create a data (HTTP) handler using the spatial application's spatial database:

https://github.com/whosonfirst/go-whosonfirst-spatial-www/blob/main/server/app.go#L161

If you look carefully at the data handler you'll see it doesn't know anything about "spatial applications" but only about readers; specifically the spatial application implements the reader.Reader interface:

func NewDataHandler(r reader.Reader) (gohttp.Handler, error) {

https://github.com/whosonfirst/go-whosonfirst-spatial-rtree/blob/main/database.go#L589

That part is important. The data handler doesn't know anything about the spatial application or the spatial application's iterator which is indexing data in the background, notably the IsIndexing() method:

So it's very possible for the data handler to start getting requests for records that haven't been indexed yet, for example with a really big repo like GB postalcodes.

One bad habit that I've been trying to address going forward is not wrapping other people's errors in something that will help track down errors. Not doing this often leads to getting an error that just says io.EOF which while self-explanatory is often unhelpful. I mention that because both the (HTTP) data handler and the go-whosonfirst-spatial-rtree implementation of the Reader interface suffer from this:

https://github.com/whosonfirst/go-whosonfirst-spatial-rtree/blob/main/database.go#L620-L646

At the end of all of this my immediate hunch is that the data race is being triggered by the nested Go routines and the io.EOF error because... I'm not sure because it doesn't seem like the kind of error to be triggered even if a record hasn't been indexed.

So, I will create a new branch that does the following:

  • Removes the unnecessary Go routine for indexing records
  • Create a middleware (HTTP) handler that wraps the data handler and ensures that indexing is complete and return a suitable 5XX error if not

Hopefully one or both of those things will address the issues you're seeing.

@thisisaaronland
Copy link
Member

thisisaaronland commented Aug 12, 2022

Can you try the issue-22 branch and see if that helps.

https://github.com/whosonfirst/go-whosonfirst-spatial-www/compare/issue-22

For example:

$> go run -mod vendor cmd/server/main.go -spatial-database-uri rtree://  /usr/local/data/sfomuseum-data-whosonfirst/

%> curl -v http://localhost:8080/data/85632685
*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /data/85632685 HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.79.1
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 503 Service Unavailable
< Content-Type: text/plain; charset=utf-8
< X-Content-Type-Options: nosniff
< Date: Fri, 12 Aug 2022 18:19:02 GMT
< Content-Length: 30
< 
Service unavailable: indexing
* Connection #0 to host localhost left intact

Note that the go-whosonfirst-spatial-pip/api PIP handler was already checking whether indexing was happening:

if app.Iterator.IsIndexing() {
http.Error(rsp, "Indexing records", http.StatusServiceUnavailable)
return
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants