Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use arrays rather than Dicts for Contour Chasing #51

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

sjkelly
Copy link
Member

@sjkelly sjkelly commented Feb 14, 2020

This builds on #50. Inspired by: https://www.researchgate.net/publication/282975362_Flying_Edges_A_High-Performance_Scalable_Isocontouring_Algorithm

This could serve as a basis for that implementation, which should yield even better performance once fully implemented.

The basic idea is to store each case in a 2D Matrix and loop through this matrix (once). Since the matrix is all UInt8, it ends up very compact in memory. We also avoid hashing overhead and do not need to store keys. Since we track the number of non-zero elements for large arrays, sparse contours (such as one closed loop) should still perform better than the dictionary approach, though they may require more memory allocations.

For reference:
v0.5.1:

contour
1-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "testdata" => BenchmarkTools.Trial: 
	  memory estimate:  3.14 MiB
	  allocs estimate:  31197
	  --------------
	  minimum time:     2.951 ms (0.00% GC)
	  median time:      3.283 ms (0.00% GC)
	  mean time:        3.464 ms (3.25% GC)
	  maximum time:     7.878 ms (0.00% GC)
	  --------------
	  samples:          1438
	  evals/sample:     1

This PR:

contour
1-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "testdata" => BenchmarkTools.Trial: 
	  memory estimate:  569.73 KiB
	  allocs estimate:  370
	  --------------
	  minimum time:     911.743 μs (0.00% GC)
	  median time:      1.029 ms (0.00% GC)
	  mean time:        1.068 ms (1.40% GC)
	  maximum time:     2.901 ms (58.94% GC)
	  --------------
	  samples:          4662
	  evals/sample:     1

…culation

The chief suspect in allocation (and triggering GC) was all the small arrays allocated
to handle ambiguous cases. With this solution we use the 5th bit (0x10) to indicate
an ambiguous case. The logic to disambiguate is handled in the processing of crossings.
This significantly reduces memory allocations and yields modest performance improvements.
This reverts commit b1a9150.
src/Contour.jl Outdated
@@ -187,6 +197,7 @@ end

# Maps cell type to crossing types for non-ambiguous cells
const edge_LUT = (SW, SE, EW, NE, 0x0, NS, NW, NW, NS, 0x0, NE, EW, SE, SW)
const start_edge_LUT = (0x00, 0x00, 0x01, 0x00, 0x01, 0x02, 0x00, 0x00, 0x01, 0x02, 0x00, 0x04, 0x00, 0x00)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these should probably be refactored to use some name constants instead of just magic numbers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for looking at this. I agree. Even better, on the latest diff these aren't there at all.

I was playing around with collapsing the branching logic but it turns out that the codegen was the same since Julia inlined and LLVM unrolled the loops and merged everything.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I haven't actually spent any time with this codebase for several years, so someone else should probably review it too. That's why I left it as a flat comment, without either 👍 or 👎 - I want to leave the actual decision to someone else... :)

@sjkelly
Copy link
Member Author

sjkelly commented Mar 2, 2020

Here is a case where allocations are greater but performance is better:

using Contour
using BenchmarkTools

x = -1:0.0001:1
y = -1:0.0001:1
c(x,y) = sqrt(x^2+y^2) - 1

z = [c(xi,yi) for xi in x, yi in y];

@benchmark contours(collect(x), collect(y), z)

Master:

BenchmarkTools.Trial: 
  memory estimate:  103.07 MiB
  allocs estimate:  795684
  --------------
  minimum time:     37.911 s (0.03% GC)
  median time:      37.911 s (0.03% GC)
  mean time:        37.911 s (0.03% GC)
  maximum time:     37.911 s (0.03% GC)
  --------------
  samples:          1
  evals/sample:     1

PR:

BenchmarkTools.Trial: 
  memory estimate:  396.66 MiB
  allocs estimate:  291
  --------------
  minimum time:     12.295 s (0.00% GC)
  median time:      12.295 s (0.00% GC)
  mean time:        12.295 s (0.00% GC)
  maximum time:     12.295 s (0.00% GC)
  --------------
  samples:          1
  evals/sample:     1

#50 :

BenchmarkTools.Trial: 
  memory estimate:  48.22 MiB
  allocs estimate:  634
  --------------
  minimum time:     37.249 s (0.00% GC)
  median time:      37.249 s (0.00% GC)
  mean time:        37.249 s (0.00% GC)
  maximum time:     37.249 s (0.00% GC)
  --------------
  samples:          1
  evals/sample:     1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants