Use arrays rather than Dicts for Contour Chasing #51

sjkelly · 2020-02-14T11:35:43Z

This builds on #50. Inspired by: https://www.researchgate.net/publication/282975362_Flying_Edges_A_High-Performance_Scalable_Isocontouring_Algorithm

This could serve as a basis for that implementation, which should yield even better performance once fully implemented.

The basic idea is to store each case in a 2D Matrix and loop through this matrix (once). Since the matrix is all UInt8, it ends up very compact in memory. We also avoid hashing overhead and do not need to store keys. Since we track the number of non-zero elements for large arrays, sparse contours (such as one closed loop) should still perform better than the dictionary approach, though they may require more memory allocations.

For reference:
v0.5.1:

contour
1-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "testdata" => BenchmarkTools.Trial: 
	  memory estimate:  3.14 MiB
	  allocs estimate:  31197
	  --------------
	  minimum time:     2.951 ms (0.00% GC)
	  median time:      3.283 ms (0.00% GC)
	  mean time:        3.464 ms (3.25% GC)
	  maximum time:     7.878 ms (0.00% GC)
	  --------------
	  samples:          1438
	  evals/sample:     1

This PR:

contour
1-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "testdata" => BenchmarkTools.Trial: 
	  memory estimate:  569.73 KiB
	  allocs estimate:  370
	  --------------
	  minimum time:     911.743 μs (0.00% GC)
	  median time:      1.029 ms (0.00% GC)
	  mean time:        1.068 ms (1.40% GC)
	  maximum time:     2.901 ms (58.94% GC)
	  --------------
	  samples:          4662
	  evals/sample:     1

…culation The chief suspect in allocation (and triggering GC) was all the small arrays allocated to handle ambiguous cases. With this solution we use the 5th bit (0x10) to indicate an ambiguous case. The logic to disambiguate is handled in the processing of crossings. This significantly reduces memory allocations and yields modest performance improvements.

This reverts commit 6e2b878.

This reverts commit b1a9150.

tomasaschan · 2020-02-17T08:09:09Z

src/Contour.jl

@@ -187,6 +197,7 @@ end

 # Maps cell type to crossing types for non-ambiguous cells
 const edge_LUT = (SW, SE, EW, NE, 0x0, NS, NW, NW, NS, 0x0, NE, EW, SE, SW)
+const start_edge_LUT = (0x00, 0x00, 0x01, 0x00, 0x01, 0x02, 0x00, 0x00, 0x01, 0x02, 0x00, 0x04, 0x00, 0x00)


I think these should probably be refactored to use some name constants instead of just magic numbers.

Thank you for looking at this. I agree. Even better, on the latest diff these aren't there at all.

I was playing around with collapsing the branching logic but it turns out that the codegen was the same since Julia inlined and LLVM unrolled the loops and merged everything.

Yeah, I haven't actually spent any time with this codebase for several years, so someone else should probably review it too. That's why I left it as a flat comment, without either 👍 or 👎 - I want to leave the actual decision to someone else... :)

sjkelly · 2020-03-02T02:13:47Z

Here is a case where allocations are greater but performance is better:

using Contour
using BenchmarkTools

x = -1:0.0001:1
y = -1:0.0001:1
c(x,y) = sqrt(x^2+y^2) - 1

z = [c(xi,yi) for xi in x, yi in y];

@benchmark contours(collect(x), collect(y), z)

Master:

BenchmarkTools.Trial: 
  memory estimate:  103.07 MiB
  allocs estimate:  795684
  --------------
  minimum time:     37.911 s (0.03% GC)
  median time:      37.911 s (0.03% GC)
  mean time:        37.911 s (0.03% GC)
  maximum time:     37.911 s (0.03% GC)
  --------------
  samples:          1
  evals/sample:     1

PR:

BenchmarkTools.Trial: 
  memory estimate:  396.66 MiB
  allocs estimate:  291
  --------------
  minimum time:     12.295 s (0.00% GC)
  median time:      12.295 s (0.00% GC)
  mean time:        12.295 s (0.00% GC)
  maximum time:     12.295 s (0.00% GC)
  --------------
  samples:          1
  evals/sample:     1

#50 :

BenchmarkTools.Trial: 
  memory estimate:  48.22 MiB
  allocs estimate:  634
  --------------
  minimum time:     37.249 s (0.00% GC)
  median time:      37.249 s (0.00% GC)
  mean time:        37.249 s (0.00% GC)
  maximum time:     37.249 s (0.00% GC)
  --------------
  samples:          1
  evals/sample:     1

sjkelly added 13 commits February 13, 2020 20:38

wip

4be7ce2

pass tests

631f177

store last position to do scan in linear time

3903806

get that microsecond unit

260fbaa

reuse cell array

b1a9150

use a LUT for starting edge calc

6e2b878

Revert "use a LUT for starting edge calc"

4e04b3f

This reverts commit 6e2b878.

Revert "reuse cell array"

e1bb0fc

This reverts commit b1a9150.

use inbounds in a few spots

d0607b8

use some more inbounds

daef760

reuse cell array

59bc3bc

move to collapse branches

d9ee21a

tomasaschan reviewed Feb 17, 2020

View reviewed changes

sjkelly added 2 commits March 1, 2020 14:26

cleanup logic in get next edge

e4a3aa2

some more logic cleanups

abed388

sjkelly mentioned this pull request Mar 3, 2020

Performance Improvements and New Algorithms #53

Open

3 tasks

sjkelly mentioned this pull request Dec 28, 2020

Naive methods and parallelizations? JuliaGeometry/Meshing.jl#82

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use arrays rather than Dicts for Contour Chasing #51

Use arrays rather than Dicts for Contour Chasing #51

sjkelly commented Feb 14, 2020

tomasaschan Feb 17, 2020

sjkelly Feb 17, 2020

tomasaschan Feb 17, 2020

sjkelly commented Mar 2, 2020 •

edited

Loading

Use arrays rather than Dicts for Contour Chasing #51

Are you sure you want to change the base?

Use arrays rather than Dicts for Contour Chasing #51

Conversation

sjkelly commented Feb 14, 2020

tomasaschan Feb 17, 2020

Choose a reason for hiding this comment

sjkelly Feb 17, 2020

Choose a reason for hiding this comment

tomasaschan Feb 17, 2020

Choose a reason for hiding this comment

sjkelly commented Mar 2, 2020 • edited Loading

sjkelly commented Mar 2, 2020 •

edited

Loading