-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add histogram calculation for RasterBand #468
Conversation
src/raster/rasterband.rs
Outdated
self.max | ||
} | ||
/// Histogram values for each bucket | ||
pub fn histogram(&self) -> &[i32] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider naming buckets
or counts
? With this name, you'll end up with things like this:
let rb: Rasterband<'_> = ...;
let histogram = rb.get_histogram(...);
for count in histogram.histogram() { // <--- awkward.
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also found it weird but I was just being consistent with GDAL naming. Happy to change it if we are not aiming for that.
src/raster/rasterband.rs
Outdated
pub struct Histogram { | ||
min: f64, | ||
max: f64, | ||
n_buckets: i32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this usize
? Sometimes GDAL uses signed values when it doesn't make semantic sense, and there's precedence for rectifying this in Rust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't contributed for a while and from memory we were using isize
. I do agree that we could rectify that.
How far do we want to take it? I could also use a NonZeroUsize
, which could be also useful when selecting a band from a dataset... (@lnicola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol... yeah, I've argued for NonZeroUsize
, but creating them is really annoying/verbose.
When I see isize
or i32
I immediately ask myself "What does a negative value mean?". If it has meaning, we should document it. If it doesn't, then I'd at least try in the bindings to coerce it into an unsigned type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wish constructing a NonZeroUsize
from a constant/literal was possible without unsafe
or bloating the binary. 😔
src/raster/rasterband.rs
Outdated
histogram: *mut i32, | ||
histogram_vec: Option<Vec<i32>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this storing the counts twice? Who owns histogram
?
It would really help to document why this is neccessary. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry. I should document that. I used that double storage mechanisms because with GDALGetDefaultHistogram
C owns histogram and with GDALGetRasterHistogram
Rust does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a restating of my idea from before. Does this capture the semantics? I should probably just compile and step through the tests, but I'm admittedly confused if I've gotten the borrowed vs owned cases switched...
pub struct Histogram {
min: f64,
max: f64,
n_buckets: i32,
counts: HistCounts
}
enum HistCounts {
Borrowed(*mut libc::c_void),
Owned(Vec<i32>)
}
...
impl Drop for HistCounts {
fn drop(&mut self) {
match self {
Self::Borrowed(p) => ???;
Self::Owned(v) => ???;
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries mate. I'm pushing those changes now.
src/raster/rasterband.rs
Outdated
/// # Arguments | ||
/// | ||
/// * `force` - If `true`, force the computation. If `false` and no default histogram is available, the method will return None | ||
pub fn get_default_histogram(&self, force: bool) -> Result<Option<Histogram>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For coherence with other methods/functions, I'd consider collapsing these two methods into a single one accepting a new HistogramOptions
with a Default
implementation setting the GDALGetDefaultHistogram
settings.
@lnicola Do you agree with me? Even though there are separate GDALGetDefaultHistogram
and GDALGetRasterHistogram
binding functions, we could be more idiomatic with the HistogramOptions
approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do prefer thin wrappers to avoid learning a new API but I'm happy to make it more idiomatic if you guys want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. Let's hear what some of the other contributors think. I'm on the extreme of "let's not leak C-language workarounds" and "let's leverage Rust to improve ergonomics". Others, I think, are more in your camp, so let's let someone else make the call. We've had related debates here.
src/raster/rasterband.rs
Outdated
max: f64, | ||
n_buckets: i32, | ||
histogram: *mut i32, | ||
histogram_vec: Option<Vec<i32>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think the counts should be usize
or u32
, regardless of what GDAL C returns, but not sure that's a consensus opinion. @lnicola @jdroenner your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw that @lnicola did something similar with raster index, so probably it makes sense to do it like that moving forward.
I've incorporated some of the changes suggested by @metasim, including:
I did not merge the calls for |
@spadarian Take a look at this PR against your repo for an alternative naming and associated documentation, which I think clarifies things for a future contributor. The other option (assuming |
We should also add |
I addressed all the comments... but CI failed on a unrelated test ( raster::processing::dem::slope::tests::test_slope) |
❗ |
Ah, there's a race between Should we use |
Thanks for merging! :) |
@metasim isn't that already merged? |
Oops! Yes. Sorry. Was looking at a non-refreshed page. |
little robustness fix in spadarian#13 |
Rebased @lnicola |
/// Pointer to GDAL allocated array and lenght of histogram counts. | ||
/// | ||
/// Requires freeing with [`VSIFree`][gdal_sys::VSIFree]. | ||
GdalAllocated(*mut u64, usize), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@metasim this seems to use u64
consistently for the counts, did you have some concerns about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, sounds fine. I just had comments about it to make sure we used the .*Ex
variants in boths cases.
CHANGES.md
if knowledge of this change could be valuable to users.This PR exposes 2 GDAL methods to calculate band histograms (default and user-defined).