-
Hello, dear community! I'm trying to figure out if is possible to do something with the help of Awkward. I have an ak array with
The lines that share the same values on the first two columns are in a sequence. I would like to know if there's a way to change this ak array in order to keep these lines with same values on the first two columns separated from the other "groups". |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
Hi @CaueSousa! I'm interpreting your question as: You want to split an array of length One thing that would be helpful to know; is it ever possible that the first column and second column of two rows are not both equal or non-equal? i.e., is this
or this
possible? Additionally, does the order of this array matter? Can we re-order the rows, as long as the groups are preserved? |
Beta Was this translation helpful? Give feedback.
-
Oddly enough, there is a function for this. You can use ak.run_lengths to determine how many values are equal (contiguously), and then ak.unflatten to make groups of those sizes. You might need to flatten and select an individual column before applying this operation, though. The links to documentation point to similar examples. |
Beta Was this translation helpful? Give feedback.
@jpivarski your answer is where my mind was going! I was unsure as to the requirement to split on both columns, but I accidentally found the solution :)
@CaueSousa I wasn't initially sure how easy this would be. It turns out that there's a neat trick (or at least, I think it's a trick!) that let's do this using
run_lengths
without any complex slicing.First, let's define an example array
>>> array = ak.Array( [ # A [0, 0, 1], [0, 0, 2], # B [1, 1, 3], # C [2, 2, 4], # D [2, 3, 5], [2, 3, 6], ] )
Let's find the lengths of the "runs" of same-value items in the first column:
>>> runs_in_col_0 = ak.run_len…