Flight split based on condition #457

xoolive · 2024-07-09T15:25:43Z

No description provided.

junzis · 2024-07-11T15:00:39Z

Can we get this method into LazyTraffic?

For example, the following code through error: Method 'split' not implemented on Traffic or LazyTraffic

t = (
    t.iterate_lazy(iterate_kw=dict(by="6 hours"))
    .assign_id(name="{self.callsign}_{idx:>04}")
    .filter()
    .longer_than("1 hour")
    .pipe(enough_data)
    .split("30min", condition=no_split_enroute)
    .pipe(full_flight)
)

xoolive · 2024-07-12T09:38:07Z

The proper way to add a method to LazyTraffic is to add a decorator to a method without an implementation (use the ellipsis ...) in the Traffic class, like everything below this line:

traffic/src/traffic/core/traffic.py

Line 736 in a83e95b

@lazy_evaluation()

Note that the decorator will look for a method in Flight with the same name:

traffic/src/traffic/core/lazy.py

Line 341 in a83e95b

if not hasattr(Flight, f.__name__):

Lazy methods

At the moment only two main categories of Flight methods are supported:

those returning a boolean: if True, the input flight is passed as is to the next method, if False, the processing for the current flight stops;
those returning an Optional[Flight]: if None, the processing for the current flight stops, if a Flight, the output flight is passed to the next method on the stack.
There is also summary() which returns a dictionary but it can only be used at the end of the stack: eval() will create a pd.DataFrame out of all the records. There is nothing in the code to ensure that no other method is stacked after it.

The most important category that is not supported as is are methods returning a FlightIterator: since there can be many segments returned, this would break the linearity of the stack.

Instead of

flight:
|_ method1 -> bool
|_ method2 -> None | Flight
|_ method3 -> None | Flight

It could create a tree of executions, with possible RecursionError associated, and without the possibility to merge branches later:

flight
|_ method1 -> bool
|_ method2 -> FlightIterator
   |_ [segment1]
   |  |_ method3 -> FlightIterator
   |     |_ ...
   |_ [segment2]
   |  |_ method3 -> FlightIterator
   |     |_ ...

A common use case with FlightIterator however is to use an aggregator right behind (i.e. a function FlightIterator -> None | bool | Flight)

That pattern is supported in LazyTraffic, with many aggregators such as has, next, or all, they also support methods being passed as a string to avoid passing lambdas which cannot be serialized (and multiprocessed), that's why instead of:

flight.aligned_on_ils("EHRD").next()  # None | Flight
traffic.aligned_on_ils("EHRD").next().eval()  # INVALID code
        ^__ returns a FlightIterator
                               ^__ brings the FlightIterator back to a Flight

You would write:

traffic.next('aligned_on_ils("EHRD")').eval()
        ^__ next returns an Optional[Flight]

On your example

t = (
    t.iterate_lazy(iterate_kw=dict(by="6 hours"))
    .assign_id(name="{self.callsign}_{idx:>04}")  # you assign a flight_id
    .filter()
    .longer_than("1 hour")
    .pipe(enough_data)
    .split("30min", condition=no_split_enroute)  # this returns a FlightIterator
    .pipe(full_flight)
)

The problem here that needs clarification is that when you split the trajectories, all the pieces will have the same flight_id. I am not sure if it is what you want. You may have wanted to put the assign_id() at the end of the stack? It feels error-prone though. I am also not sure what the full_flight method does.

That's the reason I am not very sure if I want to open the Pandora box with lazy flight iterators, because of this kind of situation.

Workaround

There is however an example of a workaround in the following example, with the all() aggregator, followed by a .drop(columns=['flight_id']) or here where a new flight_id in generated in the all() aggregator.

For your situation, I am not sure:

I am not sure how you want to apply the full_flight function? (are the types valid in the first place?)
I haven't checked whether .all('split("30min", condition=no_split_enroute)') would work. It would be probably relevant to implement it if it doesn't.

junzis · 2024-07-12T09:45:13Z

This is what I came up with in the end. Not pretty, but it work well.

def no_split_enroute(f1: Flight, f2: Flight) -> bool:
    return f1.data.iloc[-1].altitude < 25000 or f2.data.iloc[0].altitude < 25000


def resample_filter_split(flight: Flight) -> None | Flight:
    flight_ = Traffic.from_flights(
        [f.resample("2s").filter() for f in flight.split("10min")]
    )[0]  # 🌶️  I would recommend writing it Flight(pd.concat([...])) to avoid confusion with the [0] here

    return flight_.split("30min", condition=no_split_enroute).all(
        "{self.flight_id}_{i}"
    )  # 🌶️ This would be a perfect use case for flight_i.all('split("30min", condition=no_split_en_route)', flight_id="{self.flight_id}_{i}")

def full_flight(flight: Flight) -> None | Flight:  # 🌶️  return a bool
    if (  # 🌶️ replace 'if' with 'return'
        flight.data.altitude.iloc[0] < 5000
        and flight.data.altitude.iloc[-1] < 5000
        and flight.altitude_max > 15000
    ):
        return flight  # 🌶️ remove
    return None  # 🌶️  remove

 t = (
      t.iterate_lazy(iterate_kw=dict(by="6 hours"))
      .assign_id(name="{self.callsign}_{idx:>04}")
      .filter()
      .longer_than("1 hour")
      .pipe(resample_filter_split)
      .pipe(full_flight)
      .eval(max_workers=8, desc="processing")
  )

xoolive · 2024-07-12T09:50:09Z

Commenting directly in your code above

xoolive · 2024-07-12T13:13:36Z

You can now replace

flight_ = Traffic.from_flights(
        [f.resample("2s").filter() for f in flight.split("10min")]
    )[0]

with

flight_ = flight.split('10min').map(lambda f: f.resample('2s').filter()).all()

junzis · 2024-07-12T13:41:25Z

Yes, it works! I think it is good to be merged 👍

split based on condition

40921c3

xoolive assigned junzis Jul 9, 2024

add .map() function on flight iterator

83a949c

xoolive merged commit cd03e2f into master Jul 12, 2024
4 checks passed

xoolive deleted the pr-split-condition branch July 12, 2024 13:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flight split based on condition #457

Flight split based on condition #457

xoolive commented Jul 9, 2024

junzis commented Jul 11, 2024

xoolive commented Jul 12, 2024

junzis commented Jul 12, 2024 •

edited by xoolive

Loading

xoolive commented Jul 12, 2024

xoolive commented Jul 12, 2024

junzis commented Jul 12, 2024

Flight split based on condition #457

Flight split based on condition #457

Conversation

xoolive commented Jul 9, 2024

junzis commented Jul 11, 2024

xoolive commented Jul 12, 2024

Lazy methods

On your example

Workaround

junzis commented Jul 12, 2024 • edited by xoolive Loading

xoolive commented Jul 12, 2024

xoolive commented Jul 12, 2024

junzis commented Jul 12, 2024

junzis commented Jul 12, 2024 •

edited by xoolive

Loading