Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

android FSM sometimes gets stuck in the waiting_for_trip_start state #135

Closed
shankari opened this issue Oct 20, 2016 · 18 comments
Closed

android FSM sometimes gets stuck in the waiting_for_trip_start state #135

shankari opened this issue Oct 20, 2016 · 18 comments

Comments

@shankari
Copy link
Contributor

While mucking with turning tracking on and off for the power measurement paper, I found that the android FSM sometimes gets stuck in the waiting_for_trip_start state and does not respond to any inputs. Initially, only one of the phones was in that state, but today morning, both of them were. I manually killed the process and restarted it and that fixed it for one of the phones.

Need to investigate and figure out what the underlying issue is.

@shankari
Copy link
Contributor Author

I turned on tracking manually

10-20 08:52:49.502  20555-20555/? I/TripDiaryStateMachineReceiver﹕ TripDiaryStateMachineReciever onReceive(android.app.ReceiverRestrictedContext@5e0ded1, Intent { act=local.transition.start_tracking flg=0x10 cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.TripDiaryStateMachineReceiver }) called
10-20 08:52:49.510  20555-20555/? D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 3 action = local.transition.start_tracking
10-20 08:52:49.515  20555-20555/? D/TripDiaryStateMachineService﹕ after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 08:52:49.533  20555-20555/? D/BuiltinUserCache﹕ Added value for key statemachine/transition at time 1.476978769519E9

I left home, got a geofence transition

10-20 08:56:51.476  20555-20555/? D/GeofenceExitIntentService﹕ onStartCommand called with intent Intent { cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.GeofenceExitIntentService (has extras) } flags 0 startId 1
10-20 08:56:51.480  20555-20555/? D/GeofenceExitIntentService﹕ onStart called with startId 1
10-20 08:56:51.486   20555-8493/? D/GeofenceExitIntentService﹕ geofence exit intent action = null
10-20 08:56:51.491   20555-8493/? D/GeofenceExitIntentService﹕ parsedEvent = com.google.android.gms.location.GeofencingEvent@88139d
10-20 08:56:51.497   20555-8493/? D/GeofenceExitIntentService﹕ got geofence intent callback with type 2 and location Location[fused 37.392256,-122.092221 acc=1046 et=+2d14h47m32s72ms]
10-20 08:56:51.506   20555-8493/? D/GeofenceExitIntentService﹕ Geofence exited! Intent = Intent { cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.GeofenceExitIntentService (has extras) } Starting ongoing monitoring...
10-20 08:56:51.546  20555-20555/? I/TripDiaryStateMachineReceiver﹕ TripDiaryStateMachineReciever onReceive(android.app.ReceiverRestrictedContext@5e0ded1, Intent { act=local.transition.exited_geofence flg=0x10 cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.TripDiaryStateMachineReceiver }) called
10-20 08:56:51.555  20555-20555/? D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 4 action = local.transition.exited_geofence
10-20 08:56:51.563  20555-20555/? D/TripDiaryStateMachineService﹕ after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 08:56:51.580  20555-20555/? D/BuiltinUserCache﹕ Added value for key statemachine/transition at time 1.476979011566E9

I launched the app to check the state - it was WAITING_FOR_TRIP_START

10-20 09:13:55.914  20555-20555/? I/chromium﹕ [INFO:CONSOLE(106)] "[object Object]", source: file:///data/user/0/edu.berkeley.eecs.emission/app_384d8ef4-8bfb-11e6-9e86-223f96a6770c/js/control.js?ionicCachebuster=8619 (106)

So I killed the app.

10-20 09:19:55.089    2356-2366/? I/ActivityManager﹕ Killing 30585:edu.berkeley.eecs.emission:sync/u0a112 (adj 13): remove task
10-20 09:19:55.203    2356-3278/? I/ActivityManager﹕ Killing 20555:edu.berkeley.eecs.emission/u0a112 (adj 16): remove task

Both the service and the activity start back up

10-20 09:19:55.336    2356-3228/? W/ActivityManager﹕ Scheduling restart of crashed service edu.berkeley.eecs.emission/.cordova.tracker.location.TripDiaryStateMachineService in 89198428ms
10-20 09:19:58.307    2356-4880/? I/ActivityManager﹕ START u0 {act=android.intent.action.MAIN cat=[android.intent.category.LAUNCHER] flg=0x10200000 cmp=edu.berkeley.eecs.emission/.MainActivity (has extras)} from uid 10034 on display 0
10-20 09:19:58.358    2356-3228/? I/ActivityManager﹕ Start proc 9142:edu.berkeley.eecs.emission/u0a112 for activity edu.berkeley.eecs.emission/.MainActivity
10-20 09:19:58.895    9142-9142/? I/System.out﹕ During plugin initialize, created usercacheedu.berkeley.eecs.emission.cordova.usercache.BuiltinUserCache@b289ed5
10-20 09:19:59.216    2356-2374/? I/ActivityManager﹕ Displayed edu.berkeley.eecs.emission/.MainActivity: +872ms

And I start getting location points

10-20 09:19:59.497    9142-9142/? D/LocationChangeIntentService﹕ onStart called with Intent { cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.LocationChangeIntentService (has extras) } startId 1
10-20 09:19:59.506    9142-9142/? D/LocationChangeIntentService﹕ onStart called with Intent { cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.LocationChangeIntentService (has extras) } startId 2
10-20 09:19:59.516    9142-9191/? D/LocationChangeIntentService﹕ FINALLY! Got location update, intent is Intent { cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.LocationChangeIntentService (has extras) }
10-20 09:19:59.534    9142-9191/? D/LocationChangeIntentService﹕ FINALLY! Got location update, intent is Intent { cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.LocationChangeIntentService (has extras) }

10-20 09:20:02.866    9142-9202/? D/ActivityRecognitionChangeIntentService﹕ FINALLY! Got activity update, intent is Intent { cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.ActivityRecognitionChangeIntentService (has extras) }
10-20 09:20:02.894    9142-9202/? D/ActivityRecognitionChangeIntentService﹕ FINALLY! Got activity update, intent is Intent { cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.ActivityRecognitionChangeIntentService (has extras) }

I verified from the UI, and the current state is ONGOING_TRIP

It's pretty clear that what's happening is that the code gets to the mApiClient.connect() and then the connection callback is never invoked. Could it be that mApiClient is null? I am not getting a null pointer backtrace.

Could it be that the connection callback is not re-invoked if the client is
already connected?

        Log.d(this, TAG, "service started with flags = "+flags+" startId = "+startId
                +" action = "+intent.getAction());
        if (flags == Service.START_FLAG_REDELIVERY) {
            Log.d(this, TAG, "service restarted! need to check idempotency!");
        }
        mTransition = intent.getAction();
        mPrefs = PreferenceManager.getDefaultSharedPreferences(this);
        mCurrState = mPrefs.getString(this.getString(R.string.curr_state_key),
            this.getString(R.string.state_start));
        Log.d(this, TAG, "after reading from the prefs, the current state is "+mCurrState);
        UserCacheFactory.getUserCache(this).putMessage(R.string.key_usercache_transition,
                new Transition(mCurrState, mTransition));

                // And then connect to the client. All subsequent processing will be in the onConnected
        // method
        mApiClient.connect();

        /*
         We are returning with START_REDELIVER_INTENT, so the process will be restarted with the
         same intent if it is killed. We need to think through the implications of this. If the
         process was killed before any of the actions were performed, then we are fine. If the
         process was killed after the actions were performed, then we are in trouble, because the
         actions are not necessarily idempotent.

         Some of the actions are clearly idempotent. For example, starting the activity recognition
         API. For example, deleting a geofence. It might be easiest to convert the actions to be
         idempotent, but that requires some careful work. TODO: Need to think through this carefully.
         */
        return START_REDELIVER_INTENT;
    }

    @Override
    public void onConnected(Bundle connectionHint) {
        Log.d(this, TAG, "TripDiaryStateMachineReceiver onConnected("+connectionHint+") called");
        ...

@shankari
Copy link
Contributor Author

Let's look to see when the service was created. The log has entries since
10-17. However, the service was only created once, right before the transitions
that actually worked. Although note that transitions between launching the
connection and the connection being complete were also dropped.

10-20 09:19:58.993    9142-9142/? I/TripDiaryStateMachineService﹕ Service created. Initializing one-time variables!
10-20 09:19:59.007    9142-9142/? D/TripDiaryStateMachineService﹕ service started with flags = 1 startId = 1 action = local.transition.start_tracking
10-20 09:19:59.010    9142-9142/? D/TripDiaryStateMachineService﹕ service restarted! need to check idempotency!
10-20 09:19:59.015    9142-9142/? D/TripDiaryStateMachineService﹕ after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 09:19:59.031    9142-9142/? D/BuiltinUserCache﹕ Added value for key statemachine/transition at time 1.47698039902E9


10-20 09:19:59.039    9142-9142/? D/TripDiaryStateMachineService﹕ service started with flags = 1 startId = 2 action = local.transition.stop_tracking
10-20 09:19:59.043    9142-9142/? D/TripDiaryStateMachineService﹕ service restarted! need to check idempotency!
10-20 09:19:59.047    9142-9142/? D/TripDiaryStateMachineService﹕ after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 09:19:59.061    9142-9142/? D/BuiltinUserCache﹕ Added value for key statemachine/transition at time 1.476980399051E9


10-20 09:19:59.065    9142-9142/? D/TripDiaryStateMachineService﹕ service started with flags = 1 startId = 3 action = local.transition.start_tracking
10-20 09:19:59.069    9142-9142/? D/TripDiaryStateMachineService﹕ service restarted! need to check idempotency!
10-20 09:19:59.073    9142-9142/? D/TripDiaryStateMachineService﹕ after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 09:19:59.085    9142-9142/? D/BuiltinUserCache﹕ Added value for key statemachine/transition at time 1.476980399077E9


10-20 09:19:59.099    9142-9142/? D/TripDiaryStateMachineService﹕ service started with flags = 1 startId = 4 action = local.transition.exited_geofence
10-20 09:19:59.105    9142-9142/? D/TripDiaryStateMachineService﹕ service restarted! need to check idempotency!
10-20 09:19:59.111    9142-9142/? D/TripDiaryStateMachineService﹕ after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 09:19:59.121    9142-9142/? D/BuiltinUserCache﹕ Added value for key statemachine/transition at time 1.476980399114E9


10-20 09:19:59.245    9142-9142/? D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver onConnected(null) called
10-20 09:19:59.250    9142-9142/? D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver handleAction(local.state.waiting_for_trip_start, local.transition.exited_geofence) called
10-20 09:19:59.264    9142-9142/? D/BuiltinUserCache﹕ Added value for key background/battery at time 1.476980399255E9
10-20 09:19:59.268    9142-9142/? D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver handleTripStart(local.transition.exited_geofence) called

@shankari
Copy link
Contributor Author

shankari commented Oct 20, 2016

So maybe this primarily affects transitions from the UI. When the UI is launched, and we try multiple transitions, the same service is reused. And with the current change, we reuse the api client. So if connect() does not invoke the callback when it is already connected, that would explain it. In which case, the fix is also fairly simple - just check the connected state and call handleAction() if it is already connected.

@shankari
Copy link
Contributor Author

In which case, the fix is also fairly simple - just check the connected state and call handleAction() if it is already connected.

This does not appear to work. I relaunched a debug version of the app on the phone, with this modified code.

        if (mApiClient.isConnected()) {
            Log.d(this, TAG, "client is already connected, can directly handle the action");
            handleAction(this, mApiClient, mCurrState, mTransition);
        } else {
            // And then connect to the client. All subsequent processing will be in the onConnected
            // method
            mApiClient.connect();
        }

And ran it with the debugger, and the isConnected() always returned false.
Will add the logging just in case and will see if I can reproduce later from a release build.

@shankari
Copy link
Contributor Author

shankari commented Oct 20, 2016

Hm, on repro, the service was created every time it worked. That's why the service was created every time. So maybe the fix from #135 (comment) would work after all.

Will try to repro and see if I can get service to restart without create.

Try 1

10-20 15:34:49.421  19930-19930/edu.berkeley.eecs.emission I/TripDiaryStateMachineReceiver﹕ TripDiaryStateMachineReciever onReceive(android.app.ReceiverRestrictedContext@1259349, Intent { act=local.transition.start_tracking flg=0x10 cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.TripDiaryStateMachineReceiver }) called
10-20 15:34:49.429  19930-19930/edu.berkeley.eecs.emission I/TripDiaryStateMachineService﹕ Service created. Initializing one-time variables!
10-20 15:34:49.435  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 1 action = local.transition.start_tracking
10-20 15:34:49.441  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ after reading from the prefs, the current state is local.state.tracking_stopped
10-20 15:34:49.462  19930-19930/edu.berkeley.eecs.emission D/BuiltinUserCache﹕ Added value for key statemachine/transition at time 1.477002889447E9
10-20 15:34:49.475  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver onConnected(null) called
10-20 15:34:49.478  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver handleAction(local.state.tracking_stopped, local.transition.start_tracking) called
10-20 15:34:49.491  19930-19930/edu.berkeley.eecs.emission D/BuiltinUserCache﹕ Added value for key background/battery at time 1.477002889483E9
10-20 15:34:49.497  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver handleTrackingStopped(local.transition.start_tracking) called
10-20 15:34:49.500  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ newState after handling action is local.state.start
10-20 15:34:49.508  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ newState saved in prefManager is local.state.start
10-20 15:34:49.514  19930-19930/edu.berkeley.eecs.emission I/TripDiaryStateMachineService﹕ Service destroyed. So long, suckers!

Try 2

10-20 15:34:49.515  19930-19930/edu.berkeley.eecs.emission I/TripDiaryStateMachineReceiver﹕ noarg constructor called
10-20 15:34:49.518  19930-19930/edu.berkeley.eecs.emission I/TripDiaryStateMachineReceiver﹕ TripDiaryStateMachineReciever onReceive(android.app.ReceiverRestrictedContext@1259349, Intent { act=local.transition.initialize flg=0x10 cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.TripDiaryStateMachineReceiver }) called
10-20 15:34:49.523  19930-19930/edu.berkeley.eecs.emission I/System.out﹕ data = {"approval_date":"2016-07-14","protocol_id":"2014-04-6267"}
10-20 15:34:49.529  19930-19930/edu.berkeley.eecs.emission I/TripDiaryStateMachineReceiver﹕ 2016-07-14 is the current consented version, sending msg to service...
10-20 15:34:49.531  19930-19930/edu.berkeley.eecs.emission E/chromium﹕ [ERROR:xwalk_autofill_client.cc(121)] Not implemented reached in virtual void xwalk::XWalkAutofillClient::OnFirstUserGestureObserved()
10-20 15:34:49.531  19930-19930/edu.berkeley.eecs.emission D/cr_Ime﹕ [InputMethodManagerWrapper.java:56] isActive: true
10-20 15:34:49.532  19930-19930/edu.berkeley.eecs.emission D/cr_Ime﹕ [InputMethodManagerWrapper.java:65] hideSoftInputFromWindow
10-20 15:34:49.539  19930-19930/edu.berkeley.eecs.emission I/TripDiaryStateMachineService﹕ Service created. Initializing one-time variables!
10-20 15:34:49.542  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 1 action = local.transition.initialize
10-20 15:34:49.546  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ after reading from the prefs, the current state is local.state.start
10-20 15:34:49.559  19930-19930/edu.berkeley.eecs.emission D/BuiltinUserCache﹕ Added value for key statemachine/transition at time 1.477002889549E9
10-20 15:34:49.576  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver onConnected(null) called
10-20 15:34:49.580  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver handleAction(local.state.start, local.transition.initialize) called
10-20 15:34:49.601  19930-19930/edu.berkeley.eecs.emission D/BuiltinUserCache﹕ Added value for key background/battery at time 1.477002889585E9
10-20 15:34:49.605  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver handleStarted(local.transition.initialize) called
10-20 15:34:49.606  19930-26938/edu.berkeley.eecs.emission I/System.out﹕ Running in new thread!!
10-20 15:34:49.616  19930-26938/edu.berkeley.eecs.emission D/CreateGeofenceAction﹕ Last location would have been Location[fused 37.875621,-122.258698 acc=22 et=+22h46m50s597ms] if we hadn't reset it
10-20 15:34:49.620  19930-26938/edu.berkeley.eecs.emission D/CreateGeofenceAction﹕ Last location is Location[fused 37.875621,-122.258698 acc=22 et=+22h46m50s597ms] using it
10-20 15:34:49.630  19930-26938/edu.berkeley.eecs.emission D/CreateGeofenceAction﹕ creating geofence at location Location[fused 37.875621,-122.258698 acc=22 et=+22h46m50s597ms]
10-20 15:34:49.636  19930-26938/edu.berkeley.eecs.emission D/CreateGeofenceAction﹕ creating geofence at location 37.875621, -122.2586978
10-20 15:34:49.659  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ newState after handling action is local.state.waiting_for_trip_start
10-20 15:34:49.670  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ newState saved in prefManager is local.state.waiting_for_trip_start
10-20 15:34:49.677  19930-19930/edu.berkeley.eecs.emission I/TripDiaryStateMachineService﹕ Service destroyed. So long, suckers!

@shankari
Copy link
Contributor Author

restart_tracking.txt.gz

@shankari
Copy link
Contributor Author

Looking at the prior times when the state change did not happen, I can confirm that the service was not created.

10-20 08:52:49.489  20555-20555/? I/TripDiaryStateMachineReceiver﹕ noarg constructor called
10-20 08:52:49.502  20555-20555/? I/TripDiaryStateMachineReceiver﹕ TripDiaryStateMachineReciever onReceive(android.app.ReceiverRestrictedContext@5e0ded1, Intent { act=local.transition.start_tracking flg=0x10 cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.TripDiaryStateMachineReceiver }) called
10-20 08:52:49.510  20555-20555/? D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 3 action = local.transition.start_tracking
10-20 08:52:49.515  20555-20555/? D/TripDiaryStateMachineService﹕ after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 08:52:49.533  20555-20555/? D/BuiltinUserCache﹕ Added value for key statemachine/transition at time 1.476978769519E9

@shankari
Copy link
Contributor Author

Aha! a difference between the two! startId = 1 versus startId = 3.
Let's look more closely into what that means and how to reproduce that condition...

10-20 15:34:49.542  19930-19930/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 1 action = local.transition.initialize
10-20 08:52:49.510  20555-20555/? D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 3 action = local.transition.start_tracking

@shankari
Copy link
Contributor Author

quick grep shows the following restarts

10-20 08:52:49.510  20555-20555/? D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 3 action = local.transition.start_tracking
10-20 08:56:51.476  20555-20555/? D/GeofenceExitIntentService﹕ onStartCommand called with intent Intent { cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.GeofenceExitIntentService (has extras) } flags 0 startId 1
10-20 08:56:51.480  20555-20555/? D/GeofenceExitIntentService﹕ onStart called with startId 1
10-20 08:56:51.555  20555-20555/? D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 4 action = local.transition.exited_geofence
10-20 09:19:59.007    9142-9142/? D/TripDiaryStateMachineService﹕ service started with flags = 1 startId = 1 action = local.transition.start_tracking
10-20 09:19:59.039    9142-9142/? D/TripDiaryStateMachineService﹕ service started with flags = 1 startId = 2 action = local.transition.stop_tracking
10-20 09:19:59.065    9142-9142/? D/TripDiaryStateMachineService﹕ service started with flags = 1 startId = 3 action = local.transition.start_tracking
10-20 09:19:59.099    9142-9142/? D/TripDiaryStateMachineService﹕ service started with flags = 1 startId = 4 action = local.transition.exited_geofence

@shankari
Copy link
Contributor Author

from the docs, startId doesn't seem to mean much.

startId     int: A unique integer representing this specific request to start. Use with stopSelfResult(int)

But it is fairly clear that there are restarts with multiple different IDs.
In particular, the 8:52 call has a startId = 3 and the 8:56 call has a startId = 4.

Another thing to note is that several of the restarts (specially around 9:19) start with flags = 1, which corresponds to

START_FLAG_REDELIVERY
Added in API level 5

int START_FLAG_REDELIVERY

This flag is set in onStartCommand(Intent, int, int) if the Intent is a re-delivery of a previously delivered intent, because the service had previously returned START_REDELIVER_INTENT but had been killed before calling stopSelf(int) for that Intent.

Constant Value: 1 (0x00000001) 

I wonder if the restart was because I manually killed the app around 9am. But if so, why is the TripDiaryStateMachineService around for so long? And if it was a previous crash, then why did it crash and why aren't there any logs around it.

@shankari
Copy link
Contributor Author

shankari commented Oct 20, 2016

For the record, I call stopSelf(); here.

    public void setNewState(String newState) {
        Log.d(this, TAG, "newState after handling action is "+newState);
        SharedPreferences.Editor prefsEditor =
                PreferenceManager.getDefaultSharedPreferences(this).edit();
        prefsEditor.putString(this.getString(R.string.curr_state_key), newState);
        prefsEditor.commit();
        Log.d(this, TAG, "newState saved in prefManager is "+
                PreferenceManager.getDefaultSharedPreferences(this).getString(
                        this.getString(R.string.curr_state_key), "not found"));
        stopSelf();
    }

and when the service is stopped, it is destroyed, which didn't happen because we didn't get any logs around

I/TripDiaryStateMachineService﹕ Service destroyed. So long, suckers!

@shankari
Copy link
Contributor Author

So far, I can think of two potential causes - either the client was already connected or the usercache call crashed. if the cause is that the client was already connected, it can be fixed by the connected check. If it's the usercache crash, then we won't have the entry in the cache. Let's see if we do and try to distinguish between them.

@shankari
Copy link
Contributor Author

shankari commented Oct 20, 2016

If it's the usercache crash, then we won't have the entry in the cache.

Entries are in the cache.

2016-10-20T08:52:49.519000-07:00 local.transition.start_tracking
2016-10-20T08:56:51.566000-07:00 local.transition.exited_geofence

So it doesn't look like the service randomly crashed.
Phew!
So its probably the connection issue, but I don't know how to get the OS to restart the service instead of creating new ones to test it.

I will attempt to test for a bit to see if I can repro and then just merge the changes anyway and wait to see if I can repro in the real world

@shankari
Copy link
Contributor Author

shankari commented Oct 21, 2016

Ok, I can repro. It looks like, when I change the location tracking config, the service is created and
destroyed. Note that config changes re-initialize.

10-20 16:58:26.111  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineService﹕ Service created. Initializing one-time variables!
10-20 16:58:26.115  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 1 action = local.transition.initialize
10-20 16:58:26.122  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ after reading from the prefs, the current state is local.state.start
10-20 16:58:26.156  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver onConnected(null) called10-20 16:58:26.161  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver handleAction(local.state.start, local.transition.initialize) called
10-20 16:58:26.183  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver handleStarted(local.transition.initialize) called
10-20 16:58:26.252  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ newState after handling action is local.state.waiting_for_trip_start
10-20 16:58:26.265  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ newState saved in prefManager is local.state.waiting_for_trip_start
10-20 16:58:26.272  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineService﹕ Service destroyed. So long, suckers!

And after that, the service just gets restarted. However, in this case, the action is handled correctly and the service is destroyed.

10-20 16:59:26.718  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 1 action = local.transition.stop_tracking
10-20 16:59:26.723  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 16:59:26.755  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver onConnected(null) called
10-20 16:59:26.760  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ TripDiaryStateMachineReceiver handleAction(local.state.waiting_for_trip_start, local.transition.stop_tracking) called
10-20 16:59:26.764  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ newState after handling action is local.state.tracking_stopped
10-20 16:59:26.774  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ newState saved in prefManager is local.state.tracking_stopped
10-20 16:59:26.780  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineService﹕ Service destroyed. So long, suckers!

@shankari
Copy link
Contributor Author

Yup, confirmed that it was the re-config.

Switched to the control screen.

10-20 16:58:01.055  11493-11493/edu.berkeley.eecs.emission I/chromium﹕ [INFO:CONSOLE(29)] "Finished changing state from {"url":"/metrics","views":{"main-metrics":{"templateUrl":"templates/main-metrics.html","controller":"MetricsCtrl"}},"name":"root.main.metrics"} to {"url":"/control","views":{"main-control":{"templateUrl":"templates/control/main-control.html","controller":"ControlCtrl"}},"name":"root.main.control"}", source: file:///android_asset/www/js/controllers.js (29)

Turned off.

10-20 16:58:19.135  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineReceiver﹕ noarg constructor called
10-20 16:58:19.149  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineReceiver﹕ TripDiaryStateMachineReciever onReceive(android.app.ReceiverRestrictedContext@ee3224c, Intent { act=local.transition.stop_tracking flg=0x10 cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.TripDiaryStateMachineReceiver }) called
10-20 16:58:19.161  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineService﹕ Service created. Initializing one-time variables!
10-20 16:58:19.182  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 1 action = local.transition.stop_tracking

Turned on.

10-20 16:58:25.976  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineReceiver﹕ noarg constructor called
10-20 16:58:25.985  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineReceiver﹕ TripDiaryStateMachineReciever onReceive(android.app.ReceiverRestrictedContext@ee3224c, Intent { act=local.transition.start_tracking flg=0x10 cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.TripDiaryStateMachineReceiver }) called
10-20 16:58:25.994  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineService﹕ Service created. Initializing one-time variables!
10-20 16:58:26.000  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 1 action = local.transition.start_tracking
10-20 16:58:26.086  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineService﹕ Service destroyed. So long, suckers!

Reconfigured.

10-20 16:58:26.086  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineReceiver﹕ noarg constructor called
10-20 16:58:26.090  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineReceiver﹕ TripDiaryStateMachineReciever onReceive(android.app.ReceiverRestrictedContext@ee3224c, Intent { act=local.transition.initialize flg=0x10 cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.TripDiaryStateMachineReceiver }) called
10-20 16:58:26.097  11493-11493/edu.berkeley.eecs.emission I/System.out﹕ data = {"approval_date":"2016-07-14","protocol_id":"2014-04-6267"}
10-20 16:58:26.105  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineReceiver﹕ 2016-07-14 is the current consented version, sending msg to service...
10-20 16:58:26.111  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineService﹕ Service created. Initializing one-time variables!
10-20 16:58:26.115  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 1 action = local.transition.initialize
10-20 16:58:26.272  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineService﹕ Service destroyed. So long, suckers!

Tried to stop again.

10-20 16:59:26.693  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineReceiver﹕ noarg constructor called
10-20 16:59:26.700  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineReceiver﹕ TripDiaryStateMachineReciever onReceive(android.app.ReceiverRestrictedContext@ee3224c, Intent { act=local.transition.stop_tracking flg=0x10 cmp=edu.berkeley.eecs.emission/.cordova.tracker.location.TripDiaryStateMachineReceiver }) called
10-20 16:59:26.718  11493-11493/edu.berkeley.eecs.emission D/TripDiaryStateMachineService﹕ service started with flags = 0 startId = 1 action = local.transition.stop_tracking
10-20 16:59:26.780  11493-11493/edu.berkeley.eecs.emission I/TripDiaryStateMachineService﹕ Service destroyed. So long, suckers!

@shankari
Copy link
Contributor Author

And finally emission crashed.
logcat.emission.crash.gz

Turning duty cycling back on.

10-20 17:10:47.621 15039 15058 I System.out: data = {"accuracy":100,"accuracy_threshold":200,"android_geofence_responsiveness":5000,"filter_distance":-1,"filter_time":30000,"geofence_radius":100,"ios_use_remote_push_for_sync":false,"ios_use_visit_notifications_for_detection":false,"is_duty_cycling":true,"simulate_user_interaction":false,"trip_end_stationary_mins":5}
10-20 17:10:47.627 15039 15039 D TripDiaryStateMachineService: service started with flags = 1 startId = 1 action = local.transition.exited_geofence
10-20 17:10:47.631 15039 15039 D TripDiaryStateMachineService: service restarted! need to check idempotency!

It looks like the service then continuously respawns, trying to connect every time.
And then onConnected is called without the client actually being connected.
WTF!

10-20 17:10:47.750 15039 15039 D TripDiaryStateMachineService: service started with flags = 1 startId = 2 action = local.transition.exited_geofence
10-20 17:10:47.754 15039 15039 D TripDiaryStateMachineService: service restarted! need to check idempotency!
10-20 17:10:47.760 15039 15039 D TripDiaryStateMachineService: after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 17:10:47.774 15039 15039 D BuiltinUserCache: Added value for key statemachine/transition at time 1.477008647764E9
10-20 17:10:47.781 15039 15039 D TripDiaryStateMachineService: service started with flags = 1 startId = 3 action = local.transition.exited_geofence
10-20 17:10:47.787 15039 15039 D TripDiaryStateMachineService: service restarted! need to check idempotency!
10-20 17:10:47.792 15039 15039 D TripDiaryStateMachineService: after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 17:10:47.811 15039 15039 D BuiltinUserCache: Added value for key statemachine/transition at time 1.477008647796E9
10-20 17:10:47.823 15039 15039 D TripDiaryStateMachineService: service started with flags = 1 startId = 4 action = local.transition.exited_geofence
10-20 17:10:47.831 15039 15039 D TripDiaryStateMachineService: service restarted! need to check idempotency!
10-20 17:10:47.842 15039 15039 D TripDiaryStateMachineService: after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 17:10:47.859 15039 15039 D BuiltinUserCache: Added value for key statemachine/transition at time 1.477008647846E9
10-20 17:10:47.867 15039 15039 D TripDiaryStateMachineService: service started with flags = 1 startId = 5 action = local.transition.exited_geofence
10-20 17:10:47.874 15039 15039 D TripDiaryStateMachineService: service restarted! need to check idempotency!
10-20 17:10:47.880 15039 15039 D TripDiaryStateMachineService: after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 17:10:47.900 15039 15039 D BuiltinUserCache: Added value for key statemachine/transition at time 1.477008647885E9
10-20 17:10:47.913 15039 15039 D TripDiaryStateMachineService: service started with flags = 1 startId = 6 action = local.transition.exited_geofence
10-20 17:10:47.920 15039 15039 D TripDiaryStateMachineService: service restarted! need to check idempotency!
10-20 17:10:47.928 15039 15039 D TripDiaryStateMachineService: after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 17:10:47.947 15039 15039 D BuiltinUserCache: Added value for key statemachine/transition at time 1.477008647932E9
10-20 17:10:47.959 15039 15039 D TripDiaryStateMachineService: service started with flags = 1 startId = 7 action = local.transition.exited_geofence
10-20 17:10:47.976 15039 15039 D TripDiaryStateMachineService: service restarted! need to check idempotency!
10-20 17:10:47.983 15039 15039 D TripDiaryStateMachineService: after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 17:10:47.999 15039 15039 D BuiltinUserCache: Added value for key statemachine/transition at time 1.477008647989E9
10-20 17:10:48.013 15039 15039 D TripDiaryStateMachineService: service started with flags = 1 startId = 8 action = local.transition.exited_geofence
10-20 17:10:48.017 15039 15039 D TripDiaryStateMachineService: service restarted! need to check idempotency!
10-20 17:10:48.021 15039 15039 D TripDiaryStateMachineService: after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 17:10:48.047 15039 15039 D BuiltinUserCache: Added value for key statemachine/transition at time 1.477008648027E9
10-20 17:10:48.055 15039 15039 D TripDiaryStateMachineService: service started with flags = 1 startId = 9 action = local.transition.exited_geofence
10-20 17:10:48.061 15039 15039 D TripDiaryStateMachineService: service restarted! need to check idempotency!
10-20 17:10:48.065 15039 15039 D TripDiaryStateMachineService: after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 17:10:48.081 15039 15039 D BuiltinUserCache: Added value for key statemachine/transition at time 1.47700864807E9
10-20 17:10:48.101 15039 15039 D TripDiaryStateMachineService: service started with flags = 1 startId = 10 action = local.transition.exited_geofence
10-20 17:10:48.107 15039 15039 D TripDiaryStateMachineService: service restarted! need to check idempotency!
10-20 17:10:48.112 15039 15039 D TripDiaryStateMachineService: after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 17:10:48.130 15039 15039 D BuiltinUserCache: Added value for key statemachine/transition at time 1.477008648116E9
10-20 17:10:48.139 15039 15039 D TripDiaryStateMachineService: service started with flags = 1 startId = 11 action = local.transition.exited_geofence
10-20 17:10:48.143 15039 15039 D TripDiaryStateMachineService: service restarted! need to check idempotency!
10-20 17:10:48.148 15039 15039 D TripDiaryStateMachineService: after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 17:10:48.164 15039 15039 D BuiltinUserCache: Added value for key statemachine/transition at time 1.477008648153E9
10-20 17:10:48.172 15039 15039 D TripDiaryStateMachineService: service started with flags = 1 startId = 12 action = local.transition.exited_geofence
10-20 17:10:48.176 15039 15039 D TripDiaryStateMachineService: service restarted! need to check idempotency!
10-20 17:10:48.183 15039 15039 D TripDiaryStateMachineService: after reading from the prefs, the current state is local.state.waiting_for_trip_start
10-20 17:10:48.202 15039 15039 D BuiltinUserCache: Added value for key statemachine/transition at time 1.477008648187E9
10-20 17:10:48.268 15039 15039 D TripDiaryStateMachineService: TripDiaryStateMachineReceiver onConnected(null) called
10-20 17:10:48.269 15039 15039 D AndroidRuntime: Shutting down VM
10-20 17:10:48.270 15039 15039 E AndroidRuntime: FATAL EXCEPTION: main
10-20 17:10:48.270 15039 15039 E AndroidRuntime: Process: edu.berkeley.eecs.emission, PID: 15039
10-20 17:10:48.270 15039 15039 E AndroidRuntime: java.lang.IllegalStateException: Cannot invoke getConnectionResult unless GoogleApiClient is connected
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.google.android.gms.internal.zzmg.getConnectionResult(Unknown Source)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at edu.berkeley.eecs.emission.cordova.tracker.location.TripDiaryStateMachineServiceOngoing.onConnected(TripDiaryStateMachineServiceOngoing.java:123)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.google.android.gms.common.internal.zzk.zzk(Unknown Source)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.google.android.gms.internal.zzmg.zzi(Unknown Source)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.google.android.gms.internal.zzme.zzpi(Unknown Source)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.google.android.gms.internal.zzme.onConnected(Unknown Source)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.google.android.gms.internal.zzmi.onConnected(Unknown Source)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.google.android.gms.internal.zzlz.onConnected(Unknown Source)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.google.android.gms.common.internal.zzj$zzg.zzqv(Unknown Source)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.google.android.gms.common.internal.zzj$zza.zzc(Unknown Source)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.google.android.gms.common.internal.zzj$zza.zzv(Unknown Source)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.google.android.gms.common.internal.zzj$zzc.zzqx(Unknown Source)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.google.android.gms.common.internal.zzj$zzb.handleMessage(Unknown Source)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at android.os.Handler.dispatchMessage(Handler.java:102)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at android.os.Looper.loop(Looper.java:148)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at android.app.ActivityThread.main(ActivityThread.java:5417)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at java.lang.reflect.Method.invoke(Native Method)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:726)
10-20 17:10:48.270 15039 15039 E AndroidRuntime:    at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:616)

@shankari
Copy link
Contributor Author

ok, so I think that it is because of this. In this case, while we are running the service, we broadcast something else, so while the receiver is handling it, it restarts the service without creating it.

        if (actionString.equals(ctxt.getString(R.string.transition_start_tracking))) {
            setNewState(ctxt.getString(R.string.state_start));
            ctxt.sendBroadcast(new Intent(ctxt.getString(R.string.transition_initialize)));
        } else {

@shankari
Copy link
Contributor Author

shankari commented Oct 21, 2016

Also, although I am able to reproduce the start without create, I am not able to reproduce the lack of connected callback. Even in the case where it crashed, we got

10-20 17:10:48.268 15039 15039 D TripDiaryStateMachineService: TripDiaryStateMachineReceiver onConnected(null) called

setting the state to start and then initializing is bad in many ways (see #115). It was basically a somewhat lazy implementation that we can fix by factoring the code and calling the same action from two places.

shankari added a commit to shankari/e-mission-data-collection that referenced this issue Oct 21, 2016
If the state is tracking_stopped, don't automatically restart tracking:
- if you receive an initialize (e.g. on upgrade)
- if you receive a reconfigure

This fixes e-mission#115.

Also made some fixes to try and avoid e-mission#135, although I am not sure if they are
completely fixed. In particular,
- added a check for the isConnected() state and some additional logging. This
  is in case the reason that the transition was not handled was because the
  connection handler was never called.
- also changed the structure so that we no longer send a broadcast from the
  service. This should ensure that the TripDiaryStateMachineService (duty
  cycled and ongoing) is never restarted.

Fixed some logging as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant