New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Scheduler Longhaul Tests #238

Merged

yaron2 merged 15 commits into dapr:master from cicoyle:feat-scheduler-longhaul

Nov 20, 2024

Contributor

cicoyle commented Oct 21, 2024 •

edited

Loading

Robust Jobs API test (all with ttl set)

schedule 100 oneshot jobs indefinitely (repeat = 1)
schedule 100 indefinite jobs indefinitely (repeat not set, trigger every 30s)
schedule repeat-job job indefinitely (repeat = 5, trigger every 1s due immediately)
indefinitely schedule and delete a create-delete-job job (repeat = 1, trigger every 1s)

Fun Scheduler Actor Reminders test that simulates a player session where the players health decays over time and then simulates them getting a health pack or the like where their health is increased periodically. Once the players health gets to 0, the player dies, the reminders (health decay reminder and health increase reminder) are unregistered and then the player is revived and the reminders are started again. Note the SchedulerReminders config is set.
Workflow app that repeatedly starts, pauses, and resumes workflows. It defines a TestWorkflow that interacts with activities, handles external events, and logs the workflow's progress through stages. The app continuously monitoring and terminating workflows as necessary. Note the SchedulerReminders config is set.

I added instructions for how to run locally.

cicoyle added 5 commits

October 21, 2024 14:06


          robust jobs api longhaul test

e2e905a

Signed-off-by: Cassandra Coyle <[email protected]>


          fix file name

c6bd947

Signed-off-by: Cassandra Coyle <[email protected]>


          add actor reminder test apps: client + server

439192e

Signed-off-by: Cassandra Coyle <[email protected]>


          update ports and add dockerfile

ab84339

Signed-off-by: Cassandra Coyle <[email protected]>


          add workflow app

2cd4dd6

Signed-off-by: Cassandra Coyle <[email protected]>

cicoyle marked this pull request as ready for review

October 30, 2024 15:05

cicoyle added 2 commits

October 30, 2024 10:32


          add bicep per app

07aaad2

Signed-off-by: Cassandra Coyle <[email protected]>


          add workflow deploy yaml updates

ac21c48

Signed-off-by: Cassandra Coyle <[email protected]>

elena-kolevska requested changes

View reviewed changes

Contributor

elena-kolevska left a comment

Looks amazing @cicoyle ! I ran the local multi app run and it worked great. Before going into more detailed review of the apps I'll share a few notes so you can work on them in the meantime:

We need to add the app's deplyment file, following this example: https://github.com/dapr/test-infra/blob/master/longhaul-test/hashtag-actor-deploy.yml
We also need a new Github workflow file that will deploy the apps (example: https://github.com/cicoyle/test-infra/blob/feat-scheduler-longhaul/.github/workflows/hashtag-actor.yml). This should be triggered every time there are code changes in the app directory and (re)deploy your app in the cluster.

Also please doublecheck if we need to specify persistance for etcd, or are we using the default. I don't remember what we decided last.

cicoyle added 2 commits

October 30, 2024 11:37


          add redeploy config and aks/app bicep files

9e2fde4

Signed-off-by: Cassandra Coyle <[email protected]>


          tweak compose

16c161c

Signed-off-by: Cassandra Coyle <[email protected]>

elena-kolevska requested changes

View reviewed changes

Contributor

elena-kolevska left a comment

Submitting feedback for the scheduler jobs app. I'm checking the other two tomorrow.

scheduler-jobs/README.md Outdated Show resolved Hide resolved

scheduler-jobs/scheduler-jobs.go Outdated Show resolved Hide resolved

scheduler-jobs/README.md Outdated Show resolved Hide resolved

scheduler-jobs/scheduler-jobs.go Outdated Show resolved Hide resolved

scheduler-jobs/scheduler-jobs.go

Comment on lines +221 to +222

		log.Println("waiting a few seconds to let connections establish")
		time.Sleep(5 * time.Second)

Contributor

elena-kolevska Nov 10, 2024

Does the sdk provide a way to know this for certain? If not, we should probably think about adding one.

Contributor Author

cicoyle Nov 19, 2024

Im uncertain. I think I added this bc in my testing locally I noticed a discrepancy where I thought some jobs got dropped if they were scheduled too quickly upon startup

scheduler-jobs/scheduler-jobs.go Outdated Show resolved Hide resolved

scheduler-jobs/README.md Outdated Show resolved Hide resolved

scheduler-jobs/scheduler-jobs.go Outdated Show resolved Hide resolved

scheduler-jobs/scheduler-jobs.go Outdated Show resolved Hide resolved

scheduler-jobs/README.md Outdated Show resolved Hide resolved

elena-kolevska requested changes

View reviewed changes

Contributor

elena-kolevska left a comment

Adding review for the scheduler-workflow app.
Looking at actor reminders next

scheduler-workflow/workflow.go Outdated Show resolved Hide resolved

scheduler-workflow/workflow.go Outdated Show resolved Hide resolved

mikeee mentioned this pull request

v1.15 Release Planning dapr/dapr#8017

Open

38 tasks

cicoyle added 3 commits

November 19, 2024 16:27


          update ctx + cleanup + readme

061cc23

Signed-off-by: Cassandra Coyle <[email protected]>


          wf test updates

c4a4a42

Signed-off-by: Cassandra Coyle <[email protected]>


          update actor reminder client code

342f5a1

Signed-off-by: Cassandra Coyle <[email protected]>

Contributor Author

cicoyle commented Nov 19, 2024

Thanks for the review @elena-kolevska - I've updated the code accordingly :) Ready for your review again

cicoyle requested a review from elena-kolevska

November 19, 2024 23:41

elena-kolevska requested changes

View reviewed changes

scheduler-actor-reminders/client/player-actor-client.go Outdated Show resolved Hide resolved

scheduler-actor-reminders/README.md


		This tests the Scheduler for the underlying storage for Actor Reminders.

		## How To Run the Code:

Contributor

elena-kolevska Nov 19, 2024

I wasn't able to run the code, I keep getting the error:

== APP == dapr client initializing for: 127.0.0.1:65181
== APP == 2024/11/19 23:05:27 error starting health increase reminder: error invoking register actor reminder playerActorType/player-1: rpc error: code = PermissionDenied desc = operations on actor reminders are only possible on hosted actor types
DEBU[0001] api error: code = Internal desc = error invoke actor method: rpc error: code = Canceled desc = context canceled  app_id=player-actor-client instance=Elenas-MacBook-Pro.local scope=dapr.runtime.grpc.api type=log ver=1.14.4

These are the commands I used:

dapr run --app-id player-actor --app-port 3007 --log-level debug --config ../dapr/config.yaml -- go run player-actor.go

and

dapr run --app-id player-actor-client --log-level debug --config ../dapr/config.yaml --placement-host-address=localhost:50007,localhost:50008,localhost:50009 -- go run player-actor-client.go

I tried running with 1.14.4 and latest builds from master.
What am I missing?

Contributor Author

cicoyle Nov 20, 2024

I'm able to run it as-is using the instructions from my readme and do not see that error. Did you try the exact commands from my readme? Im also using v1.14.4

I have seen that error before- outside of the context of my PR here -in the wf scheduler e2e PR in CI, but am not able to repro the issue you are seeing and not familiar enough actors to know what the issue is off the top of my head.

Contributor Author

cicoyle Nov 20, 2024

First, I would try the commands noted in the readme since thats exactly what I've tested and what works for me. If that doesn't work you you, then I would try leaving --placement-host-address unset and see if that changes things. I noticed you are running placement in HA mode from your command.

Contributor

elena-kolevska Nov 20, 2024

Yeah, I did run the original commands as they are in the read me, and I got this error:

== APP == 2024/11/20 17:25:02 error invoking actor method GetUser: error invoking binding playerActorType/player-1: rpc error: code = Internal desc = the state store is not configured to use the actor runtime. Have you set the - name: actorStateStore value: "true" in your state store component file?

Then I removed the --app-port 3008 argument for the client and got the error I shared above. Really weird. I'll look into it some more.

scheduler-actor-reminders/Dockerfile-client Outdated Show resolved Hide resolved

.github/workflows/scheduler-actor-reminders-client.yml Outdated Show resolved Hide resolved

.github/workflows/scheduler-actor-reminders-server.yml Outdated Show resolved Hide resolved

deploy/dapr-multi-app/dapr.yaml Show resolved Hide resolved

deploy/dapr-multi-app/dapr.yaml

+                - appID: scheduler-actor-reminders-client
+                  appDirPath: ../../scheduler-actor-reminders/client
+                  appPort: 3008

Contributor

elena-kolevska Nov 20, 2024

Suggested change

appPort: 3008

Contributor Author

cicoyle Nov 20, 2024

ik its a client, but I think for the conn to work with dapr this is still needed.

deploy/dapr-multi-app/dapr.yaml Show resolved Hide resolved

longhaul-test/scheduler-actor-reminder-client.yml Outdated

Comment on lines 49 to 50

		- name: dapr
		containerPort: 3008

Contributor

elena-kolevska Nov 20, 2024

Suggested change

      
                        - name: dapr
          
                          containerPort: 3008

Contributor Author

cicoyle Nov 20, 2024

I think this is still needed for dapr to be able to comm with my app in k8s

longhaul-test/scheduler-workflow-deploy.yml Show resolved Hide resolved

Contributor

elena-kolevska commented Nov 20, 2024 •

edited

Loading

Thanks for updating @cicoyle , I went through the deployment-related files in the meantime and they mostly look good. I corrected the things I saw, you can just accept the changes and we'll see if we got everything right when we try to deploy.

The only thing that's left is verifying that the reminders app problem I ran into is just some config thing on my side. Does the program run for you right now? I tried it on the earlier version, before your commits from today.

I'll look into your updates first thing tomorrow.

cicoyle and others added 3 commits

November 20, 2024 08:40


          Update scheduler-actor-reminders/Dockerfile-client

d5168c2

Co-authored-by: Elena Kolevska <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>


          fix ports and infra config

eb5cab9

Signed-off-by: Cassandra Coyle <[email protected]>


          fix docker-compose image name

1e5877a

Signed-off-by: Cassandra Coyle <[email protected]>

yaron2 approved these changes

View reviewed changes

yaron2 merged commit b0653a0 into dapr:master

9 checks passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet