-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Filestream] migrate state from previous ID to current #42624
base: main
Are you sure you want to change the base?
[Filestream] migrate state from previous ID to current #42624
Conversation
This commit enables Filestream inputs to migrate file states from previous IDs to its current ID, this is done by adding a `previous_ids` entry to the input configuration. We look in the store for all states that match an active file with one of the previous IDs and migrate this state to the new ID. The migrated old states are marked for removal from the store.
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
…-from-other-filestream-id
…-from-other-filestream-id
I've handled the accidental file identity migration on cc980a2. Now if the file identity is different, then no entry will be migrated from the registry. |
@belimawr your new test is failing:
|
Thanks! I accidentally made the test rely on a hardcoded inode 🤦♂️ I'm gonna put it back to draft until I sort out all the tests. I also have some other things failing on Windows 😞 |
@@ -424,6 +424,7 @@ otherwise no tag is added. {issue}42208[42208] {pull}42403[42403] | |||
- The journald input is now generally available. {pull}42107[42107] | |||
- Add metrics for number of events and pages published by HTTPJSON input. {issue}42340[42340] {pull}42442[42442] | |||
- Filestram input now supports migrating state when changing its ID, for that set `previous_ids`. {issue}42472[42472] {pull}42624[42624] | |||
- Add `etw` input fallback to attach an already existing session. {pull}42847[42847] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it supposed to be here? It seems unrelated to this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That came when I merged main
onto my branch, I'll fix it. Thanks for spotting that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if runtime.GOOS == "windows" { | ||
t.Logf("[WARN] Could not remove temporatry directory '%s': %s", tempDir, err) | ||
} else { | ||
t.Errorf("could not remove temp dir '%s': %s", tempDir, err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it happening that often? I'm wondering if it'd be worth retrying within a given timeout and/or perhaps getting a callback that should try to kill the beat once again. For any test that does not want/need/can try to stop the beat again, the callback can be a noop. That way this function does not depend on the BeatProc
but still can attempt to stop it and retry the clean up.
It could receive the BeatProc
if it wouldn't compromise other tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it happening that often?
Often enough. While testing this PR on Windows, every time I added an runtime.GOOS == "windows"
to fix a failure, another would pop up. They all together make the test fail all the time.
I'm a bit confused on what would be the function of the callback you're suggesting. One of the ideas of this integration test framework is to remove as much work from the engineer writing the test as possible, the clean up part being one of the key automation. So it feels that adding a callback for the clean up would be adding burden on the engineer writing tests.
Moving it back to draft while we discuss: #36777 |
Proposed commit message
This commit enables Filestream inputs to migrate file states from previous IDs to its current ID, this is done by adding a
previous_ids
entry to the input configuration. We look in the store for all states that match an active file with one of the previous IDs and migrate this state to the new ID. The migrated old states are marked for removal from the store. States are only migrated if they have the same file identity.Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Disruptive User Impact
Author's Checklist
How to test this PR locally
Run the tests (on all platforms: Linux, Mac and Windows)
Manual test
Create a file with more than 1kb in size:
docker run -it --rm mingrammer/flog -f rfc5424 -n 15 > /tmp/flog.log
Run Filebeat with the following configuration:
filebeat.yml
Wait until the file is fully ingested (15 lines in the output file)
Stop Filebeat
Optionally, remove the logs, this will make the next steps easier.
rm -rf logs
Run Filebeat with the following configuration (note the change in
id
and the newprevious_ids
):filebeat.yml
Wait until Filebeat "read" the files to the end. Look for
End of file reached: /tmp/flog.log; Backoff now.
in the second execution logs.Ensure no new data was added to the output file
You can also look for log entries like:
Related issues
## Use cases## ScreenshotsLogs