Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test shim_copy_strip problem #187

Open
reidsunderland opened this issue Nov 20, 2024 · 0 comments
Open

test shim_copy_strip problem #187

reidsunderland opened this issue Nov 20, 2024 · 0 comments

Comments

@reidsunderland
Copy link
Member

reidsunderland commented Nov 20, 2024

I'm creating an issue for this because I'm confused by it and having a hard time troubleshooting it.

I'm using the maybe_fix_known_redirection_bug branch, with working redirection and I'm setting KNOWN_REDIRECTION_BUG=yes (with the modifications on this branch, setting this environment variable only causes the test script to be run using bash ./shim_post_copy.sh instead of ./shim_post_copy.sh, it doesn't actually change the behaviour of the tests anymore -- this variable should be renamed).

Note: If you run the same test with KNOWN_REDIRECTION_BUG unset, you still get the false positive GOOD result, but it makes sense in that case because redirection isn't working.

It works the same for me on Ubuntu 22 and RHEL 8, although only on RHEL 8 I also get another BAD: RESULT: BAD trees have 9 differences. I haven't investigated the tree difference yet.

This is the false positive in the shim_copy_strip test:

+ echo 2

+ echo #no post from touch, refused as repeat

+ touch test_file

2024-11-20 17:44:25,894 [INFO] shim published: { "pubTime":"20241120174425.88976816", "baseUrl":"file:/net/local/home/sunderlandr/repos/sarrac/shim_dirA", "relPath":"test_file", "topic":"v03.post", "identity":{  "method" : "sha512", "value" : "Y+Iuwvvuur8AXlj7+w7uYHxKpBcEWmigzGN2ewSONVkmjTXnLzZ9Oy29Xb3fEvxDl3YroUkmCzeVoD\nkXE73c1w=="  } , "mtime":"20241120174425.88138", "atime":"20241120174425.88138", "mode":"0644", "size":"2", "toto":"pig"}

RESULT: Good! 1 sha512 posts from 162 create test_file (again). as expected

The + echo 2 is actually echo 2 > test_file and that is supposed to be posted. But it's not posted, only the touch gets posted, and that's the false positive. The touch is supposed to be ignored because it's the same file being posted in <10 seconds (the shim_post_minterval setting).

Looking at the logs, they show that the original creation of test_file using redirection is ignored because shim is disabled while trying to post the log:

+ echo '#test 1 sha512 162 create test_file (again)'
#test 1 sha512 162 create test_file (again)
SR_SHIMDEBUG 4 2173023 18.5479  close fd=255!
SR_SHIMDEBUG 8 2173023 18.548  close fd=255 read-only, so no post, passing to built-in.
 startup SR_SHIMDEBUG 4 2181433 0.549613  close fd=3!
SR_SHIMDEBUG 8 2181433 0.549665  close fd=3 read-only, so no post, passing to built-in.
SR_SHIMDEBUG 4 2181433 3.54999 shim disabled during post of /home/ras000/shim_copy_strip.log
SR_SHIMDEBUG 4 2181433 3.55003 srshim_initialize shim starting..
2024-11-20 17:43:09,550 [DEBUG] sr_config_read about to open: /home/ras000/.config/sr3/cpost/local_post.conf
2024-11-20 17:43:09,550 [DEBUG] shim 3.24.11p1rc0 settings: action=foreground, hostname=fdsa, config_name=local_post
2024-11-20 17:43:09,550 [DEBUG]         follow_symlinks=no, realpath: Adjust=-1, Filter=yes, Post=yes
2024-11-20 17:43:09,550 [DEBUG]         force_polling=off, sleep=0, expire=86400, housekeeping=300, sanity_log_dead=450
2024-11-20 17:43:09,550 [DEBUG]         nodupe_ttl=0, nodupe_fileAgeMin=0, nodupe_fileAgeMax=2.592e+06,
2024-11-20 17:43:09,550 [DEBUG]         cache_file=off, cache_basis=path, accept_unmatch=on, messageRateMax=0
2024-11-20 17:43:09,550 [DEBUG]         loglevel=6, logReject=on, logRotateCount=5, logRotateInterval=86400
2024-11-20 17:43:09,550 [DEBUG]         fileEvents=006e, directory=(null), queuename=(null), sum=s
2024-11-20 17:43:09,550 [DEBUG]         statehost=no, v2compatRenameDoublePost=no
2024-11-20 17:43:09,550 [DEBUG]         message_ttl=0, post_exchange=xs_ras000_shimtest, post_exchangeSplit=0, post_exchangeSuffix=(null)
2024-11-20 17:43:09,550 [DEBUG]         source=ras000, to=fdsa, post_baseUrl=file:/home/ras000/sr3c_test/shim_dirA, post_baseDir=/home/ras000/sr3c_test/shim_dirA, realpath_post_baseDir=/home/ras000/sr3c_test/shim_dirA
2024-11-20 17:43:09,550 [DEBUG]         topicPrefix=v03, post_topicPrefix=v03.post, pid=-1
2024-11-20 17:43:09,550 [DEBUG] man sr3_cpost(1) for more information
2024-11-20 17:43:09,550 [DEBUG] post_broker: amqps://ras000:<pw>@fdsa:5671
SR_SHIMDEBUG 4 2181433 3.55094 srshim_initialize setup completed.
SR_SHIMDEBUG 4 2181433 3.55095 absolute 1 shimpost /home/ras000/shim_copy_strip.log, status=0
2024-11-20 17:43:09,550 [DEBUG] realpath_adjust -1, /home/ras000 -> /home/ras000
2024-11-20 17:43:09,550 [DEBUG] sr_isMatchingPattern, testing string: /home/ras000/shim_copy_strip.log vs mask: accept /home/ras000/sr3c_test/.* next=0x563030d8b030
2024-11-20 17:43:09,551 [DEBUG] sr_isMatchingPattern, testing string: /home/ras000/shim_copy_strip.log vs mask: accept /home/ras000/test/.* next=0x563030d8c3f0
2024-11-20 17:43:09,551 [DEBUG] sr_isMatchingPattern, testing string: /home/ras000/shim_copy_strip.log vs mask: reject .*                             next=(nil)
2024-11-20 17:43:09,551 [DEBUG] sr_isMatchingPattern: /home/ras000/shim_copy_strip.log matched mask: reject .*
2024-11-20 17:43:09,551 [INFO] sr_shim rejecting pattern: /home/ras000/shim_copy_strip.log
SR_SHIMDEBUG 4 2181433 3.55108 shim re-enabled after post of /home/ras000/shim_copy_strip.log
---> here's the redirect that SHOULD be posted --> + echo 2
SR_SHIMDEBUG 4 2173023 21.5519  dup2 newfd is open, so close it explicitly to potentially post.
SR_SHIMDEBUG 4 2173023 21.552  close fd=1!
SR_SHIMDEBUG 8 2173023 21.552 close /home/ras000/shim_copy_strip.log fd=1
---> here it says shim is disabled --> SR_SHIMDEBUG 4 2173023 21.5521 shim disabled during post of /home/ras000/shim_copy_strip.log
SR_SHIMDEBUG 4 2173023 21.5521 absolute 1 shimpost /home/ras000/shim_copy_strip.log, status=0
2024-11-20 17:43:09,552 [DEBUG] realpath_adjust -1, /home/ras000 -> /home/ras000
2024-11-20 17:43:09,552 [DEBUG] sr_isMatchingPattern, testing string: /home/ras000/shim_copy_strip.log vs mask: accept /home/ras000/sr3c_test/.* next=0x55d17f58f230
2024-11-20 17:43:09,552 [DEBUG] sr_isMatchingPattern, testing string: /home/ras000/shim_copy_strip.log vs mask: accept /home/ras000/test/.* next=0x55d17f590560
2024-11-20 17:43:09,552 [DEBUG] sr_isMatchingPattern, testing string: /home/ras000/shim_copy_strip.log vs mask: reject .*                             next=(nil)
2024-11-20 17:43:09,552 [DEBUG] sr_isMatchingPattern: /home/ras000/shim_copy_strip.log matched mask: reject .*
2024-11-20 17:43:09,552 [INFO] sr_shim rejecting pattern: /home/ras000/shim_copy_strip.log
---> here --> SR_SHIMDEBUG 4 2173023 21.5523 shim re-enabled after post of /home/ras000/shim_copy_strip.log

NOTE: I had tried modifying the makefile to put the log in ../ instead of ./ and that does cause the log to be rejected, but it still doesn't happen fast enough to block the posting of the close of test_file.

If I add sleep 5 before the echo 2 > test_file, then I do get a post triggered by the redirection (good) but the touch should be ignored due to minterval, and it's not being ignored (bad):

+ echo 2

2024-11-20 17:47:57,236 [INFO] shim published: { "pubTime":"20241120174757.23355785", "baseUrl":"file:/net/local/home/sunderlandr/repos/sarrac/shim_dirA", "relPath":"test_file", "topic":"v03.post", "identity":{  "method" : "sha512", "value" : "Y+Iuwvvuur8AXlj7+w7uYHxKpBcEWmigzGN2ewSONVkmjTXnLzZ9Oy29Xb3fEvxDl3YroUkmCzeVoD\nkXE73c1w=="  } , "mtime":"20241120174757.229263", "atime":"20241120174757.226254", "mode":"0644", "size":"2", "toto":"pig"}

+ echo '#no post from touch, refused as repeat'

+ touch test_file

2024-11-20 17:47:57,264 [INFO] shim published: { "pubTime":"20241120174757.25997961", "baseUrl":"file:/net/local/home/sunderlandr/repos/sarrac/shim_dirA", "relPath":"test_file", "topic":"v03.post", "identity":{  "method" : "sha512", "value" : "Y+Iuwvvuur8AXlj7+w7uYHxKpBcEWmigzGN2ewSONVkmjTXnLzZ9Oy29Xb3fEvxDl3YroUkmCzeVoD\nkXE73c1w=="  } , "mtime":"20241120174757.252254", "atime":"20241120174757.252254", "mode":"0644", "size":"2", "toto":"pig"}

RESULT: BAD! 2 sha512 posts, expected: {'sha512': 1} for 162 create test_file (again)

So there's at least two things to investigate:

  1. how do we deal with writes to the log being noticed by the shim and blocking other posts?
  2. minterval doesn't seem to be working?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant