-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query logger #161
Query logger #161
Conversation
ecAudit can now be configured to work as a pure query logger. In other, ecAudit is now compatible with the AllowAllAuthenticator and the AllowAllAuthorizer that comes with Cassandra. Align sample configuration files to use default values. Closes Ericsson#77
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just some minor suggestions.
Good that you keep updating/improving the documentation!
conf/audit.yaml
Outdated
whitelist_cache_validity_in_ms: 30000 | ||
# Example value: 30000 | ||
# | ||
#whitelist_cache_validity_in_ms: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we not keep the "30000" value on this line? So that the example line is "complete" and can be used by just removing the comment...
(similar to the whitelist-example)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. So the reason why I'm proposing all these small changes in the sample yaml files is that, when I created the Query Logger example in the documentation I was trying the example by setting it up. As a user I was kind of expecting that ecAudit would behave the same when 1) running without a audit.yaml file and 2) using the default audit.yaml file. But that was not the case. In fact, we are rather inconsistent.
So, my ambition with all the small changes in the sample yaml files was to set them up so that they reflect default behavior. But, like you're pointing out, there are still in inconsistencies wrt to examples and description of defaults and so on. And you do have a good point, somehow it would be nice to suggest good values to users as well.
In retrospect, this perhaps got a little more invasive than I was expecting. And I'm still not capturing all the essentials here. Perhaps I should create a separate ticket/pr for this?
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good - keeping it in a separate pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #162 for this.
conf/audit.yaml
Outdated
# | ||
logger_backend: | ||
- class_name: com.ericsson.bss.cassandra.ecaudit.logger.Slf4jAuditLogger | ||
parameters: | ||
# - log_format: "{?client:'${CLIENT_IP}'|?}user:'${USER}'{?|batchId:'${BATCH_ID}'?}|status:'${STATUS}'|operation:'${OPERATION}'" | ||
- log_format: "{?client:'${CLIENT_IP}'|?}user:'${USER}'{?|batchId:'${BATCH_ID}'?}|status:'${STATUS}'|operation:'${OPERATION}'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remember to remove the "{??}" around client when merge:ing to the later branches.... That field is only optional in c2.2.
Ok, it will still work fine like this (output will be the same), but we will save a few cpu cycles ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will try to remember that.
conf/eclog.yaml
Outdated
# ${CLIENT_PORT}, ${COORDINATOR_IP}, ${USER}, ${BATCH_ID}, ${STATUS}, ${OPERATION}, ${OPERATION_NAKED}, ${TIMESTAMP}, | ||
# and ${SUBJECT}. | ||
# | ||
log_format: "{?${TIMESTAMP}?}{?|${CLIENT_IP}?}{?:${CLIENT_PORT}?}{?|${COORDINATOR_IP}?}{?|${USER}?}{?|${STATUS}?}{?|${BATCH_ID}?}{?|${OPERATION}?}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thought about a (minor) drawback of having the config here instead of using the default in the binary.
If the code is updated (new field added), THEN this config has to be updated for the new field to show up (by now ecLog will print ALL fields it can find). Maybe not that big issue, but something to think about :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. But on the other hand, wouldn't it be nice for a user to know what the default setting is without looking into the code? So what's the best trade off here on maintenance vs. usability?
See my other comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it would be nice to get/see the default here... But should we then require this parameter to be configured? So that we remove the "duplicate" in the code? I guess we cannot since then we would break backwards compatibility... But it would be nice to only have the default at one place... Otherwise the is a risk it will differ (what is stated here and what is in the code)
But if we move that change into a different pr (as stated above) then the change will be more clear/isolated
doc/example_query_logger.md
Outdated
``` | ||
JVM_EXTRA_OPTS="$JVM_EXTRA_OPTS -Dcassandra.custom_query_handler_class=com.ericsson.bss.cassandra.ecaudit.handler.AuditQueryHandler" | ||
JVM_EXTRA_OPTS="$JVM_EXTRA_OPTS -Decaudit.filter_type=NONE" | ||
JVM_EXTRA_OPTS="$JVM_EXTRA_OPTS -da:net.openhft..." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean the openhft thing? There's more about it here: https://github.com/Ericsson/ecaudit/blob/release/c2.2/doc/install.md#cassandra-envsh
I figured it would be noisy to repeat info about it here. But perhaps it raises questions if I don't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, didn't remember that one :-)
Maybe we can skip it, or just a short note...
Sounds good to break out the change in audit.yaml (default values) to a separate PR. The other things looks good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't had the time to try it out yet, but I'll try to take some time for that in the afternoon!
Remove changes in sample config files and discus this in a separate issue/PR.
doc/example_query_logger.md
Outdated
2020-03-27 10:42:01 CET||SUCCEEDED|INSERT INTO country.by_code (code, name, iso) VALUES ( 46, 'Sweden', 'SE'); | ||
2020-03-27 10:42:35 CET||SUCCEEDED|INSERT INTO country.by_code (code, name, iso) VALUES ( 47, 'Norway', 'NO'); | ||
2020-03-27 10:43:12 CET||SUCCEEDED|INSERT INTO country.by_code (code, name, iso) VALUES ( 48, 'Poland', 'PL'); | ||
2020-03-27 10:43:28 CET||SUCCEEDED|INSERT INTO country.by_code (code, name, iso) VALUES ( 49, 'Germany', 'DE'); | ||
2020-03-27 10:43:42 CET||SUCCEEDED|INSERT INTO country.by_code (code, name, iso) VALUES ( 51, 'Peru', 'PE'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thought of one thing when looking at this example:
The "main reason" for having audit logging is to see WHO does WHAT... (?!)
But here we're missing the WHO part...
Ok, it's good to have simple examples but maybe we should have a more "realistic" use case (include the ${USER})
It would also be nice to nice to have some variations to the example audit -> maybe a select statement, and maybe an insert from another user - that fails...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of a pure Query Logger example would be to get log records without involving/requiring the whole authentication/authorization thing. This is similar to the Full Query Logger (fql) feature of Cassandra 4.0. With this setup, the user field would always be 'anonymous' which is kind of pointless to print. :-)
I'll add a SELECT statement for variation, and elaborate a bit on the purpose of this setup in the into text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just had a minor comment to one example in the doc
Add support for running ecAudit without the authentication/authorization overhead.
The actual changes to the code are small and trivial. But I had to add a new integration test module in order to verify functionality and to make sure that future patches doesn't break things. The new integration tests are basically a copy of the allow-all integration tests, with small modifications to the setup. This adds about 45 seconds to the build time, but I can't think of a better way to do it.