Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query logger #161

Merged
merged 5 commits into from
Apr 1, 2020
Merged

Query logger #161

merged 5 commits into from
Apr 1, 2020

Conversation

eperott
Copy link
Collaborator

@eperott eperott commented Mar 25, 2020

Add support for running ecAudit without the authentication/authorization overhead.

The actual changes to the code are small and trivial. But I had to add a new integration test module in order to verify functionality and to make sure that future patches doesn't break things. The new integration tests are basically a copy of the allow-all integration tests, with small modifications to the setup. This adds about 45 seconds to the build time, but I can't think of a better way to do it.

eperott added 3 commits March 25, 2020 14:47
ecAudit can now be configured to work as a pure query logger.
In other, ecAudit is now compatible with the AllowAllAuthenticator
and the AllowAllAuthorizer that comes with Cassandra.

Align sample configuration files to use default values.

Closes Ericsson#77
@eperott eperott requested review from etedpet and emolsson March 25, 2020 15:04
@coveralls
Copy link

coveralls commented Mar 25, 2020

Coverage Status

Coverage decreased (-0.03%) to 78.42% when pulling 01e8b19 on eperott:query_logger into ae98141 on Ericsson:release/c2.2.

Copy link
Contributor

@etedpet etedpet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just some minor suggestions.
Good that you keep updating/improving the documentation!

conf/audit.yaml Outdated
whitelist_cache_validity_in_ms: 30000
# Example value: 30000
#
#whitelist_cache_validity_in_ms:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not keep the "30000" value on this line? So that the example line is "complete" and can be used by just removing the comment...
(similar to the whitelist-example)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. So the reason why I'm proposing all these small changes in the sample yaml files is that, when I created the Query Logger example in the documentation I was trying the example by setting it up. As a user I was kind of expecting that ecAudit would behave the same when 1) running without a audit.yaml file and 2) using the default audit.yaml file. But that was not the case. In fact, we are rather inconsistent.

So, my ambition with all the small changes in the sample yaml files was to set them up so that they reflect default behavior. But, like you're pointing out, there are still in inconsistencies wrt to examples and description of defaults and so on. And you do have a good point, somehow it would be nice to suggest good values to users as well.

In retrospect, this perhaps got a little more invasive than I was expecting. And I'm still not capturing all the essentials here. Perhaps I should create a separate ticket/pr for this?

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - keeping it in a separate pr

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #162 for this.

conf/audit.yaml Outdated
#
logger_backend:
- class_name: com.ericsson.bss.cassandra.ecaudit.logger.Slf4jAuditLogger
parameters:
# - log_format: "{?client:'${CLIENT_IP}'|?}user:'${USER}'{?|batchId:'${BATCH_ID}'?}|status:'${STATUS}'|operation:'${OPERATION}'"
- log_format: "{?client:'${CLIENT_IP}'|?}user:'${USER}'{?|batchId:'${BATCH_ID}'?}|status:'${STATUS}'|operation:'${OPERATION}'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remember to remove the "{??}" around client when merge:ing to the later branches.... That field is only optional in c2.2.
Ok, it will still work fine like this (output will be the same), but we will save a few cpu cycles ;-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will try to remember that.

conf/eclog.yaml Outdated
# ${CLIENT_PORT}, ${COORDINATOR_IP}, ${USER}, ${BATCH_ID}, ${STATUS}, ${OPERATION}, ${OPERATION_NAKED}, ${TIMESTAMP},
# and ${SUBJECT}.
#
log_format: "{?${TIMESTAMP}?}{?|${CLIENT_IP}?}{?:${CLIENT_PORT}?}{?|${COORDINATOR_IP}?}{?|${USER}?}{?|${STATUS}?}{?|${BATCH_ID}?}{?|${OPERATION}?}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thought about a (minor) drawback of having the config here instead of using the default in the binary.
If the code is updated (new field added), THEN this config has to be updated for the new field to show up (by now ecLog will print ALL fields it can find). Maybe not that big issue, but something to think about :-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. But on the other hand, wouldn't it be nice for a user to know what the default setting is without looking into the code? So what's the best trade off here on maintenance vs. usability?

See my other comment above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would be nice to get/see the default here... But should we then require this parameter to be configured? So that we remove the "duplicate" in the code? I guess we cannot since then we would break backwards compatibility... But it would be nice to only have the default at one place... Otherwise the is a risk it will differ (what is stated here and what is in the code)

But if we move that change into a different pr (as stated above) then the change will be more clear/isolated

```
JVM_EXTRA_OPTS="$JVM_EXTRA_OPTS -Dcassandra.custom_query_handler_class=com.ericsson.bss.cassandra.ecaudit.handler.AuditQueryHandler"
JVM_EXTRA_OPTS="$JVM_EXTRA_OPTS -Decaudit.filter_type=NONE"
JVM_EXTRA_OPTS="$JVM_EXTRA_OPTS -da:net.openhft..."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the openhft thing? There's more about it here: https://github.com/Ericsson/ecaudit/blob/release/c2.2/doc/install.md#cassandra-envsh

I figured it would be noisy to repeat info about it here. But perhaps it raises questions if I don't?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, didn't remember that one :-)
Maybe we can skip it, or just a short note...

doc/example_query_logger.md Outdated Show resolved Hide resolved
doc/example_query_logger.md Outdated Show resolved Hide resolved
doc/example_query_logger.md Outdated Show resolved Hide resolved
@etedpet
Copy link
Contributor

etedpet commented Mar 26, 2020

Sounds good to break out the change in audit.yaml (default values) to a separate PR. The other things looks good!

Copy link
Contributor

@emolsson emolsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't had the time to try it out yet, but I'll try to take some time for that in the afternoon!

bin/configure_ccm_querylog_chronicle.sh Show resolved Hide resolved
bin/run_ccm_performance_test.sh Show resolved Hide resolved
doc/example_query_logger.md Outdated Show resolved Hide resolved
Remove changes in sample config files and discus this in a separate issue/PR.
Comment on lines 71 to 75
2020-03-27 10:42:01 CET||SUCCEEDED|INSERT INTO country.by_code (code, name, iso) VALUES ( 46, 'Sweden', 'SE');
2020-03-27 10:42:35 CET||SUCCEEDED|INSERT INTO country.by_code (code, name, iso) VALUES ( 47, 'Norway', 'NO');
2020-03-27 10:43:12 CET||SUCCEEDED|INSERT INTO country.by_code (code, name, iso) VALUES ( 48, 'Poland', 'PL');
2020-03-27 10:43:28 CET||SUCCEEDED|INSERT INTO country.by_code (code, name, iso) VALUES ( 49, 'Germany', 'DE');
2020-03-27 10:43:42 CET||SUCCEEDED|INSERT INTO country.by_code (code, name, iso) VALUES ( 51, 'Peru', 'PE');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thought of one thing when looking at this example:
The "main reason" for having audit logging is to see WHO does WHAT... (?!)
But here we're missing the WHO part...
Ok, it's good to have simple examples but maybe we should have a more "realistic" use case (include the ${USER})

It would also be nice to nice to have some variations to the example audit -> maybe a select statement, and maybe an insert from another user - that fails...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of a pure Query Logger example would be to get log records without involving/requiring the whole authentication/authorization thing. This is similar to the Full Query Logger (fql) feature of Cassandra 4.0. With this setup, the user field would always be 'anonymous' which is kind of pointless to print. :-)

I'll add a SELECT statement for variation, and elaborate a bit on the purpose of this setup in the into text.

Copy link
Contributor

@etedpet etedpet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just had a minor comment to one example in the doc

bin/run_ccm_performance_test.sh Show resolved Hide resolved
@eperott eperott merged commit 81e6be5 into Ericsson:release/c2.2 Apr 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants