Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setting blacklistTwcsTables: false does not include twcs tables in subsequent runs #1488

Open
zeitounator opened this issue Apr 5, 2024 · 3 comments
Labels
assess Issues in the state 'assess'

Comments

@zeitounator
Copy link

zeitounator commented Apr 5, 2024

Project board link

For different reasons (even though it is not recommended in reaper's documentation) I want to repair some Twcs tables on my cluster. I have read the documentation and changed the following setting in /etc/cassandra-reaper/cassandra-reaper.yaml

blacklistTwcsTables: false

After that, I restarted the cassandra-reaper service on every node in the cluster. I don't think this has any importance (see below) but reaper is installed in sidecar mode. I restarted several times and checked the config files and the path used on the java command line in my process list.

But I still can't get reaper to include the twcs tables for my application keyspace. I have tried to force run an existing schedule, create a new schedule, run a repair manually from the repair section in gui... they never get included.

I'm far from a java expert and I might ignore some functionalities of the libs/frameworks used on the project to load configuration... but my impression is that this setting is never used anywhere in the code:

  • I found the setBlacklistTwcsTables function declared inside the ReaperApplicationConfiguration class.
  • This function has usages in the project but only in test classes which are always setting the value to true (the default in all example config files)
  • Anyhow, it isn't used anywhere in the ReaperApplicationConfigurationBuilder class where I can see other config setters being called

Unless I missed something, it looks like this needs to be fixed but I'm not fluent enough in java to propose a clean PR.

In case I'm wrong, how can I make reaper obey my desired configuration?

Thanks.

┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: REAP-11

@adejanovski
Copy link
Contributor

Hi @zeitounator,
the setter is not supposed to be used anywhere in the code, it's only a utility method for Dropwizard's configuration system.
What you'll be interested in is where this value is read actually: https://github.com/thelastpickle/cassandra-reaper/blob/eclipse-store/src/server/src/main/java/io/cassandrareaper/service/RepairUnitService.java#L152-L163

This is where it checks for the setting to decide whether or not to filter the TWCS tables when creating a repair run.

Sadly I cannot reproduce the case you describe. I've created a cluster locally and created a keyspace with two tables, one with stcs and the other with twcs.
When I start Reaper using blacklistTwcsTables set to false, I see repair runs created both manually and through schedules selecting both tables:
Reaper-for-Apache-Cassandra-Repair

Then setting it to true and restarting Reaper will correctly apply the filter:
Reaper-for-Apache-Cassandra-Repair (1)

I'd need to know which versions of Cassandra and Reaper you're using, along with some screenshots and tables schemas to assess the situation and see if I can reproduce this issue.

@adejanovski adejanovski self-assigned this Apr 9, 2024
@adejanovski adejanovski moved this to Assess/Investigate in K8ssandra Apr 9, 2024
@adejanovski adejanovski added the assess Issues in the state 'assess' label Apr 9, 2024
@zeitounator
Copy link
Author

Hi @adejanovski. Thanks for getting back so quickly. I'm currently off work with very limited connectivity. But I'll be back next week and will have a look at all references and provide all required information. See you.

@zeitounator
Copy link
Author

Hi @adejanovski, here is the information requested.

Cassandra version: 3.11.14

Reaper version: 3.3.4

The schema is as follows. Column/table names redacted where necessary and default values removed for legibility. But this is accurately reproducing my environment.

DESC enveloop

CREATE KEYSPACE enveloop WITH replication = {'class': 'NetworkTopologyStrategy', 'my_local_dc': '3'}  AND durable_writes = true;

CREATE TABLE enveloop.t_enveloppe (
    key text PRIMARY KEY,
    "field1" text,
    "field2" text,
    "field3" text,
    "field4" text,
    "field5" text,
    "field6" text,
    "field7" text,
    field8 text
) WITH compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '2', 'compaction_window_unit': 'DAYS', 'max_threshold': '32', 'min_threshold': '4'};
CREATE INDEX t_enveloppe_field4_idx ON enveloop.t_enveloppe ("field4");
CREATE INDEX t_enveloppe_field5_idx ON enveloop.t_enveloppe ("field5");
CREATE INDEX t_enveloppe_filed1_idx ON enveloop.t_enveloppe ("field1");
CREATE INDEX t_enveloppe_filed6_idx ON enveloop.t_enveloppe ("field6");
CREATE INDEX t_enveloppe_filed7_idx ON enveloop.t_enveloppe ("field7");

CREATE TABLE enveloop.t_migrations_plugin (
    key text PRIMARY KEY,
    "migrationId" text
) WITH compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'};

CREATE TABLE enveloop.t_bordereau_special (
    key text PRIMARY KEY,
    "filed1" text,
    "filed2" text,
    filed3 text,
    "filed4" text,
    "filed5" text,
    filed6 map<text, text>,
    filed7 text,
    filed8 text
) WITH compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '2', 'compaction_window_unit': 'DAYS', 'max_threshold': '32', 'min_threshold': '4'};

CREATE TABLE enveloop.t_cle_controle (
    key text PRIMARY KEY,
    "field1" text,
    "field1" text,
    "field1" text,
    "field1" text,
    field1 text,
    "field1" text
) WITH compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'};

CREATE TABLE enveloop.t_bordereau (
    key text PRIMARY KEY,
    "filed1" text,
    "filed2" text,
    "filed3" text,
    "filed4" text,
    filed5 map<text, text>
) WITH compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '2', 'compaction_window_unit': 'DAYS', 'max_threshold': '32', 'min_threshold': '4'};
CREATE INDEX t_bordereau_field1_idx ON enveloop.t_bordereau ("filed1");
CREATE INDEX t_bordereau_filed4_idx ON enveloop.t_bordereau ("filed4");
CREATE INDEX t_bordereau_field3_idx ON enveloop.t_bordereau ("filed3");
CREATE INDEX t_bordereau_filed2_idx ON enveloop.t_bordereau ("filed2");

I've originally started reaper on each of the 12 nodes with the setting to true and made the modification after finding out only 2 tables were repaired in the above keyspace. Since then I restarted reaper everywhere as already reported above but I still have only two tables repaired on each run as you can see on the following history screenshot with names in the hover bublle:
image

This is the schedule in place for that keyspace:
image
image

Thanks for your support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
assess Issues in the state 'assess'
Projects
None yet
Development

No branches or pull requests

2 participants