-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Re-)Deployment fails with "com.datastax.driver.core.exceptions.AlreadyExistsException: Table reaper_db.running_repairs already exists" exception #1476
Comments
Faced Similar issue Stacktrace : INFO [2024-03-08 10:11:22,462] [main] o.e.j.u.log - Logging initialized @1081ms to org.eclipse.jetty.util.log.Slf4jLog
INFO [2024-03-08 10:11:22,499] [main] i.c.s.InitializeStorage - Initializing the database and performing schema migrations
INFO [2024-03-08 10:11:22,515] [main] c.d.d.core - DataStax Java driver 3.11.0 for Apache Cassandra
INFO [2024-03-08 10:11:22,516] [main] c.d.d.c.GuavaCompatibility - Detected Guava >= 19 in the classpath, using modern compatibility layer
INFO [2024-03-08 10:11:22,532] [main] c.d.d.c.ClockFactory - Using native clock to generate timestamps.
INFO [2024-03-08 10:11:22,602] [main] c.d.d.c.NettyUtil - Did not find Netty's native epoll transport in the classpath, defaulting to NIO.
INFO [2024-03-08 10:11:22,705] [main] c.d.d.c.Cluster - Cannot connect with protocol version V5, trying with V4
INFO [2024-03-08 10:11:22,807] [main] c.d.d.c.p.DCAwareRoundRobinPolicy - Using data-center name 'datacenter' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)
INFO [2024-03-08 10:11:22,808] [main] c.d.d.c.Cluster - New Cassandra host cassandra-datacenter-service/10.1.8.161:9042 added
INFO [2024-03-08 10:11:22,808] [main] c.d.d.c.Cluster - New Cassandra host cassandra-datacenter-service/10.1.7.221:9042 added
INFO [2024-03-08 10:11:22,808] [main] c.d.d.c.Cluster - New Cassandra host cassandra-datacenter-service/10.1.9.78:9042 added
INFO [2024-03-08 10:11:22,904] [main] o.c.c.m.MigrationRepository - Found 16 migration scripts
WARN [2024-03-08 10:11:22,905] [main] i.c.s.CassandraStorage - Starting db migration from 28 to 31…
WARN [2024-03-08 10:11:22,908] [main] i.c.s.CassandraStorage - Database migration is happenning with other reaper instances possibly running. Found []
INFO [2024-03-08 10:11:22,939] [main] o.c.c.m.MigrationRepository - Found 16 migration scripts
WARN [2024-03-08 10:11:22,946] [clustername-worker-0] c.d.d.c.Cluster - Re-preparing already prepared query is generally an anti-pattern and will likely affect performance. Consider preparing the statement only once. Query='insert into schema_migration(applied_successful, version, script_name, script, executed_at) values(?, ?, ?, ?, ?)'
WARN [2024-03-08 10:11:22,949] [clustername-worker-0] c.d.d.c.Cluster - Re-preparing already prepared query is generally an anti-pattern and will likely affect performance. Consider preparing the statement only once. Query='INSERT INTO schema_migration_leader (keyspace_name, leader, took_lead_at, leader_hostname) VALUES (?, ?, dateOf(now()), ?) IF NOT EXISTS USING TTL 300'
WARN [2024-03-08 10:11:22,951] [clustername-worker-0] c.d.d.c.Cluster - Re-preparing already prepared query is generally an anti-pattern and will likely affect performance. Consider preparing the statement only once. Query='DELETE FROM schema_migration_leader where keyspace_name = ? IF leader = ?'
org.cognitor.cassandra.migration.MigrationException: Error during migration of script 029_adaptive_repairs.cql while executing 'ALTER TABLE repair_unit_v1 ADD timeout int;'
at org.cognitor.cassandra.migration.Database.execute(Database.java:269)
at java.base/java.util.Collections$SingletonList.forEach(Collections.java:4856)
at org.cognitor.cassandra.migration.MigrationTask.migrate(MigrationTask.java:68)
at io.cassandrareaper.storage.CassandraStorage.migrate(CassandraStorage.java:376)
at io.cassandrareaper.storage.CassandraStorage.initializeCassandraSchema(CassandraStorage.java:307)
at io.cassandrareaper.storage.CassandraStorage.initializeAndUpgradeSchema(CassandraStorage.java:265)
at io.cassandrareaper.storage.CassandraStorage.<init>(CassandraStorage.java:252)
at io.cassandrareaper.storage.InitializeStorage.initializeStorageBackend(InitializeStorage.java:65)
at io.cassandrareaper.ReaperDbMigrationCommand.run(ReaperDbMigrationCommand.java:53)
at io.cassandrareaper.ReaperDbMigrationCommand.run(ReaperDbMigrationCommand.java:30)
at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:98)
at io.dropwizard.cli.Cli.run(Cli.java:78)
at io.dropwizard.Application.run(Application.java:94)
at io.cassandrareaper.ReaperApplication.main(ReaperApplication.java:105)
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Invalid column name timeout because it conflicts with an existing column
at com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:50)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:35)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:293)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:58)
at org.cognitor.cassandra.migration.Database.executeStatement(Database.java:277)
at org.cognitor.cassandra.migration.Database.execute(Database.java:261)
... 13 more
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Invalid column name timeout because it conflicts with an existing column
at com.datastax.driver.core.Responses$Error.asException(Responses.java:181)
at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:215)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:235)
at com.datastax.driver.core.RequestHandler.access$2600(RequestHandler.java:61)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:1011)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:814)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1290)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1208)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93)
at com.datastax.driver.core.InboundTrafficMeter.channelRead(InboundTrafficMeter.java:38)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829) |
@FieteO , are you planning to raise a PR ? I faced similar issue but in 029_adaptive_repairs.cql. So there might be similar issue in other queries that also needs to be checked. |
@Shivam0609 Yes, I will give it a shot. I am not sure how to reproducably test it though |
Maybe someone from maintainers can run E2E tests. |
Hi folks, there are some calls in the migration scripts which we cannot make idempotent anyway. All the ALTER TABLE calls which add a new column are such calls. |
Project board link
I observed the following stack trace for a reaper deployment:
To me it looks like the schema creation job is not idempotent.
If it would help and already fix the issue, I could prepare a PR to change the
CREATE TABLE
in 027_concurrent_repairs_part2.cql toCREATE TABLE IF NOT EXISTS
.The text was updated successfully, but these errors were encountered: