forked from apache/kafka
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull] trunk from apache:trunk #1292
Open
pull
wants to merge
3,959
commits into
sudotty:trunk
Choose a base branch
from
apache:trunk
base: trunk
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Implements streams sticky assignor on the broker-side. Reviewers: Bill Bejeck <[email protected]>, Lucas Brutschy <[email protected]>
CoordinatorRecordSerde does not validate the version of the value to check whether the version is supported by the current version of the software. This is problematic if a future and unsupported version of the record is read by an older version of the software because it would misinterpret the bytes. Hence CoordinatorRecordSerde must throw an error if the version is unknown. This is also consistent with the handling in the old coordinator. Reviewers: Jeff Kim <[email protected]>
…18775) Reviewers: Ismael Juma <[email protected]>, Ken Huang <[email protected]>, Chia-Ping Tsai <[email protected]>
Reviewers: Chia-Ping Tsai <[email protected]>
Update `benchmark_test.py` to use KRaft. ``` > TC_PATHS="tests/kafkatest/benchmarks/core/benchmark_test.py" /bin/bash tests/docker/run_tests.sh ================================================================================ SESSION REPORT (ALL TESTS) ducktape version: 0.12.0 session_id: 2025-02-03--001 run time: 96 minutes 48.900 seconds tests run: 120 passed: 120 flaky: 0 failed: 0 ignored: 0 ================================================================================ ``` Reviewers: David Jacot <[email protected]>
Reviewers: Chia-Ping Tsai <[email protected]>, Christo Lolov <[email protected]>
… task (#18717) During testing we discovered that the empty group count is not updated in group conversion, but when the new group is transition to other state, the empty group count is decremented. This could result in negative empty group count. We can have a new consumer group count implementation that follows the pattern we did for the classic group count. The timeout task periodically refreshes the metrics based on the current groups soft state. Reviewers: Jeff Kim <[email protected]>
A class to build a new target assignment based on the provided parameters. As a result, it yields the records that must be persisted to the log and the new member assignments as a map. Compared to the feature branch, I extended the unit tests (testing also standby and warm-up task logic) and adopted simplifications due to the TasksTuple class. Reviewers: Bruno Cadonna <[email protected]>, Bill Bejeck <[email protected]>
Reviewers: Chia-Ping Tsai <[email protected]>, Ismael Juma <[email protected]>, Mickael Maison <[email protected]>
The statement gets logged in the INFO level and gets printed for every message produced to the __remote_log_metadata topic. Removed the log statement as it is needed only during debug session. And, we have another log at DEBUG level to capture this information. Reviewers: Luke Chen <[email protected]>, Christo Lolov <[email protected]>
…o early return (#18720) https://issues.apache.org/jira/browse/KAFKA-18575 solved a critical race condition by returning with CONCURRENT_TRANSACTIONS early when the transaction was still completing. In testing, it was discovered that this early return could cause performance regressions. Prior to KIP-890 the addpartitions call was a separate call from the producer. There was a previous change https://issues.apache.org/jira/browse/KAFKA-5477 that decreased the retry backoff to 20ms. With KIP-890 and making the call through the produce path, we go back to the default retry backoff which takes longer. Prior to 18575 we introduce a slight delay when sending to the coordinator, so prior to 18575, we are less likely to return quickly and get stuck in this backoff. However, based on results from produce benchmarks, we can still run into the default backoff in some scenarios. This PR reverts KAFKA-18575, and doesn't return early and wait until the coordinator for checking if a transaction is ongoing. Instead, it will fix the handling with the verification guard so we don't hit the edge condition. Also cleans up some of the verification text that was unclear. Reviewers: Jeff Kim <[email protected]>, Artem Livshits <[email protected]>
Introduced via KIP-1106. Reviewers: Lucas Brutschy <[email protected]>
Reviewers: Almog Gavra <[email protected]>, Anna Sophie Blee-Goldman <[email protected]>
Reviewers: Lucas Brutschy <[email protected]>, Bill Bejeck <[email protected]>
Reviewers: Chia-Ping Tsai <[email protected]>
…18700) Reviewers: Lucas Brutschy <[email protected]>
We need to re-enable the unclean shutdown detection when in ELR mode, which was inadvertently removed during the development process. Reviewers: David Mao <[email protected]>, Jun Rao <[email protected]>
…18788) Reviewers: Ken Huang <[email protected]>, TengYao Chi <[email protected]>, Chia-Ping Tsai <[email protected]>
Reviewers: Mickael Maison <[email protected]> Co-authored-by: u0184996 <[email protected]>
This patch fixes the Benchmark system tests. We misconfigured the quorum in bc7b870. ``` ================================================================================ SESSION REPORT (ALL TESTS) ducktape version: 0.12.0 session_id: 2025-02-04--001 run time: 57 minutes 27.169 seconds tests run: 62 passed: 62 flaky: 0 failed: 0 ignored: 0 ================================================================================ ``` Reviewers: PoAn Yang <[email protected]>, Christo Lolov <[email protected]>
This patch fixes the PerformanceService system test which was still using ZK. ``` ================================================================================ SESSION REPORT (ALL TESTS) ducktape version: 0.12.0 session_id: 2025-02-04--003 run time: 1 minute 42.629 seconds tests run: 4 passed: 4 flaky: 0 failed: 0 ignored: 0 ================================================================================ ``` Reviewers: Chia-Ping Tsai <[email protected]>
``` ================================================================================ SESSION REPORT (ALL TESTS) ducktape version: 0.12.0 session_id: 2025-02-04--005 run time: 4 minutes 0.023 seconds tests run: 4 passed: 4 flaky: 0 failed: 0 ignored: 0 ================================================================================ ``` Reviewers: Lianet Magrans <[email protected]>
Increase the maximum number of flaky tests we tolerate for the main test suite from 3 to 10. This will result in fewer failed builds. Reviewers: Chia-Ping Tsai <[email protected]>
#18165) Reviewers: Mickael Maison <[email protected]>, Chia-Ping Tsai <[email protected]>
The Streams membership manager is used client-side in the background thread of the async consumer. For each member /consumer, it is responsible for: * keeping the member state, * keeping assignments for the member, * reconciling the assignments of the member -- for example when tasks need to be revoked before other tasks are assigned * requesting invocations of assignment and revocation callbacks by the stream thread. The Streams membership manager is called by the background thread of the async consumer, directly in its event loop and from the Streams group heartbeat request manager. The Streams membership manager uses the Streams rebalance events processor to request assignment/revocation callback in the stream thread. Reviewers: Lucas Brutschy <[email protected]>, Bill Bejeck <[email protected]>
It appears this test was failing because the transaction was never aborting and the concurrent transactions errors would not go away. ccab9eb introduced the test failure because it requires the transaction to complete, but I suspect the lack of completion was happening before the change. The timeout for the write is based on the transactional timeout, and 100ms seemed too small -- thus the requests to update the state would often repeatedly time out. Also removed the loop since it was not necessary. Reviewers: Jeff Kim <[email protected]>, Calvin Liu <[email protected]>
…#18773) Reviewers: Bruno Cadonna <[email protected]>
Reviewers: Lianet Magrans <[email protected]>
Reviewers: Chia-Ping Tsai <[email protected]>
…ection (#19037) Reviewers: Chia-Ping Tsai <[email protected]>
…rs (#19039) Reviewers: TengYao Chi <[email protected]>, Ken Huang <[email protected]>, Chia-Ping Tsai <[email protected]>
…#19016) KIP-966 adds strict min ISR rule, so this PR improves the docs of min.insync.replicas to include that change. Reviewers: Ismael Juma <[email protected]>, Chia-Ping Tsai <[email protected]>
Given that now we support Java 17 on our brokers, this PR replace the use of : Collections.singletonList() and Collections.emptyList() with List.of() Collections.singletonMap() and Collections.emptyMap() with Map.of() Collections.singleton() and Collections.emptySet() with Set.of() Affected modules: server and coordinator-common Reviewers: Chia-Ping Tsai <[email protected]>
- Added a unit test to validate the exception hierarchy for all KIP-1050 transaction related exceptions. - RetriableException is correctly extended by all child classes - Included test for RetriableException exception with verification that all exceptions extending `RetriableException` do not inadvertently extend `RefreshRetriableException, preserving the intended behavior. Reviewers: Kirk True <[email protected]>, TaiJuWu <[email protected]>, TengYao Chi <[email protected]>, Ken Huang <[email protected]>, Justine Olshan <[email protected]>
…roker (#18997) In #16848, we added `kraft.version` to finalized features and got finalized features outside controller event handling thread. This may make finalized features stale when processing `registerBroker` event. Also, some cases like `QuorumControllerTest.testBalancePartitionLeaders` become flaky cause of outdated MV. This PR moves finalized features back to controller event handling thread to avoid the error. Reviewers: Ismael Juma <[email protected]>, Jun Rao <[email protected]>, Colin P. McCabe <[email protected]>, Chia-Ping Tsai <[email protected]>
…e configuration data in logs (#18349) Reviewers: Chia-Ping Tsai <[email protected]>
Reviewers: Chia-Ping Tsai <[email protected]>
…9044) `testActivationMessageForEmptyLogAtMv3_6WithTransactionAndElr` is tested with 4.0IV1, so the name should be corrected. Reviewers: Jun Rao <[email protected]>
There isn't any flaky test for SharePartitionManager in last 7 days, removing flaky annotation. Reviewers: Andrew Schofield <[email protected]>
The PR implements the SharePartitionMetrics as defined in KIP-1103, with one change. The metric `FetchLockRatio` is defined as `Meter` in KIP but is implemented as `HIstogram`. There was a discussion about same on KIP-1103 discussion where we thought that `FetchLockRatio` is pre-aggregated but while implemeting the rate from `Meter` can go above 100 as `Meter` defines rate per time period. Hence it makes more sense to implement metric `FetchLockRatio` as `Histogram`. Reviewers: Andrew Schofield <[email protected]>
…9036) The purpose of this PR is to remove the `@InterfaceStability.Evolving` from classes that were created over a year ago. Reviewers: Jun Rao <[email protected]>
Reviewers: Ken Huang <[email protected]>, Chia-Ping Tsai <[email protected]>
…=[val1,val2]` (#17771) Reviewers: Jhen-Yung Hsu <[email protected]>, TaiJuWu <[email protected]>, Chia-Ping Tsai <[email protected]>
…plica.alter.log.dirs.threads (#19038) Update the documentation for the `num.replica.alter.log.dirs.threads` configuration to clarify that its default value is determined by the number of log directories specified in the `log.dirs` configuration. Reviewers: Ken Huang <[email protected]>, Chia-Ping Tsai <[email protected]>
Reviewers: Chia-Ping Tsai <[email protected]>
Given that now we support Java 17 on our brokers, this PR replace the use of : - `Collections.singletonList()` and `Collections.emptyList()` with `List.of()` - `Collections.singletonMap()` and `Collections.emptyMap()` with `Map.of()` - `Collections.singleton()` and `Collections.emptySet()` with `Set.of()` Reviewers: Chia-Ping Tsai <[email protected]>
…eException (#19047) Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException as they were used by zk path. Reviewers: TengYao Chi <[email protected]>, Ken Huang <[email protected]>, Chia-Ping Tsai <[email protected]>
…than Short.MAX_VALUE (#19067) JIRA: KAFKA-18908 `ZooKeeper` mode allows large configuration values to be created through an append operation. `KRaft` mode restricts configuration values to a maximum size of `Short.MAX_VALUE`. We should document this behavioral change. Reviewers: Ken Huang <[email protected]>, Chia-Ping Tsai <[email protected]>
… larger (#19070) In ZooKeeper mode, users can append configurations to create values larger than Short.MAX_VALUE. However, this behavior is disallowed in KRaft mode. Additionally, a server error is returned to users. Creating a value this large is rare, so we don't plan to fix it for KRaft. This PR aims to tweak the error message. Reviewers: Ken Huang <[email protected]>, TengYao Chi <[email protected]>, Chia-Ping Tsai <[email protected]>
… and closing (#18752) To address the issue of not creating a checkpoint file during the restoring and closing process, called the GlobalStateUpdateTask.flushState() method in GlobalStateUpdateTask.initialize() and GlobalStateUpdateTask.close() methods. This will flush the state and create a checkpoint file thereby, avoiding the need to completely restore the entire state. Reviewers: Alieh Saeedi <[email protected]>, Matthias J. Sax <[email protected]>
There are 3 issues (at least) about the multithreaded issue on ConsumerRecords. Hence, it would be better to document it completely. Reviewers: Kirk True <[email protected]>, TengYao Chi <[email protected]>, Ken Huang <[email protected]>, Xuan-Zhang Gong <[email protected]>, Chia-Ping Tsai <[email protected]>
…eat (#18981) Implements auto-topic creation when handling the streams group heartbeat. Inside KafkaApis, the handler for streamsGroupHeartbeat uses the result of the streams group heartbeat inside the group coordinator to attempt to create all missing internal topics using AutoTopicCreationManager. CREATE TOPIC ACLs are checked. The unit tests class AutoTopicCreationManagerTest is brought back (it was recently deleted during a ZK removal PR), but testing only the kraft-related functionality. Reviewers: Bruno Cadonna <[email protected]> ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes)
…19043) The AbstractHeartbeatRequestManager and the StreamsGroupHeartbeatRequestManager, both use the HeartbeatRequestState to track the state of the heartbeat requests. Both heartbeat request managers have an implementation of HeartbeatRequestState as inner class. To deduplicate code this commit extracts the HeartbeatRequestState so that the same code can be used by both heartbeat request manager. Reviewers: Kirk True <[email protected]>, Chia-Ping Tsai <[email protected]>, Lucas Brutschy <[email protected]>
… check topic describe (#19055) 1、Client support for TopicAuthException in DescribeShareGroup and HB path 2、ShareConsumerImpl#sendAcknowledgementsAndLeaveGroup swallow TopicAuthorizationException and GroupAuthorizationException Reviewers: ShivsundarR <[email protected]>, Andrew Schofield <[email protected]>
…nder EOS (#17931)" (#19078) This reverts commit e883746. The commit that is reverted prevents Kafka Streams from re-initializing its transactional producer. If an exception that fences the transactional producer occurs, the producer is not re-initialized during the handling of the exception. That causes an infinite loop of ProducerFencedExceptions with corresponding rebalances. Reviewers: Lucas Brutschy <[email protected]>, David Jacot <[email protected]>
…s (KIP-1103) (#19059) The PR implements the ShareSessionCache and DelayedShareFetchMetrics as defined in KIP-1103. Reviewers: Andrew Schofield <[email protected]>
The new group coordinator prints the following line at fixed interval even if no groups were deleted: ``` Generated 0 tombstone records while cleaning up group metadata in 0 milliseconds. (org.apache.kafka.coordinator.group.GroupCoordinatorShard) ``` The time component has some value in its own but it may be better to not print if when there are not records in order to reduce the spam in the logs. Reviewers: Chia-Ping Tsai <[email protected]>
…n BrokerLifecycleManager (#19061) This PR reduces `maxInterval` for `resendExponentialBackoff` in `BrokerLifecycleManager` class from `broker.session.timeout.ms` to half of its value. Setting `maxInterval` to `broker.session.timeout.ms` caused brokers to be fenced if a resend attempt occurred near the timeout threshold, leading to unnecessary broker fencing. Reviewers: Colin P. McCabe <[email protected]>
The default checkout behavior for GitHub Actions is to use a special merge ref which is equivalent to the base branch with the PR merged into it. While this is crucial for checking compilation issues against trunk, it significantly diminishes our ability to use any build caching. This patch changes the JUnit test jobs to checkout the HEAD commit of the PR when building. The "Compile and Check" step still checks out the merge commit so we can keep that level of validation. Reviewers: Ismael Juma <[email protected]>, Chia-Ping Tsai <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )