[improve][broker] Don't call ManagedLedger#asyncAddEntry in Netty I/O thread #23983

BewareMyPower · 2025-02-13T12:21:51Z

Motivation

#23940 brings a behavior change that the core logic of ManagedLedger#asyncAddEntry now won't switch threads, which means it will be executed directly in Netty I/O thread via PersistentTopic#asyncAddEntry.

The beforeAddEntry method calls theintercept and interceptWithNumberOfMessages methods for all broker entry interceptors and prepends a new broker entry metadata buffer on the original buffer (though it's just a composite buffer).

There is a risk that when many producers send messages to the same managed ledger concurrently, the process of asyncAddEntry might block the Netty I/O thread for some time and cause the performance regression.

Modifications

In PersistentTopic#publishMessage, expose the getExecutor() method for ManagedLedger and execute ManagedLedger#asyncAddEntry in that executor. The change of #12606 is moved to PersistentTopic as well that the buffer is retained before switching to another thread.

After that, only synchronize afterAddEntryToQueue with other synchronized methods of ManagedLedgerImpl. P.S. actually I don't think synchronized is needed here but the logic is not trivial like beforeAddEntryToQueue and beforeAddEntry, so I still retain it as synchronized.

ManagedLedgerImpl#asyncAddEntry still doesn't switch the thread, so it would still be possible for the downstream application to synchronize asyncAddEntry, either by adding a lock (e.g. synchronized) or executing this method is a single thread.

Documentation

doc
doc-required
doc-not-needed
doc-complete

Matching PR in forked repository

PR in forked repository: BewareMyPower#40

… thread

merlimat

An extra context switch for each entry is costly, especially when you have many small entries and little or no batching. That's why we put it on the same thread.

If the interceptor needs to do expensive work, maybe only the interceptor part should be done in a different thread, though it shouldn't affect it when we don't use interceptor.

lhotari · 2025-02-13T18:20:03Z

An extra context switch for each entry is costly, especially when you have many small entries and little or no batching. That's why we put it on the same thread.

@merlimat The thread switching was added in PR #9039, already in December 2020. The reason to make this change is related to a performance concern of #23940 changes which removed the thread switching.

pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java

Lines 796 to 826 in ee5b13a

    
           public void asyncAddEntry(ByteBuf buffer, int numberOfMessages, AddEntryCallback callback, Object ctx) { 
        
               if (log.isDebugEnabled()) { 
        
                   log.debug("[{}] asyncAddEntry size={} state={}", name, buffer.readableBytes(), state); 
        
               } 
        
               // retain buffer in this thread 
        
               buffer.retain(); 
        
               // Jump to specific thread to avoid contention from writers writing from different threads 
        
               final var addOperation = OpAddEntry.createNoRetainBuffer(this, buffer, numberOfMessages, callback, ctx, 
        
                       currentLedgerTimeoutTriggered); 
        
               var added = false; 
        
               try { 
        
                   // Use synchronized to ensure if `addOperation` is added to queue and fails later, it will be the first 
        
                   // element in `pendingAddEntries`. 
        
                   synchronized (this) { 
        
                       if (managedLedgerInterceptor != null) { 
        
                           managedLedgerInterceptor.beforeAddEntry(addOperation, addOperation.getNumberOfMessages()); 
        
                       } 
        
                       final var state = STATE_UPDATER.get(this); 
        
                       beforeAddEntryToQueue(state); 
        
                       pendingAddEntries.add(addOperation); 
        
                       added = true; 
        
                       afterAddEntryToQueue(state, addOperation); 
        
                   } 
        
               } catch (Throwable throwable) { 
        
                   if (!added) { 
        
                       addOperation.failed(ManagedLedgerException.getManagedLedgerException(throwable)); 
        
                   } // else: all elements of `pendingAddEntries` will fail in another thread 
        
               } 
        
           }

In Pulsar use cases, synchronization on CPU intensive operations (or blocking IO operations) in Netty IO threads could cause performance regressions. In this case, it would impact use cases where there's a large number of producers producing to a single topic.
Blocking IO threads will have a broader impact since it will impact Netty IO of all connections sharing the same IO thread.

Before #23940, the code looks like this:

pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java

Lines 796 to 810 in 7a79c78

    
           public void asyncAddEntry(ByteBuf buffer, int numberOfMessages, AddEntryCallback callback, Object ctx) { 
        
               if (log.isDebugEnabled()) { 
        
                   log.debug("[{}] asyncAddEntry size={} state={}", name, buffer.readableBytes(), state); 
        
               } 
        
               // retain buffer in this thread 
        
               buffer.retain(); 
        
               // Jump to specific thread to avoid contention from writers writing from different threads 
        
               executor.execute(() -> { 
        
                   OpAddEntry addOperation = OpAddEntry.createNoRetainBuffer(this, buffer, numberOfMessages, callback, ctx, 
        
                           currentLedgerTimeoutTriggered); 
        
                   internalAsyncAddEntry(addOperation); 
        
               }); 
        
           }

btw. In the Pulsar code base, we have a problem in how IO threads are used. IO threads are used to process work that shouldn't be handled with IO threads at all. I have created an issue #23865. There should be a separate thread pool for running blocking operations and CPU intensive synchronized operations.

lhotari

Great work @BewareMyPower. Some comments added in this first pass.

pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java

BewareMyPower · 2025-02-14T03:05:03Z

@merlimat The thread switching was added in PR #9039, already in December 2020.

@merlimat @lhotari to correct it, this is the very early behavior introduced in #1521.

This PR intends to decouple ManagedLedger#asyncAddEntry and PersistentTopic#asyncAddEntry so that the managed ledger interface can be more flexible for the downstream protocol handlers to use.

After that, all write operations from Pulsar client will still keep the original behavior that switches to managed ledger's executor to call ManagedLedger#asyncAddEntry.

However, regarding the downstream, for example, in my Kafka protocol handler implementation, PersistentTopic#publishMessage is not called in an I/O thread. Instead, it's called in an independent worker thread. Then I can choose to call persistentTopic.getManagedLedger().asyncAddEntry(/* ... */) in order, which can be achieved by adding the synchronized keyword or using the same worker thread for the same topic.

The comment here makes sense to a certain extent, but it might be a new topic (e.g. thread switching vs. synchronized) to discuss, which is beyond the scope of this PR. At least, the existing thread switching approach can already achieve high publish performance, which is verified by many benchmarks.

[improve][broker] Don't call ManagedLedger#asyncAddEntry in Netty I/O…

0260324

… thread

BewareMyPower requested review from lhotari, codelipenghui, gaoran10, dao-jun and Demogorgon314 February 13, 2025 12:21

github-actions bot added the doc-not-needed Your PR changes do not impact docs label Feb 13, 2025

BewareMyPower self-assigned this Feb 13, 2025

BewareMyPower added type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages release/4.0.3 labels Feb 13, 2025

BewareMyPower added 2 commits February 13, 2025 20:24

Add synchronized to afterAddEntryToQueue

67b31ea

Clal buffer.release() in asyncAddEntry

61dcac7

BewareMyPower marked this pull request as draft February 13, 2025 13:00

BewareMyPower added 2 commits February 13, 2025 21:06

Synchronize all addEntry operations

c0e844a

Fix testBrokerClosedProducerClientRecreatesProducerThenSendCommand

73fb146

BewareMyPower marked this pull request as ready for review February 13, 2025 13:17

BewareMyPower marked this pull request as draft February 13, 2025 14:07

Fix tests

d709574

BewareMyPower marked this pull request as ready for review February 13, 2025 14:09

merlimat reviewed Feb 13, 2025

View reviewed changes

lhotari added the release/blocker Indicate the PR or issue that should block the release until it gets resolved label Feb 13, 2025

lhotari reviewed Feb 13, 2025

View reviewed changes

pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java Show resolved Hide resolved

pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java Outdated Show resolved Hide resolved

BewareMyPower marked this pull request as draft February 14, 2025 01:56

BewareMyPower added 3 commits February 14, 2025 10:13

Fix memory leak of PersistentTopic#publishMessage

c4940d8

Improve API docs

5f920a9

Add more tests for the corner case

01656af

BewareMyPower marked this pull request as ready for review February 14, 2025 02:39

BewareMyPower marked this pull request as draft February 14, 2025 07:46

Fix tests

8106bac

BewareMyPower marked this pull request as ready for review February 14, 2025 07:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[improve][broker] Don't call ManagedLedger#asyncAddEntry in Netty I/O thread #23983

[improve][broker] Don't call ManagedLedger#asyncAddEntry in Netty I/O thread #23983

BewareMyPower commented Feb 13, 2025 •

edited

Loading

merlimat left a comment

lhotari commented Feb 13, 2025

lhotari left a comment

BewareMyPower commented Feb 14, 2025 •

edited

Loading

[improve][broker] Don't call ManagedLedger#asyncAddEntry in Netty I/O thread #23983

Are you sure you want to change the base?

[improve][broker] Don't call ManagedLedger#asyncAddEntry in Netty I/O thread #23983

Conversation

BewareMyPower commented Feb 13, 2025 • edited Loading

Motivation

Modifications

Documentation

Matching PR in forked repository

merlimat left a comment

Choose a reason for hiding this comment

lhotari commented Feb 13, 2025

lhotari left a comment

Choose a reason for hiding this comment

BewareMyPower commented Feb 14, 2025 • edited Loading

BewareMyPower commented Feb 13, 2025 •

edited

Loading

BewareMyPower commented Feb 14, 2025 •

edited

Loading