Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support intra-broker throttling (replica.alter.log.dirs.io.max.bytes.per.second) #2145

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

dongjinleekr
Copy link
Contributor

This PR resolves #1851.

@dongjinleekr
Copy link
Contributor Author

This PR is an WIP; I will test this feature in our in-house fork (@naver) this week. Stay tuned!! 🙃

@dongjinleekr
Copy link
Contributor Author

@aswinayyolath @mhratson @jiao-zhangS Could you kindly have a look? We adopted this patch to our in-house distribution and confirmed it works correctly. 🙏

- Add: setLogDirThrottles, setLogDirThrottledRateIfNecessary
- Rename: removeReplicationThrottledRateFromBroker → removeThrottledRatesFromBroker
…oker rebalanacing

    - Rename: ReplicationThrottleHelper.clearThrottles → clearInterBrokerThrottles
    - Add ReplicationThrottleHelper.clearIntraBrokerThrottles
    - Executor.intraBrokerMoveReplicas now calls ReplicationThrottleHelper.setLogDirThrottles, ReplicationThrottleHelper.clearIntraBrokerThrottles
Copy link

@cpaika cpaika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will make things a lot more stable for JBOD clusters, good change

@@ -1707,6 +1729,11 @@ private void intraBrokerMoveReplicas() {
waitForIntraBrokerReplicaTasksToFinish();
inExecutionTasks = inExecutionTasks();
}

if (_logDirThrottle != null) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for my understanding, why the underscore in the variable names here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In consistency with _replicationThrottle.

@dongjinleekr
Copy link
Contributor Author

@cpaika @adkafka

Excuse me. Is there any issue with this PR? It initially worked on 2.5.138, and 2.5.141 was already released last month, but this PR has yet to be merged or released.

If any modifications are needed, don't hesitate to leave me a mention.

@adkafka
Copy link

adkafka commented Nov 18, 2024

@cpaika @adkafka

Excuse me. Is there any issue with this PR? It initially worked on 2.5.138, and 2.5.141 was already released last month, but this PR has yet to be merged or released.

If any modifications are needed, don't hesitate to leave me a mention.

I don't see any issues with this PR, but I'm not a maintainer of this project. We need a maintainer to review this and add their approval.

@aswinayyolath
Copy link
Contributor

@CCisGG could you please review this PR?

@dongjinleekr
Copy link
Contributor Author

@CCisGG Hello. Could you please have a look when you are free? 🙏

@CCisGG
Copy link
Contributor

CCisGG commented Dec 8, 2024

Hi @mhratson, would you mind taking a look at this one? Thanks!

@imans777
Copy link

imans777 commented Jan 4, 2025

Anyone caring to review this?
It's really a must-have feature.

Copy link
Contributor

@CCisGG CCisGG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments/questions.

Some other questions I have:

  1. Did you test that with your change, when no replication/intra broker throttling was set (which I think is by default), does this work properly?
  2. Is their any case that this intra-broker throttling can interfere with inter-broker thorttling? E.g. what happens if they are enabled together?
  3. Please update Configuration.md for this new config.

* <code>default.log.dir.throttle</code>
*/
public static final String DEFAULT_LOG_DIR_THROTTLE_CONFIG = "default.log.dir.throttle";
public static final Long DEFAULT_DEFAULT_LOG_DIR_THROTTLE = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that be more reasonable to have it Long.MAX_VALUE rather than null? I'm a bit concerned to have a null default value for a LONG type. I'm not entirely sure what will happen here by default

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm. Just realize that default replication throttle is also null. This is fine then.

@@ -830,7 +834,7 @@ public synchronized void executeProposals(Collection<ExecutionProposal> proposal
requestedIntraBrokerPartitionMovementConcurrency, requestedClusterLeadershipMovementConcurrency,
requestedBrokerLeadershipMovementConcurrency, requestedExecutionProgressCheckIntervalMs, replicaMovementStrategy,
isTriggeredByUserRequest, loadMonitor);
startExecution(loadMonitor, null, removedBrokers, replicationThrottle, isTriggeredByUserRequest);
startExecution(loadMonitor, null, removedBrokers, replicationThrottle, logDirThrottle, isTriggeredByUserRequest);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, does that make more sense to have a separate method for intra-broker throttling since this parameter only applies for inter-broker movement?

@@ -918,7 +922,7 @@ public synchronized void executeDemoteProposals(Collection<ExecutionProposal> pr
initProposalExecution(proposals, demotedBrokers, concurrentSwaps, null, 0,
requestedClusterLeadershipMovementConcurrency, requestedBrokerLeadershipMovementConcurrency,
requestedExecutionProgressCheckIntervalMs, replicaMovementStrategy, isTriggeredByUserRequest, loadMonitor);
startExecution(loadMonitor, demotedBrokers, null, replicationThrottle, isTriggeredByUserRequest);
startExecution(loadMonitor, demotedBrokers, null, replicationThrottle, replicationThrottle, isTriggeredByUserRequest);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the implication here to have log dir throttling set to replication throttle here?

for (Map.Entry<String, Set<String>> entry : throttledReplicas.entrySet()) {
setThrottledReplicas(entry.getKey(), entry.getValue());
}
LOG.info("Setting a rebalance throttle of {} bytes/sec", throttleRate);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we remove the "if (throttlingEnabled())" check here?

@@ -336,7 +355,7 @@ private void removeThrottledReplicasFromTopic(String topic, Set<String> replicas
}
}

private void removeThrottledRateFromBroker(Integer brokerId)
private void removeThrottledRatesFromBroker(Integer brokerId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method now become "remove both throttle rates"? In that case we should add a comment to specify that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

disk balance
6 participants