Proposing a way to alter `ClusterInfo` #1157

garyparrot · 2022-11-21T17:42:36Z

這個 Issue 議題應該新增一個手法能夠讓我們對 ClusterInfo 套用一些變更，他背後的目的是：

現行想測試 Balancer，我們必須要開啟叢集才能拿到那個分佈，這個測試方法太麻煩，如果想測個特別的情境，還要想辦法寫程式去弄出他
Proposing a simple Balancer benchmark suite #1108 提到為了衡量 Balancer 的效果我們需要去生成虛假的分佈，有了這個修改 ClusterInfo 的機制，我們可以自己從 ClusterInfo.empty() 中修改出一個虛假的分佈，這對測試來說很方便
ClusterLogAllocation 的修改函數可以無縫換成一個 Alteration 的操作

我想到的程式碼概念如下：

public class Ignore {
  public static void main(String[] args) {
    ClusterInfo<ReplicaInfo> fakeCluster = ClusterInfo.empty()
        .apply(ClusterInfoAlteration.addNode(1, 2, 3))
        .apply(ClusterInfoAlteration.addFolders(1, "/ssd1", "/ssd2", "/ssd3"))
        .apply(ClusterInfoAlteration.addFolders(2, "/ssd1", "/ssd2", "/ssd3"))
        .apply(ClusterInfoAlteration.addFolders(3, "/ssd1", "/ssd2", "/ssd3"))
        .apply(ClusterInfoAlteration.addTopic("ThePipe", 100, (short) 2))
        .apply(ClusterInfoAlteration.addTopicSize(DataSize.GB.of(10), (tp) -> tp.partition() % 2 == 0));
  }
}

/**
 * A functional interface to transform a given {@link ClusterInfo} with specific logic
 */
@FunctionalInterface
public interface ClusterInfoAlteration<T extends ReplicaInfo> {

  /**
   * Alter the given source ClusterInfo with specific logic
   *
   * @param sourceCluster the source ClusterInfo
   * @return a new ClusterInfo that is being processed with specific logic based on the given source ClusterInfo
   */
  ClusterInfo<T> apply(ClusterInfo<T> sourceCluster);

  static <T extends ReplicaInfo> ClusterInfoAlteration<T> addNode(Integer... brokerIds) {
    return addNode(Arrays.asList(brokerIds));
  }

  static <T extends ReplicaInfo> ClusterInfoAlteration<T> addNode(List<Integer> brokerIds) {
    return (sourceCluster) -> sourceCluster;
  }

  static <T extends ReplicaInfo> ClusterInfoAlteration<T> addFolders(int brokerId, String... folders) {
    return addFolders(Map.of(brokerId, Set.copyOf(Arrays.asList(folders))));
  }

  static <T extends ReplicaInfo> ClusterInfoAlteration<T> addFolders(Map<Integer, Set<String>> folders) {
    return (sourceCluster) -> sourceCluster;
  }

  static <T extends ReplicaInfo> ClusterInfoAlteration<T> addTopic(String topicName, int partitionSize, short replicaFactor) {
    return (sourceCluster) -> sourceCluster;
  }

  static <T extends ReplicaInfo> ClusterInfoAlteration<T> addTopicSize(DataSize size, Predicate<TopicPartition> filter) {
    return (sourceCluster) -> sourceCluster;
  }

}

我覺得可以討論的問題

需要做成 Builder Pattern 嗎
通常我們會做一系列的變更，這個變更很長的時候，會不會中間很多計算都是白費的（大量的 ClusterInfo 建立之後又複製另外一個），需不需要套用某種優化。

The text was updated successfully, but these errors were encountered:

chia7712 · 2022-11-22T04:07:10Z

這個概念不錯，可以順便整合ClusterLogAllocation

我是建議用Builder pattern，它要能吃一個現存的 ClusterInfo，並且提供多種 setter，在串接的過程中不需要事先產生新的 ClusterInfo，這樣有幾個好處：

ClusterInfo 依然是 immutable
變更可以最後 build 的時候才套用
可以移除 ClusterLogAllocation

garyparrot · 2022-11-23T13:58:26Z

我會試著弄成 Builder 版本。

可以移除 ClusterLogAllocation

如果 Balancer 回傳一個 Plan，他的 Plan 的 ClusterInfo 裡面多了 3 個節點和 8 個根本不在叢集內的 data folder，我們有需要防禦這種詭異的使用情境嗎。如果用 ClusterLogAllocation 的話我們至少 API 可以給一點暗示，或是從物件函數上面避免這些奇怪的操作。

如果要移除的話，這部分我們可能沒辦法防禦，如果不防禦的話可能就 Balancer 實作自己注意，不要做這類奇怪的變更是嗎

chia7712 · 2022-11-23T14:22:27Z

如果用 ClusterLogAllocation 的話我們至少 API 可以給一點暗示，或是從物件函數上面避免這些奇怪的操作。

或者把 CLA 的方法也放到 builder 上面，概念上就是 builder 可以吃一個 ClusterInfo 作為基底，然後提供各種 setter and alter 做變更

garyparrot self-assigned this Nov 21, 2022

This was referenced Nov 23, 2022

Implementation of ClusterInfoBuilder #1165

Merged

Replace ClusterLogAllocation usage by ClusterInfoBuilder #1184

Merged

garyparrot closed this as completed in #1184 Dec 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposing a way to alter `ClusterInfo` #1157

Proposing a way to alter `ClusterInfo` #1157

garyparrot commented Nov 21, 2022

chia7712 commented Nov 22, 2022

garyparrot commented Nov 23, 2022

chia7712 commented Nov 23, 2022

Proposing a way to alter ClusterInfo #1157

Proposing a way to alter ClusterInfo #1157

Comments

garyparrot commented Nov 21, 2022

chia7712 commented Nov 22, 2022

garyparrot commented Nov 23, 2022

chia7712 commented Nov 23, 2022

Proposing a way to alter `ClusterInfo` #1157

Proposing a way to alter `ClusterInfo` #1157