Flink 源码分析(二):standalone 集群启动流程

更新至 Flink 1.9.0 版本

启动脚本

JobManager 启动脚本

执行 bin/jobmanager.sh start 启动 JobManager 流程:

bin/config.sh
bin/daemon.sh start standalonesession 

在 daemon.sh 脚本中,standalonesession 入口类为 org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint

TaskManager 启动脚本

执行 bin/taskmanager.sh start 启动 TaskManager 流程:

bin/config.sh
bin/daemon.sh start taskexecutor 

在 daemon.sh 脚本中,taskexecutor 入口类为 org.apache.flink.runtime.entrypoint.org.apache.flink.runtime.taskexecutor.TaskManagerRunner

JobManager 启动

在 StandaloneSessionClusterEntrypoint 的 main 方法里,调用 ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint clusterEntrypoint) 静态方法:

StandaloneSessionClusterEntrypoint entrypoint = new StandaloneSessionClusterEntrypoint(configuration);

ClusterEntrypoint.runClusterEntrypoint(entrypoint);

类 StandaloneSessionClusterEntrypoint 的继承体系:

调用 ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint clusterEntrypoint) 静态方法,调用 ClusterEntrypoint 的 void runCluster(Configuration configuration) 实例方法启动 JobManager。流程如下:

  1. 初始化服务;
  2. 创建集群组件。

初始化服务

调用 ClusterEntrypoint 的 void initializeServices(Configuration configuration) 实例方法初始化服务。

服务包括:

org.apache.flink.runtime.rpc.RpcService
org.apache.flink.runtime.highavailability.HighAvailabilityServices
org.apache.flink.runtime.blob.BlobServer
org.apache.flink.runtime.heartbeat.HeartbeatServices
org.apache.flink.runtime.metrics.MetricRegistry
org.apache.flink.runtime.dispatcher.ArchivedExecutionGraphStore

创建集群组件

抽象工厂设计模式创建集群组件:

final DispatcherResourceManagerComponentFactory dispatcherResourceManagerComponentFactory = createDispatcherResourceManagerComponentFactory(configuration);

clusterComponent = dispatcherResourceManagerComponentFactory.create(  
    configuration,
    commonRpcService,
    haServices,
    blobServer,
    heartbeatServices,
    metricRegistry,
    archivedExecutionGraphStore,
    new RpcMetricQueryServiceRetriever(metricRegistry.getMetricQueryServiceRpcService()),
    this);

类 SessionDispatcherResourceManagerComponentFactory 继承体系:

服务组件包括:

org.apache.flink.runtime.resourcemanager.ResourceManager
org.apache.flink.runtime.webmonitor.WebMonitorEndpoint
org.apache.flink.runtime.metrics.groups.JobManagerMetricGroup
org.apache.flink.runtime.dispatcher.Dispatcher

TaskManager 启动

TODO