Apacha Druid - xibryan/notes GitHub Wiki

  1. 查询流程 image

  2. 任务下发

image

1. 调用overlord接口创建supervisor
rest接口:POST [https://{{ip}}:26203/druid/indexer/v1/supervisor](https://%7B%7Bip%7D%7D:26203/druid/indexer/v1/supervisor)
代码入口:org.apache.druid.indexing.overlord.supervisor.SupervisorResource#specPost
流程:SupervisorResource.specPost -> SupervisorManager.createOrUpdateAndStartSupervisor -> SupervisorManager.createAndStartSupervisorInternal -> SeekableStreamSupervisor.start

2. overlord定时创建Task
KafkaSupervisor会定时创建RunNotice检查任务状态是否正常,是否需要创建新的任务。
代码入口:org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisor#buildRunTask
RunNotice处理入口:org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisor.RunNotice#handle

3. MiddleManager创建Peon进程
MiddleManager根据配置文件以及任务参数中的配置项拼装peon进程需要的参数,并拉起peon进程。
代码入口:org.apache.druid.indexing.overlord.ForkingTaskRunner#run

4. Peon进程读取kafka数据
代码入口:org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner#runInternal
kafka任务实现类:IncrementalPublishingKafkaIndexTaskRunner 继承自SeekableStreamIndexTaskRunner

5. overlord结束读取数据
overlord检查任务任务周期已经结束,会停止peon继续读取数据。peon结束读取后开始制作Segment。
代码入口:org.apache.druid.indexing.seekablestream.supervisor.SeekableStreamSupervisor#checkTaskDuration

6. Publish Segment并等待handoff
Segment制作完成后发布到HDFS上并等待Historical进程加载。
代码入口:org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner#publishAndRegisterHandoff
  1. 入库流程:

image