Google 분류

Airflow and core concepts

컨텐츠 정보


# Apache Airflow Airflow is a platform created by the community to programmatically author, schedule and monitor workflows. Airflow는 프로그래밍 방식으로 워크플로를 작성, 예약 및 모니터링하는 플랫폼입니다. # Core concepts ## DAG DAG : A Directed Acyclic Graph is a collection of all the tasks you want to run, organized in a way that reflects their relationships and dependencies. * DAG : 방향 비순환 그래프 * 실행하려는 모든 태스크의 모음 * 해당 태스크의 관계와 종속성을 반영하는 방식


### Here’s a basic example DAG: ![basic-dag.png]( "basic-dag.png") ## Operators > The description of a single task, it is usually atomic. For example, the BashOperator is used to execute bash commands. * 단일 작업에 대한 설명 * 원자성 * BashOperator: bash 명령을 실행하는 데 사용


## Tassks > A parameterised instance of an Operator; a node in the DAG. * Operator의 매개변수화된 인스턴스 * DAG의 노드


## Task Instances > A specific run of a task; characterized as: a DAG, a Task, and a point in time. It has an indicative state: running, success, failed, skipped, ... Much in the same way that a DAG is instantiated into a DAG Run each time it runs, the tasks under a DAG are instantiated into Task Instances. DAG가 실행될 때마다 DAG 실행으로 인스턴스화되는 것과 동일한 방식으로 DAG의 태스크는 태스크 인스턴스로 인스턴스화됩니다. 태스크의 인스턴스는 지정된 DAG(즉, 지정된 데이터 간격)에 대해 해당 태스크의 특정 실행입니다. 또한 이 작업은 라이프사이클의 어느 단계에 있는지를 나타내는 상태를 가진 작업의 표현이기도 합니다. * 특정 작업 실행 DAG * 작업 및 특정 시점으로 특징 ### The possible states for a Task Instance are * none: The Task has not yet been queued for execution (its dependencies are not yet met) * scheduled: The scheduler has determined the Task’s dependencies are met and it should run * queued: The task has been assigned to an Executor and is awaiting a worker * running: The task is running on a worker (or on a local/synchronous executor) * success: The task finished running without errors * shutdown: The task was externally requested to shut down when it was running * restarting: The task was externally requested to restart when it was running * failed: The task had an error during execution and failed to run * skipped: The task was skipped due to branching, LatestOnly, or similar. * upstream_failed: An upstream task failed and the Trigger Rule says we needed it * up_for_retry: The task failed, but has retry attempts left and will be rescheduled. * up_for_reschedule: The task is a Sensor that is in reschedule mode * deferred: The task has been deferred to a trigger * removed: The task has vanished from the DAG since the run started **Document** : [](http:/ "") [] ![task_lifecycle_diagram.png]( "task_lifecycle_diagram.png")


댓글 1
전체 1 / 1 페이지
게시물이 없습니다.