Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement accessors to read dataset events defined as inlet #39367

Merged
merged 1 commit into from
May 3, 2024

Conversation

uranusjr
Copy link
Member

@uranusjr uranusjr commented May 2, 2024

This is kind of the other side of dataset_events implemented in #38481. The inlet_events context key allows the task to access past events associated with a dataset that’s defined in the task’s inlets, like this:

@task(inlets=my_ds)
def my_task(inlet_events):
    last_event_timestamp = inlet_events[my_ds][-1].timestamp

Note that inlets is not logically related to using the dataset to schedule a DAG. An inlet dataset may or may not also be present in the DAG’s schedule. Subsequently, events accessed from inlet_events do not contain any logical filtering—all past events are simply returned with a list-like interface.

This PR implements the basic structure including a lazy list-like structure (that queries the database on-demand). I plan to add more changes in future PRs after this is merged (all targeting 2.10):

  • Rename dataset_events to outlet_events for consistency.
  • Allow slicing syntax e.g. inlet_events[ds][:-3].
  • Some refactor to consolidate other list-like and lazy db access interfaces we provide elsewhere, most significantly LazyXComAccess.
  • Add documentation on this.

@uranusjr uranusjr requested review from potiuk, kaxil, XD-DENG and ashb as code owners May 2, 2024 10:23
@uranusjr uranusjr force-pushed the dataset-inlet-event-access branch from 9fd41f0 to 12be387 Compare May 2, 2024 10:39
@uranusjr uranusjr force-pushed the dataset-inlet-event-access branch from 12be387 to 0a33bbb Compare May 3, 2024 03:07
@eladkal eladkal added this to the Airflow 2.10.0 milestone May 3, 2024
@uranusjr uranusjr merged commit 2294001 into apache:main May 3, 2024
41 checks passed
@uranusjr uranusjr deleted the dataset-inlet-event-access branch May 3, 2024 13:42
@utkarsharma2 utkarsharma2 added the type:new-feature Changelog: New Features label Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants