Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base date for fetching dag grid view must include selected run_id #34887

Merged
merged 1 commit into from
Feb 20, 2024

Conversation

hterik
Copy link
Contributor

@hterik hterik commented Oct 12, 2023

Previously, if user set dag_run_id parameter in the url, that refers to a old run, which doesn't fit in the most recent 25 runs, then the requested run will not be selected.

This change fixes this by setting the base_date to a time where the run_id is known to exist if dag_run_id is provided as an explicit query parameter.

closes: #34723

@boring-cyborg boring-cyborg bot added area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues labels Oct 12, 2023
@hterik hterik force-pushed the fix_run_id_out_of_base_date_grid branch from e9e041b to 2a5f571 Compare October 17, 2023 07:34
@eladkal eladkal added this to the Airflow 2.7.3 milestone Oct 27, 2023
@jscheffl
Copy link
Contributor

I tried my best to review this PR but I was not able to get the UI running with breeze locally. GRID view did not show up anymore. Unfortunately I'm not really a TypeScript developer (noop level) and could not see the root cause.

Reading the code I fear a bit that most of the logic is made in backend in views.py. As already above I am a noop but would assume it is better to have the logic in the client side, similar like "If requested run_id not in runs loaded, then load an are where the run_id is contained.
But I'd leave a review to one of the UX experts... @pierrejeambrun / @bbovenzi

@sinwoobang
Copy link
Contributor

This bug significantly impacts the efficiency of navigating DagRuns, particularly when dealing with a high volume of items. The ability to quickly identify a specific DagRun is critical to my workflow, and the current issue hinders this process. A prompt resolution would greatly enhance productivity and is eagerly anticipated.

@hterik
Copy link
Contributor Author

hterik commented Nov 15, 2023

@ryanahamilton @ashb @bbovenzi @pierrejeambrun can you please help in reviewing or find someone else who can review this?

This has been pending for some time now and fixes a very annoying bug where one can not permalink to individual dagruns #34723

Copy link
Contributor

@bbovenzi bbovenzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we need a custom context provider. I would prefer we try to have a useEffect inside of useGridData to detect only the first time runId is specified in the url params.

@hterik
Copy link
Contributor Author

hterik commented Nov 20, 2023

I don't see why we need a custom context provider. I would prefer we try to have a useEffect inside of useGridData to detect only the first time runId is specified in the url params.

It's been some time now so i don't remember the exact details, i think i tried that as first version but it didn't work for some reason. I wouldn't add context unnecessarily, as a I'm normally very anti global variables. :D

The context is needed to transfer the first selection between useSelection and useGridData. There are 14 different usages of these and it's today not 100% clear how they are connected, because the data is transferred between them using the already global variable useQuery and setSearchParams :)
The alternative would have been to use the return value of useSelection and prop-drill it into every use of useGridData instead.

The use of query params also introduces an ordering problem. If useGridData is created after useSelection. Then there is a risk that setSearchParams inside useSelection is called before any useGridData tries to read searchParams to lock the runId. I think this was actually the main problem that prevented me from storing the first selection inside useGridData, but i can't remember for sure.


(The placement of firstRunIdSetByUrl = useContext(DagRunSelectionContext) inside useSelection is purely cosmetic, to centralize all selection logic in one place. That is possible to easily move into useGridData)


If you see any better way to do this, i would be more than happy to use some other approach instead. Also feel free to overwrite this PR if you have some other example for this already.

@bbovenzi
Copy link
Contributor

I don't see why we need a custom context provider. I would prefer we try to have a useEffect inside of useGridData to detect only the first time runId is specified in the url params.

It's been some time now so i don't remember the exact details, i think i tried that as first version but it didn't work for some reason. I wouldn't add context unnecessarily, as a I'm normally very anti global variables. :D

The context is needed to transfer the first selection between useSelection and useGridData. There are 14 different usages of these and it's today not 100% clear how they are connected, because the data is transferred between them using the already global variable useQuery and setSearchParams :) The alternative would have been to use the return value of useSelection and prop-drill it into every use of useGridData instead.

The use of query params also introduces an ordering problem. If useGridData is created after useSelection. Then there is a risk that setSearchParams inside useSelection is called before any useGridData tries to read searchParams to lock the runId. I think this was actually the main problem that prevented me from storing the first selection inside useGridData, but i can't remember for sure.

(The placement of firstRunIdSetByUrl = useContext(DagRunSelectionContext) inside useSelection is purely cosmetic, to centralize all selection logic in one place. That is possible to easily move into useGridData)

If you see any better way to do this, i would be more than happy to use some other approach instead. Also feel free to overwrite this PR if you have some other example for this already.

That makes sense. You are right the ordering is a bit tricky since selection/griddata also edit the url params.

Let me pull this down locally and poke around today.

@bbovenzi
Copy link
Contributor

  1. Do you mind rebasing?

  2. Can we change the base date in the UI as well? Otherwise there is no way in the UI to reset the selection and get to the most recent dag runs.

@hterik
Copy link
Contributor Author

hterik commented Nov 29, 2023

Is it possible to take the base date changes separately as an improvement? This bug makes the most recent versions of airflow completely unusable for us, would be great to have some kind of fix in asap.
The situation where one navigates from old run to new run isn't even possible to reach today due to this bug so i don't think we are regressing anything? Maybe i'm missing something, can you explain the scenario a bit more in detail? Does this change break the base_date selector in any other way?

@bbovenzi
Copy link
Contributor

Is it possible to take the base date changes separately as an improvement? This bug makes the most recent versions of airflow completely unusable for us, would be great to have some kind of fix in asap. The situation where one navigates from old run to new run isn't even possible to reach today due to this bug so i don't think we are regressing anything? Maybe i'm missing something, can you explain the scenario a bit more in detail? Does this change break the base_date selector in any other way?

Sure I can do a follow up PR. Let's still rebase.

@hterik hterik force-pushed the fix_run_id_out_of_base_date_grid branch from 2a5f571 to 9522b33 Compare November 30, 2023 10:23
@hterik
Copy link
Contributor Author

hterik commented Nov 30, 2023

Rebased now.

I see the problem with base date now. The whole base date selector and the filter options stop working if a dag_run_id was present in the URL (though dag_run_id in url didn't work at all before so questionable if this is a regression).

I did a quick and dirty attempt of fixing it but couldn't find a good solution in short time.
Something like below did make both the selector and Clear filters button work. Except if one clicks the current date, because when base date is not present it assumes it is the same time. It feels like i'm slowly migrating essentially the whole useFilters and useSelection to the Main function, it's a bit out of my comfort zone to be making such big changes here. I also don't have much time to look deeper into this in the close future. If you feel like you have better control over this i would happy to hand it over.

const Main = () => {
  const [searchParams, setSearchParams] = useSearchParams();
  const [firstRunIdSetByUrl, setFirstRunIdSetByUrl] = useState(searchParams.get(RUN_ID));
  const [prevBasedate, setPrevBasedate] = useState(searchParams.get(BASE_DATE_PARAM));
  
  const baseDate = searchParams.get(BASE_DATE_PARAM);
  if (baseDate != prevBasedate) {    
    setPrevBasedate(baseDate)
    setFirstRunIdSetByUrl(null)
    searchParams.delete("dag_run_id")
    searchParams.delete("task_id")    
    setSearchParams(searchParams);
  }

@bbovenzi
Copy link
Contributor

@hterik Ok I'm happy to clean up this filters+main logic. Mind to rebase the PR first?

@ephraimbuddy
Copy link
Contributor

There's conflicts

@eladkal
Copy link
Contributor

eladkal commented Feb 16, 2024

@hterik Can you handle the rebase and resolve conflicts?

@jscheffl
Copy link
Contributor

Yeah, would be great - as this fix was pushed multiple times... next release will be cut in ~24h - if rebased and reviewed the chance is likely to get it in...

Previously, if user set dag_run_id parameter in the url, that
refers to a old run, which doesn't fit in the most recent 25 runs,
then the requested run will not be selected.

This change fixes this by setting the base_date to a time where
the run_id is known to exist if dag_run_id is provided as
an explicit query parameter.

closes: apache#34723
@hterik hterik force-pushed the fix_run_id_out_of_base_date_grid branch from 9522b33 to d7c403d Compare February 20, 2024 07:17
@eladkal eladkal requested a review from bbovenzi February 20, 2024 08:22
Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a test (visual inspection and test on UI level) and can confirm that this PR improves the situation - allowing to find a DAG run by search and then navigate to GRID.

Nevertheless I assume we really have follow-up issues (like mentioned before by @bbovenzi that if the navigation included the filter, that no attempt to rest filter or re-selection of a date is working. Nothing that I see is caused by this PR - so still I'd propose to merge, knowing that there need to be other fixes in the filtering following.

LGTM!

@bbovenzi bbovenzi merged commit a0ebabb into apache:main Feb 20, 2024
57 checks passed
@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Feb 22, 2024
ephraimbuddy pushed a commit that referenced this pull request Feb 22, 2024
…4887)

Previously, if user set dag_run_id parameter in the url, that
refers to a old run, which doesn't fit in the most recent 25 runs,
then the requested run will not be selected.

This change fixes this by setting the base_date to a time where
the run_id is known to exist if dag_run_id is provided as
an explicit query parameter.

closes: #34723
(cherry picked from commit a0ebabb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Links to specific dagruns don't work when older than 25 runs
6 participants