Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

窗口函数定义 #40

Open
astak16 opened this issue Jan 30, 2022 · 0 comments
Open

窗口函数定义 #40

astak16 opened this issue Jan 30, 2022 · 0 comments

Comments

@astak16
Copy link
Owner

astak16 commented Jan 30, 2022

窗口函数

create table employee(
  id int,
  month int,
  salary int
)

insert into employee values
(1,1,20),
(2,1,20),
(1,2,30),
(2,2,30),
(3,2,40),
(1,3,40),
(3,3,60),
(1,4,60),
(3,4,70),
(1,7,90),
(1,8,90);

基础

窗口函数的作用是用来简化排名问题,否则将会写一大堆晦涩难懂的代码。

窗口函数与聚合函数的区别是:

  • 窗口函数会为每一行返回一个相应的值
  • 聚合函数是将一组行计算后返回一个值。

排名函数rank()row_number()dense_rank() ,他们的用法可以看

分布函数: percent_rank()cume_dist()

偏移量函数: lead()lag()

一些其他函数:first_value()last_value()nth_value()ntile()

这些函数的详细介在这里:各种窗口函数

聚合函数也可以配合窗口函数使用,起到“累加/累计”的作用,截止到当前行的 和/最大值/最小值/平均值/个数

select *, sum(column) over(partition by id order by age) from table_name;

聚合函数和专用函数有一个显著的区别:聚合函数需要指定列名,专用函数不需要指定列名。

滑动窗口

语法: rows n preceding between n followingprecding 在前, following 在后,否则会报错, current row 前后都可以

  • rows:行号为基准
  • range:以 order by 为基准
  • n preceding: 前 n
  • n following:后 n
  • current row:当前行
  • unbounded preceding:窗口第一行
  • unbounded following:窗口最后一行

窗口区间

  • n preceding 当前行的前 n
    n = 1

    • 如果当前行在边界(第一行),前面一行会忽略
    • 如果当前行在第二行,会和第一行的 salary 计算
    • 如果当前行在第三行,会和第二行的 salary 计算
    • 以此类推
    select
      id, month, salary,
      sum(salary) over(partition by id order by month range 1 preceding) w_salary
    from employee

    9

  • n following 当前行的后 n 行,和 preceding 效果一样
    如果单独使用 n following 会报语法错误

    select
      id, month, salary,
      sum(salary) over(partition by id order by month range 1 following) w_salary
    from employee
  • current row 当前行,只计算当前行,和 w_salarysalary 一样

    select
      id, month, salary,
      sum(salary) over(partition by id order by month range current row) w_salary
    from employee

10

  • unbounded preceding 窗口第一行
    当前行到第一行所有的 salar 计算
    select
      id, month, salary,
      sum(salary) over(partition by id
                        order by month
                        range unbounded preceding
      ) w_salary
    from employee

11

  • unbounded following:窗口最后一行,和 unbounded preceding 效果一样
    如果单独使用会报错
    select
      id, month, salary,
      sum(salary) over(partition by id
                        order by month
                        range unbounded preceding
      ) w_salary
    from employee
  • 组合使用: between 2 preceding and 2 following
    当前行上两行,当前行下两行,以及当前行的 salary 累加
    select
      id, month, salary,
      sum(salary) over(partition by id
                        order by month
                        range between 2 preceding and 2 following
      ) w_salary
    from employee

12

rowsrange 的区别

  • rows 是每一行在表中的实际位置,可以脱离 order by 运行
    看下图中 w_salary 是将当前行和下面两行的 salary 进行计算。
    不过这里 id = 1, month = 7, 8 没有和 id = 2, month = 1salary 进行累加,是因为这里使用 partition by idid 进行分组
    select
      id, month, salary,
      sum(salary) over(partition by id
                        order by month
                        rows between 0 preceding and 2 following
      ) w_salary
    from employee

13

  • range 要配合 order by 使用,如果 order by <column_name>column_name 不连续,在计算时也不会跳过。
    看下面图中, id = 1, month = 3, 4w_salary100, 60 ,因为没有 month = 5 ,所以它不会去计算 month = 7
    select
      id, month, salary,
      sum(salary) over(partition by id
                        order by month
                        range between 0 preceding and 2 following
      ) w_salary
    from employee

14

  • 这中情况下 rangerows 没有区别:
    • 当前行到下边界,上 0 行到下边界值都是一样
      rows between 0 preceding and unbounded following
      range between 0 preceding and unbounded following
      rows between current row and unbounded following
      range between current row and unbounded following
    • 上边界到当前行,上编辑到下 0 行值都是一样
      rows between unbounded preceding and 0 following
      range between unbounded preceding and 0 following
      rows between unbounded preceding and current row
      range between unbounded preceding and current row
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant