窗口函数定义 #40

astak16 · 2022-01-30T07:43:48Z

窗口函数

create table employee(
  id int,
  month int,
  salary int
)

insert into employee values
(1,1,20),
(2,1,20),
(1,2,30),
(2,2,30),
(3,2,40),
(1,3,40),
(3,3,60),
(1,4,60),
(3,4,70),
(1,7,90),
(1,8,90);

基础

窗口函数的作用是用来简化排名问题，否则将会写一大堆晦涩难懂的代码。

窗口函数与聚合函数的区别是：

窗口函数会为每一行返回一个相应的值
聚合函数是将一组行计算后返回一个值。

排名函数： rank()、 row_number()、 dense_rank() ，他们的用法可以看

分布函数： percent_rank() 、cume_dist()

偏移量函数： lead()、lag()

一些其他函数：first_value()、last_value()、nth_value()、ntile()

这些函数的详细介在这里：各种窗口函数

聚合函数也可以配合窗口函数使用，起到“累加/累计”的作用，截止到当前行的 和/最大值/最小值/平均值/个数：

select *, sum(column) over(partition by id order by age) from table_name;

聚合函数和专用函数有一个显著的区别：聚合函数需要指定列名，专用函数不需要指定列名。

滑动窗口

语法： rows n preceding between n following ，precding 在前， following 在后，否则会报错， current row 前后都可以

rows：行号为基准
range：以 order by 为基准
n preceding: 前 n 行
n following：后 n 行
current row：当前行
unbounded preceding：窗口第一行
unbounded following：窗口最后一行

窗口区间

n preceding 当前行的前 n 行
当 n = 1
- 如果当前行在边界（第一行），前面一行会忽略
- 如果当前行在第二行，会和第一行的 salary 计算
- 如果当前行在第三行，会和第二行的 salary 计算
- 以此类推
```
select
  id, month, salary,
  sum(salary) over(partition by id order by month range 1 preceding) w_salary
from employee
```

n following 当前行的后 n 行，和 preceding 效果一样
如果单独使用 n following 会报语法错误

select
  id, month, salary,
  sum(salary) over(partition by id order by month range 1 following) w_salary
from employee

current row 当前行，只计算当前行，和 w_salary 和 salary 一样

select
  id, month, salary,
  sum(salary) over(partition by id order by month range current row) w_salary
from employee

unbounded preceding 窗口第一行
当前行到第一行所有的 salar 计算

select
  id, month, salary,
  sum(salary) over(partition by id
                    order by month
                    range unbounded preceding
  ) w_salary
from employee

unbounded following：窗口最后一行，和 unbounded preceding 效果一样
如果单独使用会报错

select
  id, month, salary,
  sum(salary) over(partition by id
                    order by month
                    range unbounded preceding
  ) w_salary
from employee

组合使用： between 2 preceding and 2 following
当前行上两行，当前行下两行，以及当前行的 salary 累加

select
  id, month, salary,
  sum(salary) over(partition by id
                    order by month
                    range between 2 preceding and 2 following
  ) w_salary
from employee

`rows` 和 `range` 的区别

rows 是每一行在表中的实际位置，可以脱离 order by 运行
看下图中 w_salary 是将当前行和下面两行的 salary 进行计算。
不过这里 id = 1, month = 7, 8 没有和 id = 2, month = 1 的 salary 进行累加，是因为这里使用 partition by id 按 id 进行分组
```
select
  id, month, salary,
  sum(salary) over(partition by id
                    order by month
                    rows between 0 preceding and 2 following
  ) w_salary
from employee
```

range 要配合 order by 使用，如果 order by <column_name> 的 column_name 不连续，在计算时也不会跳过。
看下面图中， id = 1, month = 3, 4 ， w_salary 为 100, 60 ，因为没有 month = 5 ，所以它不会去计算 month = 7 。
```
select
  id, month, salary,
  sum(salary) over(partition by id
                    order by month
                    range between 0 preceding and 2 following
  ) w_salary
from employee
```

这中情况下 range 和 rows 没有区别：

当前行到下边界，上 0 行到下边界值都是一样

rows between 0 preceding and unbounded following
range between 0 preceding and unbounded following
rows between current row and unbounded following
range between current row and unbounded following

上边界到当前行，上编辑到下 0 行值都是一样

rows between unbounded preceding and 0 following
range between unbounded preceding and 0 following
rows between unbounded preceding and current row
range between unbounded preceding and current row

The text was updated successfully, but these errors were encountered:

astak16 mentioned this issue Jan 30, 2022

30 行程和用户 #36

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

窗口函数定义 #40

窗口函数定义 #40

astak16 commented Jan 30, 2022 •

edited

Loading

窗口函数定义 #40

窗口函数定义 #40

Comments

astak16 commented Jan 30, 2022 • edited Loading

窗口函数

基础

滑动窗口

窗口区间

rows 和 range 的区别

astak16 commented Jan 30, 2022 •

edited

Loading

`rows` 和 `range` 的区别