>

  在开窗函数出现之前存在着很多用 SQL,SQL

- 编辑:银河国际手机客户端 -

  在开窗函数出现之前存在着很多用 SQL,SQL

从SQL Server 2005起,SQL Server开始支持窗口函数 (Window Function),以及到SQL Server 2012,窗口函数功能增强,目前为止支持以下几种窗口函数:

 

  1. 排序函数 (Ranking Function) ;

  2. 聚合函数 (Aggregate Function) ;

  3. 分析函数 (Analytic Function) ;

  4. NEXT VALUE FOR Function, 这是给sequence专用的一个函数;

从 转

 

 

一. 排序函数(Ranking Function)

开窗函数是在 ISO 标准中定义的。SQL Server 提供排名开窗函数和聚合开窗函数。

帮助文档里的代码示例很全。

  在开窗函数出现之前存在着很多用 SQL 语句很难解决的问题,很多都要通过复杂的相关子查询或者存储过程来完成。SQL Server 2005 引入了开窗函数,使得这些经典的难题可以被轻松的解决。

排序函数中,ROW_NUMBER()较为常用,可用于去重、分页、分组中选择数据,生成数字辅助表等等;

  窗口是用户指定的一组行。开窗函数计算从窗口派生的结果集中各行的值。开窗函数分别应用于每个分区,并为每个分区重新启动计算。

排序函数在语法上要求OVER子句里必须含ORDER BY,否则语法不通过,对于不想排序的场景可以这样变通;

  OVER 子句用于确定在应用关联的开窗函数之前,行集的分区和排序。PARTITION BY 将结果集分为多个分区。

drop table if exists test_ranking

create table test_ranking
( 
id int not null,
name varchar(20) not null,
value int not null
) 

insert test_ranking 
select 1,'name1',1 union all 
select 1,'name2',2 union all 
select 2,'name3',2 union all 
select 3,'name4',2

select id , name, ROW_NUMBER() over (PARTITION by id ORDER BY name) as num
from test_ranking

select id , name, ROW_NUMBER() over (PARTITION by id) as num
from test_ranking
/*
Msg 4112, Level 15, State 1, Line 1
The function 'ROW_NUMBER' must have an OVER clause with ORDER BY.
*/

--ORDERY BY后面给一个和原表无关的派生列
select id , name, ROW_NUMBER() over (PARTITION by id ORDER BY GETDATE()) as num
from test_ranking

select id , name, ROW_NUMBER() over (PARTITION by id ORDER BY (select 0)) as num
from test_ranking

 

 

一、排名开窗函数

二. 聚合函数 (Aggregate Function)

1. 语法

SQL Server 2005中,窗口聚合函数仅支持PARTITION BY,也就是说仅能对分组的数据整体做聚合运算;

Ranking Window Functions

< OVER_CLAUSE > :: =

   OVER ( [ PARTITION BY value_expression , ... [ n ] ]

          <ORDER BY_Clause> )

SQL Server 2012开始,窗口聚合函数支持ORDER BY,以及ROWS/RAGNE选项,原本需要子查询来实现的需求,如: 移动平均 (moving averages), 总计聚合 (cumulative aggregates), 累计求和 (running totals) 等,变得更加方便;

 

 

注意:ORDER BY 子句指定对相应 FROM 子句生成的行集进行分区所依据的列。value_expression 只能引用通过 FROM 子句可用的列。value_expression 不能引用选择列表中的表达式或别名。value_expression 可以是列表达式、标量子查询、标量函数或用户定义的变量。

代码示例1:总计/小计/累计求和

 

drop table if exists test_aggregate;

create table test_aggregate
(
event_id      varchar(100),
rk            int,
price         int
)

insert into test_aggregate
values
('a',1,10),
('a',2,10),
('a',3,50),
('b',1,10),
('b',2,20),
('b',3,30)


--1. 没有窗口函数时,用子查询
select a.event_id, 
       a.rk,  --build ranking column if needed
       a.price, 
     (select sum(price) from test_aggregate b where b.event_id = a.event_id and b.rk <= a.rk) as totalprice 
  from test_aggregate a


--2. 从SQL Server 2012起,用窗口函数
--2.1 
--没有PARTITION BY, 没有ORDER BY,为全部总计;
--只有PARTITION BY, 没有ORDER BY,为分组小计;
--只有ORDER BY,没有PARTITION BY,为全部累计求和(RANGE选项,见2.2)
select *,
     sum(price) over() as TotalPrice,
     sum(price) over(partition by event_id) as SubTotalPrice,
       sum(price) over(order by rk) as RunningTotalPrice
  from test_aggregate a

--2.2 注意ORDER BY列的选择,可能会带来不同结果
select *,
     sum(price) over(partition by event_id order by rk) as totalprice 
  from test_aggregate a
/*
event_id    rk    price    totalprice
a    1    10    10
a    2    10    20
a    3    50    70
b    1    10    10
b    2    20    30
b    3    30    60
*/

select *,
     sum(price) over(partition by event_id order by price) as totalprice 
  from test_aggregate a
/*
event_id    rk    price    totalprice
a    1    10    20
a    2    10    20
a    3    50    70
b    1    10    10
b    2    20    30
b    3    30    60
*/

--因为ORDER BY还有个子选项ROWS/RANGE,不指定的情况下默认为RANGE UNBOUNDED PRECEDING AND CURRENT ROW 
--RANGE按照ORDER BY中的列值,将相同的值的行均视为当前同一行
select  *,sum(price) over(partition by event_id order by price) as totalprice from test_aggregate a
select  *,sum(price) over(partition by event_id order by price range between unbounded preceding and current row) as totalprice from test_aggregate a

--如果ORDER BY中的列值有重复值,手动改用ROWS选项即可实现逐行累计求和
select  *,sum(price) over(partition by event_id order by price rows between unbounded preceding and current row) as totalprice from test_aggregate a

2. 示例

 

  可参考 

代码示例2:移动平均

 

--移动平均,举个例子,就是求前N天的平均值,和股票市场的均线类似
drop table if exists test_moving_avg

create table test_moving_avg
(
ID    int, 
Value int,
DT    datetime
)

insert into test_moving_avg 
values
(1,10,GETDATE()-10),
(2,110,GETDATE()-9),
(3,100,GETDATE()-8),
(4,80,GETDATE()-7),
(5,60,GETDATE()-6),
(6,40,GETDATE()-5),
(7,30,GETDATE()-4),
(8,50,GETDATE()-3),
(9,20,GETDATE()-2),
(10,10,GETDATE()-1)

--1. 没有窗口函数时,用子查询
select *,
(select AVG(Value) from test_moving_avg a where a.DT >= DATEADD(DAY, -5, b.DT) AND a.DT < b.DT) AS avg_value_5days
from test_moving_avg b

--2. 从SQL Server 2012起,用窗口函数
--三个内置常量,第一行,最后一行,当前行:UNBOUNDED PRECEDING, UNBOUNDED FOLLOWING, CURRENT ROW 
--在行间移动,用BETWEEN m preceding AND n following (m, n > 0)
SELECT *,
       sum(value) over (ORDER BY DT ROWS BETWEEN 5 preceding AND CURRENT ROW) moving_sum,
       avg(value) over (ORDER BY DT ROWS BETWEEN 4 preceding AND CURRENT ROW) moving_avg1,
       avg(value) over (ORDER BY DT ROWS BETWEEN 5 preceding AND 1 preceding) moving_avg2,
       avg(value) over (ORDER BY DT ROWS BETWEEN 3 preceding AND 1 following) moving_avg3
FROM  test_moving_avg
ORDER BY DT

 

 

二、聚合开窗函数

三. 分析函数 (Analytic Function)

1. 语法

代码示例1:取当前行某列的前一个/下一个值

Aggregate Window Functions

< OVER_CLAUSE > :: =

   OVER ( [ PARTITION BY value_expression , ... [ n ] ] )

drop table if exists test_analytic

create table test_analytic
(
SalesYear         varchar(10),
Revenue           int,
Offset            int
)

insert into test_analytic
values
(2013,1001,1),
(2014,1002,1),
(2015,1003,1),
(2016,1004,1),
(2017,1005,1),
(2018,1006,1)

--当年及去年的销售额
select *,lag(Revenue,1,null) over(order by SalesYear asc) as PreviousYearRevenue from test_analytic
select *,lag(Revenue,Offset,null) over(order by SalesYear asc) as PreviousYearRevenue from test_analytic
select *,lead(Revenue,1,null) over(order by SalesYear desc) as PreviousYearRevenue from test_analytic

--当年及下一年的销售额
select *,lead(Revenue,1,null) over(order by SalesYear asc) as NextYearRevenue from test_analytic
select *,lead(Revenue,Offset,null) over(order by SalesYear asc) as NextYearRevenue from test_analytic
select *,lag(Revenue,1,null) over(order by SalesYear desc) as NextYearRevenue from test_analytic

--可以根据offset调整跨度

 

 

2. 示例

代码示例2:分组中某列最大/最小值,对应的其他列值

  下例将根据 SalesOrderID 进行分区,然后为每个分区分别统计SUM、AVG、COUNT、MIN、MAX。

假设有个门禁系统,在员工每次进门时写入一条记录,记录了“身份号码”,“进门时间”,“衣服颜色",查询每个员工最后一次进门时的“衣服颜色”。

SELECT SalesOrderID, ProductID, OrderQty

   ,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Total'

   ,AVG(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Avg'

   ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Count'

   ,MIN(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Min'

   ,MAX(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Max'

FROM SalesOrderDetail

WHERE SalesOrderID IN(43659,43664);

drop table if exists test_first_last

create table test_first_last
(
EmployeeID             int,
EnterTime              datetime,
ColorOfClothes         varchar(20)
)

insert into test_first_last
values
(1001, GETDATE()-9, 'GREEN'),
(1001, GETDATE()-8, 'RED'),
(1001, GETDATE()-7, 'YELLOW'),
(1001, GETDATE()-6, 'BLUE'),
(1002, GETDATE()-5, 'BLACK'),
(1002, GETDATE()-4, 'WHITE')

--1. 用子查询
--LastColorOfColthes
select * from test_first_last a
where not exists(select 1 from test_first_last b where a.EmployeeID = b.EmployeeID and a.EnterTime < b.EnterTime)

--LastColorOfColthes
select *
from 
(select *, ROW_NUMBER() over(partition by EmployeeID order by EnterTime DESC) num
from test_first_last ) t
where t.num =1


--2. 用窗口函数
--用LAST_VALUE时,必须加上ROWS/RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING,否则结果不正确
select *, 
       FIRST_VALUE(ColorOfClothes) OVER (PARTITION BY EmployeeID ORDER BY EnterTime DESC) as LastColorOfClothes,
       FIRST_VALUE(ColorOfClothes) OVER (PARTITION BY EmployeeID ORDER BY EnterTime ASC) as FirstColorOfClothes,
       LAST_VALUE(ColorOfClothes) OVER (PARTITION BY EmployeeID ORDER BY EnterTime ASC ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) as LastColorOfClothes,
       LAST_VALUE(ColorOfClothes) OVER (PARTITION BY EmployeeID ORDER BY EnterTime DESC ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) as FirstColorOfClothes
from test_first_last

--对于显示表中所有行,并追加Last/First字段时用窗口函数方便些
--对于挑选表中某一行/多行时,用子查询更方便

 

 

  下例首先由 SalesOrderID 分区进行聚合,并为每个 SalesOrderID 的每一行计算 ProductID 的百分比)。

四. NEXT VALUE FOR Function

SELECT SalesOrderID, ProductID, OrderQty

   ,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Total'

   ,CAST(1.0 * OrderQty / SUM(OrderQty) OVER(PARTITION BY SalesOrderID)

       *100 AS DECIMAL(5,2))AS 'Percent by ProductID'

FROM SalesOrderDetail

WHERE SalesOrderID IN(43659,43664);

drop sequence if exists test_seq

create sequence test_seq
start with 1
increment by 1;

GO

drop table if exists test_next_value

create table test_next_value
(
ID         int,
Name       varchar(10)
)

insert into test_next_value(Name)
values
('AAA'),
('AAA'),
('BBB'),
('CCC')

--对于多行数据获取sequence的next value,是否使用窗口函数都会逐行计数
--窗口函数中ORDER BY用于控制不同列值的计数顺序
select *, NEXT VALUE FOR test_seq from test_next_value
select *, NEXT VALUE FOR test_seq OVER(ORDER BY Name DESC) from test_next_value

 

 

3. SQL Server 2012 增加功能

参考:

  SQL Server 2012 为聚合函数提供了窗口排序和框架支持,可以将 OVER 子句与函数一起使用,以便计算各种聚合值,例如移动平均值、累积聚合、运行总计或每组结果的前 N 个结果。

SELECT - OVER Clause (Transact-SQL)

  更多详情,请参考 

 

SQL Server Windowing Functions: ROWS vs. RANGE

 

三、分析开窗函数

  可参考 

 

 

四、NEXT VALUE FOR 函数

  通过将 OVER 子句应用于 NEXT VALUE FOR 调用,NEXT VALUE FOR 函数支持生成排序的序列值。 通过使用 OVER 子句,可以向用户保证返回的值是按照 OVER 子句的 ORDER BY 子子句的顺序生成的。

  例如:

SELECT NEXT VALUE FOR Test.CountBy1 OVER (ORDER BY LastName) AS ListNumber,

   FirstName, LastName

FROM Person.Contact ;

  详情请参考 

本文由计算机数据发布,转载请注明来源:  在开窗函数出现之前存在着很多用 SQL,SQL