给经过两次转置的结果集添加列标题

最近读了 O'Reilly 的技术译文书《SQL经典实例》,原书最后一章有一篇标题为《给经过两次转置的结果集添加列标题》的实例,该实例处理后的数据如下图所示,其结果有点类似电影结尾的职员表,或是以前校园里张贴的成绩排行榜。

SQL 经典实例

作者的想法很棒,同时也觉得作者的实现似乎有些过于复杂和抽象了,不太容易阅读和理解,在此以另一种方法实现,也算是对我自己学习结果的一次检验。(这里的标题仅仅是为匹配原书标题而取的,实际上我并不知道这个场景叫什么比较合适。)

创建数据表:

create table it_app (deptno int NULL,ename VARCHAR(30) NULL);
create table it_sch as select * from it_app where 1<>1;

写入示例数据:

INSERT INTO it_app VALUES (100, 'Clay');
INSERT INTO it_app VALUES (100, 'Mark');
INSERT INTO it_app VALUES (100, 'Jim');
INSERT INTO it_app VALUES (200, 'Lily');
INSERT INTO it_app VALUES (200, 'Lucy');
INSERT INTO it_app VALUES (200, 'Judah');
INSERT INTO it_app VALUES (300, 'Scott');
INSERT INTO it_app VALUES (300, 'Mary');
INSERT INTO it_app VALUES (100, 'Oracle');

INSERT INTO it_sch VALUES (500, 'Kate');
INSERT INTO it_sch VALUES (500, 'Steve');
INSERT INTO it_sch VALUES (500, 'Kettle');
INSERT INTO it_sch VALUES (400, 'Matt');
INSERT INTO it_sch VALUES (400, 'Lary');
INSERT INTO it_sch VALUES (400, 'Danny');

最终的 SQL 实现:

with temp_table as (
-- 整理数据,拼接上部门号, 序号取0(依据 row_number() 从1开始的事实)
select 'app' as mark, deptno, to_char(deptno) as ename, 0 as row_num from it_app
GROUP BY deptno
UNION ALL
select 'app' as mark, deptno,ename,row_number() over(PARTITION by deptno ORDER BY ename) row_num
from it_app
UNION ALL
-- 整理数据, 同上
select 'sch' as mark, deptno, to_char(deptno) as ename, 0 row_num from it_sch
GROUP BY deptno
UNION ALL
select 'sch' as mark, deptno, ename, row_number() over(PARTITION by deptno ORDER BY ename) row_num from it_sch
), tmp_data_src as (
-- 重新进行一次排序编号
select mark, deptno, ename, row_number() over (partition by mark order by deptno,row_num asc) as row_num from temp_table
)
-- 完成转置
select
	max(case mark when 'app' then ename end) as apps,
	max(case mark when 'sch' then ename end) as research
from tmp_data_src
GROUP BY row_num
ORDER BY row_num;

实际上我的这个方案也有它的缺点:较原书的实现方式本实现会各多读一次数据表。

对原书实现方案的改进

原书的实现似乎有 bug: 生成 row_number() 时只用了 id 一个字段,显然还需要加上 ename 字段,否则没法保证多截取的那行跟被替换为 deptno 的那行是同一个员工,下面是我在原作者的基础上做了调整后的实现,简化了层级结构,且实现了按姓名升序排列:

with temp_level as (
	select level as id from dual connect by level < 3 -- 每行需要计算的次数
), temp_table as (
	-- 将最后一行的雇员名称替换为部门号
	select c.mark,c.deptno,c.ename,c.ttl_row,c.row_order,decode(c.row_order, c.ttl_row, TO_CHAR(c.DEPTNO), c.ename) as ename2,
				 row_number() over (partition by c.mark ORDER BY c.mark, c.deptno asc) as last_rownum
	from (
		select a.mark, a.deptno, a.ename, a.ttl_row,b.id,row_number() over (partition by a.DEPTNO order by b.id, a.ename) as row_order 
		from (
			select 'app' as mark, deptno, ename, count(1) over (partition by DEPTNO) as ttl_row --这个字段用于计算行数使用需要再此基础上+1行
			from it_app
			union all
			select 'sch' as mark, deptno, ename, count(1) over (partition by DEPTNO) as ttl_row
			from it_sch
		) a,
		temp_level b
	) c
	where c.row_order > c.ttl_row - 1
)
select
	max(case d.mark when 'app' then ename2 end) as apps,
	max(case d.mark when 'sch' then ename2 end) as research
from temp_table d
GROUP BY d.last_rownum
ORDER BY d.last_rownum

发表评论