从一个游戏SQL的优化说起 - l1t1/note GitHub Wiki
快过年了,中国人的风俗里有一个打麻将,麻将共有3种花色,每个花色分为1-9条、1-9筒、1-9万各四张共108张牌,规定4人每轮摸进一张,从第14张起打出不要的一张牌,不吃不碰,纯自摸胡牌。胡牌格式为1对+3张(碰或顺子)*4。用SQL输出第一个胡牌的人手中的牌。 在duckdb v1.13中的建表语句如下:
CREATE TABLE games(i INTEGER, j VARCHAR);
with t as (select chr(126983+i::int) pic from generate_series(0,26)t(i))insert into games select 2, string_agg(pic,''order by random())j from t,generate_series(1,4);
表中数据示例
|i| j |
│2│ 🀘🀊🀏🀛🀜🀖🀒🀛🀝🀎🀌🀈🀞🀠🀑🀙🀠🀔🀌🀒🀖🀓🀐🀐🀝🀗🀔🀎🀜🀑🀚🀘🀙🀙🀐🀛🀍🀔🀐🀋🀓🀕🀌🀇🀌🀘🀖🀡🀕🀙🀠🀞🀉🀝🀎🀈🀍🀑🀟🀚🀗🀏🀡🀈🀔🀊🀒🀚🀏🀝🀇🀉🀏🀘🀗🀓🀊🀖🀋🀜🀋🀈🀉🀓🀜🀕🀟🀇🀚🀗🀠🀋🀑🀛🀍🀕🀡🀞🀍🀞🀉🀒🀊🀟🀟🀎🀇🀡 │
问题1:判断手里14张牌是否胡牌。先把牌按花色分组,每种花色里找1对或3张,如果14张组成了1对+3张*4,则胡牌。但是手中牌直接胡的可能性很低,需要打出和摸进,打什么最后能胡是不容易预测的,如果每种组合都尝试,那么效率必然很低。程宁同学想到一个优化点,因为不吃不碰纯自摸,那么一开始,4家将要摸到的牌实际上已经定了,各人27张牌,只要在这里面挑出最先胡的14张即可。他的思路如下SQL所示:
with
--将108张牌分配给4个人,用轮流、顺序分配的方案
tmp0 as (
select a.i AS category,
b.i as seq,
substring(c.j,(a.i * 27) + b.i ,1) chr,
ASCII(chr) - 126982 n ,
(n-1)//9 hs,
case n%9 when 0 then 9 else n%9 end pd
from range(4) a(i) ,range(1,28) b(i),games c ) ,
--第一步,能凑成对子、2张顺子情况
tmp1 as (
select a.*,
(select min(seq) from tmp0 b where a.category = b.category and a.hs = b.hs and a.pd = b.pd and b.seq > a.seq) dz,
(select min(seq) from tmp0 b where a.category = b.category and a.hs = b.hs and a.pd = b.pd +1 and b.seq > a.seq) sz1,
(select min(seq) from tmp0 b where a.category = b.category and a.hs = b.hs and a.pd = b.pd -1 and b.seq > a.seq) sz2
from tmp0 a
order by a.category,a.seq ),
--第二步,加能凑成三张顺子的情况
tmp2 as (
select a.*,
(select min(seq) from tmp0 b where a.category = b.category and a.hs = b.hs and a.pd = b.pd +2 and b.seq > a.seq and sz1 is not null) sz11,
(select min(seq) from tmp0 b where a.category = b.category and a.hs = b.hs and a.pd = b.pd -2 and b.seq > a.seq and sz2 is not null ) sz22,
(select min(seq) from tmp0 b where a.category = b.category and a.hs = b.hs and a.pd = b.pd and b.seq > a.dz and dz is not null) sz
from tmp1 a
order by a.category,a.seq ),
--生成所有可以形成3张组合的牌
tmp_3z as (
select category,LIST_SORT([seq,sz1,sz11]) n from tmp2 where sz11 is not null
union
select category,LIST_SORT([seq,sz2,sz22]) n from tmp2 where sz22 is not null
union
select category,LIST_SORT([seq,dz,sz]) n from tmp2 where sz is not null ),
--生成所有对子组合
tmp_dz as (
select category,LIST_SORT([seq,dz]) n from tmp1 where dz is not null ),
tmp_all as (
select a.category,a.n sz1,b.n sz2 ,c.n sz3 ,d.n dz ,
LIST_REVERSE(LIST_SORT(ARRAY_DISTINCT(ARRAY_CONCAT(ARRAY_CONCAT(ARRAY_CONCAT(a.n,b.n),c.n),d.n)))) as n
from tmp_3z a
left join tmp_3z b on a.category = b.category
left join tmp_3z c on a.category = c.category
left join tmp_dz d on a.category = d.category
where a.n<b.n and b.n<c.n ) ,
--用最少摸牌次数能胡牌的组合
tmp_min as (
select min(n) n from tmp_all where ARRAY_LENGTH(n) = 11 )
select STRING_AGG(chr,'' order by chr) as ret from tmp1 a
where a.seq in (select UNNEST(n) from tmp_min)
and a.category =
(select category from tmp_all b
where b.n=(select n from tmp_min) )
为简化问题,上述代码只考虑了11张牌(1对+3张*3)的情况,很容易扩展成14张。针对示例数据的运行结果如下:
│ 🀌🀍🀎🀔🀕🀖🀙🀙🀚🀛🀜 │
Run Time (s): real 1.092 user 1.164000 sys 0.792000
上述算法的关键是: 1.利用where a.n<b.n and b.n<c.n保证a b c 是顺序的,即消除了同一组合的不同排列,达到了去重复。 2.利用LIST_REVERSE(LIST_SORT(ARRAY_DISTINCT(ARRAY_CONCAT(ARRAY_CONCAT(ARRAY_CONCAT(a.n,b.n),c.n),d.n))))将选出的牌按摸牌顺序倒序排列,从而找到最先胡的一手牌。