Get the max of each partition in SQL
Post date: Oct 15, 2013 8:28:04 PM
Here is the example written in sqldf:
# ============ exaMPLE: show the maximum for each id =================== # > df # id size # 1 1 big # 2 1 small # 3 1 bigger # 4 2 biggest # 5 2 small # 6 1 large # 7 2 small # ---- we can order by the column ------ df1 <- sqldf(' select id, size, (case when size="small" then 1 when size="big" then 2 when size="bigger" then 3 when size="biggest" then 4 else 0 end) as size_num from df order by id ASC, size_num ASC ') # > df1 # id size size_num # 1 1 large 0 # 2 1 small 1 # 3 1 big 2 # 4 1 bigger 3 # 5 2 small 1 # 6 2 small 1 # 7 2 biggest 4 # --- then we can get only the last row of each partition ---- df2 <- sqldf(' select * from df1 group by id ') # > df2 # id size size_num # 1 1 bigger 3 # 2 2 biggest 4 # ======= In fact, we can just combine the two queries into the same ====== df2 <- sqldf(' select B.* from (select id, size, (case when size="small" then 1 when size="big" then 2 when size="bigger" then 3 when size="biggest" then 4 else 0 end) as size_num from df order by id ASC, size_num ASC) B group by B.id ') # > df2 # id size size_num # 1 1 bigger 3 # 2 2 biggest 4
xxxx