計算PostgreSQL中表中列的for回圈中的平均值-有解無憂

我來自 Python 世界，那里的許多事情都是豐富多彩且簡單的。現在我正在嘗試進入 SQL，因為我想在 pandas 之外挑戰自己，并獲得 SQL 方面的重要經驗。也就是說，我有以下問題。我有以下片段：

do 
$do$
declare i varchar(50); 
declare average int; 
begin
    for i in (
        select column_name
        FROM information_schema.columns
        where table_schema = 'public'
        and table_name = 'example_table' 
        and column_name like '%suffix') loop 
            --raise notice 'Value: %', i; 
            select AVG(i) as average from example_table; 
            raise notice 'Value: %', i;
        end loop; 
end; 
$do$

正如我在 SQL 檔案中了解到的，我發現 for 回圈只能在 do 塊中使用，并且必須宣告某些變數。我為 i 變數執行此操作，該變數包含我要迭代的列的名稱。但是我想獲得該列的平均值并將其添加為表中的一行，其中兩列用于特征（i 變數）和該列的平均值。我認為上面的代碼片段可以做到這一點，但我收到一條錯誤訊息，上面寫著 Function avg(character varying) does not exist. 當我在 for 回圈之外對單個列使用函式 AVG 時，它確實檢索了這個數字列的平均值，但是當我在 for 回圈中執行它時，說這個聚合函式不存在。有人可以幫我解決這個問題嗎？

更新：我退后一步，試圖讓故事更短：

select column_name
        FROM information_schema.columns
        where table_schema = 'public'
        and table_name = 'my_table' 
        and column_name like '%wildcard';

This snippet yields a table with a column called column_name and all the columns that fullfil the constraints stated in the where statement. I just want to add a column with the average value of those columns.

uj5u.com熱心網友回復：

如果你只需要一個表，你可以使用：

select x.col, avg(x.value::numeric)
from example_table t
 cross join lateral (
    select col, value
    from jsonb_each(to_jsonb(t)) as e(col, value)
    where jsonb_typeof(e.value) = 'number'
 ) x
group by x.col;

“魔術”是將表中的每一行轉換為 JSON 值。這就是to_jsonb(t)（t主查詢中給表的別名）。所以我們得到類似的東西{"name": "Bla", "value": 3.14, "length": 10, "some_date": "2022-03-02"}。所以每個列名都是 JSON 值中的一個鍵。

然后使用該函式將此 json 轉換為每列 (=key) 一行，jsonb_each()但僅保留具有數字值的行 (=columns)。因此，派生表回傳表中每列和每行的一行。外部查詢然后簡單地聚合每列。缺點是，您需要為每個表撰寫一個查詢。

如果您需要架構中所有表的某種報告，您可以使用此答案的變體

with all_selects as (
  select table_schema, table_name, 'select '||string_agg(format('avg(%I) as %I', column_name, column_name), ', ')||format(' from %I.%I', table_schema, table_name) as query
  from information_schema.columns
  where table_schema = 'public'
    and data_type in ('bigint', 'integer', 'double precision', 'smallint', 'numeric', 'real')
  group by table_schema, table_name
), all_aggregates as (
   select table_schema, table_name, 
          query_to_xml(query, true, true, '') as result
   from all_selects
)
select ag.table_schema, ag.table_name, r.column_name, nullif(r.average, '')::numeric as average
from all_aggregates ag
  cross join xmltable('/row/*' passing result
     columns column_name text path 'local-name()', 
             average text path '.') as r

這有點棘手。第一部分all_selects為架構中的每個表構建一個查詢，public以將avg()聚合應用于每個可以包含數字 ( where data type in (...))的列

所以例如這會回傳一個字串select avg(value) as value, avg(length) as length from example_table

下一步是運行這些查詢中的每一個query_to_xml()（遺憾的是沒有內置的query_to_jsonb()）。

query_to_xml()會回傳類似：

<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <value>12.345</balance>
  <length>42</user_id>
</row>

所以每一列都有一個標簽（這是avg(..)函式的結果）。

最后的選擇然后用于xmltable()將 XML 結果中的每個標記轉換為回傳列名稱和值的行

在線示例

當然你也可以在 PL/pgSQL 中這樣做：

do 
$do$
declare 
  l_rec record;
  l_sql text;
  l_average numeric;
begin
    for l_rec in 
        select table_schema, table_name, column_name
        from information_schema.columns
        where table_schema = 'public'
          and data_type in ('bigint', 'integer', 'double precision', 'smallint', 'numeric', 'real')
    loop 
      l_sql := format('select avg(%I) from %I.%I', l_rec.column_name, l_rec.table_schema, l_rec.table_name);
      execute l_sql
         into l_average;
      raise notice 'Average for %.% is: %', l_rec.table_name, l_rec.column_name, l_average;
    end loop; 
end; 
$do$

Note condition on the column data_type to only process columns that can be averaged. This is however more costly as it runs one query per column, not per table.

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/436578.html

標籤：sql postgresql average

上一篇：選擇帶有“case”運算式的變數

下一篇：在子查詢中使用unnest()時，Postgresql磁區修剪不起作用