我已經在 proc sql 中有答案,但我需要我的代碼的資料步驟版本。如果有人可以幫我轉換它,我將不勝感激。
PROC SQL;
CREATE TABLE CARS AS
SELECT Origin, Type, Cylinders, DriveTrain, COUNT(*) AS COUNT
FROM SASHELP.CARS
group by Origin, Type, Cylinders, DriveTrain;
QUIT;
uj5u.com熱心網友回復:
只要您的關鍵變數都沒有缺失值,并且完整的匯總表將適合您的可用記憶體,您就可以使用資料步驟 HASH。
這將消除對資料進行預排序的需要。
data _null_;
set sashelp.cars end=eof;
if _n_=1 then do;
declare hash h(ordered:'yes');
rc=h.definekey('Origin','Type','Cylinders','DriveTrain');
rc=h.definedata('Origin','Type','Cylinders','DriveTrain','count');
rc=h.definedone();
end;
if h.find() then count=0;
count 1;
rc=h.replace();
if eof then rc=h.output(dataset:'cars2');
run;
uj5u.com熱心網友回復:
資料步在這里不是合適的解決方案,PROC FREQ 將是 SAS 解決方案。
proc freq data=sashelp.cars;
table origin*type*cylinders*drivetrain / out=cars list;
run;
為了完整起見,這是一個資料步驟方法。非常不推薦:
- 首先通過分組變數對資料集進行排序
- 在資料步驟中使用 BY Group 來識別感興趣的組
- 使用 RETAIN 跨行保存值
- 使用 FIRST./LAST。累加計數器并輸出
*sort for BY statement is required;
proc sort data=sashelp.cars out=cars_sorted;
by origin type cylinders drivetrain;
run;
data cars_count;
set cars_sorted;
by origin type cylinders drivetrain;
*RETAIN tells SAS to keep this variable across rows, otherwise it resets for each observation;
retain count;
*if first in category set count to 0;
if first.drivetrain then count=0;
*increment count for each record (implicit retain so RETAIN is not actually required here);
count 1;
*if last of the group then output the total count for that group;
if last.drivetrain then output;
*keep only variables of interest;
keep origin type cylinders drivetrain count;
run;
*display results;
proc print data=cars_count;
run;
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/422041.html
標籤:
上一篇:分析功能如何在內部作業?
