您现在的位置:首页 > 教案格式 > 正文

主成分分析_成份分析_主成分分析的主要步骤(4)

2016-11-26 11:08 网络整理 教案网

输出中,第一部分为简单统计量(均值和标准差),第二部分为协方差的特征值(注意我们在过程中用了COV选项,无此选项用相关阵),从这里可以看到贡献率(Proportion)和累计贡献率(Cumulative),第三部分为特征向量。按本结果的特征向量值及用COV选项规定使用协方差阵,我们可以知道两个主成份如此计算:

		PRIN1 = 0.326866 (JULY-75.92) + 0.945071 (JANUARY-32.55)
		PRIN2 = 0.945071 (JULY-75.92)+ (-0.326866) (JANUARY-32.55)

如果没有用COV选项,原始变量还需要除以标准差。由系数可见,第一主成份是两个月份的加权平均,代表了一个地方的气温水平,第二主成份系数一正一负,反应了冬季和夏季的气温差别。

例2.美国各种类型犯罪的主成份分析

在数据集CRIME中有美国各个州的各种类型犯罪的犯罪率数据。希望对这些犯罪率数据进行主成份分析以概括犯罪情况。程序如下:

/* EXAMPLE 2*/
DATA CRIME;
   TITLE '各州每十万人的犯罪率';
   INPUT STATE $1-15 MURDER RAPE ROBBERY ASSAULT BURGLARY LARCENY AUTO;
   CARDS;
ALABAMA        14.2 25.2  96.8 278.3 1135.5 1881.9 280.7
ALASKA         10.8 51.6  96.8 284.0 1331.7 3369.8 753.3
ARIZONA         9.5 34.2 138.2 312.3 2346.1 4467.4 439.5
ARKANSAS        8.8 27.6  83.2 203.4  972.6 1862.1 183.4
CALIFORNIA     11.5 49.4 287.0 358.0 2139.4 3499.8 663.5
COLORADO        6.3 42.0 170.7 292.9 1935.2 3903.2 477.1
CONNECTICUT     4.2 16.8 129.5 131.8 1346.0 2620.7 593.2
DELAWARE        6.0 24.9 157.0 194.2 1682.6 3678.4 467.0
FLORIDA        10.2 39.6 187.9 449.1 1859.9 3840.5 351.4
GEORGIA        11.7 31.1 140.5 256.5 1351.1 2170.2 297.9
HAWAII          7.2 25.5 128.0  64.1 1911.5 3920.4 489.4
IDAHO           5.5 19.4  39.6 172.5 1050.8 2599.6 237.6
ILLINOIS        9.9 21.8 211.3 209.0 1085.0 2828.5 528.6
INDIANA         7.4 26.5 123.2 153.5 1086.2 2498.7 377.4
IOWA            2.3 10.6  41.2  89.8  812.5 2685.1 219.9
KANSAS          6.6 22.0 100.7 180.5 1270.4 2739.3 244.3
KENTUCKY       10.1 19.1  81.1 123.3  872.2 1662.1 245.4
LOUISIANA      15.5 30.9 142.9 335.5 1165.5 2469.9 337.7
MAINE           2.4 13.5  38.7 170.0 1253.1 2350.7 246.9
MARYLAND        8.0 34.8 292.1 358.9 1400.0 3177.7 428.5
MASSACHUSETTS   3.1 20.8 169.1 231.6 1532.2 2311.3 1140.1
MICHIGAN        9.3 38.9 261.9 274.6 1522.7 3159.0 545.5
MINNESOTA       2.7 19.5  85.9  85.8 1134.7 2559.3 343.1
MISSISSIPPI    14.3 19.6  65.7 189.1  915.6 1239.9 144.4
MISSOURI        9.6 28.3 189.0 233.5 1318.3 2424.2 378.4
MONTANA         5.4 16.7  39.2 156.8  804.9 2773.2 309.2
NEBRASKA        3.9 18.1  64.7 112.7  760.0 2316.1 249.1
NEVADA         15.8 49.1 323.1 355.0 2453.1 4212.6 559.2
NEW HAMPSHIRE   3.2 10.7  23.2  76.0 1041.7 2343.9 293.4
NEW JERSEY      5.6 21.0 180.4 185.1 1435.8 2774.5 511.5
NEW MEXICO      8.8 39.1 109.6 343.4 1418.7 3008.6 259.5
NEW YORK       10.7 29.4 472.6 319.1 1728.0 2782.0 745.8
NORTH CAROLINA 10.6 17.0  61.3 318.3 1154.1 2037.8 192.1
NORTH DAKOTA    0.9  9.0  13.3  43.8  446.1 1843.0 144.7
OHIO            7.8 27.3 190.5 181.1 1216.0 2696.8 400.4
OKLAHOMA        8.6 29.2  73.8 205.0 1288.2 2228.1 326.8
OREGON          4.9 39.9 124.1 286.9 1636.4 3506.1 388.9
PENNSYLVANIA    5.6 19.0 130.3 128.0  877.5 1624.1 333.2
RHODE ISLAND    3.6 10.5  86.5 201.0 1489.5 2844.1 791.4
SOUTH CAROLINA 11.9 33.0 105.9 485.3 1613.6 2342.4 245.1
SOUTH DAKOTA    2.0 13.5  17.9 155.7  570.5 1704.4 147.5
TENNESSEE      10.1 29.7 145.8 203.9 1259.7 1776.5 314.0
TEXAS          13.3 33.8 152.4 208.2 1603.1 2988.7 397.6
UTAH            3.5 20.3  68.8 147.3 1171.6 3004.6 334.5
VERMONT         1.4 15.9  30.8 101.2 1348.2 2201.0 265.2
VIRGINIA        9.0 23.3  92.1 165.7  986.2 2521.2 226.7
WASHINGTON      4.3 39.6 106.2 224.8 1605.6 3386.9 360.3
WEST VIRGINIA   6.0 13.2  42.2  90.9  597.4 1341.7 163.3
WISCONSIN       2.8 12.9  52.2  63.7  846.9 2614.2 220.7
WYOMING         5.4 21.9  39.7 173.9  811.6 2772.2 282.0
;
PROC PRINCOMP OUT=CRIMCOMP;
RUN;

PROC SORT;
   BY PRIN1;
PROC PRINT;
   ID STATE;
   VAR PRIN1 PRIN2 MURDER RAPE ROBBERY ASSAULT BURGLARY LARCENY AUTO;
   TITLE2 '各州按第一主成份作为总犯罪率排列';
PROC SORT;
   BY PRIN2;
PROC PRINT;
   ID STATE;
   VAR PRIN1 PRIN2 MURDER RAPE ROBBERY ASSAULT BURGLARY LARCENY AUTO;
   TITLE2 '各州按第二主成份作为金钱犯罪与犯罪对比的排列';
PROC GPLOT;
   PLOT PRIN2*PRIN1=STATE;
   TITLE2 'PLOT OF THE FIRST TWO PRINCIPAL COMPONENTS';
PROC GPLOT;
   PLOT PRIN3*PRIN1=STATE;
   TITLE2 'PLOT OF THE FIRST AND THIRD PRINCIPAL COMPONENTS';
RUN;