主頁 > 後端開發 > 面對小白的pandas命令手冊+練習題【三萬字詳解】

面對小白的pandas命令手冊+練習題【三萬字詳解】

2021-09-15 09:29:37 後端開發

大家好,我是辣條,

Pandas 是Python的核心資料分析支持庫,提供了快速、靈活、明確的資料結構,旨在簡單、直觀地處理關系型、標記型資料,Pandas常用于處理帶行列標簽的矩陣資料、與 SQL 或 Excel 表類似的表格資料,應用于金融、統計、社會科學、工程等領域里的資料整理與清洗、資料分析與建模、資料可視化與制表等作業,

練習題索引

習題編號內容相應資料集
練習1 - 開始了解你的資料探索Chipotle快餐資料chipotle.tsv
練習2 - 資料過濾與排序探索2012歐洲杯資料Euro2012_stats.csv
[練習3 - 資料分組]探索酒類消費資料drinks.csv
[練習4 -Apply函式]探索1960 - 2014 美國犯罪資料US_Crime_Rates_1960_2014.csv
[練習5 - 合并]探索虛擬姓名資料練習中手動內置的資料
[練習6 - 統計探索風速資料wind.data
[練習7 - 可視化]探索泰坦尼克災難資料train.csv
[練習8 - 創建資料框探索Pokemon資料練習中手動內置的資料
[練習9 - 時間序列探索Apple公司股價資料Apple_stock.csv
[練習10 - 洗掉資料探索Iris紙鳶花資料iris.csv

練習1-開始了解你的資料

探索Chipotle快餐資料

image description

步驟1 匯入必要的庫

In [7]:

# 運行以下代碼
import pandas as pd
 

步驟2 從如下地址匯入資料集

In [5]:

# 運行以下代碼
path1 = "./exercise_data/chipotle.tsv"    # chipotle.tsv

步驟3 將資料集存入一個名為chipo的資料框內

In [8]:

# 運行以下代碼
chipo = pd.read_csv(path1, sep = '\t')

步驟4 查看前10行內容

In [9]:

# 運行以下代碼
chipo.head(10)

Out[9]:

order_idquantityitem_namechoice_descriptionitem_price
011Chips and Fresh Tomato SalsaNaN$2.39
111Izze[Clementine]$3.39
211Nantucket Nectar[Apple]$3.39
311Chips and Tomatillo-Green Chili SalsaNaN$2.39
422Chicken Bowl[Tomatillo-Red Chili Salsa (Hot), [Black Beans...$16.98
531Chicken Bowl[Fresh Tomato Salsa (Mild), [Rice, Cheese, Sou...$10.98
631Side of ChipsNaN$1.69
741Steak Burrito[Tomatillo Red Chili Salsa, [Fajita Vegetables...$11.75
841Steak Soft Tacos[Tomatillo Green Chili Salsa, [Pinto Beans, Ch...$9.25
951Steak Burrito[Fresh Tomato Salsa, [Rice, Black Beans, Pinto...$9.25

步驟6 資料集中有多少個列(columns)

In [236]:

# 運行以下代碼
chipo.shape[1]

Out[236]:

5

步驟7 列印出全部的列名稱

In [237]:

# 運行以下代碼
chipo.columns

Out[237]:

Index(['order_id', 'quantity', 'item_name', 'choice_description',
       'item_price'],
      dtype='object')

步驟8 資料集的索引是怎樣的

In [238]:

# 運行以下代碼
chipo.index

Out[238]:

RangeIndex(start=0, stop=4622, step=1)

步驟9 被下單數最多商品(item)是什么?

In [239]:

# 運行以下代碼,做了修正
c = chipo[['item_name','quantity']].groupby(['item_name'],as_index=False).agg({'quantity':sum})
c.sort_values(['quantity'],ascending=False,inplace=True)
c.head()

Out[239]:

item_namequantity
17Chicken Bowl761
18Chicken Burrito591
25Chips and Guacamole506
39Steak Burrito386
10Canned Soft Drink351

步驟10 在item_name這一列中,一共有多少種商品被下單?

In [240]:

# 運行以下代碼
chipo['item_name'].nunique()

Out[240]:

50

步驟11 在choice_description中,下單次數最多的商品是什么?

In [241]:

# 運行以下代碼,存在一些小問題
chipo['choice_description'].value_counts().head()

Out[241]:

[Diet Coke]                                                                          134
[Coke]                                                                               123
[Sprite]                                                                              77
[Fresh Tomato Salsa, [Rice, Black Beans, Cheese, Sour Cream, Lettuce]]                42
[Fresh Tomato Salsa, [Rice, Black Beans, Cheese, Sour Cream, Guacamole, Lettuce]]     40
Name: choice_description, dtype: int64

步驟12 一共有多少商品被下單?

In [242]:

# 運行以下代碼
total_items_orders = chipo['quantity'].sum()
total_items_orders

Out[242]:

4972

步驟13 將item_price轉換為浮點數

In [243]:

# 運行以下代碼
dollarizer = lambda x: float(x[1:-1])
chipo['item_price'] = chipo['item_price'].apply(dollarizer)

步驟14 在該資料集對應的時期內,收入(revenue)是多少

In [244]:

# 運行以下代碼,已經做更正
chipo['sub_total'] = round(chipo['item_price'] * chipo['quantity'],2)
chipo['sub_total'].sum()

Out[244]:

39237.02

步驟15 在該資料集對應的時期內,一共有多少訂單?

In [245]:

# 運行以下代碼
chipo['order_id'].nunique()

Out[245]:

1834

步驟16 每一單(order)對應的平均總價是多少?

In [246]:

# 運行以下代碼,已經做過更正
chipo[['order_id','sub_total']].groupby(by=['order_id']
).agg({'sub_total':'sum'})['sub_total'].mean()

Out[246]:

21.39423118865867

步驟17 一共有多少種不同的商品被售出?

In [247]:

# 運行以下代碼
chipo['item_name'].nunique()

Out[247]:

練習2-資料過濾與排序

探索2012歐洲杯資料

image description

步驟1 - 匯入必要的庫

In [248]:

# 運行以下代碼
import pandas as pd

步驟2 - 從以下地址匯入資料集

In [249]:

# 運行以下代碼
path2 = "./exercise_data/Euro2012_stats.csv"      # Euro2012_stats.csv

步驟3 - 將資料集命名為euro12

In [250]:

# 運行以下代碼
euro12 = pd.read_csv(path2)
euro12

Out[250]:

TeamGoalsShots on targetShots off targetShooting Accuracy% Goals-to-shotsTotal shots (inc. Blocked)Hit WoodworkPenalty goalsPenalties not scored...Saves madeSaves-to-shots ratioFouls WonFouls ConcededOffsidesYellow CardsRed CardsSubs onSubs offPlayers Used
0Croatia4131251.9%16.0%32000...1381.3%41622909916
1Czech Republic4131841.9%12.9%39000...960.1%5373870111119
2Denmark4101050.0%20.0%27100...1066.7%25388407715
3England5111850.0%17.2%40000...2288.1%4345650111116
4France3222437.9%6.5%65100...654.6%3651560111119
5Germany10323247.8%15.6%80210...1062.6%63491240151517
6Greece581830.7%19.2%32111...1365.1%67481291121220
7Italy6344543.0%7.5%110200...2074.1%1018916160181819
8Netherlands2123625.0%4.1%60200...1270.6%35303507715
9Poland2152339.4%5.2%48000...666.7%48563717717
10Portugal6224234.3%9.3%82600...1071.5%739010120141416
11Republic of Ireland171236.8%5.2%28000...1765.4%43511161101017
12Russia593122.5%12.5%59200...1077.0%34434607716
13Spain12423355.9%16.0%100010...1593.8%1028319110171718
14Sweden5171947.2%13.8%39300...861.6%35517709918
15Ukraine272621.2%6.0%38000...1376.5%48314509918

16 rows × 35 columns

步驟4 只選取 Goals 這一列

In [251]:

# 運行以下代碼
euro12.Goals

Out[251]:

0      4
1      4
2      4
3      5
4      3
5     10
6      5
7      6
8      2
9      2
10     6
11     1
12     5
13    12
14     5
15     2
Name: Goals, dtype: int64

步驟5 有多少球隊參與了2012歐洲杯?

In [252]:

# 運行以下代碼
euro12.shape[0]

Out[252]:

16

步驟6 該資料集中一共有多少列(columns)?

In [253]:

# 運行以下代碼
euro12.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16 entries, 0 to 15
Data columns (total 35 columns):
Team                          16 non-null object
Goals                         16 non-null int64
Shots on target               16 non-null int64
Shots off target              16 non-null int64
Shooting Accuracy             16 non-null object
% Goals-to-shots              16 non-null object
Total shots (inc. Blocked)    16 non-null int64
Hit Woodwork                  16 non-null int64
Penalty goals                 16 non-null int64
Penalties not scored          16 non-null int64
Headed goals                  16 non-null int64
Passes                        16 non-null int64
Passes completed              16 non-null int64
Passing Accuracy              16 non-null object
Touches                       16 non-null int64
Crosses                       16 non-null int64
Dribbles                      16 non-null int64
Corners Taken                 16 non-null int64
Tackles                       16 non-null int64
Clearances                    16 non-null int64
Interceptions                 16 non-null int64
Clearances off line           15 non-null float64
Clean Sheets                  16 non-null int64
Blocks                        16 non-null int64
Goals conceded                16 non-null int64
Saves made                    16 non-null int64
Saves-to-shots ratio          16 non-null object
Fouls Won                     16 non-null int64
Fouls Conceded                16 non-null int64
Offsides                      16 non-null int64
Yellow Cards                  16 non-null int64
Red Cards                     16 non-null int64
Subs on                       16 non-null int64
Subs off                      16 non-null int64
Players Used                  16 non-null int64
dtypes: float64(1), int64(29), object(5)
memory usage: 4.5+ KB

步驟7 將資料集中的列Team, Yellow Cards和Red Cards單獨存為一個名叫discipline的資料框

In [254]:

# 運行以下代碼
discipline = euro12[['Team', 'Yellow Cards', 'Red Cards']]
discipline

Out[254]:

TeamYellow CardsRed Cards
0Croatia90
1Czech Republic70
2Denmark40
3England50
4France60
5Germany40
6Greece91
7Italy160
8Netherlands50
9Poland71
10Portugal120
11Republic of Ireland61
12Russia60
13Spain110
14Sweden70
15Ukraine50

步驟8 對資料框discipline按照先Red Cards再Yellow Cards進行排序

In [255]:

# 運行以下代碼
discipline.sort_values(['Red Cards', 'Yellow Cards'], ascending = False)

Out[255]:

TeamYellow CardsRed Cards
6Greece91
9Poland71
11Republic of Ireland61
7Italy160
10Portugal120
13Spain110
0Croatia90
1Czech Republic70
14Sweden70
4France60
12Russia60
3England50
8Netherlands50
15Ukraine50
2Denmark40
5Germany40

步驟9 計算每個球隊拿到的黃牌數的平均值

In [256]:

# 運行以下代碼
round(discipline['Yellow Cards'].mean())

Out[256]:

7.0

步驟10 找到進球數Goals超過6的球隊資料

In [257]:

# 運行以下代碼
euro12[euro12.Goals > 6]

Out[257]:

TeamGoalsShots on targetShots off targetShooting Accuracy% Goals-to-shotsTotal shots (inc. Blocked)Hit WoodworkPenalty goalsPenalties not scored...Saves madeSaves-to-shots ratioFouls WonFouls ConcededOffsidesYellow CardsRed CardsSubs onSubs offPlayers Used
5Germany10323247.8%15.6%80210...1062.6%63491240151517
13Spain12423355.9%16.0%100010...1593.8%1028319110171718

2 rows × 35 columns

步驟11 選取以字母G開頭的球隊資料

In [258]:

# 運行以下代碼
euro12[euro12.Team.str.startswith('G')]

Out[258]:

TeamGoalsShots on targetShots off targetShooting Accuracy% Goals-to-shotsTotal shots (inc. Blocked)Hit WoodworkPenalty goalsPenalties not scored...Saves madeSaves-to-shots ratioFouls WonFouls ConcededOffsidesYellow CardsRed CardsSubs onSubs offPlayers Used
5Germany10323247.8%15.6%80210...1062.6%63491240151517
6Greece581830.7%19.2%32111...1365.1%67481291121220

2 rows × 35 columns

步驟12 選取前7列

In [259]:

# 運行以下代碼
euro12.iloc[: , 0:7]

Out[259]:

TeamGoalsShots on targetShots off targetShooting Accuracy% Goals-to-shotsTotal shots (inc. Blocked)
0Croatia4131251.9%16.0%32
1Czech Republic4131841.9%12.9%39
2Denmark4101050.0%20.0%27
3England5111850.0%17.2%40
4France3222437.9%6.5%65
5Germany10323247.8%15.6%80
6Greece581830.7%19.2%32
7Italy6344543.0%7.5%110
8Netherlands2123625.0%4.1%60
9Poland2152339.4%5.2%48
10Portugal6224234.3%9.3%82
11Republic of Ireland171236.8%5.2%28
12Russia593122.5%12.5%59
13Spain12423355.9%16.0%100
14Sweden5171947.2%13.8%39
15Ukraine272621.2%6.0%38

步驟13 選取除了最后3列之外的全部列

In [260]:

# 運行以下代碼
euro12.iloc[: , :-3]

Out[260]:

TeamGoalsShots on targetShots off targetShooting Accuracy% Goals-to-shotsTotal shots (inc. Blocked)Hit WoodworkPenalty goalsPenalties not scored...Clean SheetsBlocksGoals concededSaves madeSaves-to-shots ratioFouls WonFouls ConcededOffsidesYellow CardsRed Cards
0Croatia4131251.9%16.0%32000...01031381.3%4162290
1Czech Republic4131841.9%12.9%39000...1106960.1%5373870
2Denmark4101050.0%20.0%27100...11051066.7%2538840
3England5111850.0%17.2%40000...22932288.1%4345650
4France3222437.9%6.5%65100...175654.6%3651560
5Germany10323247.8%15.6%80210...11161062.6%63491240
6Greece581830.7%19.2%32111...12371365.1%67481291
7Italy6344543.0%7.5%110200...21872074.1%1018916160
8Netherlands2123625.0%4.1%60200...0951270.6%3530350
9Poland2152339.4%5.2%48000...083666.7%4856371
10Portugal6224234.3%9.3%82600...21141071.5%739010120
11Republic of Ireland171236.8%5.2%28000...02391765.4%43511161
12Russia593122.5%12.5%59200...0831077.0%3443460
13Spain12423355.9%16.0%100010...5811593.8%1028319110
14Sweden5171947.2%13.8%39300...1125861.6%3551770
15Ukraine272621.2%6.0%38000...0441376.5%4831450

16 rows × 32 columns

步驟14 找到英格蘭(England)、意大利(Italy)和俄羅斯(Russia)的射正率(Shooting Accuracy)

In [261]:

# 運行以下代碼
euro12.loc[euro12.Team.isin(['England', 'Italy', 'Russia']), ['Team','Shooting Accuracy']]

Out[261]:

TeamShooting Accuracy
3England50.0%
7Italy43.0%
12Russia22.5%

練習3-資料分組

探索酒類消費資料

image description

步驟1 匯入必要的庫

In [262]:

# 運行以下代碼
import pandas as pd

步驟2 從以下地址匯入資料

In [10]:

# 運行以下代碼
path3 ='./exercise_data/drinks.csv'    #'drinks.csv'

步驟3 將資料框命名為drinks

In [11]:

# 運行以下代碼
drinks = pd.read_csv(path3)
drinks.head()

Out[11]:

countrybeer_servingsspirit_servingswine_servingstotal_litres_of_pure_alcoholcontinent
0Afghanistan0000.0AS
1Albania89132544.9EU
2Algeria250140.7AF
3Andorra24513831212.4EU
4Angola21757455.9AF

步驟4 哪個大陸(continent)平均消耗的啤酒(beer)更多?

In [12]:

# 運行以下代碼
drinks.groupby('continent').beer_servings.mean()

Out[12]:

continent
AF     61.471698
AS     37.045455
EU    193.777778
OC     89.687500
SA    175.083333
Name: beer_servings, dtype: float64

步驟5 列印出每個大陸(continent)的紅酒消耗(wine_servings)的描述性統計值

In [13]:

# 運行以下代碼
drinks.groupby('continent').wine_servings.describe()

Out[13]:

countmeanstdmin25%50%75%max
continent
AF53.016.26415138.8464190.01.02.013.00233.0
AS44.09.06818221.6670340.00.01.08.00123.0
EU45.0142.22222297.4217380.059.0128.0195.00370.0
OC16.035.62500064.5557900.01.08.523.25212.0
SA12.062.41666788.6201891.03.012.098.50221.0

步驟6 列印出每個大陸每種酒類別的消耗平均值

In [15]:

# 運行以下代碼
drinks.groupby('continent').mean()

Out[15]:

beer_servingsspirit_servingswine_servingstotal_litres_of_pure_alcohol
continent
AF61.47169816.33962316.2641513.007547
AS37.04545560.8409099.0681822.170455
EU193.777778132.555556142.2222228.617778
OC89.68750058.43750035.6250003.381250
SA175.083333114.75000062.4166676.308333

步驟7 列印出每個大陸每種酒類別的消耗中位數

In [268]:

# 運行以下代碼
drinks.groupby('continent').median()

Out[268]:

beer_servingsspirit_servingswine_servingstotal_litres_of_pure_alcohol
continent
AF32.03.02.02.30
AS17.516.01.01.20
EU219.0122.0128.010.00
OC52.537.08.51.75
SA162.5108.512.06.85

步驟8 列印出每個大陸對spirit飲品消耗的平均值,最大值和最小值

In [269]:

# 運行以下代碼
drinks.groupby('continent').spirit_servings.agg(['mean', 'min', 'max'])

Out[269]:

meanminmax
continent
AF16.3396230152
AS60.8409090326
EU132.5555560373
OC58.4375000254
SA114.75000025302

練習4-Apply函式

探索1960 - 2014 美國犯罪資料

image description

步驟1 匯入必要的庫

In [16]:

# 運行以下代碼
import numpy as np
import pandas as pd

步驟2 從以下地址匯入資料集

In [27]:

# 運行以下代碼
path4 = './exercise_data/US_Crime_Rates_1960_2014.csv'    # "US_Crime_Rates_1960_2014.csv"

步驟3 將資料框命名為crime

In [28]:

# 運行以下代碼
crime = pd.read_csv(path4)
crime.head()

Out[28]:

YearPopulationTotalViolentPropertyMurderForcible_RapeRobberyAggravated_assaultBurglaryLarceny_TheftVehicle_Theft
01960179323175338420028846030957009110171901078401543209121001855400328200
11961182992000348800028939031986008740172201066701567609496001913000336000
21962185771000375220030151034507008530175501108601645709943002089600366800
319631884830004109500316970379250086401765011647017421010864002297800408300
419641911410004564600364220420040093602142013039020305012132002514400472800

步驟4 每一列(column)的資料型別是什么樣的?

In [29]:

# 運行以下代碼
crime.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 55 entries, 0 to 54
Data columns (total 12 columns):
Year                  55 non-null int64
Population            55 non-null int64
Total                 55 non-null int64
Violent               55 non-null int64
Property              55 non-null int64
Murder                55 non-null int64
Forcible_Rape         55 non-null int64
Robbery               55 non-null int64
Aggravated_assault    55 non-null int64
Burglary              55 non-null int64
Larceny_Theft         55 non-null int64
Vehicle_Theft         55 non-null int64
dtypes: int64(12)
memory usage: 5.2 KB

注意到了嗎,Year的資料型別為 int64,但是pandas有一個不同的資料型別去處理時間序列(time series),我們現在來看看,

步驟5 將Year的資料型別轉換為 datetime64

In [30]:

# 運行以下代碼
crime.Year = pd.to_datetime(crime.Year, format='%Y')
crime.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 55 entries, 0 to 54
Data columns (total 12 columns):
Year                  55 non-null datetime64[ns]
Population            55 non-null int64
Total                 55 non-null int64
Violent               55 non-null int64
Property              55 non-null int64
Murder                55 non-null int64
Forcible_Rape         55 non-null int64
Robbery               55 non-null int64
Aggravated_assault    55 non-null int64
Burglary              55 non-null int64
Larceny_Theft         55 non-null int64
Vehicle_Theft         55 non-null int64
dtypes: datetime64[ns](1), int64(11)
memory usage: 5.2 KB

步驟6 將列Year設定為資料框的索引

In [31]:

# 運行以下代碼
crime = crime.set_index('Year', drop = True)
crime.head()

Out[31]:

PopulationTotalViolentPropertyMurderForcible_RapeRobberyAggravated_assaultBurglaryLarceny_TheftVehicle_Theft
Year
1960-01-01179323175338420028846030957009110171901078401543209121001855400328200
1961-01-01182992000348800028939031986008740172201066701567609496001913000336000
1962-01-01185771000375220030151034507008530175501108601645709943002089600366800
1963-01-011884830004109500316970379250086401765011647017421010864002297800408300
1964-01-011911410004564600364220420040093602142013039020305012132002514400472800

步驟7 洗掉名為Total的列

In [32]:

# 運行以下代碼
del crime['Total']
crime.head()

Out[32]:

PopulationViolentPropertyMurderForcible_RapeRobberyAggravated_assaultBurglaryLarceny_TheftVehicle_Theft
Year
1960-01-0117932317528846030957009110171901078401543209121001855400328200
1961-01-0118299200028939031986008740172201066701567609496001913000336000
1962-01-0118577100030151034507008530175501108601645709943002089600366800
1963-01-01188483000316970379250086401765011647017421010864002297800408300
1964-01-01191141000364220420040093602142013039020305012132002514400472800

In [33]:

crime.resample('10AS').sum()

Out[33]:

PopulationViolentPropertyMurderForcible_RapeRobberyAggravated_assaultBurglaryLarceny_TheftVehicle_Theft
Year
1960-01-0119150531754134930451609001061802367201633510215852013321100265477005292100
1970-01-0121211932989607930913838001922305545704159020470212028486000531578009739900
1980-01-0123713700691407432811704890020643986563953831097619130330734947204025311935411
1990-01-01261282525817527048119053499211664998827574893010568963267500157767936614624418
2000-01-0129479691171396805610094436916306892249942303668652124215651766797029111412834
2010-01-011570146307607201744095950728674210591749809376414210125170304016983569080

步驟8 按照Year對資料框進行分組并求和

*注意Population這一列,若直接對其求和,是不正確的**

In [34]:

# 更多關于 .resample 的介紹
# (https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.resample.html)
# 更多關于 Offset Aliases的介紹 
# (http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases)
# 運行以下代碼
crimes = crime.resample('10AS').sum() # resample a time series per decades


# 用resample去得到“Population”列的最大值
population = crime['Population'].resample('10AS').max()

# 更新 "Population" 
crimes['Population'] = population

crimes

Out[34]:

PopulationViolentPropertyMurderForcible_RapeRobberyAggravated_assaultBurglaryLarceny_TheftVehicle_Theft
Year
1960-01-012013850004134930451609001061802367201633510215852013321100265477005292100
1970-01-012200990009607930913838001922305545704159020470212028486000531578009739900
1980-01-012482390001407432811704890020643986563953831097619130330734947204025311935411
1990-01-0127269081317527048119053499211664998827574893010568963267500157767936614624418
2000-01-013070065501396805610094436916306892249942303668652124215651766797029111412834
2010-01-01318857056607201744095950728674210591749809376414210125170304016983569080

步驟9 何時是美國歷史上生存最危險的年代?

In [279]:

# 運行以下代碼
crime.idxmax(0)

Out[279]:

Population           2014-01-01
Violent              1992-01-01
Property             1991-01-01
Murder               1991-01-01
Forcible_Rape        1992-01-01
Robbery              1991-01-01
Aggravated_assault   1993-01-01
Burglary             1980-01-01
Larceny_Theft        1991-01-01
Vehicle_Theft        1991-01-01
dtype: datetime64[ns]

練習5-合并

探索虛擬姓名資料

步驟1 匯入必要的庫

In [280]:

# 運行以下代碼
import numpy as np
import pandas as pd

步驟2 按照如下的元資料內容創建資料框

In [281]:

# 運行以下代碼
raw_data_1 = {
        'subject_id': ['1', '2', '3', '4', '5'],
        'first_name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'], 
        'last_name': ['Anderson', 'Ackerman', 'Ali', 'Aoni', 'Atiches']}

raw_data_2 = {
        'subject_id': ['4', '5', '6', '7', '8'],
        'first_name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'], 
        'last_name': ['Bonder', 'Black', 'Balwner', 'Brice', 'Btisan']}

raw_data_3 = {
        'subject_id': ['1', '2', '3', '4', '5', '7', '8', '9', '10', '11'],
        'test_id': [51, 15, 15, 61, 16, 14, 15, 1, 61, 16]}

步驟3 將上述的資料框分別命名為data1, data2, data3

In [282]:

# 運行以下代碼
data1 = pd.DataFrame(raw_data_1, columns = ['subject_id', 'first_name', 'last_name'])
data2 = pd.DataFrame(raw_data_2, columns = ['subject_id', 'first_name', 'last_name'])
data3 = pd.DataFrame(raw_data_3, columns = ['subject_id','test_id'])

步驟4 將data1data2兩個資料框按照行的維度進行合并,命名為all_data

In [283]:

# 運行以下代碼
all_data = pd.concat([data1, data2])
all_data

Out[283]:

subject_idfirst_namelast_name
01AlexAnderson
12AmyAckerman
23AllenAli
34AliceAoni
45AyoungAtiches
04BillyBonder
15BrianBlack
26BranBalwner
37BryceBrice
48BettyBtisan

步驟5 將data1data2兩個資料框按照列的維度進行合并,命名為all_data_col

In [284]:

# 運行以下代碼
all_data_col = pd.concat([data1, data2], axis = 1)
all_data_col

Out[284]:

subject_idfirst_namelast_namesubject_idfirst_namelast_name
01AlexAnderson4BillyBonder
12AmyAckerman5BrianBlack
23AllenAli6BranBalwner
34AliceAoni7BryceBrice
45AyoungAtiches8BettyBtisan

步驟6 列印data3

In [285]:

# 運行以下代碼
data3

Out[285]:

subject_idtest_id
0151
1215
2315
3461
4516
5714
6815
791
81061
91116

步驟7 按照subject_id的值對all_datadata3作合并

In [286]:

# 運行以下代碼
pd.merge(all_data, data3, on='subject_id')

Out[286]:

subject_idfirst_namelast_nametest_id
01AlexAnderson51
12AmyAckerman15
23AllenAli15
34AliceAoni61
44BillyBonder61
55AyoungAtiches16
65BrianBlack16
77BryceBrice14
88BettyBtisan15

步驟8 對data1data2按照subject_id作連接

In [287]:

# 運行以下代碼
pd.merge(data1, data2, on='subject_id', how='inner')

Out[287]:

subject_idfirst_name_xlast_name_xfirst_name_ylast_name_y
04AliceAoniBillyBonder
15AyoungAtichesBrianBlack

步驟9 找到 data1data2 合并之后的所有匹配結果

In [288]:

# 運行以下代碼
pd.merge(data1, data2, on='subject_id', how='outer')

Out[288]:

subject_idfirst_name_xlast_name_xfirst_name_ylast_name_y
01AlexAndersonNaNNaN
12AmyAckermanNaNNaN
23AllenAliNaNNaN
34AliceAoniBillyBonder
45AyoungAtichesBrianBlack
56NaNNaNBranBalwner
67NaNNaNBryceBrice
78NaNNaNBettyBtisan

練習6-統計

探索風速資料

image description

步驟1 匯入必要的庫

In [289]:

# 運行以下代碼
import pandas as pd
import datetime

步驟2 從以下地址匯入資料

In [290]:

import pandas as pd

In [35]:

# 運行以下代碼
path6 = "./exercise_data/wind.data"  # wind.data

步驟3 將資料作存盤并且設定前三列為合適的索引

In [292]:

import datetime

In [293]:

# 運行以下代碼
data = pd.read_table(path6, sep = "\s+", parse_dates = [[0,1,2]]) 
data.head()

Out[293]:

Yr_Mo_DyRPTVALROSKILSHABIRDUBCLAMULCLOBELMAL
02061-01-0115.0414.9613.179.29NaN9.8713.6710.2510.8312.5818.5015.04
12061-01-0214.71NaN10.836.5012.627.6711.5010.049.799.6717.5413.83
22061-01-0318.5016.8812.3310.1311.176.1711.25NaN8.507.6712.7512.71
32061-01-0410.586.6311.754.584.542.888.631.795.835.885.4610.88
42061-01-0513.3313.2511.426.1710.718.2111.926.5410.9210.3412.9211.83

步驟4 2061年?我們真的有這一年的資料?創建一個函式并用它去修復這個bug

In [294]:

# 運行以下代碼
def fix_century(x):
    year = x.year - 100 if x.year > 1989 else x.year
    return datetime.date(year, x.month, x.day)

# apply the function fix_century on the column and replace the values to the right ones
data['Yr_Mo_Dy'] = data['Yr_Mo_Dy'].apply(fix_century)

# data.info()
data.head()

Out[294]:

Yr_Mo_DyRPTVALROSKILSHABIRDUBCLAMULCLOBELMAL
01961-01-0115.0414.9613.179.29NaN9.8713.6710.2510.8312.5818.5015.04
11961-01-0214.71NaN10.836.5012.627.6711.5010.049.799.6717.5413.83
21961-01-0318.5016.8812.3310.1311.176.1711.25NaN8.507.6712.7512.71
31961-01-0410.586.6311.754.584.542.888.631.795.835.885.4610.88
41961-01-0513.3313.2511.426.1710.718.2111.926.5410.9210.3412.9211.83

步驟5 將日期設為索引,注意資料型別,應該是datetime64[ns]

In [295]:

# 運行以下代碼
# transform Yr_Mo_Dy it to date type datetime64
data["Yr_Mo_Dy"] = pd.to_datetime(data["Yr_Mo_Dy"])

# set 'Yr_Mo_Dy' as the index
data = data.set_index('Yr_Mo_Dy')

data.head()
# data.info()

Out[295]:

RPTVALROSKILSHABIRDUBCLAMULCLOBELMAL
Yr_Mo_Dy
1961-01-0115.0414.9613.179.29NaN9.8713.6710.2510.8312.5818.5015.04
1961-01-0214.71NaN10.836.5012.627.6711.5010.049.799.6717.5413.83
1961-01-0318.5016.8812.3310.1311.176.1711.25NaN8.507.6712.7512.71
1961-01-0410.586.6311.754.584.542.888.631.795.835.885.4610.88
1961-01-0513.3313.2511.426.1710.718.2111.926.5410.9210.3412.9211.83

步驟6 對應每一個location,一共有多少資料值缺失

In [296]:

# 運行以下代碼
data.isnull().sum()

Out[296]:

RPT    6
VAL    3
ROS    2
KIL    5
SHA    2
BIR    0
DUB    3
CLA    2
MUL    3
CLO    1
BEL    0
MAL    4
dtype: int64

步驟7 對應每一個location,一共有多少完整的資料值

In [297]:

# 運行以下代碼
data.shape[0] - data.isnull().sum()

Out[297]:

RPT    6568
VAL    6571
ROS    6572
KIL    6569
SHA    6572
BIR    6574
DUB    6571
CLA    6572
MUL    6571
CLO    6573
BEL    6574
MAL    6570
dtype: int64

步驟8 對于全體資料,計算風速的平均值

In [298]:

# 運行以下代碼
data.mean().mean()

Out[298]:

10.227982360836924

步驟9 創建一個名為loc_stats的資料框去計算并存盤每個location的風速最小值,最大值,平均值和標準差

In [299]:

# 運行以下代碼
loc_stats = pd.DataFrame()

loc_stats['min'] = data.min() # min
loc_stats['max'] = data.max() # max 
loc_stats['mean'] = data.mean() # mean
loc_stats['std'] = data.std() # standard deviations

loc_stats

Out[299]:

minmaxmeanstd
RPT0.6735.8012.3629875.618413
VAL0.2133.3710.6443145.267356
ROS1.5033.8411.6605265.008450
KIL0.0028.466.3064683.605811
SHA0.1337.5410.4558344.936125
BIR0.0026.167.0922543.968683
DUB0.0030.379.7973434.977555
CLA0.0031.088.4950534.499449
MUL0.0025.888.4935904.166872
CLO0.0428.218.7073324.503954
BEL0.1342.3813.1210075.835037
MAL0.6742.5415.5990796.699794

步驟10 創建一個名為day_stats的資料框去計算并存盤所有location的風速最小值,最大值,平均值和標準差

In [300]:

# 運行以下代碼
# create the dataframe
day_stats = pd.DataFrame()

# this time we determine axis equals to one so it gets each row.
day_stats['min'] = data.min(axis = 1) # min
day_stats['max'] = data.max(axis = 1) # max 
day_stats['mean'] = data.mean(axis = 1) # mean
day_stats['std'] = data.std(axis = 1) # standard deviations

day_stats.head()

Out[300]:

minmaxmeanstd
Yr_Mo_Dy
1961-01-019.2918.5013.0181822.808875
1961-01-026.5017.5411.3363643.188994
1961-01-036.1718.5011.6418183.681912
1961-01-041.7911.756.6191673.198126
1961-01-056.1713.3310.6300002.445356

步驟11 對于每一個location,計算一月份的平均風速

注意,1961年的1月和1962年的1月應該區別對待

In [301]:

# 運行以下代碼
# creates a new column 'date' and gets the values from the index
data['date'] = data.index

# creates a column for each value from date
data['month'] = data['date'].apply(lambda date: date.month)
data['year'] = data['date'].apply(lambda date: date.year)
data['day'] = data['date'].apply(lambda date: date.day)

# gets all value from the month 1 and assign to janyary_winds
january_winds = data.query('month == 1')

# gets the mean from january_winds, using .loc to not print the mean of month, year and day
january_winds.loc[:,'RPT':"MAL"].mean()

Out[301]:

RPT    14.847325
VAL    12.914560
ROS    13.299624
KIL     7.199498
SHA    11.667734
BIR     8.054839
DUB    11.819355
CLA     9.512047
MUL     9.543208
CLO    10.053566
BEL    14.550520
MAL    18.028763
dtype: float64

步驟12 對于資料記錄按照年為頻率取樣

In [302]:

# 運行以下代碼
data.query('month == 1 and day == 1')

Out[302]:

RPTVALROSKILSHABIRDUBCLAMULCLOBELMALdatemonthyearday
Yr_Mo_Dy
1961-01-0115.0414.9613.179.29NaN9.8713.6710.2510.8312.5818.5015.041961-01-01119611
1962-01-019.293.4211.543.502.211.9610.412.793.545.174.387.921962-01-01119621
1963-01-0115.5913.6219.798.3812.2510.0023.4515.7113.5914.3717.5834.131963-01-01119631
1964-01-0125.8022.1318.2113.2521.2914.7914.1219.5813.2516.7528.9621.001964-01-01119641
1965-01-019.5411.929.004.386.085.2110.256.085.718.6312.0417.411965-01-01119651
1966-01-0122.0421.5017.0812.7522.1715.5921.7918.1216.6617.8328.3323.791966-01-01119661
1967-01-016.464.466.503.216.673.7911.383.837.719.0810.6720.911967-01-01119671
1968-01-0130.0417.8816.2516.2521.7912.5418.1616.6218.7517.6222.2527.291968-01-01119681
1969-01-016.131.635.411.082.541.008.502.424.586.349.1716.711969-01-01119691
1970-01-019.592.9611.793.426.134.089.004.467.293.507.3313.001970-01-01119701
1971-01-013.710.794.710.171.421.044.630.751.541.084.219.541971-01-01119711
1972-01-019.293.6314.544.256.754.4213.005.3310.048.548.7119.171972-01-01119721
1973-01-0116.5015.9214.627.418.2911.2113.547.7910.4610.7913.379.711973-01-01119731
1974-01-0123.2116.5416.089.7515.8311.469.5413.5413.8316.6617.2125.291974-01-01119741
1975-01-0114.0413.5411.295.4612.585.588.128.969.295.177.7111.631975-01-01119751
1976-01-0118.3417.6714.838.0016.6210.1313.179.0413.135.7511.3814.961976-01-01119761
1977-01-0120.0411.9220.259.139.298.0410.755.889.009.0014.8825.701977-01-01119771
1978-01-018.337.127.713.548.507.5014.7110.0011.8310.0015.0920.461978-01-01119781

步驟13 對于資料記錄按照月為頻率取樣

In [303]:

# 運行以下代碼
data.query('day == 1')

Out[303]:

RPTVALROSKILSHABIRDUBCLAMULCLOBELMALdatemonthyearday
Yr_Mo_Dy
1961-01-0115.0414.9613.179.29NaN9.8713.6710.2510.8312.5818.5015.041961-01-01119611
1961-02-0114.2515.129.045.8812.087.1710.173.636.505.509.178.001961-02-01219611
1961-03-0112.6713.1311.796.429.798.5410.2513.29NaN12.2120.62NaN1961-03-01319611
1961-04-018.386.348.336.759.339.5411.678.2111.216.4611.967.171961-04-01419611
1961-05-0115.8713.8815.379.7913.4610.179.9614.049.759.9218.6311.121961-05-01519611
1961-06-0115.929.5912.048.7911.546.049.758.299.3310.3410.6712.121961-06-01619611
1961-07-017.216.837.714.428.464.796.716.005.797.966.968.711961-07-01719611
1961-08-019.595.095.544.638.295.254.215.255.375.418.389.081961-08-01819611
1961-09-015.581.134.963.044.252.254.632.713.676.004.795.411961-09-01919611
1961-10-0114.2512.877.878.0013.007.755.839.007.085.2911.794.041961-10-011019611
1961-11-0113.2113.1314.338.5412.1710.2113.0812.1710.9213.5420.1720.041961-11-011119611
1961-12-019.677.758.003.966.002.757.252.505.585.587.7911.171961-12-011219611
1962-01-019.293.4211.543.502.211.9610.412.793.545.174.387.921962-01-01119621
1962-02-0119.1213.9612.2110.5815.7110.6315.7111.0813.1712.6217.6722.711962-02-01219621
1962-03-018.214.839.004.836.002.217.961.874.083.924.085.411962-03-01319621
1962-04-0114.3312.2511.8710.3714.9211.0019.7911.6714.0915.4616.6223.581962-04-01419621
1962-05-019.629.543.583.338.753.752.252.581.672.377.293.251962-05-01519621
1962-06-015.886.298.675.215.004.255.915.414.799.255.2510.711962-06-01619621
1962-07-018.674.176.926.718.175.6611.179.388.7511.1210.2517.081962-07-01719621
1962-08-014.585.376.042.297.873.714.462.584.004.797.217.461962-08-01819621
1962-09-0110.0012.0810.969.259.297.627.418.757.679.6214.5811.921962-09-01919621
1962-10-0114.587.8319.2110.0811.548.3813.2910.638.2112.9218.0518.121962-10-011019621
1962-11-0116.8813.2516.008.9613.4611.4610.4610.1710.3713.2114.8315.161962-11-011119621
1962-12-0118.3815.4111.756.7912.218.048.4210.835.669.0811.5011.501962-12-011219621
1963-01-0115.5913.6219.798.3812.2510.0023.4515.7113.5914.3717.5834.131963-01-01119631
1963-02-0115.417.6224.6711.429.218.1714.047.547.5410.0810.1717.671963-02-01219631
1963-03-0116.7519.6717.678.8719.0815.3716.2114.2911.299.2119.9219.791963-03-01319631
1963-04-0110.549.5912.467.339.469.5911.7911.879.7910.7113.3718.211963-04-01419631
1963-05-0118.7914.1713.5911.6314.1711.9614.4612.4612.8713.9615.2921.621963-05-01519631
1963-06-0113.376.8712.008.5010.049.4210.9212.9611.7911.0410.9213.671963-06-01619631
...................................................
1976-07-018.501.756.582.132.752.215.372.045.884.504.9610.631976-07-01719761
1976-08-0113.008.388.635.8312.928.2513.009.4210.5811.3414.2120.251976-08-01819761
1976-09-0111.8711.007.386.877.758.3310.346.4610.179.2912.7519.551976-09-01919761
1976-10-0110.966.7110.414.637.585.045.045.546.503.926.795.001976-10-011019761
1976-11-0113.9615.6710.296.4612.799.0810.009.6710.2111.6323.0921.961976-11-011119761
1976-12-0113.4616.429.214.5410.758.6710.884.838.795.918.8313.671976-12-011219761
1977-01-0120.0411.9220.259.139.298.0410.755.889.009.0014.8825.701977-01-01119771
1977-02-0111.839.7111.004.258.588.716.175.668.297.5811.7116.501977-02-01219771
1977-03-018.6314.8310.293.756.638.795.008.127.876.4213.5413.671977-03-01319771
1977-04-0121.6716.0017.3313.5920.8315.9625.6217.6219.4120.6724.3730.091977-04-01419771
1977-05-016.427.128.673.584.584.006.756.133.334.5019.2112.381977-05-01519771
1977-06-017.085.259.712.832.213.505.291.422.000.925.215.631977-06-01619771
1977-07-0115.4116.2917.086.2511.8311.8312.2910.5810.417.2117.377.831977-07-01719771
1977-08-014.332.964.422.330.961.084.961.872.332.0410.509.831977-08-01819771
1977-09-0117.3716.3316.838.5814.4611.8315.0913.9213.2913.8823.2925.171977-09-01919771
1977-10-0116.7515.3412.259.4216.3811.3818.5013.9214.0914.4622.3429.671977-10-011019771
1977-11-0116.7111.5412.174.178.547.1711.126.468.256.2111.0415.631977-11-011119771
1977-12-0113.3710.9212.422.375.796.138.967.386.295.718.5412.421977-12-011219771
1978-01-018.337.127.713.548.507.5014.7110.0011.8310.0015.0920.461978-01-01119781
1978-02-0127.2524.2118.1617.4627.5418.0520.9625.0420.0417.5027.7121.121978-02-01219781
1978-03-0115.046.2116.047.876.426.6712.298.0010.589.335.4117.001978-03-01319781
1978-04-013.427.582.711.383.462.082.674.754.831.677.3313.671978-04-01419781
1978-05-0110.5412.219.085.2911.0010.0811.1713.7511.8711.7912.8727.161978-05-01519781
1978-06-0110.3711.426.466.0411.257.506.465.967.795.465.5010.411978-06-01619781
1978-07-0112.4610.6311.176.7512.929.0412.429.6212.088.0414.0416.171978-07-01719781
1978-08-0119.3315.0920.178.8312.6210.419.3312.339.509.9215.7518.001978-08-01819781
1978-09-018.426.139.875.253.215.717.253.507.336.507.6215.961978-09-01919781
1978-10-019.506.8310.503.886.134.584.216.506.386.5410.6314.091978-10-011019781
1978-11-0113.5916.7511.257.0811.048.338.1711.2910.7511.2523.1325.001978-11-011119781
1978-12-0121.2916.2924.0412.7918.2119.2921.5417.2116.7117.8317.7525.701978-12-011219781

216 rows × 16 columns

練習7-可視化

探索泰坦尼克災難資料

步驟1 匯入必要的庫

In [304]:

# 運行以下代碼
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

%matplotlib inline

步驟2 從以下地址匯入資料

In [36]:

# 運行以下代碼
path7 = '../input/pandas_exercise/pandas_exercise/exercise_data/train.csv'  # train.csv

步驟3 將資料框命名為titanic

In [306]:

# 運行以下代碼
titanic = pd.read_csv(path7)
titanic.head()

Out[306]:

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
0103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S
4503Allen, Mr. William Henrymale35.0003734508.0500NaNS

步驟4 將PassengerId設定為索引

In [307]:

# 運行以下代碼
titanic.set_index('PassengerId').head()

Out[307]:

SurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
PassengerId
103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS
211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C
313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS
411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S
503Allen, Mr. William Henrymale35.0003734508.0500NaNS

步驟5 繪制一個展示男女乘客比例的扇形圖

In [308]:

# 運行以下代碼
# sum the instances of males and females
males = (titanic['Sex'] == 'male').sum()
females = (titanic['Sex'] == 'female').sum()

# put them into a list called proportions
proportions = [males, females]

# Create a pie chart
plt.pie(
    # using proportions
    proportions,
    
    # with the labels being officer names
    labels = ['Males', 'Females'],
    
    # with no shadows
    shadow = False,
    
    # with colors
    colors = ['blue','red'],
    
    # with one slide exploded out
    explode = (0.15 , 0),
    
    # with the start angle at 90%
    startangle = 90,
    
    # with the percent listed as a fraction
    autopct = '%1.1f%%'
    )

# View the plot drop above
plt.axis('equal')

# Set labels
plt.title("Sex Proportion")

# View the plot
plt.tight_layout()
plt.show()

步驟6 繪制一個展示船票Fare, 與乘客年齡和性別的散點圖

In [309]:

# 運行以下代碼
# creates the plot using
lm = sns.lmplot(x = 'Age', y = 'Fare', data = titanic, hue = 'Sex', fit_reg=False)

# set title
lm.set(title = 'Fare x Age')

# get the axes object and tweak it
axes = lm.axes
axes[0,0].set_ylim(-5,)
axes[0,0].set_xlim(-5,85)

Out[309]:

(-5, 85)

步驟7 有多少人生還?

In [310]:

# 運行以下代碼
titanic.Survived.sum()

Out[310]:

342

步驟8 繪制一個展示船票價格的直方圖

In [311]:

# 運行以下代碼
# sort the values from the top to the least value and slice the first 5 items
df = titanic.Fare.sort_values(ascending = False)
df

# create bins interval using numpy
binsVal = np.arange(0,600,10)
binsVal

# create the plot
plt.hist(df, bins = binsVal)

# Set the title and labels
plt.xlabel('Fare')
plt.ylabel('Frequency')
plt.title('Fare Payed Histrogram')

# show the plot
plt.show()

行業資料:添加即可領取PPT模板、簡歷模板、行業經典書籍PDF,
面試題庫:歷年經典,熱乎的大廠面試真題,持續更新中,添加獲取,
學習資料:含Python、爬蟲、資料分析、演算法等學習視頻和檔案,添加獲取
交流加群:大佬指點迷津,你的問題往往有人遇到過,技識訓助交流,

轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/300248.html

標籤:python

上一篇:抽獎送書有“黑幕”?!揭曉“黑心”博主的抽獎“黑幕”!順便評論區再抽一位送中秋禮物!

下一篇:基于案例的軟體構造教程第一章課后題答案

標籤雲
其他(157675) Python(38076) JavaScript(25376) Java(17977) C(15215) 區塊鏈(8255) C#(7972) AI(7469) 爪哇(7425) MySQL(7132) html(6777) 基礎類(6313) sql(6102) 熊猫(6058) PHP(5869) 数组(5741) R(5409) Linux(5327) 反应(5209) 腳本語言(PerlPython)(5129) 非技術區(4971) Android(4554) 数据框(4311) css(4259) 节点.js(4032) C語言(3288) json(3245) 列表(3129) 扑(3119) C++語言(3117) 安卓(2998) 打字稿(2995) VBA(2789) Java相關(2746) 疑難問題(2699) 细绳(2522) 單片機工控(2479) iOS(2429) ASP.NET(2402) MongoDB(2323) 麻木的(2285) 正则表达式(2254) 字典(2211) 循环(2198) 迅速(2185) 擅长(2169) 镖(2155) 功能(1967) .NET技术(1958) Web開發(1951) python-3.x(1918) HtmlCss(1915) 弹簧靴(1913) C++(1909) xml(1889) PostgreSQL(1872) .NETCore(1853) 谷歌表格(1846) Unity3D(1843) for循环(1842)

熱門瀏覽
  • 【C++】Microsoft C++、C 和匯編程式檔案

    ......

    uj5u.com 2020-09-10 00:57:23 more
  • 例外宣告

    相比于斷言適用于排除邏輯上不可能存在的狀態,例外通常是用于邏輯上可能發生的錯誤。 例外宣告 Item 1:當函式不可能拋出例外或不能接受拋出例外時,使用noexcept 理由 如果不打算拋出例外的話,程式就會認為無法處理這種錯誤,并且應當盡早終止,如此可以有效地阻止例外的傳播與擴散。 示例 //不可 ......

    uj5u.com 2020-09-10 00:57:27 more
  • Codeforces 1400E Clear the Multiset(貪心 + 分治)

    鏈接:https://codeforces.com/problemset/problem/1400/E 來源:Codeforces 思路:給你一個陣列,現在你可以進行兩種操作,操作1:將一段沒有 0 的區間進行減一的操作,操作2:將 i 位置上的元素歸零。最終問:將這個陣列的全部元素歸零后操作的最少 ......

    uj5u.com 2020-09-10 00:57:30 more
  • UVA11610 【Reverse Prime】

    本人看到此題沒有翻譯,就附帶了一個自己的翻譯版本 思考 這一題,它的第一個要求是找出所有 $7$ 位反向質數及其質因數的個數。 我們應該需要質數篩篩選1~$10^{7}$的所有數,這里就不慢慢介紹了。但是,重讀題,我們突然發現反向質數都是 $7$ 位,而將它反過來后的數字卻是 $6$ 位數,這就說明 ......

    uj5u.com 2020-09-10 00:57:36 more
  • 統計區間素數數量

    1 #pragma GCC optimize(2) 2 #include <bits/stdc++.h> 3 using namespace std; 4 bool isprime[1000000010]; 5 vector<int> prime; 6 inline int getlist(int ......

    uj5u.com 2020-09-10 00:57:47 more
  • C/C++編程筆記:C++中的 const 變數詳解,教你正確認識const用法

    1、C中的const 1、區域const變數存放在堆疊區中,會分配記憶體(也就是說可以通過地址間接修改變數的值)。測驗代碼如下: 運行結果: 2、全域const變數存放在只讀資料段(不能通過地址修改,會發生寫入錯誤), 默認為外部聯編,可以給其他源檔案使用(需要用extern關鍵字修飾) 運行結果: ......

    uj5u.com 2020-09-10 00:58:04 more
  • 【C++犯錯記錄】VS2019 MFC添加資源不懂如何修改資源宏ID

    1. 首先在資源視圖中,添加資源 2. 點擊新添加的資源,復制自動生成的ID 3. 在解決方案資源管理器中找到Resource.h檔案,編輯,使用整個專案搜索和替換的方式快速替換 宏宣告 4. Ctrl+Shift+F 全域搜索,點擊查找全部,然后逐個替換 5. 為什么使用搜索替換而不使用屬性視窗直 ......

    uj5u.com 2020-09-10 00:59:11 more
  • 【C++犯錯記錄】VS2019 MFC不懂的批量添加資源

    1. 打開資源頭檔案Resource.h,在其中預先定義好宏 ID(不清楚其實ID值應該設定多少,可以先新建一個相同的資源項,再在這個資源的ID值的基礎上遞增即可) 2. 在資源視圖中選中專案資源,按F7編輯資源檔案,按 ID 型別 相對路徑的形式添加 資源。(別忘了先把檔案拷貝到專案中的res檔案 ......

    uj5u.com 2020-09-10 01:00:19 more
  • C/C++編程筆記:關于C++的參考型別,專供新手入門使用

    今天要講的是C++中我最喜歡的一個用法——參考,也叫別名。 參考就是給一個變數名取一個變數名,方便我們間接地使用這個變數。我們可以給一個變數創建N個參考,這N + 1個變數共享了同一塊記憶體區域。(參考型別的變數會占用記憶體空間,占用的記憶體空間的大小和指標型別的大小是相同的。雖然參考是一個物件的別名,但 ......

    uj5u.com 2020-09-10 01:00:22 more
  • 【C/C++編程筆記】從頭開始學習C ++:初學者完整指南

    眾所周知,C ++的學習曲線陡峭,但是花時間學習這種語言將為您的職業帶來奇跡,并使您與其他開發人員區分開。您會更輕松地學習新語言,形成真正的解決問題的技能,并在編程的基礎上打下堅實的基礎。 C ++將幫助您養成良好的編程習慣(即清晰一致的編碼風格,在撰寫代碼時注釋代碼,并限制類內部的可見性),并且由 ......

    uj5u.com 2020-09-10 01:00:41 more
最新发布
  • Rust中的智能指標:Box<T> Rc<T> Arc<T> Cell<T> RefCell<T> Weak

    Rust中的智能指標是什么 智能指標(smart pointers)是一類資料結構,是擁有資料所有權和額外功能的指標。是指標的進一步發展 指標(pointer)是一個包含記憶體地址的變數的通用概念。這個地址參考,或 ” 指向”(points at)一些其 他資料 。參考以 & 符號為標志并借用了他們所 ......

    uj5u.com 2023-04-20 07:24:10 more
  • Java的值傳遞和參考傳遞

    值傳遞不會改變本身,參考傳遞(如果傳遞的值需要實體化到堆里)如果發生修改了會改變本身。 1.基本資料型別都是值傳遞 package com.example.basic; public class Test { public static void main(String[] args) { int ......

    uj5u.com 2023-04-20 07:24:04 more
  • [2]SpinalHDL教程——Scala簡單入門

    第一個 Scala 程式 shell里面輸入 $ scala scala> 1 + 1 res0: Int = 2 scala> println("Hello World!") Hello World! 檔案形式 object HelloWorld { /* 這是我的第一個 Scala 程式 * 以 ......

    uj5u.com 2023-04-20 07:23:58 more
  • 理解函式指標和回呼函式

    理解 函式指標 指向函式的指標。比如: 理解函式指標的偽代碼 void (*p)(int type, char *data); // 定義一個函式指標p void func(int type, char *data); // 宣告一個函式func p = func; // 將指標p指向函式func ......

    uj5u.com 2023-04-20 07:23:52 more
  • Django筆記二十五之資料庫函式之日期函式

    本文首發于公眾號:Hunter后端 原文鏈接:Django筆記二十五之資料庫函式之日期函式 日期函式主要介紹兩個大類,Extract() 和 Trunc() Extract() 函式作用是提取日期,比如我們可以提取一個日期欄位的年份,月份,日等資料 Trunc() 的作用則是截取,比如 2022-0 ......

    uj5u.com 2023-04-20 07:23:45 more
  • 一天吃透JVM面試八股文

    什么是JVM? JVM,全稱Java Virtual Machine(Java虛擬機),是通過在實際的計算機上仿真模擬各種計算機功能來實作的。由一套位元組碼指令集、一組暫存器、一個堆疊、一個垃圾回收堆和一個存盤方法域等組成。JVM屏蔽了與作業系統平臺相關的資訊,使得Java程式只需要生成在Java虛擬機 ......

    uj5u.com 2023-04-20 07:23:31 more
  • 使用Java接入小程式訂閱訊息!

    更新完微信服務號的模板訊息之后,我又趕緊把微信小程式的訂閱訊息給實作了!之前我一直以為微信小程式也是要企業才能申請,沒想到小程式個人就能申請。 訊息推送平臺🔥推送下發【郵件】【短信】【微信服務號】【微信小程式】【企業微信】【釘釘】等訊息型別。 https://gitee.com/zhongfuch ......

    uj5u.com 2023-04-20 07:22:59 more
  • java -- 緩沖流、轉換流、序列化流

    緩沖流 緩沖流, 也叫高效流, 按照資料型別分類: 位元組緩沖流:BufferedInputStream,BufferedOutputStream 字符緩沖流:BufferedReader,BufferedWriter 緩沖流的基本原理,是在創建流物件時,會創建一個內置的默認大小的緩沖區陣列,通過緩沖 ......

    uj5u.com 2023-04-20 07:22:49 more
  • Java-SpringBoot-Range請求頭設定實作視頻分段傳輸

    老實說,人太懶了,現在基本都不喜歡寫筆記了,但是網上有關Range請求頭的文章都太水了 下面是抄的一段StackOverflow的代碼...自己大修改過的,寫的注釋挺全的,應該直接看得懂,就不解釋了 寫的不好...只是希望能給視頻網站開發的新手一點點幫助吧. 業務場景:視頻分段傳輸、視頻多段傳輸(理 ......

    uj5u.com 2023-04-20 07:22:42 more
  • Windows 10開發教程_編程入門自學教程_菜鳥教程-免費教程分享

    教程簡介 Windows 10開發入門教程 - 從簡單的步驟了解Windows 10開發,從基本到高級概念,包括簡介,UWP,第一個應用程式,商店,XAML控制元件,資料系結,XAML性能,自適應設計,自適應UI,自適應代碼,檔案管理,SQLite資料庫,應用程式到應用程式通信,應用程式本地化,應用程式 ......

    uj5u.com 2023-04-20 07:22:35 more