我有以下問題。我想從三個不同的資料集中繪制三個箱線圖。我的代碼:
fig, (ax1, ax2, ax3) = plt.subplots(1, 3)
fig.suptitle(f'Pocet requestu (podobne IP dohromady). 2021{koncovka}')
ax1.boxplot(data_count_G["ip_count"])
ax1.set_xlabel(f"Google bot, n = {len(data_count_G)}")
ax1.set_ylabel("requests from ip")
ax2.boxplot(data_count_S["ip_count"])
ax2.set_xlabel(f"Seznam bot, n = {len(data_count_S)}")
ax2.set_ylabel("requests from ip")
ax3.boxplot(data_count_nGS["ip_count"])
ax3.set_xlabel(f"Bez Google bota, n = {len(data_count_nGS)}")
ax3.set_ylabel("requests from ip")
plt.tight_layout()
plt.savefig('box_request_count_GSnG.png', bbox_inches='tight')
plt.close()
但是,結果如下所示:

當我這樣做時data_count_nGS.info():
<class 'pandas.core.frame.DataFrame'>
Int64Index: 92774 entries, 0 to 20899956
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 zacatek_ip 92773 non-null object
1 ip_count 92773 non-null float64
dtypes: float64(1), object(1)
memory usage: 2.1 MB
當我這樣做時data_count_nGS.describe():
ip_count
count 92773.000000
mean 209.073351
std 1430.188719
min 1.000000
25% 70.000000
50% 107.000000
75% 194.000000
max 253248.000000
問題是否可能出在最后一個資料幀(92774)的大小上?請問我該如何解決?
uj5u.com熱心網友回復:
您可能需要洗掉 NaN 值。如ax3.boxplot(data_count_nGS["ip_count"].dropna()). 您也可以嘗試 seaborn 的 boxplot,它會自動洗掉 NaN。
當 max 遠離第 75個百分位數時,盒子將縮小到一條非常細的線,因為一些例外值會很遠。您可能希望更改資料限制以更好地查看主箱線圖。
以下示例代碼試圖模擬這種情況:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
df = pd.DataFrame({"ip_count": np.round(((np.random.rand(100_000) ** 3) 1) ** 19)})
df.iloc[-1, :] = np.nan
fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize=(12, 5))
ax1.boxplot(df["ip_count"].dropna())
ax1.set_title('default ylim, full range')
ax2.boxplot(df["ip_count"].dropna())
ax2.set_ylim(np.percentile(df["ip_count"].dropna(), [0, 80]))
ax2.set_title('ylim from 0th to 80th percentile')
sns.boxplot(y=df["ip_count"], ax=ax3)
ax3.set_ylim(np.percentile(df["ip_count"].dropna(), [0, 80]))
ax3.set_title('seaborn with ylim\nfrom 0th to 80th percentile')
plt.tight_layout()
plt.show()

轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/371231.html
標籤:Python matplotlib
