嘗試使用pop從pd.Series串列中洗掉專案但無法正常作業-有解無憂

我正在嘗試使用諸如分位數范圍和 zscore 之類的統計方法創建一個類來查找資料集中的例外值。我想知道為什么有些例外值會從我的 pd.Series 串列中洗掉，而有些例外值沒有給出空條件。

class OutliersDetector:
  def __init__(self, X):
    self.outliers = []
    self.X = X

  def detect_range(self):
    self.reset_outliers()
    # to implement
    self.remove_empty_items()

  def detect_zscore(self):
    self.reset_outliers()
    zscore = np.abs(stats.zscore(self.X))
    threshold_std = 3
    for index, col_name in enumerate(self.X.columns): # X and zscore always have the same shape
      col = zscore[:, index]
      self.outliers.append( pd.Series(col[col >= threshold_std], name=col_name) )
    self.remove_empty_items()

  # none of the if statements i tried worked
  def remove_empty_items(self):
    for index, item in enumerate(self.outliers):
      #if item.size == 0:
      #if len(item.index) == 0:
      if item.empty:
        print("[no outliers] {}".format(item.name))
        self.outliers.pop(index)

  def reset_outliers(self):
    self.outliers = []

  def show_outliers(self):
    for item in self.outliers:
      print("[name]: {}\n[outliers]: {}\n".format(item.name, item.size))

outliers_detector = OutliersDetector(X_train_transformed)
outliers_detector.detect_zscore()
print("\noutliers found: ")
outliers_detector.show_outliers()

輸出：Rainfall、Month、Location、WindDir9a 不應列印在“發現例外值”下方，因為大小為 0，但...

[no outliers] RainToday
[no outliers] Year
[no outliers] Day
[no outliers] WindGustDir
[no outliers] WindDir3pm
[no outliers] Sunshine
[no outliers] Humidity3pm
[no outliers] Cloud9am

outliers found:
[name]: Rainfall
[outliers]: 0

[name]: Evaporation
[outliers]: 289

[name]: Month
[outliers]: 0

[name]: Location
[outliers]: 0

[name]: WindDir9am
[outliers]: 0

我怎樣才能解決這個問題？

uj5u.com熱心網友回復：

在remove_empty_items您self.outliers在迭代串列時修改串列。這會導致未定義的行為。您的代碼應該創建一個新串列，而不是修改當前串列：

  def remove_empty_items(self):
    non_empty_outliers = []
    for item in self.outliers:
      if item.empty:
        print("[no outliers] {}".format(item.name))
      else:
        non_empty_outliers.append(item)
    self.outliers = non_empty_outliers

轉載請註明出處，本文鏈接：https://www.uj5u.com/qukuanlian/432213.html

標籤：Python 熊猫列表班级系列

上一篇：如何使用全域值列印類中的屬性？

下一篇：Python-揭開使用super()呼叫祖父方法的神秘面紗