Python代碼優化技巧和竅門

很多人學習python，不知道從何學起，
很多人學習python，掌握了基本語法過后，不知道在哪里尋找案例上手，
很多已經做案例的人，卻不知道如何去學習更加高深的知識，
那么針對這三類人，我給大家提供一個好的學習平臺，免費領取視頻教程，電子書籍，以及課程的源代碼！
QQ群：961562169

1-分析你的代碼
- - - 1.1. 使用timeit模塊
    - 1.2. 使用高級的cProfile分析
    - - 1.2.1. 關于 cProfile 的結果說明?
2-使用生成器和鍵進行排序
3-優化你的回圈陳述句
- - - 3.1. 在Python中優化for回圈
4-利用哈希
5-避免使用全域變數
6-使用外部的包或者庫
7-使用內置的運算子
8-限制回圈中的方法呼叫
9-字串優化
10-if陳述句進行優化
11-使用裝飾器進行一些快取操作
12-將“ while 1”用于無限回圈，

Python是一種功能強大的編程語言，我們可以做很多事情來使我們的代碼更輕，更快，它不僅僅是使用多行程等功能，而且還可以輕松實作，下面，我們列出了一些最佳的Python代碼優化技巧和竅門，

1-分析你的代碼

如果您不了解你的代碼性能瓶頸所在，那么在進一步優化代碼之前，這會顯得你很幼稚，因此，首先，使用以下兩種方法中的任何一種來分析您的代碼

1.1. 使用timeit模塊

下面是使用Python的模塊進行分析的傳統方式，它記錄了一段代碼執行所需的時間，及測量程序消耗的時間（以毫秒為單位）

import timeit

subStrings = ['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']


def simpleString(subStrings):
    finalString = ''
    for part in subStrings:
        finalString += part
    return finalString


def formatString(subStrings):
    finalString = "%s%s%s%s%s%s%s" % (subStrings[0], subStrings[1],
                                      subStrings[2], subStrings[3],
                                      subStrings[4], subStrings[5],
                                      subStrings[6])
    return finalString


def joinString(subStrings):
    return ''.join(subStrings)


print('joinString() Time   : ' + str(
    timeit.timeit('joinString(subStrings)', setup='from __main__ import joinString, subStrings')))
print('formatString() Time : ' + str(
    timeit.timeit('formatString(subStrings)', setup='from __main__ import formatString, subStrings')))
print('simpleString() Time : ' + str(
    timeit.timeit('simpleString(subStrings)', setup='from __main__ import simpleString, subStrings')))

結果如下：

joinString() Time   : 0.223614629
formatString() Time : 0.49615162100000004
simpleString() Time : 0.47305408300000007

上面的示例說明了join方法比其他方法效率更高，

1.2. 使用高級的cProfile分析

從Python 2.5開始，cProfile已成為Python軟體包的一部分，它帶來了一套不錯的分析功能，您可以通過多種方式將其與代碼系結，就像將一個函式包裝在其run方法中以衡量性能，或者，借助Python的“ -m”選項將cProfile作為引數激活，同時以命令列方式運行整個腳本，

import cProfile


def add():
	"""
	也可使用雙引號，當然引號里面也可以是直接的運算式 例如 '10 + 10'這樣的用法
	"""
    return 10 + 10

cProfile.run('add()')

結果如下：

:1()/n        1    0.000    0.000    0.000    0.000 {built-in method builtins.exec}/n        1    0.000    0.000    0.000    0.000 {method'disable' of '_lsprof.Profiler' objects}\n&quot;,&quot;classes&quot;:[]}" data-cke-widget-upcasted="1" data-cke-widget-keep-attr="0" data-widget="codeSnippet">             3 function calls in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

當然也可以在terminal 里面使用命令：python -m cProfile -s cumtime xxx.py

1.2.1. 關于 cProfile 的結果說明?

從輸出中分析找到導致代碼耗時罪魁禍首就顯得尤為重要，因此只有知道cProfile報告的關鍵要素后才能做出判斷；
1. ncalls：表示函式呼叫的次數；
2. tottime：表示指定函式的總的運行時間；
3. percall：（第一個percall）表示tottime除以ncalls；
4. cumtime：表示該函式及其所有子函式的呼叫運行的時間，即函式開始呼叫到回傳的時間；
5 .percall：（第二個percall）即函式運行一次的平均時間，等于 cumtime/ncalls；
5. filename:lineno(function)：每個函式呼叫的具體資訊；
從分析報告中你可以找到具體的原因，當然首先；最重要的是tottime和cumtime， ncalls有時也可能是有用的，對于其余專案，您需要自己練習分析；

2-使用生成器和鍵進行排序

生成器是記憶體優化的絕佳工具，它會創建一個可以一次回傳一個結果的（迭代器）的函式，而不是一次回傳所有的結果，一個很好的例子是創建大量數字并將它們相加，
同樣，在對串列中的元素進行排序時，應盡可能使用鍵和默認的sort()方法,在下面的例子中，我們根據key引數選擇部分的索引對串列進行排序，

import operator

test = [(11, 52, 83), (61, 20, 40), (93, 72, 51)]
print("Before sorting:", test)

test.sort(key=operator.itemgetter(0))
print("After sorting[1]: ", test)

test.sort(key=operator.itemgetter(1))
print("After sorting[2]: ", test)

test.sort(key=operator.itemgetter(2))
print("After sorting[3]: ", test)

結果如下：

Before sorting: [(11, 52, 83), (61, 20, 40), (93, 72, 51)]
After sorting[1]:  [(11, 52, 83), (61, 20, 40), (93, 72, 51)]
After sorting[2]:  [(61, 20, 40), (11, 52, 83), (93, 72, 51)]
After sorting[3]:  [(61, 20, 40), (93, 72, 51), (11, 52, 83)]

3-優化你的回圈陳述句

大多數編程語言都強調需要優化回圈，在Python中，我們確實有一種方法可以使回圈執行得更快，

雖然您可能喜歡使用回圈，但是回圈是有代價的， Python引擎在解釋for回圈結構上花費了大量精力，因此，最好將它們替換為Python的內置函式（例如Map）

接下來，代碼優化的級別還取決于您對Python內置功能的了解，在以下示例中，我們將嘗試解釋不同的方法以幫助優化回圈，

3.1. 在Python中優化for回圈

import timeit
import itertools

Zipcodes = ['121212','232323','434334']
newZipcodes = ['  131313 ',' 242424   ',' 212121 ','  323232','342312  ',' 565656 ']

def updateZips(newZipcodes, Zipcodes):
    """
    Example-1 ：最原始的，利用for回圈去除 newZipcodes 里面的空格
    :param newZipcodes:
    :param Zipcodes:
    :return:
    """
    for zipcode in newZipcodes:
        Zipcodes.append(zipcode.strip())

def updateZipsWithMap(newZipcodes, Zipcodes):
    """
    Example-2 ：現在，看看如何使用map物件將以上內容轉換為一行，在查看具體收益有多大
    :param newZipcodes:
    :param Zipcodes:
    :return:
    """
    Zipcodes += map(str.strip, newZipcodes)

def updateZipsWithListCom(newZipcodes, Zipcodes):
    """
    Example-3 ：利用串列推導式
    :param newZipcodes:
    :param Zipcodes:
    :return:
    """
    Zipcodes += [iter.strip() for iter in newZipcodes]

def updateZipsWithGenExp(newZipcodes, Zipcodes):
    """
    Example-3 ：最后，最快的方法是將for回圈轉換為生成器運算式
    :param newZipcodes:
    :param Zipcodes:
    :return:
    """

    return itertools.chain(Zipcodes, (iter.strip() for iter in newZipcodes))



print('updateZips() Time            : ' + str(timeit.timeit('updateZips(newZipcodes, Zipcodes)', setup='from __main__ import updateZips, newZipcodes, Zipcodes')))

Zipcodes = ['121212','232323','434334']
print('updateZipsWithMap() Time     : ' + str(timeit.timeit('updateZipsWithMap(newZipcodes, Zipcodes)', setup='from __main__ import updateZipsWithMap, newZipcodes, Zipcodes')))

Zipcodes = ['121212','232323','434334']
print('updateZipsWithListCom() Time : ' + str(timeit.timeit('updateZipsWithListCom(newZipcodes, Zipcodes)', setup='from __main__ import updateZipsWithListCom, newZipcodes, Zipcodes')))

Zipcodes = ['121212','232323','434334']
print('updateZipsWithGenExp() Time  : ' + str(timeit.timeit('updateZipsWithGenExp(newZipcodes, Zipcodes)', setup='from __main__ import updateZipsWithGenExp, newZipcodes, Zipcodes')))

updateZips() Time            : 1.043096744
updateZipsWithMap() Time     : 0.7813633710000001
updateZipsWithListCom() Time : 0.9924229019999999
updateZipsWithGenExp() Time  : 0.5045337760000002

如上所述，在上述用例中（通常），使用生成器運算式是優化for回圈的最快方法，我們匯總了四個示例的代碼，以便您還可以看到每種方法所獲得的性能提升.

4-利用哈希

Python使用哈希表來管理集合，每當我們將元素添加到集合中時，Python解釋器都會使用目標元素的哈希值確定其在分配給該集合的記憶體中的位置，

由于Python自動調整哈希表的大小，因此無論集合的大小如何，速度都可以恒定（O（1））這就是使設定操作執行得更快的原因，

在Python中，集合操作包括并集，交集和差集，因此，您可以嘗試在適合它們的代碼中使用它們，這些通常比遍歷串列更快，具體用法百度

Syntax       Operation    Description
   ------       ---------    -----------
set(l1)|set(l2) Union	     Set with all l1 and l2 items.
set(l1)&set(l2) Intersection Set with commmon l1 and l2 items.
set(l1)-set(l2) Difference   Set with l1 items not in l2.

5-避免使用全域變數

不僅限于Python，幾乎所有語言都不贊成過度或無節制地使用全域變數，其背后的原因是它們可能具有導致代碼一些非顯而易見的副作用，而且，Python在訪問外部變數方面確實很慢，
使用很少的全域變數是一種有效的設計模式，因為它可以幫助您跟蹤范圍和不必要的記憶體使用情況，而且，Python檢索區域變數要比全域變數更快，

6-使用外部的包或者庫

一些python庫具有與原始庫相同的功能，比如用“ C”撰寫的代碼執行速度更快，例如，嘗試使用cPickle而不是使用pickle，也可以嘗試使用cpython，
您也可以考慮使用PyPy軟體包，它包括一個JIT（即時）編譯器，使Python代碼運行得非常快，您甚至可以對其進行調整以提供額外的處理能力，

7-使用內置的運算子

Python是一種解釋性語言，基于高級抽象，因此，您應盡可能使用內置功能，由于內置程式是預先編譯的，并且速度很快，因此可以提高您的代碼效率，而包括解釋步驟在內的冗長迭代變得非常緩慢，
同樣，多使用map等內置功能，這些功能可以顯著提高速度，

8-限制回圈中的方法呼叫

當在回圈中執行操作時，應該快取方法呼叫，而不是在物件上呼叫它，否則，方法查找會顯得很昂貴，
如下示例：

>> for it in xrange(10000):/n>>>    myLib.findMe(it)/n","classes":[]}" data-cke-widget-upcasted="1" data-cke-widget-keep-attr="0" data-widget="codeSnippet">>>> for it in xrange(10000):
>>>    myLib.findMe(it)

>> findMe = myLib.findMe/n>>> for it in xrange(10000):/n>>>    findMe(it)/n","classes":[]}" data-cke-widget-upcasted="1" data-cke-widget-keep-attr="0" data-widget="codeSnippet">>>> findMe = myLib.findMe
>>> for it in xrange(10000):
>>>    findMe(it)

9-字串優化

字串拼接速度很慢，永遠不要在回圈內進行，相反，請使用Python的join方法，或者，使用格式設定功能來形成統一的字串

隨著Python中的RegEx，雖然它們的運行速度很快，但是，在某些情況下，像isalpha（）/isdigit（）/ startswith（）/ endswith（）這樣的基本字串使用方法會更好

10-if陳述句進行優化

就像大多數編程語言都允許進行惰性假設評估一樣，Python也是如此，這意味著，像“ AND”條件，如果其中任何一個條件為假，則不會對后續條件進行測驗；
多使用if x:，而不是if x == True:來進行比較
if done is not None比使用 if done != None 更快

11-使用裝飾器進行一些快取操作

當我使用該演算法找到第36個斐波那契數，即fibonacci（36）時，計算程序花了12s，48315636 function calls.

import cProfile
import timeit

def fibonacci(n):
  if n == 0: # There is no 0'th number
    return 0
  elif n == 1: # We define the first number as 1
    return 1
  return fibonacci(n - 1) + fibonacci(n-2)
#
# print('fibonacci() Time   : ' + str(
#     timeit.timeit('fibonacci(36)', setup='from __main__ import fibonacci, n')))
cProfile.run('fibonacci(36)')

:1()/n48315633/1   12.584    0.000   12.584   12.584 vv.py:4(fibonacci)/n        1    0.000    0.000   12.584   12.584 {built-in method builtins.exec}/n        1    0.000    0.000    0.000    0.000 {method'disable' of '_lsprof.Profiler' objects}\n&quot;,&quot;classes&quot;:[]}" data-cke-widget-upcasted="1" data-cke-widget-keep-attr="0" data-widget="codeSnippet">                 48315636 function calls (4 primitive calls) in 12.584 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   12.584   12.584 <string>:1(<module>)
48315633/1   12.584    0.000   12.584   12.584 vv.py:4(fibonacci)
        1    0.000    0.000   12.584   12.584 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

但是，當從標準庫引入快取時，情況會發生變化，只需要幾行代碼

import cProfile
import functools

@functools.lru_cache(maxsize=128)
def fibonacci(n):
  if n == 0:
    return 0
  elif n == 1:
    return 1
  return fibonacci(n - 1) + fibonacci(n-2)

cProfile.run('fibonacci(100)')

:1()/n    101/1    0.000    0.000    0.000    0.000 vv.py:6(fibonacci)/n        1    0.000    0.000    0.000    0.000 {built-in method builtins.exec}/n        1    0.000    0.000    0.000    0.000 {method'disable' of '_lsprof.Profiler' objects}\n\n&quot;,&quot;classes&quot;:[]}" data-cke-widget-upcasted="1" data-cke-widget-keep-attr="0" data-widget="codeSnippet">         104 function calls (4 primitive calls) in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
    101/1    0.000    0.000    0.000    0.000 vv.py:6(fibonacci)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

計算第100個數的時間0s，104 function calls

12-將“ while 1”用于無限回圈，

如果您正在偵聽套接字，則可能需要使用無限回圈，實作此目的的正常方法是在True時使用，這是可行的，但是通過使用while 1可以更快地達到相同的效果，因為它是一個數值比較，僅適用Python2，

因為，在Python 2.x中，True它不是關鍵字，而只是在型別中定義為1 的內置全域常量bool，因此，解釋器仍然必須加載True的內容，換句話說，True是可重新分配的：

>> True = 4/n>>> True/n4/n","classes":[]}" data-cke-widget-upcasted="1" data-cke-widget-keep-attr="0" data-widget="codeSnippet">Python 2.7 (r27:82508, Jul  3 2010, 21:12:11) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> True = 4
>>> True
4

在Python 3.x中，

>> True = 4/n  File /"/", line 1/nSyntaxError: assignment to keyword","classes":[]}" data-cke-widget-upcasted="1" data-cke-widget-keep-attr="0" data-widget="codeSnippet">Python 3.1.2 (r312:79147, Jul 19 2010, 21:03:37) 
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> True = 4
  File "<stdin>", line 1
SyntaxError: assignment to keyword

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/112865.html

標籤：其他

上一篇：結構與演算法(05)：二叉樹與多叉樹

下一篇：python 爬蟲之selenium可視化爬蟲