使用numpy用調色板中最接近的顏色替換影像中的顏色-有解無憂

我有一個顏色串列，我有一個函式closest_color(pixel, colors)，它將給定像素的RGB值與我的顏色串列進行比較，并輸出串列中最接近的顏色。

我需要將此功能應用于整個影像。當我嘗試逐個像素地使用它時（通過使用 2 個嵌套的 for 回圈）它很慢。有沒有更好的方法來用 numpy 實作這一點？

uj5u.com熱心網友回復：

任務是將圖片變成它的調色板版本。您定義了一個調色板，然后您需要為每個像素在定義的調色板中為該像素的顏色找到最近鄰匹配。您從該查找中獲得一個索引，然后您可以將其轉換為該像素的調色板顏色。

這可以使用FLANN來實作。代碼也不多。在我的舊電腦上查找需要兩秒鐘。

這種方法的優點是它可以處理“大”調色板（超過少數顏色）而不需要大量記憶體。

完整筆記本：使用 numpy 用調色板中最接近的顏色替換影像中的顏色

這不是最快的解決方案。我可以想象一些既不需要 FLANN 索引結構也不需要大量使用記憶體的東西。請參閱我使用的其他答案numba。

uj5u.com熱心網友回復：

沒有我期望的那么快。使用 np.argmin 作為預先創建的顏色容器的索引。

import numpy as np
from PIL import Image
import requests

# get some image
im = Image.open(requests.get("https://upload.wikimedia.org/wikipedia/commons/thumb/7/77/Big_Nature_(155420955).jpeg/800px-Big_Nature_(155420955).jpeg", stream=True).raw)
newsize = (1000, 1000)
im = im.resize(newsize)
# im.show()
im = np.asarray(im)
new_shape = (im.shape[0],im.shape[1],1,3)

# Ignore above
# Now we have image of shape (1000,1000,1,3). 1 is there so its easy to subtract from color container
image = im.reshape(im.shape[0],im.shape[1],1,3)



# test colors
colors = [[0,0,0],[255,255,255],[0,0,255]]

# Create color container 
## It has same dimensions as image (1000,1000,number of colors,3)
colors_container = np.ones(shape=[image.shape[0],image.shape[1],len(colors),3])
for i,color in enumerate(colors):
    colors_container[:,:,i,:] = color



def closest(image,color_container):
    shape = image.shape[:2]
    total_shape = shape[0]*shape[1]

    # calculate distances
    ### shape =  (x,y,number of colors)
    distances = np.sqrt(np.sum((color_container-image)**2,axis=3))

    # get position of the smalles distance
    ## this means we look for color_container position ????-> (x,y,????,3)
    ### before min_index has shape (x,y), now shape = (x*y)
    #### reshaped_container shape = (x*y,number of colors,3)
    min_index = np.argmin(distances,axis=2).reshape(-1)
    # Natural index. Bind pixel position with color_position
    natural_index = np.arange(total_shape)

    # This is due to easy index access
    ## shape is (1000*1000,number of colors, 3)
    reshaped_container = colors_container.reshape(-1,len(colors),3)

    # Pass pixel position with corresponding position of smallest color
    color_view = reshaped_container[natural_index,min_index].reshape(shape[0],shape[1],3)
    return color_view

# NOTE: Dont pass uint8 due to overflow during subtract
result_image = closest(image,colors_container)

Image.fromarray(result_image.astype(np.uint8)).show()

uj5u.com熱心網友回復：

這里有兩個變體使用numba，一個用于 python 代碼的 JIT 編譯器。

from numba import njit, prange

第一個變體使用更多的 numpy 原語（np.argmin），因此使用“更多”記憶體。也許一點點記憶體會產生影響，或者 numba 可能會按原樣呼叫 numpy 例程，但無法優化這些例程。

@njit(parallel=True)
def lookup1(palette, im):
    palette = palette.astype(np.int32)
    (rows,cols) = im.shape[:2]
    result = np.zeros((rows, cols), dtype=np.uint8)
    
    for i in prange(rows):
        for j in range(cols):
            sqdists = ((im[i,j] - palette) ** 2).sum(axis=1)
            index = np.argmin(sqdists)
            result[i,j] = index

    return result

每次運行我得到約 180-190 毫秒lena.jpg和 125 種顏色的調色板。

第二個變體使用更多的手寫代碼來替換大多數 numpy 原語，這使得它更快。

@njit(parallel=True)
def lookup2(palette, im):
    (rows,cols) = im.shape[:2]
    result = np.zeros((rows, cols), dtype=np.uint8)
    
    for i in prange(rows): # parallelize over this
        for j in range(cols):
            pb,pg,pr = im[i,j] # take pixel apart
            bestindex = -1
            bestdist = 2**20
            for index in range(len(palette)):
                cb,cg,cr = palette[i] # take palette color apart
                dist = (pb-cb)**2   (pg-cg)**2   (pr-cr)**2
                if dist < bestdist:
                    bestdist = dist
                    bestindex = index
            
            result[i,j] = bestindex
    
    return result

每次運行 30 毫秒！

我認為這已接近理論最大值，達到一個數量級之內。我從所需的數學運算中得出這一點。

每個調色板條目：A = 10 ops

3 次減法，3 次平方，3 次加法，1 次比較
每像素：B = 1375 ops

len(palette) * (A 1), 一個索引增量
每行：C = 704512 次操作

ncols * (B 1)，一個索引增量
每張圖片：D = 360710656 操作

nrows * (C 1)，一個索引增量

因此，在 30 毫秒內，在我古老的超執行緒四核上，提供 12000 MIPS（我不會說 flop/s，因為沒有浮點）。這意味著每個周期接近一條指令。我確信代碼缺少一些 SIMD 矢量化......可以調查 LLVM 對這些回圈的看法，但我現在不會為此煩惱。

中的一些代碼cython可能會解決這個問題，因為您可以進一步限制變數的型別。

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/420092.html

標籤：

上一篇：從多個numpy陣列創建熊貓系列

下一篇：將卡方應用于包含分類變數的資料集