避免回圈中的條件陳述句-有解無憂

我的 f90 程式的一部分占用了大量的計算時間。我基本上是遍歷三個矩陣（大小相同，維度大到 250×250），并試圖確保值保持在區間 [-1.0, 1.0] 內。我知道在回圈中避免條件陳述句是最佳實踐，但我無法弄清楚如何重新撰寫此代碼塊以獲得最佳性能。有沒有辦法“解開”回圈或使用某種內置函式來“矢量化”條件陳述句？

do ind2 = 1, size(u_mat,2)
    do ind1 = 1,size(u_mat,1)
        ! Dot product 1 must be bounded between [-1,1]
        if (b1_dotProd(ind1,ind2) .GT. 1.0_dp) then
            b1_dotProd(ind1,ind2) = 1.0_dp
        else if (b1_dotProd(ind1,ind2) .LT. -1.0_dp) then
            b1_dotProd(ind1,ind2) = -1.0_dp
        end if
        ! Dot product 2 must be bounded between [-1,1]
        if (b2_dotProd(ind1,ind2) .GT. 1.0_dp) then
            b2_dotProd(ind1,ind2) = 1.0_dp
        else if (b2_dotProd(ind1,ind2) .LT. -1.0_dp) then
            b2_dotProd(ind1,ind2) = -1.0_dp
        end if
        ! Dot product 3 must be bounded between [-1,1]
        if (b3_dotProd(ind1,ind2) .GT. 1.0_dp) then
            b3_dotProd(ind1,ind2) = 1.0_dp
        else if (b3_dotProd(ind1,ind2) .LT. -1.0_dp) then
            b3_dotProd(ind1,ind2) = -1.0_dp
        end if
    end do
end do

對于它的價值，我正在編譯ifort.

uj5u.com熱心網友回復：

您可以為此使用內在的min和max函式。

由于它們都是基本元素，因此您可以在整個陣列上使用它們，如

b1_dotProd = max(-1.0_dp, min(b1_dotProd, 1.0_dp))

雖然有些處理器指令允許min并max在沒有分支的情況下實作，但這將取決于編譯器的實作min以及max是否實際完成以及這是否實際上更快，但它至少要簡潔得多。

uj5u.com熱心網友回復：

@veryreverie 的答案絕對是正確的，但有兩件事需要考慮。

一個where說法是另一種明智的選擇。因為它仍然是一個有條件的選擇，所以同樣的警告

這是否真的避免了分支，如果它實際上更快，但它至少要簡潔得多

仍然適用。

一個例子是：

    pure function clamp(X) result(res)
        real, intent(in) :: X(:)
        real :: res(size(X))
        where (X < -1.0)
            res = -1.0
        else where (X > 1.0)
            res = 1.0
        else
            res = X
        end where
    end function

如果您想規范化為嚴格的 1 或 -1，我實際上會考慮將資料型別更改為整數。然后你可以a == 1不用考慮浮點相等問題就可以實際使用等。根據您的代碼，我還會考慮點積接近零的情況。當然這點只適用于你只對標志感興趣的情況。

    pure function get_sign(X) result(res)
        real, intent(in) :: X(:)
        integer :: res(size(X))
        ! Or use another appropiate choice to test for near_zero
        where (abs(X) < epsilon(X) * 10.)
            res = 0
        else where (X < 0.0)
            res = -1
        else where (X > 0.0)
            res =  1
        end where
    end function

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/358406.html

標籤：表现复式嵌套循环英特尔-fortran

上一篇：地理編碼器對于搜索視圖來說太慢了

下一篇：回圈串列以計算每對元素的出現（在子串列中）