在 Haskell：我可以將整數直接讀入陣列嗎？

函式readvandreadv.readInt負責決議整數并將它們保存到二維串列中，大約需要 50-70 MB，而不是大約 16 MB =（10 ^{6 個}整數）×（每個整數 8 個位元組每個鏈接 8 個位元組）。

有沒有希望我可以得到低于 256 MB 的總記憶體？我已經在使用Text包進行輸入。也許我應該完全避免使用串列并直接從標準輸入讀取整數到陣列。我們怎樣才能做到這一點？或者，是其他地方的問題嗎？

代碼

{-# OPTIONS_GHC -O2 #-}
module CF1606D where
import qualified Data.Text as T
import qualified Data.Text.IO as TI
import qualified Data.Text.Read as TR
import Control.Monad
import qualified Data.List as DL
import qualified Data.IntSet as DS
import Control.Monad.ST
import Data.Array.ST.Safe
import Data.Int (Int32)
import Data.Array.Unboxed

solve :: IO ()
solve =  do
  ~[n,m] <- readv      
  -- 2D list
  input <- {-# SCC input #-} replicateM (fromIntegral n) readv     
  let
      ints = [1..]
      sorted = DL.sortOn (head.fst) (zip input ints)
      (rows,indices) = {-# SCC rows_inds #-} unzip sorted    
      -- 2D list converted into matrix:
      matrix = mat (fromIntegral n) (fromIntegral m) rows           
      infinite = 10^7
      asc x y = [x,x 1..y]
      desc x y = [y,y-1..x]    
      -- Four prefix-matrices:
      tlMax = runSTUArray $ prefixMat max 0 asc asc (subtract 1) (subtract 1) =<< matrix
      blMin = runSTUArray $ prefixMat min infinite desc asc ( 1) (subtract 1) =<< matrix
      trMin = runSTUArray $ prefixMat min infinite asc desc (subtract 1) ( 1) =<< matrix
      brMax = runSTUArray $ prefixMat max 0 desc desc ( 1) ( 1) =<< matrix    
      good _ (i,j)
        | tlMax!(i,j) < blMin!(i 1,j) && brMax!(i 1,j 1) < trMin!(i,j 1) = Left (i,j)
        | otherwise = Right ()
      {-# INLINABLE good #-}
      nearAns = foldM good () [(i,j)|i<-[1..n-1],j<-[1..m-1]]
      ans = either (\(i,j)-> "YES\n"    color n (take i indices)    " "    show j) (const "NO") nearAns
  putStrLn ans

type I = Int32
type S s = (STUArray s (Int, Int) I)
type R = Int -> Int -> [Int]
type F = Int -> Int

mat :: Int -> Int -> [[I]] -> ST s (S s)
mat n m rows = newListArray ((1,1),(n,m)) $ concat rows

prefixMat :: (I->I->I) -> I -> R -> R -> F -> F -> S s -> ST s (S s)
prefixMat opt worst ordi ordj previ prevj mat = do
  ((ilo,jlo),(ihi,jhi)) <- getBounds mat
  pre <- newArray ((ilo-1,jlo-1),(ihi 1,jhi 1)) worst
  forM_ (ordi ilo ihi) $ \i-> do
    forM_ (ordj jlo jhi) $ \j -> do
      matij <- readArray mat (i,j)
      prei <- readArray pre (previ i,j)
      prej <- readArray pre (i, prevj j)
      writeArray pre (i,j) (opt (opt prei prej) matij)
  return pre

color :: Int -> [Int] -> String
color n inds = let
  temp = DS.fromList inds
  colors = [if DS.member i temp then 'B' else 'R' | i<-[1..n]]
  in colors

readv :: Integral t => IO [t]
readv = map readInt . T.words <$> TI.getLine where
  readInt = fromIntegral . either (const 0) fst . TR.signed TR.decimal
{-# INLINABLE readv #-}

main :: IO ()
main = do
  ~[n] <- readv
  replicateM_ n solve

上面代碼的快速描述：

讀取n每個具有m整數的行。
按第一個元素對行進行排序。
Now compute four 'prefix matrices', one from each corner. For top-left and bottom-right corners, it's the prefix-maximum, and for the other two corners, it's the prefix-minimum that we need to compute.
Find a cell [i,j] at which these prefix matrices satisfy the following condition: top_left [i,j] < bottom_left [i,j] and top_right [i,j] > bottom_right [i,j]
For rows 1 through i, mark their original indices (i.e. position in the unsorted input matrix) as Blue. Mark the rest as Red.

Sample input and Commands

Sample input: inp3.txt.

Command:

> stack ghc -- -main-is CF1606D.main -with-rtsopts="-s -h -p -P" -rtsopts -prof -fprof-auto CF1606D
> gtime -v ./CF1606D < inp3.txt > outp
    ...
    ...
    MUT     time    2.990s  (  3.744s elapsed)    #    RTS -s output
    GC      time    4.525s  (  6.231s elapsed)    #    RTS -s output
    ...
    ...
    Maximum resident set size (kbytes): 408532    #    >256 MB (gtime output)

> stack exec -- hp2ps -t0.1 -e8in -c CF1606D.hp && open CF1606D.ps

Question about GC: As shown above in the RTS -s output, GC seems to be taking longer than the actual logic execution. Is this normal? Is there a way to visualize the GC activity over time? I tried making matrices strict but that didn't have any impact.

Probably this is not a functional-friendly problem at all (although I'll be happy to be disproved on this). For example, Java uses GC too but there are lots of successful Java submissions. Still, I want to see how far I can push. Thanks!

uj5u.com熱心網友回復：

與普遍看法相反，Haskell 對此類問題非常友好。真正的問題是arrayGHC 附帶的庫完全是垃圾。另一個大問題是在 Haskell 中每個人都被教導在應該使用陣列的地方使用串列，這通常是代碼緩慢和記憶體膨脹程式的主要來源之一。因此，GC 需要很長時間也就不足為奇了，因為分配的東西太多了。下面是對下面提供的解決方案提供的輸入的運行：

   1,483,547,096 bytes allocated in the heap
         566,448 bytes copied during GC
      18,703,640 bytes maximum residency (3 sample(s))
       1,223,400 bytes maximum slop
              32 MiB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      1399 colls,     0 par    0.009s   0.009s     0.0000s    0.0011s
  Gen  1         3 colls,     0 par    0.002s   0.002s     0.0006s    0.0016s

  TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.001s  (  0.001s elapsed)
  MUT     time    0.484s  (  0.517s elapsed)
  GC      time    0.011s  (  0.011s elapsed)
  EXIT    time    0.001s  (  0.002s elapsed)
  Total   time    0.496s  (  0.530s elapsed)

下面提供的解決方案使用了一個陣列庫massiv，這使得無法提交給 codeforces。然而，希望目標是在 Haskell 上變得更好，而不是在某些網站上獲得積分。

紅藍矩陣可以分為兩個階段：讀取和求解

讀

閱讀尺寸

在main函式中，我們只讀取陣列的總數和每個陣列的維度。我們也列印結果。這里沒有什么令人興奮的。（請注意，鏈接的檔案inp3.txt具有比在該問題所定義的限制更大的陣列：n*m <= 10^6）

import Control.Monad.ST
import Control.Monad
import qualified Data.ByteString as BS
import Data.Massiv.Array as A hiding (B)
import Data.Massiv.Array.Mutable.Algorithms (quicksortByM_)
import Control.Scheduler (trivialScheduler_)

main :: IO ()
main = do
  t <- Prelude.read <$> getLine
  when (t < 1 || t > 1000) $ error $ "Invalid t: "    show t
  replicateM_ t $ do
    dimsStr <- getLine
    case Prelude.map Prelude.read (words dimsStr) of
      -- Test file fails this check: && n * m <= 10 ^ (6 :: Int) -> do
      [n, m] | n >= 2 && m > 0 && m <= 5 * 10 ^ (5 :: Int) -> do
        mat <- readMatrix n m
        case solve mat of
          Nothing -> putStrLn "NO"
          Just (ix, cs) -> do
            putStrLn "YES"
            putStr $ foldMap show cs
            putStr " "
            print ix
      _ -> putStrLn $ "Unexpected dimensions: "    show dimsStr

讀取陣列

將輸入加載到陣列中是原始問題中問題的主要來源：

無需依賴text, ascii 字符是問題預期的唯一有效輸入。
輸入被讀入串列串列。該串列串列是記憶體開銷的真正來源。
排序串列的速度慢得離譜，而且很耗記憶體。

通常在這種情況下，使用類似conduit. 特別是，將輸入讀取為位元組流并將這些位元組決議為數字將是最佳解決方案。話雖如此，在問題描述中對每個陣列的寬度有嚴格的要求，因此我們可以將輸入逐行讀取為 aByteString然后決議每行中的數字（為簡單起見假設為無符號）并寫入這些數字同時進入陣列。這確保在這個階段我們將只分配結果陣列和一行作為位元組序列。這可以通過像那樣的決議庫來完成attoparsec，但問題很簡單，只需臨時進行即可。

type Val = Word

readMatrix :: Int -> Int -> IO (Matrix P Val)
readMatrix n m = createArrayS_ (Sz2 n m) readMMatrix

readMMatrix :: MMatrix RealWorld P Val -> IO ()
readMMatrix mat =
  loopM_ 0 (< n) (  1) $ \i -> do
    line <- BS.getLine
    --- ^ reads at most 10Mb because it is known that input will be at most
    -- 5*10^5 Words: 19 digits max per Word and one for space: 5*10^5 * 20bytes
    loopM 0 (< m) (  1) line $ \j bs ->
      let (word, bs') = parseWord bs
       in bs' <$ write_ mat (i :. j) word
  where
    Sz2 n m = sizeOfMArray mat
    isSpace = (== 32)
    isDigit w8 = w8 >= 48 && w8 <= 57
    parseWord bs =
      case BS.uncons bs of
        Just (w8, bs')
          | isDigit w8 -> parseWordLoop (fromIntegral (w8 - 48)) bs'
          | otherwise -> error $ "Unexpected byte: "    show w8
        Nothing -> error "Unexpected end of input"
    parseWordLoop !acc bs =
      case BS.uncons bs of
        Nothing -> (acc, bs)
        Just (w8, bs')
          | isSpace w8 -> (acc, bs')
          | isDigit w8 -> parseWordLoop (acc * 10   fromIntegral (w8 - 48)) bs'
          | otherwise -> error $ "Unexpected byte: "    show w8

解決

這是我們實施實際解決方案的步驟。我沒有嘗試修復這個 SO 問題中提供的解決方案，而是繼續翻譯了問題中鏈接的C 解決方案。我走這條路的原因有兩個：

C 解決方案是非常必要的，我想證明命令式陣列操作對 Haskell 來說并不是那么陌生，所以我嘗試創建一個盡可能接近的翻譯。
我知道解決方案有效

請注意，應該可以使用arraypackage重寫下面的解決方案，因為最終需要的只是read,write和allocate操作。

computeSortBy ::
     (Load r Ix1 e, Manifest r' e)
  => (e -> e -> Ordering)
  -> Vector r e
  -> Vector r' e
computeSortBy f vec = 
  withLoadMArrayST_ vec $ quicksortByM_ (\x y -> pure $ f x y) trivialScheduler_

solve :: Matrix P Val -> Maybe (Int, [Color])
solve a = runST $ do
  let sz@(Sz2 n m) = size a
      ord :: Vector P Int
      ord = computeSortBy 
              (\x y -> compare (a ! (y :. 0)) (a ! (x :. 0))) (0 ..: n)
  mxl <- newMArray @P sz minBound
  loopM_ (n - 1) (>= 0) (subtract 1) $ \ i ->
    loopM_ 0 (< m) (  1) $ \j -> do
      writeM mxl (i :. j) (a ! ((ord ! i) :. j))
      when (i < n - 1) $
        writeM mxl (i :. j) 
          =<< max <$> readM mxl (i :. j) <*> readM mxl (i   1 :. j)
      when (j > 0) $
        writeM mxl (i :. j) 
          =<< max <$> readM mxl (i :. j) <*> readM mxl (i :. j - 1)
  mnr <- newMArray @P sz maxBound
  loopM_ (n - 1) (>= 0) (subtract 1) $ \ i ->
    loopM_ (m - 1) (>= 0) (subtract 1) $ \ j -> do
      writeM mnr (i :. j) (a ! ((ord ! i) :. j))
      when (i < n - 1) $
        writeM mnr (i :. j) 
          =<< min <$> readM mnr (i :. j) <*> readM mnr (i   1 :. j)
      when (j < m - 1) $
        writeM mnr (i :. j) 
          =<< min <$> readM mnr (i :. j) <*> readM mnr (i :. j   1)
  mnl <- newMArray @P (Sz m) maxBound
  mxr <- newMArray @P (Sz m) minBound
  let goI i
        | i < n - 1 = do
          loopM_ 0 (< m) (  1) $ \j -> do
            val <- min (a ! ((ord ! i) :. j)) <$> readM mnl j
            writeM mnl j val
            when (j > 0) $
              writeM mnl j . min val =<< readM mnl (j - 1)
          loopM_ (m - 1) (>= 0) (subtract 1) $ \j -> do
            val <- max (a ! ((ord ! i) :. j)) <$> readM mxr j
            writeM mxr j val
            when (j < m - 1) $
              writeM mxr j . max val =<< readM mxr (j   1)
          let goJ j
                | j < m - 1 = do
                    mnlVal <- readM mnl j
                    mxlVal <- readM mxl (i   1 :. j)
                    mxrVal <- readM mxr (j   1)
                    mnrVal <- readM mnr ((i   1) :. (j   1))
                    if mnlVal > mxlVal && mxrVal < mnrVal
                      then pure $ Just (i, j)
                      else goJ (j   1)
                | otherwise = pure Nothing
          goJ 0 >>= \case
            Nothing -> goI (i   1)
            Just pair -> pure $ Just pair
        | otherwise = pure Nothing
  mAns <- goI 0
  Control.Monad.forM mAns $ \ (ansFirst, ansSecond) -> do
    resVec <- createArrayS_ @BL (Sz n) $ \res ->
        iforM_ ord $ \i ordIx -> do
          writeM res ordIx $! if i <= ansFirst then R else B
    pure (ansSecond   1, A.toList resVec)

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/374170.html

標籤：arrays performance haskell stdin

上一篇：Haskell中的伯努利函式

下一篇：如何在Haskell中計算檔案中的單詞數？