我最近拿起了 Haskell,我正在嘗試撰寫一個簡單的程式來對首字母縮寫詞進行排序。基本上,我有一個首字母縮寫詞串列 - 每個首字母縮寫詞都是一對字串,例如(“DH”,“Diffie-Hellman”) - 我想對它們進行排序并以以下格式列印它們(首字母縮寫詞,含義,索引)。
例如,如果
acronyms = [("ECDLP", "elliptic curve discrete logarithm problem"),
("DH", "Diffie-Hellman"),
("KDF", "key derivation function")]
那么輸出應該是
("DH","Diffie-Hellman",1)
("ECDLP","Elliptic Curve Discrete Logarithm Problem",2)
("KDF","Key Derivation Function",3)
目前,我擁有的代碼如下。
main :: IO ()
main =
mapM_
print
(zipWith
(curry (\((a, b), c) -> (a, b, c)))
(sort . nub $ map (capitalise <$>) acronyms)
[1 ..])
這里的 capitalize 是我寫的一個函式,它把字串中每個單詞的第一個字符大寫。總的來說,這個程式正在運行,但我想獲得一些關于如何改進它的反饋。最后,有什么辦法可以讓這curry (\((a, b), c) -> (a, b, c))部分成為無點風格?
謝謝
uj5u.com熱心網友回復:
代碼審查
首先,這curry實際上只是增加了復雜性。您可以撰寫一個沒有它的帶有多個引數的 lambda。而不是
curry (\((a, b), c) -> (a, b, c))
考慮
\(a, b) c -> (a, b, c)
現在,您似乎認為一切都應該是無點的。這對于休閑編程來說是一個很好的練習,并且在代碼高爾夫(目標是盡可能縮短代碼)中很有用,但是當您撰寫通用軟體時,它通常是一個糟糕的設計選擇(目標是使代碼盡可能可讀)。所以我建議引入一些區域變數和一些額外的函式來照顧中介。
我們可以將壓縮部分拆分為一個單獨的函式
enumerateAcronyms :: [(a, b)] -> [(a, b, Int)]
enumerateAcronyms xs = zipWith (\(a, b) c -> (a, b, c)) xs [1..]
請注意,雖然我們將此函式用作[(String, String)] -> [(String, String, Int)],但我將其寫為[(a, b)] -> [(a, b, Int)]。這種更通用的型別為我們如何處理輸入提供了非常有力的保證。也就是說,我們不會修改元組的前兩個元素;我們要做的就是移動它們并將整數與它們相關聯。
的Functor實體(,) a很奇怪。我花了一分鐘才意識到這(capitalize <$>)意味著“對元組的第二個元素執行此操作”。所以你應該明確地這樣做
\(x, y) -> (x, capitalize y)
或使用更符合您意圖的名稱。在這種情況下,我們實際上有兩個選擇:Control.Arrow.second或Data.Bifunctor.second。它們做同樣的事情,只是基于不同的抽象(前者抽象函式型別,后者抽象資料結構)。
second capitalize
我個人的偏好是永遠不要使用特定于串列的函式map,而總是使用通用函子fmap。這樣,如果您想將此函式擴展為通用函式并處理序列或其他一些資料結構,那么您要做的作業就更少了。
nub是 O(n^2),如果您要保留串列的順序,這很好。但是,如果您要對串列進行排序,我們可以通過自己滾動來在 O(n) 中得到它。因此,讓我們撰寫自己的nub.
同樣,我可以將mapM_, 需要Monad, 替換為相同traverse_的 ,它只需要Applicative并且因此在更一般的情況下有效。
最后,我不喜歡將長序列的操作鏈接在一起。有些人這樣做,但我不喜歡從右到左閱讀我的代碼,所以而不是
main :: IO ()
main = traverse_ print (enumerateAcronyms (uniqSort $ map (second capitalise) acronyms))
我可能會寫
main :: IO ()
main =
let uniqAcronyms = uniqSort $ map (second capitalise) acronyms
in traverse_ print (enumerateAcronyms uniqAcronyms)
總共有類似的東西
import Data.Char
import Data.List
import Data.Bifunctor
import Data.Foldable
acronyms :: [(String, String)]
acronyms = [("ECDLP", "elliptic curve discrete logarithm problem"),
("DH", "Diffie-Hellman"),
("KDF", "key derivation function")]
capitaliseFirst :: String -> String
capitaliseFirst [] = []
capitaliseFirst (x:xs) = toUpper x : xs
capitalise :: String -> String
capitalise = unwords . fmap capitaliseFirst . words
enumerateAcronyms :: [(a, b)] -> [(a, b, Int)]
enumerateAcronyms xs = zipWith (\(a, b) c -> (a, b, c)) xs [1..]
uniqSort :: Ord a => [a] -> [a]
uniqSort = go . sort
where go [] = []
go [x] = [x]
go (x:x':xs)
| x == x' = go (x':xs)
| otherwise = x : go (x':xs)
main :: IO ()
main =
let uniqAcronyms = uniqSort $ map (second capitalise) acronyms
in traverse_ print (enumerateAcronyms uniqAcronyms)
This is much longer than what you wrote, but it's also going to be much clearer to the average Haskell programmer. There's a lot less to have to keep in your head when reading any particular function, so the code is much easier to break down. If you're coming from Java or some other languages, you've seen the opposite side of this: Code is far too verbose and split across ten files, which makes it hard to see what's going on. But in Haskell, we run into the opposite issue: Code can get too short and snazzy and it gets hard to reason about. There's a happy medium to be reached in the middle.
Pointfree
Now, to answer your initial question: Can we pointfree curry (\((a, b), c) -> (a, b, c))? I don't recommend doing it in production code, as it's definitely not readable, but let's explore. First, as mentioned, the curry isn't necessary. This is just
\(a, b) c -> (a, b, c)
which is
\(a, b) c -> (,,) a b c
then it's just a matter of moving variables around and getting rid of the tuple. c can be eliminated, hence
\(a, b) -> (,,) a b
Now we have a 2-tuple on the left and two arguments on the right. The difference between those is uncurry, hence
\(a, b) -> uncurry (,,) (a, b)
and then we eliminate the last element
uncurry (,,)
but of course, I have no idea what that does at a glance, whereas \(a, b) c -> (a, b, c) is pretty self-explanatory.
In general, we can pointfree any Haskell function provided we have catamorphisms for any data structures used (this is maybe for Maybe, or foldr for lists), as well as some basic quality-of-life functions in the prelude. You can use other tricks like the reader monad or liftA2 to make things easier or shorter, but fundamentally you don't need things like that. It's just a matter of being able to move variables around the various Haskell syntax elements. You convert recursion into fix, pattern matching into catamorphisms, and arguments that are in the "wrong" order into several flip calls.
Lambdabot過去能夠使用這種系統化的方法自動對大多數 Haskell 運算式進行無點分析。我不確定它是否還在。
uj5u.com熱心網友回復:
并行串列推導完全消除了元組(、、、zipWithlambdacurry和(<$>))的麻煩,我認為這使它更具可讀性。此外,由于您無論如何都在排序,所以您不需要nub- 只需grouping 相鄰的相等元素就足夠了。所以:
{-# Language ParallelListComp #-}
main = mapM_ print
[ (acronym, capitalise meaning, i)
| (acronym, meaning):_ <- group (sort acronyms)
| i <- [1..]
]
uj5u.com熱心網友回復:
我會先讓它不那么無意義!那curry是一團糟,一事無成。只需撰寫一個直接將元組與索引組合的 lambda。
main :: IO ()
main = mapM_ print . zipWith output [1 ..] . sort . nub . map (fmap capitalise) $ acronyms
where output idx (acronym, meaning) = (acronym, meaning, idx)
這很好,但我認為元組的 fmap 實體足以令人驚訝,在我希望其他人閱讀的代碼中,我也會命名該函式:
main :: IO ()
main = mapM_ print . zipWith output [1 ..] . sort . nub . map capitaliseDefinition $ acronyms
where output idx (acronym, meaning) = (acronym, meaning, idx)
capitaliseDefinition (acronym, meaning) = (acronym, capitalise meaning)
您也可以更現代,并使用traverse_而不是mapM_. 不過,我很同情mapM_.
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/417920.html
標籤:
下一篇:在電子郵件正文中粘貼影像
