本筆記摘抄自:https://www.cnblogs.com/liqingwen/p/5814204.html,記錄一下學習程序以備后續查用,
一、統計單詞在字串中出現的次數
請注意,若要執行計數,請先呼叫Split方法來創建詞陣列,Split方法存在性能開銷,如果對字串執行的唯一操作是計數詞,則應考慮改用Matches或
IndexOf方法,
class Program { static void Main(string[] args) { #region LINQ 統計單詞在字串中出現的次數 const string text = @"Historically, the world of data and the world of objects" + @" have not been well integrated. Programmers work in C# or Visual Basic" + @" and also in SQL or XQuery. On the one side are concepts such as classes," + @" objects, fields, inheritance, and .NET Framework APIs. On the other side" + @" are tables, columns, rows, nodes, and separate languages for dealing with" + @" them. Data types often require translation between the two worlds; there are" + @" different standard functions. Because the object world has no notion of query, a" + @" query can only be represented as a string without compile-time type checking or" + @" IntelliSense support in the IDE. Transferring data from SQL tables or XML trees to" + @" objects in memory is often tedious and error-prone."; const string searchWord = "data"; //字串轉換成陣列 var source = text.Split(new[] { '.', '?', '!', ' ', ';', ':', ',' }, StringSplitOptions.RemoveEmptyEntries); //創建查詢,并忽略大小寫比較, var query = from word in source where string.Equals(word, searchWord, StringComparison.InvariantCultureIgnoreCase) select word; //統計匹配數量 var wordCount = query.Count(); Console.WriteLine($"{wordCount} occurrences(s) of the search word \"{searchWord}\" were found."); Console.Read(); #endregion } }View Code
運行結果如下:

二、查詢包含指定一組單詞的句子
此示例演示如何查找文本檔案中包含指定一組單詞中每個單詞匹配項的句子,雖然在此示例中搜索條件陣列是硬編碼的,但也可以在運行時動態填充此
陣列,
class Program { static void Main(string[] args) { #region LINQ 查詢包含指定一組單詞的句子 const string text = @"Historically, the world of data and the world of objects " + @"have not been well integrated. Programmers work in C# or Visual Basic " + @"and also in SQL or XQuery. On the one side are concepts such as classes, " + @"objects, fields, inheritance, and .NET Framework APIs. On the other side " + @"are tables, columns, rows, nodes, and separate languages for dealing with " + @"them. Data types often require translation between the two worlds; there are " + @"different standard functions. Because the object world has no notion of query, a " + @"query can only be represented as a string without compile-time type checking or " + @"IntelliSense support in the IDE. Transferring data from SQL tables or XML trees to " + @"objects in memory is often tedious and error-prone."; //將文本塊切割成陣列 var sentences = text.Split('.', '?', '!'); //定義搜索條件,此串列可以在運行時動態添加, string[] wordsToMatch = { "Historically", "data", "integrated" }; var query = from sentence in sentences let t = sentence.Split(new char[] { '.', '?', '!', ' ', ';', ':', ',' }, StringSplitOptions.RemoveEmptyEntries) where t.Distinct().Intersect(wordsToMatch).Count() == wordsToMatch.Length //去重,取交集后的數量對比, select sentence; foreach (var sentence in query) { Console.WriteLine(sentence); } Console.Read(); #endregion } }View Code
運行結果如下:

查詢運行時首先將文本拆分成句子,然后將句子拆分成包含每個單詞的字串陣列,對于每個這樣的陣列,Distinct<TSource> 方法移除所有重復的單詞,
然后查詢對單詞陣列和wordstoMatch陣列執行Intersect<TSource>操作,如果交集的計數與wordsToMatch陣列的計數相同,則在單詞中找到了所有的單詞,
然后回傳原始句子,
在對Split的呼叫中,使用標點符號作為分隔符,以從字串中移除標點符號,如果您沒有這樣做,則假如您有一個字串“Historically,”,該字串不會
與wordsToMatch陣列中的“Historically”相匹配,根據源文本中標點的型別,您可能必須使用其他分隔符,
三、在字串中查詢字符
因為String類實作泛型IEnumerable<T>介面,所以可以將任何字串作為字符序列進行查詢,但是,這不是LINQ的常見用法,若要執行復雜的模式匹配操
作,請使用Regex類,
下面的示例查詢一個字串以確定它包含的數字的數目,
class Program { static void Main(string[] args) { #region LINQ 在字串中查詢字符 const string source = "ABCDE99F-J74-12-89A"; //只選擇數字的字符 var digits = from character in source where char.IsDigit(character) select character; Console.Write("Digit:"); foreach (var digit in digits) { Console.Write($"{digit} "); } Console.WriteLine(); //選擇第一個"-"之前的所有字符 var query = source.TakeWhile(x => x != '-'); foreach (var character in query) { Console.Write(character); } Console.Read(); #endregion } }View Code
運行結果如下:

四、正則運算式結合LINQ查詢
此示例演示如何使用Regex類創建正則運算式以便在文本字串中進行更復雜的匹配,使用LINQ查詢可以方便地對您要用正則運算式搜索的檔案進行準確
篩選以及對結果進行加工,
class Program { static void Main(string[] args) { #region LINQ 正則運算式結合LINQ查詢 //請根據不同版本的VS進行路徑修改 const string floder = @"C:\Program Files (x86)\Microsoft Visual Studio\"; var fileInfoes = GetFiles(floder); //創建正則運算式來尋找所有的"Visual" var searchTerm = new Regex(@"http://(www.w3.org|www.npmjs.org)"); //搜索每一個“.html”檔案 //通過where找到匹配項 //注意:select中的變數要求顯示宣告其型別,因為MatchCollection不是泛型IEnumerable集合, var query = from fileInfo in fileInfoes where fileInfo.Extension == ".html" let text = File.ReadAllText(fileInfo.FullName) let matches = searchTerm.Matches(text) where matches.Count > 0 select new { name = fileInfo.FullName, matchValue = from Match match in matches select match.Value }; Console.WriteLine($"The term \"{searchTerm}\" was found in:"); Console.WriteLine(); foreach (var q in query) { //修剪匹配找到的檔案中的路徑 Console.WriteLine($"name==>{q.name.Substring(floder.Length - 1)}"); //輸出找到的匹配值 foreach (var v in q.matchValue) { Console.WriteLine($"matchValue=https://www.cnblogs.com/atomy/p/=>{v}"); } //輸出空白行 Console.WriteLine(); } Console.Read(); #endregion } /// <summary> /// 獲取指定路徑的檔案資訊 /// </summary> /// <param name="path"></param> /// <returns></returns> private static IList<FileInfo> GetFiles(string path) { var files = Directory.GetFiles(path, "*.*", SearchOption.AllDirectories); return files.Select(file => new FileInfo(file)).ToList(); } }View Code
運行結果如下:

五、查找兩個集合間的差異
此示例演示如何使用LINQ對兩個字串串列進行比較,并輸出那些位于text1.txt中但不在text2.txt中的行,
Bankov, Peter
Holm, Michael
Garcia, Hugo
Potra, Cristina
Noriega, Fabricio
Aw, Kam Foo
Beebe, Ann
Toyoshima, Tim
Guy, Wey Yuan
Garcia, Debra
text1.txt
Liu, Jinghao
Bankov, Peter
Holm, Michael
Garcia, Hugo
Beebe, Ann
Gilchrist, Beth
Myrcha, Jacek
Giakoumakis, Leo
McLin, Nkenge
El Yassir, Mehdi
text2.txt
class Program { static void Main(string[] args) { #region LINQ 查找兩個集合間的差異 //創建資料源 var text1 = File.ReadAllLines(@"..\..\text1.txt"); var text2 = File.ReadAllLines(@"..\..\text2.txt"); //創建查詢,這里必須使用方法語法, var query = text1.Except(text2); //執行查詢 Console.WriteLine("The following lines are in text1.txt but not text2.txt"); foreach (var name in query) { Console.WriteLine(name); } Console.Read(); #endregion } }View Code
運行結果如下:

注:某些型別的查詢操作(如 Except<TSource>、Distinct<TSource>、Union<TSource> 和 Concat<TSource>)只能用基于方法的語法表示,
六、排序或過濾任意單詞或欄位的文本資料
下面的示例演示如何按結構化文本(如逗號分隔值)行中的任意欄位對該文本行進行排序,可在運行時動態指定該欄位,
假定scores.csv中的欄位表示學生的ID號,后面跟著四個測驗分數,
111, 97, 92, 81, 60 112, 75, 84, 91, 39 113, 88, 94, 65, 91 114, 97, 89, 85, 82 115, 35, 72, 91, 70 116, 99, 86, 90, 94 117, 93, 92, 80, 87 118, 92, 90, 83, 78 119, 68, 79, 88, 92 120, 99, 82, 81, 79 121, 96, 85, 91, 60 122, 94, 92, 91, 91scores.csv
class Program { static void Main(string[] args) { #region LINQ 排序或過濾任意單詞或欄位的文本資料 //創建資料源 var scores = File.ReadAllLines(@"..\..\scores.csv"); //可以改為0~4的任意值 const int sortIndex = 1; //演示從方法回傳查詢變數,非查詢結果, foreach (var score in SplitSortQuery(scores, sortIndex)) { Console.WriteLine(score); } Console.Read(); #endregion } /// <summary> /// 分割字串排序 /// </summary> /// <param name="scores"></param> /// <param name="num"></param> /// <returns></returns> private static IEnumerable<string> SplitSortQuery(IEnumerable<string> scores, int num) { var query = from line in scores let fields = line.Split(',') orderby fields[num] descending select line; return query; } }View Code
運行結果如下:

七、對一個分割的檔案的欄位重新排序
逗號分隔值 (CSV) 檔案是一種文本檔案,通常用于存盤電子表格資料或其他由行和串列示的表格資料,通過使用Split方法分隔欄位,可以非常輕松地使用
LINQ來查詢和操作CSV檔案,事實上,可以使用此技術來重新排列任何結構化文本行部分,此技術不局限于CSV檔案,
在下面的示例中,假定有三列分別代表學生的“姓氏”、“名字”和“ID”,這些欄位基于學生的姓氏按字母順序排列,查詢生成一個新序列,其中首先出現的是
ID列,后面的第二列組合了學生的名字和姓氏,根據ID欄位重新排列各行,結果保存到新檔案,但不修改原始資料,
Adams,Terry,120 Fakhouri,Fadi,116 Feng,Hanying,117 Garcia,Cesar,114 Garcia,Debra,115 Garcia,Hugo,118 Mortensen,Sven,113 O'Donnell,Claire,112 Omelchenko,Svetlana,111 Tucker,Lance,119 Tucker,Michael,122 Zabokritski,Eugene,121spread.csv
class Program { static void Main(string[] args) { #region LINQ 對一個分割的檔案的欄位重新排序 //資料源 var lines = File.ReadAllLines(@"..\..\spread.csv"); //將舊資料的第2列的欄位放到第一位,逆向結合第0列和第1列的欄位, var query = from line in lines let t = line.Split(',') orderby t[2] select $"{t[2]} {t[1]} {t[0]}"; foreach (var item in query) { Console.WriteLine(item); } Console.Read(); #endregion } }View Code
運行結果如下:

八、組合和比較字串集合
此示例演示如何合并包含文本行的檔案,然后排序結果,具體來說,此示例演示如何對兩組文本行執行簡單的串聯、聯合和交集,
注:text1.txt及text2.txt與五、的一致,
class Program { static void Main(string[] args) { #region LINQ 組合和比較字串集合 var text1 = File.ReadAllLines(@"..\..\text1.txt"); var text2 = File.ReadAllLines(@"..\..\text2.txt"); //簡單連接并排序,重復保存, var concatQuery = text1.Concat(text2).OrderBy(x => x); OutputQueryResult(concatQuery, "Simple concatenate and sort,duplicates are preserved:"); //基于默認字串比較器連接,并洗掉重名, var unionQuery = text1.Union(text2).OrderBy(x => x); OutputQueryResult(unionQuery, "Union removes duplicate names:"); //查找在兩個檔案中出現的名稱 var intersectQuery = text1.Intersect(text2).OrderBy(x => x); OutputQueryResult(intersectQuery, "Merge based on intersect:"); //在每個串列中找到匹配的欄位,使用concat將兩個結果合并,然后使用默認的字串比較器進行排序, const string nameMatch = "Garcia"; var matchQuery1 = from name in text1 let t = name.Split(',') where t[0] == nameMatch select name; var matchQuery2 = from name in text2 let t = name.Split(',') where t[0] == nameMatch select name; var temp = matchQuery1.Concat(matchQuery2).OrderBy(x => x); OutputQueryResult(temp, $"Concat based on partial name match \"{nameMatch}\":"); Console.Read(); #endregion } /// <summary> /// 輸出查詢結果 /// </summary> /// <param name="querys"></param> /// <param name="title"></param> private static void OutputQueryResult(IEnumerable<string> querys, string title) { Console.WriteLine(Environment.NewLine + title); foreach (var query in querys) { Console.WriteLine(query); } Console.WriteLine($"Total {querys.Count()} names in list."); } }View Code
運行結果如下:

九、從多個源中填充物件集合
不要嘗試將記憶體中的資料或檔案系統中的資料與仍在資料庫中的資料相聯接,此種跨域聯接會生成未定義的結果,因為資料庫查詢和其他型別的源定義聯
接運算的方式可能不同,另外,如果資料庫中的資料量足夠大,則存在此類運算引發記憶體不足例外的風險,若要將資料庫資料與記憶體中的資料相聯接,請首
先對資料庫查詢呼叫ToList或ToArray,然后對回傳的集合執行聯接,
class Program { static void Main(string[] args) { #region LINQ 從多個源中填充物件集合 //spread.csv每行包含姓氏、名字和身份證號,以逗號分隔,例如,Omelchenko,Svetlana,111 var names = File.ReadAllLines(@"..\..\spread.csv"); //scores.csv每行包括身份證號碼和四個測驗評分,以逗號分隔,例如,111,97,92,81,60 var scores = File.ReadAllLines(@"..\..\scores.csv"); //使用一個匿名的型別合并資料源, //注:動態創建一個int的考試成績成員串列, //跳過分割字串中的第一項,因為它是學生的身份證,不是一個考試成績, var students = from name in names let t1 = name.Split(',') from score in scores let t2 = score.Split(',') where t1[2] == t2[0] select new { FirstName = t1[0], LastName = t1[1], ID = Convert.ToInt32(t1[2]), ExamScores = (from score in t2.Skip(1) select Convert.ToInt32(score)).ToList() }; foreach (var student in students) { Console.WriteLine($"The average score of {student.FirstName} {student.LastName} is {student.ExamScores.Average()}."); } Console.Read(); #endregion } }View Code
運行結果如下:

十、使用group將一個檔案拆分成多個檔案
此示例演示一種進行以下操作的方法:合并兩個檔案的內容,然后創建一組以新方式組織資料的新檔案,
注:text1.txt及text2.txt與五、的一致,
class Program { static void Main(string[] args) { #region LINQ 使用group將一個檔案拆分成多個檔案 var text1 = File.ReadAllLines(@"..\..\text1.txt"); var text2 = File.ReadAllLines(@"..\..\text2.txt"); //并集:連接并洗掉重復的名字 var mergeQuery = text1.Union(text2); //根據姓氏的首字母對姓名進行分組 var query = from name in mergeQuery let t = name.Split(',') group name by t[0][0] into g orderby g.Key select g; //注意嵌套的 foreach 回圈 foreach (var g in query) { var fileName = @"testFile_" + g.Key + ".txt"; Console.WriteLine(g.Key + ":"); //寫入檔案 using (var sw = new StreamWriter(fileName)) { foreach (var name in g) { sw.WriteLine(name); Console.WriteLine(" " + name); } } } Console.Read(); #endregion } }View Code
運行結果如下:

十一、向不同的檔案中加入內容
此示例演示如何聯接兩個逗號分隔檔案中的資料,這兩個檔案共享一個用作匹配鍵的共同值,如果您必須將兩個電子表格的資料或一個電子表格和一個其
他格式的檔案的資料組合為一個新檔案,則此技術很有用,還可以修改此示例以適合任意種類的結構化文本,
class Program { static void Main(string[] args) { #region LINQ 向不同的檔案中加入內容 var names = File.ReadAllLines(@"..\..\spread.csv"); var scores = File.ReadAllLines(@"..\..\scores.csv"); //該查詢基于ID連接兩個不同的電子表格 var query = from name in names let t1 = name.Split(',') from score in scores let t2 = score.Split(',') where t1[2] == t2[0] orderby t1[0] select $"{t1[0]},{t2[1]},{t2[2]},{t2[3]},{t2[4]}"; //輸出 OutputQueryResult(query, "Merge two spreadsheets:"); Console.Read(); #endregion } /// <summary> /// 輸出查詢結果 /// </summary> /// <param name="querys"></param> /// <param name="title"></param> private static void OutputQueryResult(IEnumerable<string> querys, string title) { Console.WriteLine(Environment.NewLine + title); foreach (var query in querys) { Console.WriteLine(query); } Console.WriteLine($"Total {querys.Count()} names in list."); } }View Code
運行結果如下:

十二、計算一個CSV文本檔案中的列值
此示例演示如何對.csv檔案的列執行諸如Sum、Average、Min和Max等聚合計算,此示例可以應用于其他型別的結構化文本,
class Program { static void Main(string[] args) { #region LINQ 計算一個CSV文本檔案中的列值 var scores = File.ReadAllLines(@"..\..\scores.csv"); //指定要計算的列 const int examNum = 3; //+1表示跳過第一列 //統計單列 SingleColumn(scores, examNum + 1); Console.WriteLine(); //統計多列 MultiColumns(scores); Console.Read(); #endregion } /// <summary> /// 統計單列 /// </summary> /// <param name="lines"></param> /// <param name="examNum"></param> private static void SingleColumn(IEnumerable<string> lines, int examNum) { Console.WriteLine("Single Column Query:"); //查詢步驟: //1.分割字串 //2.對要計算的列的值轉換為int var query = from line in lines let t = line.Split(',') select Convert.ToInt32(t[examNum]); //對指定的列進行統計 var average = query.Average(); var max = query.Max(); var min = query.Min(); Console.WriteLine($"Exam #{examNum}: Average:{average:##.##} High Score:{max} Low Score:{min}"); } /// <summary> /// 統計多列 /// </summary> /// <param name="lines"></param> private static void MultiColumns(IEnumerable<string> lines) { Console.WriteLine("Multi Column Query:"); //查詢步驟: //1.分割字串 //2.跳過ID列(第一列) //3.將當前行的每個評分都轉換成int,并選擇整個序列作為一行結果, var query1 = from line in lines let t1 = line.Split(',') let t2 = t1.Skip(1) select (from t in t2 select Convert.ToInt32(t)); //執行查詢并快取結果以提高性能 var results = query1.ToList(); //找出結果的列數 var count = results[0].Count(); //執行統計 for (var i = 0; i < count; i++) { var query2 = from result in results select result.ElementAt(i); var average = query2.Average(); var max = query2.Max(); var min = query2.Min(); //#1表示第一次考試 Console.WriteLine($"Exam #{i + 1} Average: {average:##.##} High Score: {max} Low Score: {min}"); } } }View Code
運行結果如下:

轉載請註明出處,本文鏈接:https://www.uj5u.com/net/86447.html
標籤:C#
