我一直在玩@ Chintoo的 C# 出色的 ETL 系統。我需要比較兩個 CSV 檔案,其中一個 CSV 檔案定義為動態增長的主表,另一個是饋送器“詳細資訊”表。
詳細表可能在新記錄、已更改記錄或主 CSV 檔案中不再存在 (DELETED) 的記錄方面存在差異。
輸出應該是替換或更新主表的第三個表 - 所以它是一個不斷增長的 CSV 檔案。
兩個表都有唯一的 ID 列和標題行。
主 CSV
ID,name
1,Danny
2,Fred
3,Sam
細節
ID,name
1,Danny
<-- record no longer exists
3,Pamela <-- name change
4,Fernando <-- new record
到目前為止,我一直在指這個fiddle,以及下面的代碼:
using System;
using ChoETL;
using System.Linq;
public class Program
{
public static void Main()
{
var input1 = ChoCSVReader.LoadText(csv1).WithFirstLineHeader().ToArray();
var input2 = ChoCSVReader.LoadText(csv2).WithFirstLineHeader().ToArray();
Console.WriteLine("NEW records\n");
using (var output = new ChoCSVWriter(Console.Out).WithFirstLineHeader())
{
output.Write(input2.OfType<ChoDynamicObject>().Except(input1.OfType<ChoDynamicObject>(),
new ChoDynamicObjectEqualityComparer(new string[] { "id" })));
}
Console.WriteLine("\n\nDELETED records\n");
using (var output = new ChoCSVWriter(Console.Out).WithFirstLineHeader())
{
output.Write(input1.OfType<ChoDynamicObject>().Except(input2.OfType<ChoDynamicObject>(),
new ChoDynamicObjectEqualityComparer(new string[] { "id" })));
}
Console.WriteLine("\n\nCHANGED records\n");
using (var output = new ChoCSVWriter(Console.Out).WithFirstLineHeader())
{
output.Write(input1.OfType<ChoDynamicObject>().Except(input2.OfType<ChoDynamicObject>(),
new ChoDynamicObjectEqualityComparer(new string[] { "id", "name" })));
}
}
static string csv1 = @"
ID,name
1,Danny
2,Fred
3,Sam";
static string csv2 = @"
ID,name
1,Danny
3,Pamela
4,Fernando";
}
輸出
NEW records
ID,name
4,Fernando
DELETED records
ID,name
2,Fred
CHANGED records
ID,name
2,Fred
3,Sam
CHANGED 記錄不起作用。另外,我需要一個狀態,所以我希望它看起來像這樣:
CHANGED records
ID,name,status
1,Danny,NOCHANGE
2,Fred,DELETED
3,Pamela,CHANGED
4,Fernando,NEW
謝謝
uj5u.com熱心網友回復:
這是您可以如何使用 Cinchoo ETL
string csv1 = @"ID,name
1,Danny
2,Fred
3,Sam";
string csv2 = @"ID,name
1,Danny
3,Pamela
4,Fernando";
var r1 = ChoCSVReader.LoadText(csv1).WithFirstLineHeader().ToArray();
var r2 = ChoCSVReader.LoadText(csv2).WithFirstLineHeader().ToArray();
using (var w = new ChoCSVWriter(Console.Out).WithFirstLineHeader())
{
var newItems = r2.OfType<ChoDynamicObject>().Except(r1.OfType<ChoDynamicObject>(), new ChoDynamicObjectEqualityComparer(new string[] { "ID" }))
.Select(r =>
{
var dict = r.AsDictionary();
dict["Status"] = "NEW";
return new ChoDynamicObject(dict);
}).ToArray();
var deletedItems = r1.OfType<ChoDynamicObject>().Except(r2.OfType<ChoDynamicObject>(), new ChoDynamicObjectEqualityComparer(new string[] { "ID" }))
.Select(r =>
{
var dict = r.AsDictionary();
dict["Status"] = "DELETED";
return new ChoDynamicObject(dict);
}).ToArray();
var changedItems = r2.OfType<ChoDynamicObject>().Except(r1.OfType<ChoDynamicObject>(), ChoDynamicObjectEqualityComparer.Default)
.Except(newItems.OfType<ChoDynamicObject>(), new ChoDynamicObjectEqualityComparer(new string[] { "ID" }))
.Select(r =>
{
var dict = r.AsDictionary();
dict["Status"] = "CHANGED";
return new ChoDynamicObject(dict);
}).ToArray();
var noChangeItems = r1.OfType<ChoDynamicObject>().Intersect(r2.OfType<ChoDynamicObject>(), ChoDynamicObjectEqualityComparer.Default)
.Select(r =>
{
var dict = r.AsDictionary();
dict["Status"] = "NOCHANGE";
return new ChoDynamicObject(dict);
}).ToArray();
var finalResult = Enumerable.Concat(newItems, deletedItems).Concat(changedItems).Concat(noChangeItems).OfType<dynamic>().OrderBy(r => r.ID);
w.Write(finalResult);
}
Console.WriteLine();
輸出:
ID,name,Status
1,Danny,NOCHANGE
2,Fred,DELETED
3,Pamela,CHANGED
4,Fernando,NEW
小提琴示例:https : //dotnetfiddle.net/mrHpFx
轉載請註明出處,本文鏈接:https://www.uj5u.com/qianduan/368364.html
上一篇:如何在python中操作JSON檔案使第一列成為第一行?
下一篇:如何在awk腳本中隱藏csv內容
