您的位置:首页 > 其它

【原创】StreamInsight查询系列(十六)——查询模式之左外联接

2011-09-06 00:26 465 查看
上篇文章介绍了查询模式中窗口比率部分,这篇博文将介绍StreamInsight中的左外联接(Left Outer Join)。

测试数据准备

为了方便测试查询,我们首先准备一个静态的测试数据源:

var time = DateTime.Parse("10/23/2011 09:37:00 PM");
var inputEvents = new[]
{
new { Timestamp = time+TimeSpan.FromMinutes(1), accountNo = "800001", branchName="43rd Street", customerName="John", amountWithdrawn = 100.0 },
new { Timestamp = time+TimeSpan.FromMinutes(2), accountNo = "800002", branchName="23rd Street", customerName="Mark", amountWithdrawn = 250.0 },
new { Timestamp = time+TimeSpan.FromMinutes(3), accountNo = "800003", branchName="43rd Street", customerName="Chandler", amountWithdrawn = 500.0 },
new { Timestamp = time+TimeSpan.FromMinutes(4), accountNo = "800004", branchName="43rd Street", customerName="Dave", amountWithdrawn = 380.0 },
new { Timestamp = time+TimeSpan.FromMinutes(4), accountNo = "800005", branchName="43rd Street", customerName="Mike", amountWithdrawn = 5000.0 },
new { Timestamp = time+TimeSpan.FromMinutes(5), accountNo = "800006", branchName="23rd Street", customerName="Sarah", amountWithdrawn = 80.0 },
new { Timestamp = time+TimeSpan.FromMinutes(6), accountNo = "800007", branchName="43rd Street", customerName="Ross", amountWithdrawn = 500.0 },
new { Timestamp = time+TimeSpan.FromMinutes(8), accountNo = "800008", branchName="43rd Street", customerName="Joey", amountWithdrawn = 1500.0 }
};

var branchInfo = new[]
{
new { branchName = "23rd Street", city = "Some Town", zip = "11111" }
};

接下去将上述数据源转变为点类型复杂事件流(其中branchStream将作为引用事件流):

var branchStream = branchInfo.ToPointStream(Application, t =>
PointEvent.CreateInsert(time, t),
AdvanceTimeSettings.StrictlyIncreasingStartTime);
// 将branchStream事件流中的所有时间持续时间延伸至无穷大值,方便后续作为引用流
var refStream = branchStream.AlterEventDuration(e => TimeSpan.MaxValue);

var inputStream = inputEvents.ToPointStream(Application, t =>
PointEvent.CreateInsert(t.Timestamp, t),
AdvanceTimeSettings.StrictlyIncreasingStartTime);

左外联接

熟悉SQL的读者一定很熟悉左外联接(Left Outer Join),这里再举一个左外联接的简单例子:

假设存在两张表A和B,表A记录了人员编号id及人员姓名Name,如下:

id

name

1张三
2李四
3王五
表B记录了人员编号id及对应职业job,如下:

id

job

1学生
2教师
4程序员
将表A与表B进行左外联接

SELECT A.*, B.* FROM A

LEFT JOIN B ON A.id = B.id

结果如下:

idnameidjob
1张三1学生
2李四2教师
3王五NULLNULL
好了复习完了左外联接的基础知识,让我们再回顾一下之前学习过的StreamInsight内部联接:

var innerJoinStream = from left in inputStream
join right in refStream on left.branchName equals right.branchName
select new
{
accountNo = left.accountNo,
amountWithdrawn = left.amountWithdrawn,
customerName = left.customerName,

branchName = right.branchName,
branchCity = right.city,
branchZip = right.zip
};

结果显示了inputStream和refStream中branchName相同且事件重叠的两个事件:



之所以介绍内部联接,是因为实现StreamInsight中的左外联接要用到内部联接。我们可以把左外联接看做是内部联接和左反半部联接的并集。

下面进行左外半部联接查询找出inputStream存在但是refStream没有的事件:

var lasjStream = from left in inputStream
where (from right in refStream
where left.branchName == right.branchName
select right).IsEmpty()
select new
{
accountNo = left.accountNo,
amountWithdrawn = left.amountWithdrawn,
customerName = left.customerName,

branchName = left.branchName,
branchCity = "MISSING",
branchZip = "MISSING"
};

结果如下:



最后左外联接的结果就是内部链接和左反半部联接的结果并集:

var leftOuterJoinStream = innerJoinStream.Union(lasjStream);

左外联接的最终结果如下:



下一篇将介绍StreamInsight查询模式中如何应对瞬变及报警泛滥。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐