{"id":304,"date":"2013-05-15T16:48:41","date_gmt":"2013-05-15T16:48:41","guid":{"rendered":"http:\/\/www.timestored.com\/b\/?p=304"},"modified":"2013-05-15T16:50:27","modified_gmt":"2013-05-15T16:50:27","slug":"reconcile-tickerplant-feeds-using-longest-common-subsequence-matching","status":"publish","type":"post","link":"https:\/\/www.timestored.com\/b\/reconcile-tickerplant-feeds-using-longest-common-subsequence-matching\/","title":{"rendered":"Reconcile tickerplant feeds using longest common subsequence matching"},"content":{"rendered":"<p>Sanket Agrawal just posted some very cool code to the k4 mailing list for finding the <a href=\"http:\/\/en.wikipedia.org\/wiki\/Longest_common_subsequence_problem\">longest common\u00a0sub-sequences<\/a>. This actually solves a problem that some colleagues had where they wanted to combine two tickerplant feeds that never seemed to match up. Here&#8217;s an example:<\/p>\n<p><code><br \/>\nq)\/ assuming t is our perfect knowledge table<\/p>\n<p>q)t<br \/>\ntime sym size price<br \/>\n------------------------<br \/>\n09:00 A 8 7.343062<br \/>\n09:01 A 0 5.482385<br \/>\n09:02 B 5 8.847715<br \/>\n09:03 A 1 6.78881<br \/>\n09:04 B 5 3.432312<br \/>\n09:05 A 0 0.2801381<br \/>\n09:06 A 2 3.775222<br \/>\n09:07 B 3 1.676582<br \/>\n09:08 B 7 7.163578<br \/>\n09:09 B 4 3.300548<br \/>\n<\/code><br \/>\nLet us now create two tables, u and v, neither of which contain all the data.<br \/>\n<code><br \/>\nq)u:t except t 7 8<br \/>\nq)v:t except t 1 2 3 4<br \/>\nq)u<br \/>\ntime sym size price<br \/>\n------------------------<br \/>\n09:00 A 8 7.343062<br \/>\n09:01 A 0 5.482385<br \/>\n09:02 B 5 8.847715<br \/>\n09:03 A 1 6.78881<br \/>\n09:04 B 5 3.432312<br \/>\n09:05 A 0 0.2801381<br \/>\n09:06 A 2 3.775222<br \/>\n09:09 B 4 3.300548<br \/>\nq)v<br \/>\ntime sym size price<br \/>\n------------------------<br \/>\n09:00 A 8 7.343062<br \/>\n09:05 A 0 0.2801381<br \/>\n09:06 A 2 3.775222<br \/>\n09:07 B 3 1.676582<br \/>\n09:08 B 7 7.163578<br \/>\n09:09 B 4 3.300548<\/p>\n<p><\/code><\/p>\n<p>We can find the indices that differ using Sankets difftables function<br \/>\n<code><br \/>\nq)p:diffTables[u;v;t `sym;`price`size]<br \/>\nq)show p 0; \/ rows in u that are not in v<br \/>\n1 2 3 4<br \/>\nq)show p 1; \/ rows in v that are not in u<br \/>\n3 4<\/p>\n<p>\/ combine together again<\/p>\n<p>q)`time xasc (update src:`u from u),update src:`v from v p 1<br \/>\ntime sym size price src<br \/>\n----------------------------<br \/>\n09:00 A 8 7.343062 u<br \/>\n09:01 A 0 5.482385 u<br \/>\n09:02 B 5 8.847715 u<br \/>\n09:03 A 1 6.78881 u<br \/>\n09:04 B 5 3.432312 u<br \/>\n09:05 A 0 0.2801381 u<br \/>\n09:06 A 2 3.775222 u<br \/>\n09:07 B 3 1.676582 v<br \/>\n09:08 B 7 7.163578 v<br \/>\n09:09 B 4 3.300548 u<br \/>\nq)t~`time xasc u,v p 1<br \/>\n1b<br \/>\n<\/code><\/p>\n<p>The code can be downloaded at:<br \/>\n<a href=\"http:\/\/code.kx.com\/wsvn\/code\/contrib\/sagrawal\/lcs\/miller.q\">http:\/\/code.kx.com\/wsvn\/code\/contrib\/sagrawal\/lcs\/miller.q<\/p>\n<p><\/a><a href=\"http:\/\/code.kx.com\/wsvn\/code\/contrib\/sagrawal\/lcs\/myers.q\">http:\/\/code.kx.com\/wsvn\/code\/contrib\/sagrawal\/lcs\/myers.q<\/a><\/p>\n<p>Thanks Sanket.<\/p>\n<p>Algorithm details:<\/p>\n<p>Myers O(ND): <a href=\"http:\/\/www.xmailserver.org\/diff2.pdf\">http:\/\/www.xmailserver.org\/diff2.pdf<\/a><\/p>\n<p>Miller O(NP):\u00a0 <a href=\"http:\/\/www.bookoff.co.jp\/files\/ir_pr\/6c\/np_diff.pdf\">http:\/\/www.bookoff.co.jp\/files\/ir_pr\/6c\/np_diff.pdf<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Sanket Agrawal just posted some very cool code to the k4 mailing list for finding the longest common\u00a0sub-sequences. This actually solves a problem that some colleagues had where they wanted to combine two tickerplant feeds that never seemed to match up. Here&#8217;s an example: q)\/ assuming t is our perfect knowledge table q)t time sym [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0},"categories":[2],"tags":[],"_links":{"self":[{"href":"https:\/\/www.timestored.com\/b\/wp-json\/wp\/v2\/posts\/304"}],"collection":[{"href":"https:\/\/www.timestored.com\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.timestored.com\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.timestored.com\/b\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.timestored.com\/b\/wp-json\/wp\/v2\/comments?post=304"}],"version-history":[{"count":5,"href":"https:\/\/www.timestored.com\/b\/wp-json\/wp\/v2\/posts\/304\/revisions"}],"predecessor-version":[{"id":309,"href":"https:\/\/www.timestored.com\/b\/wp-json\/wp\/v2\/posts\/304\/revisions\/309"}],"wp:attachment":[{"href":"https:\/\/www.timestored.com\/b\/wp-json\/wp\/v2\/media?parent=304"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.timestored.com\/b\/wp-json\/wp\/v2\/categories?post=304"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.timestored.com\/b\/wp-json\/wp\/v2\/tags?post=304"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}