是否有一種檢查n列規則的好方法? - Is there a good way to check rules against n columns? -开发者知识库

是否有一種檢查n列規則的好方法? - Is there a good way to check rules against n columns? -开发者知识库,第1张

Suppose you have a table RULES with 3 columns A, B, and C. As data enters the system, I want to know if any row of the RULES table matches my data with the condition that if the corresponding column in the RULES table is null, all data matches. The obvious SQL is:

假設你有一個包含3列A,B和C列的RULES表。當數據進入系統時,我想知道RULES表中的任何行是否與我的數據匹配,條件是RULES表中的相應列是否為null ,所有數據匹配。明顯的SQL是:

SELECT * FROM RULES
WHERE (A = :a OR A IS NULL)
  AND (B = :b OR B IS NULL)
  AND (C = :c OR C IS NULL)

So if I have rules:

所以,如果我有規則:

RULE    A        B        C
1       50       NULL     NULL
2       51       xyz      NULL
3       51       NULL     123
4       NULL     xyz      456

An input of (50, xyz, 456) will match rules 1 and 4.

(50,xyz,456)的輸入將匹配規則1和4。

Question: Is there a better way to do this? With only 3 fields this is no problem. But the actual table will have 15 columns and I worry about how well that SQL scales.

問題:有更好的方法嗎?只有3個字段,這沒問題。但實際的表將有15列,我擔心SQL的擴展程度。

Speculation: An alternative SQL statement I came up with involved adding an extra column to the table with a count of how many fields are not null. (So in the example, this columns value for rules 1-4 is 1, 2, 2 and 2 respectively.) With this "col_count" column, the select could be:

推測:我提出的另一個SQL語句涉及向表中添加一個額外的列,其中包含多少字段不為空的計數。 (因此在示例中,規則1-4的此列值分別為1,2,2和2.)使用此“col_count”列,select可以是:

SELECT * FROM RULES
WHERE (CASE WHEN A = :a THEN 1 ELSE 0 END)
      (CASE WHEN B = :b THEN 1 ELSE 0 END)
      (CASE WHEN C = :c THEN 1 ELSE 0 END)
    = COL_COUNT

Unfortunately, I don't have enough sample data to find our which of these approaches would perform better. Before I start creating random rules, I thought I'd ask here whether there was a better approach.

不幸的是,我沒有足夠的樣本數據來找到我們哪種方法會表現得更好。在我開始創建隨機規則之前,我想我會問這里是否有更好的方法。

Note: Data mining techniques and column constraints are not feasible here. The data must be checked as it enters the system and so it can be flagged pass/fail immediately. And, the users control the addition or removal of rules so I can't convert the rules into column constraints or other data definition statements.

注意:數據挖掘技術和列約束在這里是不可行的。必須在進入系統時檢查數據,因此可以立即標記通過/失敗。並且,用戶控制規則的添加或刪除,因此我無法將規則轉換為列約束或其他數據定義語句。

One last thing, in the end I need a list of all the rules that the data fails to pass. The solution cannot abort at the first failure.

最后一件事,最后我需要一個數據無法通過的所有規則的列表。解決方案在第一次失敗時無法中止。

Thanks.

5 个解决方案

#1


The first query you provided is perfect. I really doubt that adding the column you were speaking of would give you any more speed, since the NOT NULL property of every entry is checked anyway, since every comparison to NULL yields false. So I would guess that x=y is expanded to x IS NOT NULL AND x=y internally. Maybe someone else can clarify that.

您提供的第一個查詢是完美的。我真的懷疑添加你所說的列會給你更多的速度,因為無論如何都要檢查每個條目的NOT NULL屬性,因為每次與NULL的比較都會產生錯誤。所以我猜想x = y在內部擴展為x IS NOT NULL和x = y。也許其他人可以澄清這一點。

All other optimizations I can think of would involve precalculation or caching. You can create [temporary] tables matching certain rules or add further columns holding matching rules.

我能想到的所有其他優化都涉及預先計算或緩存。您可以創建匹配特定規則的[臨時]表,也可以添加包含匹配規則的其他列。

最佳答案:

本文经用户投稿或网站收集转载,如有侵权请联系本站。

发表评论

0条回复