是否有一種檢查n列規則的好方法？ - Is there a good way to check rules against n columns? -开发者知识库

Suppose you have a table RULES with 3 columns A, B, and C. As data enters the system, I want to know if any row of the RULES table matches my data with the condition that if the corresponding column in the RULES table is null, all data matches. The obvious SQL is:

``````SELECT * FROM RULES
WHERE (A = :a OR A IS NULL)
AND (B = :b OR B IS NULL)
AND (C = :c OR C IS NULL)
``````

So if I have rules:

```RULE    A        B        C
1       50       NULL     NULL
2       51       xyz      NULL
3       51       NULL     123
4       NULL     xyz      456
```

An input of (50, xyz, 456) will match rules 1 and 4.

(50,xyz,456)的輸入將匹配規則1和4。

Question: Is there a better way to do this? With only 3 fields this is no problem. But the actual table will have 15 columns and I worry about how well that SQL scales.

Speculation: An alternative SQL statement I came up with involved adding an extra column to the table with a count of how many fields are not null. (So in the example, this columns value for rules 1-4 is 1, 2, 2 and 2 respectively.) With this "col_count" column, the select could be:

``````SELECT * FROM RULES
WHERE (CASE WHEN A = :a THEN 1 ELSE 0 END)
(CASE WHEN B = :b THEN 1 ELSE 0 END)
(CASE WHEN C = :c THEN 1 ELSE 0 END)
= COL_COUNT
``````

Unfortunately, I don't have enough sample data to find our which of these approaches would perform better. Before I start creating random rules, I thought I'd ask here whether there was a better approach.

Note: Data mining techniques and column constraints are not feasible here. The data must be checked as it enters the system and so it can be flagged pass/fail immediately. And, the users control the addition or removal of rules so I can't convert the rules into column constraints or other data definition statements.

One last thing, in the end I need a list of all the rules that the data fails to pass. The solution cannot abort at the first failure.

Thanks.

5 个解决方案

#1

The first query you provided is perfect. I really doubt that adding the column you were speaking of would give you any more speed, since the NOT NULL property of every entry is checked anyway, since every comparison to NULL yields false. So I would guess that `x=y` is expanded to `x IS NOT NULL AND x=y` internally. Maybe someone else can clarify that.

All other optimizations I can think of would involve precalculation or caching. You can create [temporary] tables matching certain rules or add further columns holding matching rules.

0条回复