Groups | Blog | Home
all groups > sql server (alternate) > april 2004 >

sql server (alternate) : Reverse Wildcard Searches Impossible?


robertbrown1971 NO[at]SPAM yahoo.com
4/30/2004 12:19:57 PM
I have researched newsgroups and the web very thoroughly and
unsuccessfully for a solution to what I believe is a very common
problem. I know it's easy to do wildcard match against data in DB
(using LIKE and "%" and "?").

But is it possible to match a concrete string against a database of
wildcarded data? ("%" and LIKE do not work). For example:


CREATE TABLE blacklist (
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
pattern VARCHAR(255) NOT NULL
/* ... */
);

INSERT INTO blacklist (pattern) VALUES ('%foobar.com');

Is there a select query that would match address mars2.foobar.com
against this row?

Some people claim that the following query will work. I have tried it
and it's not true on either Oracle or SQL server.

SELECT * FROM blacklist WHERE 'mars2.foobar.com' LIKE pattern;

Some people suggest breaking up the blacklist table into N varchar
fields for each domain segment and then representing a wildcard
character as a NULL and use isNull to match it. This does work to an
extent. However, a) it seems really ugly, b) does not allow arbitrary
wildcarding (eg %mars%foobar.com), and c) this is something the DB
should do out of the box.

Please help! Humanity will be greatful as there's currently no
David Best
4/30/2004 1:08:20 PM
[quoted text, click to view]

Works fine for me on Oracle:

C:\>sqlplus
Connected to Oracle9i Enterprise Edition Release 9.2.0.1.0

SQL> create table blacklist (pattern varchar2(255));

Table created

SQL> insert into blacklist (pattern) values ('%foobar.com');

1 row inserted

SQL> select * from blacklist where 'mars2.foobar.com' like pattern;

PATTERN
----------------------------------------------------------------------------
----
%foobar.com


Erland Sommarskog
4/30/2004 10:19:49 PM
David Best (davebest@usa.net) writes:
[quoted text, click to view]

And the same example (save the funny varchar2) works on MS SQL Server too.

And should work on about any DBMS, as this is core SQL.

--
Erland Sommarskog, SQL Server MVP, sommar@algonet.se

Books Online for SQL Server SP3 at
Jim Kennedy
5/1/2004 4:03:14 AM

[quoted text, click to view]

It should work, but the query is going to be very inefficient.
Jim

Erland Sommarskog
5/1/2004 8:28:24 AM
Jim Kennedy (kennedy-downwithspammersfamily@attbi.net) writes:
[quoted text, click to view]

Yes, if there is an index on pattern it is not going to be useful,
since the match is at the end of the string. But that is not really
the same that it is very ineffecient. If you have a million entries,
you will certainly notice the toll. But with thousand? Not very much.
And thousand is a more likely number than a million.

For this particular case there exists a possible way to speed things up.
Since we search for the end of the string, you could have:

CREATE TABLE blacklist (pattern varchar(225) NOT NULL PRIMARY KEY,
revpattern AS reverse(pattern));
CREATE UNIQUE INDEX revix ON blacklist (revpattern);
go
INSERT blacklist VALUES ('%@example.com')
go
SELECT pattern FROM blacklist
WHERE reverse('spammer@example.com') LIKE revpattern

However computed columns is not standard SQL, and may not work on all
DBMSs. The above works in SQL Server.

--
Erland Sommarskog, SQL Server MVP, sommar@algonet.se

Books Online for SQL Server SP3 at
Gert-Jan Strik
5/1/2004 12:02:58 PM
The query will still need a table (or index) scan, because the column is
to the right of the LIKE operator...

Gert-Jan


[quoted text, click to view]

--
AddThis Social Bookmark Button