sql server mseq:
I'm working on a mailing list and want to select records where the both the SSN number and Address fields are unique, since I may want to send a piece of mail to more than one address for a person, but not multiple pieces to the same address for that person. I've created a test table and some test data to try to get my syntax right (below) but so far have had no luck. I'm not certain if I can do this using DISTINCT or the UNIQUE keyword, nor how to get my syntax right. Could someone help me with this? Thanks in advance :-) Marc ************* CREATE TABLE ***************************** USE [Northwind] GO /****** Object: Table [dbo].[MY_CONTACTS_TBL] Script Date: 08/17/2006 09:07:19 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO SET ANSI_PADDING ON GO CREATE TABLE [dbo].[MY_CONTACTS_TBL]( [UID] [smallint] IDENTITY(1,1) NOT NULL, [SSN] [bigint] NULL, [FNAME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, [LNAME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, [ADDRESS1] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, [ADDRESS2] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, [CITY] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, [STATE] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, [ZIP] [int] NULL, CONSTRAINT [PK_MY_CONTACTS_TBL] PRIMARY KEY CLUSTERED ( [UID] ASC )WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY] ) ON [PRIMARY] GO SET ANSI_PADDING OFF ******************* TABLE DATA ***************************** UID,SSN,FNAME,LNAME,ADDRESS1,ADDRESS2,CITY,STATE,ZIP 1,123456789,Marc,Archuleta,123 Sesame St,apartment 1,Columbus,OH,43215 2,234567890,John,Doe,123 Dough St,,Columbus,OH,43215 3,345678901,John2,Doe,456 Dough St,,Columbus,OH,43215 4,456789012,John,DoeTwo,789 Dough St,,Columbus,OH,43215 5,234567890,John,Doe,123 Dough St,,Columbus,OH,43215 6,234567890,John,Doe,123 Pumpernickel Ave,,New York,NY,10016
First, a suggestion.=20 Make the ZipCode field a varchar() -either 9 or 10, depending upon what = you do with the dash. If you use an int, leading zeros (upper Northeast = States) is lost, and 9 digit zipcodes (e.g., 98765-4321) will do the = math and store the result. Similar leading zero problem with SSN The use of DISTINCT serves two purposes. First, it makes sure that there = are no duplicates by testing all of the selected columns as a unit = -ignoring capitalization differences (with the standard = SQL_Latin1_General_CP1_CI_AS collation). Second, it provides a 'sorted' = output since it has to accomplish an ORDER BY just to remove duplicates. = However, if there is any character, including space, period, etc., = difference, the comparison fails. CREATE TABLE dbo.#MY_CONTACTS_TBL ( UID smallint IDENTITY(1,1) NOT NULL , SSN char(9) NULL , FName varchar(50) , LName varchar(50) , Address1 varchar(50) , Address2 varchar(50) , City varchar(50) , State varchar(50) , Zipcode varchar(10) NULL ) GO INSERT INTO #MY_CONTACTS_TBL VALUES ( = '123456789','Marc','Archuleta','123 Sesame St','apartment = 1','Columbus','OH','43215' ) INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','John','Doe','123 = Dough St',NULL,'Columbus','OH','43215' ) INSERT INTO #MY_CONTACTS_TBL VALUES ( '345678901','John2','Doe','456 = Dough St',NULL,'Columbus','OH','43215' ) INSERT INTO #MY_CONTACTS_TBL VALUES ( '456789012','John','DoeTwo','789 = Dough St',NULL,'Columbus','OH','43215' ) INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','john','doe','123 = dough St',NULL,'columbus','oh','43215' ) INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','John','Doe','123 = Pumpernickel Ave',NULL,'New York','NY','10016' ) INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','John','Doe','123 = Dough St.',NULL,'Columbus','OH','43215' ) SELECT DISTINCT FName + ' ' + LName , Address1 , isnull( Address2, '' ) , City + ', ' + State + ' ' + ZipCode FROM #MY_CONTACTS_TBL --=20 Arnie Rowland, Ph.D. Westwood Consulting, Inc Most good judgment comes from experience.=20 Most experience comes from bad judgment.=20 - Anonymous [quoted text, click to view] "archuleta37" <archuleta37@discussions.microsoft.com> wrote in message = news:269374D2-A26E-42B4-8CA1-0BF057600B39@microsoft.com... > I'm working on a mailing list and want to select records where the = both the=20 > SSN number and Address fields are unique, since I may want to send a = piece of=20 > mail to more than one address for a person, but not multiple pieces to = the=20 > same address for that person. I've created a test table and some test = data to=20 > try to get my syntax right (below) but so far have had no luck. >=20 > I'm not certain if I can do this using DISTINCT or the UNIQUE = keyword, nor=20 > how to get my syntax right. Could someone help me with this? >=20 > Thanks in advance :-) > Marc >=20 >=20 > ************* CREATE TABLE ***************************** > USE [Northwind] > GO > /****** Object: Table [dbo].[MY_CONTACTS_TBL] Script Date: = 08/17/2006=20 > 09:07:19 ******/ > SET ANSI_NULLS ON > GO > SET QUOTED_IDENTIFIER ON > GO > SET ANSI_PADDING ON > GO > CREATE TABLE [dbo].[MY_CONTACTS_TBL]( > [UID] [smallint] IDENTITY(1,1) NOT NULL, > [SSN] [bigint] NULL, > [FNAME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, > [LNAME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, > [ADDRESS1] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, > [ADDRESS2] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, > [CITY] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, > [STATE] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, > [ZIP] [int] NULL, > CONSTRAINT [PK_MY_CONTACTS_TBL] PRIMARY KEY CLUSTERED=20 > ( > [UID] ASC > )WITH (PAD_INDEX =3D OFF, IGNORE_DUP_KEY =3D OFF) ON [PRIMARY] > ) ON [PRIMARY] >=20 > GO > SET ANSI_PADDING OFF >=20 > ******************* TABLE DATA ***************************** > UID,SSN,FNAME,LNAME,ADDRESS1,ADDRESS2,CITY,STATE,ZIP > 1,123456789,Marc,Archuleta,123 Sesame St,apartment 1,Columbus,OH,43215 > 2,234567890,John,Doe,123 Dough St,,Columbus,OH,43215 > 3,345678901,John2,Doe,456 Dough St,,Columbus,OH,43215 > 4,456789012,John,DoeTwo,789 Dough St,,Columbus,OH,43215 > 5,234567890,John,Doe,123 Dough St,,Columbus,OH,43215 > 6,234567890,John,Doe,123 Pumpernickel Ave,,New York,NY,10016
Arnie, You make some good points about the table design. Also, thanks for shedding a bit more light on how the DISTINCT keyword works. Based on the query you wrote, I see my mistake was to include the UID field in my query. I don't think I will actually need that field so that should do the the trick. But out of curiosity, I wonder how the query could be written if I did need the UID field. Would I use multiple select statements and the IN keyword or somthing like that? [quoted text, click to view] "Arnie Rowland" wrote: > First, a suggestion. > > Make the ZipCode field a varchar() -either 9 or 10, depending upon what you do with the dash. If you use an int, leading zeros (upper Northeast States) is lost, and 9 digit zipcodes (e.g., 98765-4321) will do the math and store the result. > > Similar leading zero problem with SSN > > The use of DISTINCT serves two purposes. First, it makes sure that there are no duplicates by testing all of the selected columns as a unit -ignoring capitalization differences (with the standard SQL_Latin1_General_CP1_CI_AS collation). Second, it provides a 'sorted' output since it has to accomplish an ORDER BY just to remove duplicates. However, if there is any character, including space, period, etc., difference, the comparison fails. > > CREATE TABLE dbo.#MY_CONTACTS_TBL > ( UID smallint IDENTITY(1,1) NOT NULL > , SSN char(9) NULL > , FName varchar(50) > , LName varchar(50) > , Address1 varchar(50) > , Address2 varchar(50) > , City varchar(50) > , State varchar(50) > , Zipcode varchar(10) NULL > ) > GO > > INSERT INTO #MY_CONTACTS_TBL VALUES ( '123456789','Marc','Archuleta','123 Sesame St','apartment 1','Columbus','OH','43215' ) > INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','John','Doe','123 Dough St',NULL,'Columbus','OH','43215' ) > INSERT INTO #MY_CONTACTS_TBL VALUES ( '345678901','John2','Doe','456 Dough St',NULL,'Columbus','OH','43215' ) > INSERT INTO #MY_CONTACTS_TBL VALUES ( '456789012','John','DoeTwo','789 Dough St',NULL,'Columbus','OH','43215' ) > INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','john','doe','123 dough St',NULL,'columbus','oh','43215' ) > INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','John','Doe','123 Pumpernickel Ave',NULL,'New York','NY','10016' ) > INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','John','Doe','123 Dough St.',NULL,'Columbus','OH','43215' ) > > SELECT DISTINCT > FName + ' ' + LName > , Address1 > , isnull( Address2, '' ) > , City + ', ' + State + ' ' + ZipCode > FROM #MY_CONTACTS_TBL > > -- > Arnie Rowland, Ph.D. > Westwood Consulting, Inc > > Most good judgment comes from experience. > Most experience comes from bad judgment. > - Anonymous > > > "archuleta37" <archuleta37@discussions.microsoft.com> wrote in message news:269374D2-A26E-42B4-8CA1-0BF057600B39@microsoft.com... > > I'm working on a mailing list and want to select records where the both the > > SSN number and Address fields are unique, since I may want to send a piece of > > mail to more than one address for a person, but not multiple pieces to the > > same address for that person. I've created a test table and some test data to > > try to get my syntax right (below) but so far have had no luck. > > > > I'm not certain if I can do this using DISTINCT or the UNIQUE keyword, nor > > how to get my syntax right. Could someone help me with this? > > > > Thanks in advance :-) > > Marc > > > > > > ************* CREATE TABLE ***************************** > > USE [Northwind] > > GO > > /****** Object: Table [dbo].[MY_CONTACTS_TBL] Script Date: 08/17/2006 > > 09:07:19 ******/ > > SET ANSI_NULLS ON > > GO > > SET QUOTED_IDENTIFIER ON > > GO > > SET ANSI_PADDING ON > > GO > > CREATE TABLE [dbo].[MY_CONTACTS_TBL]( > > [UID] [smallint] IDENTITY(1,1) NOT NULL, > > [SSN] [bigint] NULL, > > [FNAME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, > > [LNAME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, > > [ADDRESS1] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, > > [ADDRESS2] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, > > [CITY] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, > > [STATE] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, > > [ZIP] [int] NULL, > > CONSTRAINT [PK_MY_CONTACTS_TBL] PRIMARY KEY CLUSTERED > > ( > > [UID] ASC > > )WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY] > > ) ON [PRIMARY] > > > > GO > > SET ANSI_PADDING OFF > > > > ******************* TABLE DATA ***************************** > > UID,SSN,FNAME,LNAME,ADDRESS1,ADDRESS2,CITY,STATE,ZIP > > 1,123456789,Marc,Archuleta,123 Sesame St,apartment 1,Columbus,OH,43215 > > 2,234567890,John,Doe,123 Dough St,,Columbus,OH,43215 > > 3,345678901,John2,Doe,456 Dough St,,Columbus,OH,43215 > > 4,456789012,John,DoeTwo,789 Dough St,,Columbus,OH,43215 > > 5,234567890,John,Doe,123 Dough St,,Columbus,OH,43215 > > 6,234567890,John,Doe,123 Pumpernickel Ave,,New York,NY,10016
That would be more troublesome, since there would be two or more UIDs = for a duplicated set of fields. Which one to pick? You could use a GROUP BY, and decide that you want the max( UID ) or = min( UID ). Something like this: SELECT max( UID ) , FullName =3D ( FName + ' ' + LName ) , Address1 , Address2 =3D ( isnull( Address2, '' )) , CityState =3D ( City + ', ' + State + ' ' + ZipCode ) FROM #MY_CONTACTS_TBL GROUP BY FNAME , LNAME , Address1 , Address2 , City , State =20 , ZipCode --=20 Arnie Rowland, Ph.D. Westwood Consulting, Inc Most good judgment comes from experience.=20 Most experience comes from bad judgment.=20 - Anonymous [quoted text, click to view] "archuleta37" <archuleta37@discussions.microsoft.com> wrote in message = news:CEAE9B75-4BB7-428C-8B1C-2A012F145DBC@microsoft.com... > Arnie, >=20 > You make some good points about the table design. Also, thanks for = shedding=20 > a bit more light on how the DISTINCT keyword works. Based on the query = you=20 > wrote, I see my mistake was to include the UID field in my query. I = don't=20 > think I will actually need that field so that should do the the trick. = But=20 > out of curiosity, I wonder how the query could be written if I did = need the=20 > UID field. Would I use multiple select statements and the IN keyword = or=20 > somthing like that? >=20 >=20 >=20 >=20 >=20 > "Arnie Rowland" wrote: >=20 >> First, a suggestion.=20 >>=20 >> Make the ZipCode field a varchar() -either 9 or 10, depending upon =
what you do with the dash. If you use an int, leading zeros (upper = Northeast States) is lost, and 9 digit zipcodes (e.g., 98765-4321) will = do the math and store the result. [quoted text, click to view] >>=20 >> Similar leading zero problem with SSN >>=20 >> The use of DISTINCT serves two purposes. First, it makes sure that =
there are no duplicates by testing all of the selected columns as a unit = -ignoring capitalization differences (with the standard = SQL_Latin1_General_CP1_CI_AS collation). Second, it provides a 'sorted' = output since it has to accomplish an ORDER BY just to remove duplicates. = However, if there is any character, including space, period, etc., = difference, the comparison fails. [quoted text, click to view] >>=20 >> CREATE TABLE dbo.#MY_CONTACTS_TBL >> ( UID smallint IDENTITY(1,1) NOT NULL >> , SSN char(9) NULL >> , FName varchar(50) >> , LName varchar(50) >> , Address1 varchar(50) >> , Address2 varchar(50) >> , City varchar(50) >> , State varchar(50) >> , Zipcode varchar(10) NULL >> ) >> GO >>=20 >> INSERT INTO #MY_CONTACTS_TBL VALUES ( =
'123456789','Marc','Archuleta','123 Sesame St','apartment = 1','Columbus','OH','43215' ) [quoted text, click to view] >> INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','John','Doe','123 = Dough St',NULL,'Columbus','OH','43215' ) >> INSERT INTO #MY_CONTACTS_TBL VALUES ( '345678901','John2','Doe','456 = Dough St',NULL,'Columbus','OH','43215' ) >> INSERT INTO #MY_CONTACTS_TBL VALUES ( =
'456789012','John','DoeTwo','789 Dough St',NULL,'Columbus','OH','43215' = ) [quoted text, click to view] >> INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','john','doe','123 = dough St',NULL,'columbus','oh','43215' ) >> INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','John','Doe','123 = Pumpernickel Ave',NULL,'New York','NY','10016' ) >> INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','John','Doe','123 = Dough St.',NULL,'Columbus','OH','43215' ) >>=20 >> SELECT DISTINCT >> FName + ' ' + LName >> , Address1 >> , isnull( Address2, '' ) >> , City + ', ' + State + ' ' + ZipCode >> FROM #MY_CONTACTS_TBL >>=20 >> --=20 >> Arnie Rowland, Ph.D. >> Westwood Consulting, Inc >>=20 >> Most good judgment comes from experience.=20 >> Most experience comes from bad judgment.=20 >> - Anonymous >>=20 >>=20 >> "archuleta37" <archuleta37@discussions.microsoft.com> wrote in = message news:269374D2-A26E-42B4-8CA1-0BF057600B39@microsoft.com... >> > I'm working on a mailing list and want to select records where the = both the=20 >> > SSN number and Address fields are unique, since I may want to send = a piece of=20 >> > mail to more than one address for a person, but not multiple pieces = to the=20 >> > same address for that person. I've created a test table and some = test data to=20 >> > try to get my syntax right (below) but so far have had no luck. >> >=20 >> > I'm not certain if I can do this using DISTINCT or the UNIQUE = keyword, nor=20 >> > how to get my syntax right. Could someone help me with this? >> >=20 >> > Thanks in advance :-) >> > Marc >> >=20 >> >=20 >> > ************* CREATE TABLE ***************************** >> > USE [Northwind] >> > GO >> > /****** Object: Table [dbo].[MY_CONTACTS_TBL] Script Date: = 08/17/2006=20 >> > 09:07:19 ******/ >> > SET ANSI_NULLS ON >> > GO >> > SET QUOTED_IDENTIFIER ON >> > GO >> > SET ANSI_PADDING ON >> > GO >> > CREATE TABLE [dbo].[MY_CONTACTS_TBL]( >> > [UID] [smallint] IDENTITY(1,1) NOT NULL, >> > [SSN] [bigint] NULL, >> > [FNAME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, >> > [LNAME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, >> > [ADDRESS1] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, >> > [ADDRESS2] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, >> > [CITY] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, >> > [STATE] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, >> > [ZIP] [int] NULL, >> > CONSTRAINT [PK_MY_CONTACTS_TBL] PRIMARY KEY CLUSTERED=20 >> > ( >> > [UID] ASC >> > )WITH (PAD_INDEX =3D OFF, IGNORE_DUP_KEY =3D OFF) ON [PRIMARY] >> > ) ON [PRIMARY] >> >=20 >> > GO >> > SET ANSI_PADDING OFF >> >=20 >> > ******************* TABLE DATA ***************************** >> > UID,SSN,FNAME,LNAME,ADDRESS1,ADDRESS2,CITY,STATE,ZIP >> > 1,123456789,Marc,Archuleta,123 Sesame St,apartment = 1,Columbus,OH,43215 >> > 2,234567890,John,Doe,123 Dough St,,Columbus,OH,43215 >> > 3,345678901,John2,Doe,456 Dough St,,Columbus,OH,43215 >> > 4,456789012,John,DoeTwo,789 Dough St,,Columbus,OH,43215 >> > 5,234567890,John,Doe,123 Dough St,,Columbus,OH,43215 >> > 6,234567890,John,Doe,123 Pumpernickel Ave,,New York,NY,10016
Don't see what you're looking for? Try a search.
|