Groups | Blog | Home
all groups > sql server clients > april 2005 >

sql server clients : Expert advised needed.


David Gugick
4/25/2005 12:00:00 AM
[quoted text, click to view]

How did you partition the table? Did you place each partition on its own
array? Also, your table design does not lead to efficient table access.
Your row size already exceeds the maximum allowed by SQL Server of 8060
bytes (yours has a maximum of 16,100 bytes).

I'm guessing that all you've done is add more work for SQL Server. You
should post your DDL for the individual tables (are there check
constraints on the tables for modifying data?)

How do you normally access the table (i/e/ what columns do you access
most frequently?) What you might want to do is create a Customer Note
table that links to the Customer table and move the nvarchar(4000)
columns there. I don't really understand the table design, but if your
long text is long, then you not only risk exceeding the max row size,
but are creating low-density pages, which make the table huge. You could
also change the nvarchar(4000) columns to ntext if they are not accessed
frequently.

Where are your indexes? I think we need more information here...




--
David Gugick
Imceda Software
www.imceda.com
John
4/25/2005 9:30:56 PM
Dear DBAs.....

I would really appreciate if any one give me good advise for that.

I had a table CustomerHistory in which there are 15 Billions of records.
Basically in that table we are archiving the data for last 5 years and we
use that table for reporting purpose.

Cust_ID INT
TableName VARCHAR(50)
ColName VARCHAR(50)
Note NVARCHAR(4000)
Comments NVARCHAR(4000)
CreationDate Datetime

In the report CustomerHistory table joins some more tables and yield the
result in 4 seconds but I thought that would be speed up if I can do
Partitioning the table so I have created following tables.

1) Customer_2001
2) Customer_2002
3) Customer_2003
4) Customer_2004
5) Customer_2005

after that I have rename the table name to CustomerHistory_Bk and created
one view like that


Create View CustomerHistory
Select
Cust_ID ,
TableName ,
ColName ,
Note ,
Comments ,
CreationDate
From CustomerHistory_2001
Union All
Select
Cust_ID ,
TableName ,
ColName ,
Note ,
Comments ,
CreationDate
From CustomerHistory_2002
Union All
Select
Cust_ID ,
TableName ,
ColName ,
Note ,
Comments ,
CreationDate
From CustomerHistory_2003
Union All
Select
Cust_ID ,
TableName ,
ColName ,
Note ,
Comments ,
CreationDate
From CustomerHistory_2004

Union All
Select
Cust_ID ,
TableName ,
ColName ,
Note ,
Comments ,
CreationDate
From CustomerHistory_2005


After that when I run that report so report took 47 seconds... Can any one
add his or her comments regarding this.... what should I do ......

Thanks and have a great day !



Andrew J. Kelly
4/25/2005 9:43:59 PM
Have a look in BooksOnLine under "partitioned views".

--
Andrew J. Kelly SQL MVP


[quoted text, click to view]

John
4/25/2005 9:50:59 PM
Thanks for your reply but I want to know how can I optimized that views... I
have gone through BOL that's why put that question over here...


[quoted text, click to view]

John
4/26/2005 12:00:00 AM

MedicalHistory Table
Seq int
LTkId int
LIId int
TbName nvarchar(50)
FldName nvarchar(50)
Bf nvarchar(4000)
Aft nvarchar(4000)
CDate datetime
SysChDate datetime

No identity column defined. NULL NULL NULL
No rowguidcol column defined.

Indexes:
MedicalHistory_SysChDate_Ind nonclustered located on PRIMARY SysChDate
MedicalHistory_LIId_Ind nonclustered located on PRIMARY LIId
MedicalHistory_LTkId_Ind nonclustered located on PRIMARY LTkId
PK_MedicalHistory nonclustered, unique, primary key located on PRIMARY Seq,
LTkId, LIId

15 Billions record in MedicalHistory table, After that I have breaked that
table like that.

select * into MedicalHistory_2002 from MedicalHistory where year(CDate) =
2002
GO
alter table MedicalHistory_2002
add Constraint MedicalHistory_2002_CDate_Chk check(year(CDate) = 2002)
GO
CREATE INDEX [MedicalHistory_LIId_2002_Ind] ON
[dbo].[MedicalHistory_2002]([LIId])
GO
CREATE INDEX [MedicalHistory_LTkId_2002_Ind] ON
[dbo].[MedicalHistory_2002]([LTkId])
GO
CREATE INDEX [MedicalHistory_SysChDate_2002_Ind] ON
[dbo].[MedicalHistory_2002]([SysChDate])
GO


select * into MedicalHistory_2003 from MedicalHistory where year(CDate) =
2003
GO
alter table MedicalHistory_2003
add Constraint MedicalHistory_2003_CDate_Chk check(year(CDate) = 2003)
GO
CREATE INDEX [MedicalHistory_LIId_2003_Ind] ON
[dbo].[MedicalHistory_2003]([LIId])
GO
CREATE INDEX [MedicalHistory_LTkId_2003_Ind] ON
[dbo].[MedicalHistory_2003]([LTkId])
GO
CREATE INDEX [MedicalHistory_SysChDate_2003_Ind] ON
[dbo].[MedicalHistory_2003]([SysChDate])
GO

select * into MedicalHistory_2004 from MedicalHistory where year(CDate) =
2004
GO
alter table MedicalHistory_2004
add Constraint MedicalHistory_2004_CDate_Chk check(year(CDate) = 2004)
GO
CREATE INDEX [MedicalHistory_LIId_2004_Ind] ON
[dbo].[MedicalHistory_2004]([LIId])
GO
CREATE INDEX [MedicalHistory_LTkId_2004_Ind] ON
[dbo].[MedicalHistory_2004]([LTkId])
GO
CREATE INDEX [MedicalHistory_SysChDate_2004_Ind] ON
[dbo].[MedicalHistory_2004]([SysChDate])
GO

select * into MedicalHistory_2005 from MedicalHistory where year(CDate) =
2005
GO
alter table MedicalHistory_2005
add Constraint MedicalHistory_2005_CDate_Chk check(year(CDate) = 2005)
GO
CREATE INDEX [MedicalHistory_LIId_2005_Ind] ON
[dbo].[MedicalHistory_2005]([LIId])
GO
CREATE INDEX [MedicalHistory_LTkId_2005_Ind] ON
[dbo].[MedicalHistory_2005]([LTkId])
GO
CREATE INDEX [MedicalHistory_SysChDate_2005_Ind] ON
[dbo].[MedicalHistory_2005]([SysChDate])
GO

SP_RENAME 'MEDICALHISTORY','MEDICALHISTORY_BACKUP'

CREATE VIEW MedicalHistory
SELECT Seq ,
LTkId ,
LIId ,
TbName ,
FldName ,
Bf ,
Aft ,
CDate ,
SysChDate
From MedicalHistory_2002
Union All
SELECT Seq ,
LTkId ,
LIId ,
TbName ,
FldName ,
Bf ,
Aft ,
CDate ,
SysChDate
From MedicalHistory_2003
Union All
SELECT Seq ,
LTkId ,
LIId ,
TbName ,
FldName ,
Bf ,
Aft ,
CDate ,
SysChDate
From MedicalHistory_2004
Union All
SELECT Seq ,
LTkId ,
LIId ,
TbName ,
FldName ,
Bf ,
Aft ,
CDate ,
SysChDate
From MedicalHistory_2005


The most important column in that table is "LIId"..... that would be ample
for you to give me some more advised.

Thanks
[quoted text, click to view]

David Gugick
4/26/2005 12:00:00 AM
[quoted text, click to view]
I posted DBCC SHRINKFILE. You used DBCC SHRINKDATABASE.

--
David Gugick
Imceda Software
Hugo Kornelis
4/26/2005 12:00:00 AM
[quoted text, click to view]
(snip)

Hi John,

After reading thorugh the discussion, I think you're missing a few
requirements for partitioned views.

One very important requirement for partitioned views is the existance of a
partitioning column. This is a column that SQL Server can use to decide in
which partition a row belongs. The partitioning column should:
a) be part of the primary key (in each of the partitions), and
b) have a CHECK constraint (in each of the partitions), that does not
overlap with any of the CHECK constraints in the other partitions.

Since you want to partition by year, you'd have to use the column CDate as
partitioning column. But that one doesn't appear to be part of the primary
key. If you can't fix that, you can't partition on the year part of CDate,
period. But if you can change the Primary Key to include CDate (without
violating data integrity), then you still have more changes to make.

If you use SELECT INTO to create the partitions, then there is no CHECK
constraint. You'll have to use CREATE TABLE statements to create the
tables and declare the constraints, then use INSERT INTO ... SELECT to put
in the data.
Another reason to use CREATE TABLE instead of SELECT INTO is to be able to
control which file group each partition is created on. To maximize the
advantage of using a partitioned view, you should create each partition on
it's own drive.
In the end, your script should look something like this:

CREATE TABLE MedHist2002
(CDate datetime NOT NULL
,some other columns
,PRIMARY KEY (CDate, ....)
,CHECK (CDate >= '20020101' AND CDate < '20030101')
) ON Partition1
CREATE TABLE MedHist2003
(CDate datetime NOT NULL
,some other columns
,PRIMARY KEY (CDate, ....)
,CHECK (CDate >= '20030101' AND CDate < '20040101')
) ON Partition2
((etc.))
go
INSERT INTO MedHist2002 (CDate, ...)
SELECT CDate, ...
FROM MedicalHistory
WHERE CDate >= '20020101' AND CDate < '20030101'
INSERT INTO MedHist2003 (CDate, ...)
SELECT CDate, ...
FROM MedicalHistory
WHERE CDate >= '20030101' AND CDate < '20040101'
((etc))
go
sp_rename 'MedicalHistory', 'MedicalHistoryBackup'
go
CREATE VIEW MedicalHistory
AS
SELECT CDate, ...
FROM MedHist2002
UNION ALL
SELECT CDate, ...
FROM MedHist2003
UNION ALL
((etc.))
go

Best, Hugo
--

David Gugick
4/26/2005 9:45:00 AM
[quoted text, click to view]

I'm afraid I won't be able to help much more. I still don't know where
these tables are located. Partitioning the original table into a view on
the same array is not likely to add any performance benefits. You
didn't answer about your long nvarchar columns or how you primarily
access the table.

My guess, and this is completely a guess here based on limited
information, is that you should just redesign the main table. If those
long columns can be NULL and are not fetched with every query (which you
never posted), I would move them as I stated in my previous post to a
related table (even if it's a 1:1 with the main table).

Also, I thought you said the view joins with another table, but I didn't
see that posted and qouldn't recommend it anyway for the partiioned
view.

If you put the old table back and get rid of the partioned view, you can
recover your lost 5GB of drive space :-)


--
David Gugick
Imceda Software
www.imceda.com
John
4/26/2005 10:20:26 AM
Hi David, for testing purpose I have removed all the partitioned tables and
views but still my data file is too high to 8 GB.... I have shrink the log
and data file but no effects and then I have used sp_spaceused command it
gave me that result

database_name database_size unallocated space
----------------------------------------------------------------------------
---------------------------------------- ------------------ ----------------
--
Health 6919.44 MB 1781.97 MB


reserved data index_size
unused
------------------ ------------------ ------------------ ------------------
5259744 KB 1963968 KB 3208520 KB 87256 KB

any idea what should I do now ?

Thanks

[quoted text, click to view]

John
4/26/2005 10:27:42 AM
David

I have used the following command :

drop table MedicalHistory_2001
drop table MedicalHistory_2002
drop table MedicalHistory_2003
drop table MedicalHistory_2004
drop table MedicalHistory_2005

but datafile size is still the same, is there any command to reclaim the
space ?

[quoted text, click to view]

David Gugick
4/26/2005 11:40:00 AM
[quoted text, click to view]

DBCC SHRINKFILE

--
David Gugick
Imceda Software
John
4/26/2005 2:42:40 PM
Dbcc shrinkdatabase('Health')

[Microsoft][ODBC SQL Server Driver]Unspecified error occurred on SQL Server.
Connection may have been terminated by the server.
Server: Msg 8966, Level 16, State 2, Line 1
Could not read and latch page (1:413364) with latch type SH. Latch failed.


any idea?
[quoted text, click to view]

John
4/26/2005 4:47:11 PM
I know I also used that DBCC SHRINKFILE(datafile_name) that didn't any
effects that's why I used DBCC ShrinkDatabase command.

Thanks

[quoted text, click to view]

David Gugick
4/26/2005 7:12:06 PM
[quoted text, click to view]

John,

You need to post exact commands you are executing so we can best assist
you. With DBCC SHRINKFILE, you should be including the target size of
the data file. SQL Server will attempt to relocate pages in order to
free the used space. If you're not expecting a lot of growth over the
next few months, reduce the data file to the currently allocated amount
plus 250MB or so to leave some free space.


--
David Gugick
Imceda Software
www.imceda.com
Hugo Kornelis
4/27/2005 12:00:00 AM
[quoted text, click to view]

Hi John,

Well, the partitioning column doesn't have to *be* the primary key, it
suffices if it is *part of* the primary key.

But if CData is not logically part of the primary key, then I guess that
you just can't use this column for partitioning. You'll either have to
keep all data in one table, or use another column (one that IS part of the
primary key) as your partitioning column.

As David indicates in his first reply in this thread, all you've done with
your current partitioning choice is add more work for SQL Server.

Best, Hugo
--

John
4/27/2005 8:28:39 AM
I used DBCC ShrinkFile('Health_Data')

Thanks

[quoted text, click to view]

John
4/27/2005 8:32:08 AM
Thanks Hugo, thanks for your reply but the thing is that I can't set CData
as a primary key b/c date may be duplicate.....

Thanks

[quoted text, click to view]

Hugo Kornelis
4/28/2005 12:00:00 AM
[quoted text, click to view]
(snip)

Hi John,

I'm afraid I don't understand you. If MedicalHistory is already a view,
then you can't partition it - only tables can be partitioned.

[quoted text, click to view]

No, that's not what I'm suggesting. If Seq, LIId, LTkId is your primary
key, then you should definitely NOT add CDate to it - this would result
in duplicate data, and a whole lot of trouble when you find that out
(probably 6 months or more after the fact) and get to clean up the mess.

What I mean is that you should first identify ALL candidate keys (both
composite and simple) for the table. Then check if any of these contain
a column that might be suitable for partitioning. If there is, make that
candidate key the primary key (and declare the other candidate keys as
UNIQUE), then partition the table. If there isn't, then you'll just have
to accept that this table can't be partitioned.

[quoted text, click to view]

Nothing you do is guaranteed to make things faster. Performance of a SQL
Server depends on many factors, such as the specs and setup of the
hardware, overall server activity, various installation options, other
simultaneous query executions, size, amount and distribution of data,
etc. The only way to find out if a change will improve performance is to
test it. First on a test database, that is setup on a machine that
mimics the production machine as closely as possible, then (if it seems
to be working) promote it to the production machine and monitor the real
performance for a while.

Don't forget to check overall performance. Sometimes, a change that
benefits one part of the application hurts another part. This is often
overlooked.

Best, Hugo
--

John
4/28/2005 8:26:29 AM
thanks for your reply, please tell me one thing like in the table
MedicalHistory but now this is view, has contain composite primary
key(Seq,LIId,LTkId) and I follow the same rule while creating the view, so
according to your previous email you suggest me to make CDate as primary key
like ((Seq,LIId,LTkId, CDate).... am I right ? so that performance would be
fast?

Waiting for your reply.


[quoted text, click to view]

John
4/29/2005 10:03:11 AM
Before creating the Partitioned view I used the following query

select top 1 * from MedicalHistory order by 1 desc (

It took 2 seconds but after partitioned the view I used the following query.

select top 1 * from MedicalHistory order by 1 desc

It took more than 3 minutes.

Can you tell me where I doing wrong.

Thanks

[quoted text, click to view]

John
4/29/2005 10:09:58 AM
Thanks Hugo,,,, please check this trigger I have created....

CREATE TRIGGER MedicalHistory_Insert_TRG ON MedicalHistory INSTEAD OF INSERT
AS
Declare @Year INT,
@Seq INT,
@LInfoId INT,
@LTkId INT,
@TbName NVARCHAR(50),
@FldName NVARCHAR(50),
@Bf NVARCHAR(4000),
@Af NVARCHAR(4000),
@CDate DATETIME

SELECT
@Year = year(CDate),
@Seq = [Seq],
@LInfoId = LInfoId,
@LTkId = LTkId,
@TbName = TbName,
@FldName = FldName,
@Bf = Bf,
@Af = [Af],
@CDate = CDate
FROM INSERTED

IF (@Year = 2001)
BEGIN
IF NOT EXISTS
(

SELECT * FROM MedicalHistory_2001
WHERE [Seq] = @Seq
AND LInfoId = @LInfoId
AND LTkId = @LTkId
)
BEGIN
INSERT INTO MedicalHistory_2001
(
[Seq] ,
LInfoId ,
LTkId ,
TbName ,
FldName ,
Bf ,
[Af] ,
CDate
)
SELECT
@Seq ,
@LInfoId ,
@LTkId ,
@TbName ,
@FldName ,
@Bf ,
@Af ,
@CDate
END
END
ELSE IF (@Year = 2002)
BEGIN

IF NOT EXISTS
(

SELECT * FROM MedicalHistory_2002
WHERE [Seq] = @Seq
AND LInfoId = @LInfoId
AND LTkId = @LTkId
)
BEGIN

INSERT INTO MedicalHistory_2002
(
[Seq] ,
LInfoId ,
LTkId ,
TbName ,
FldName ,
Bf ,
[Af] ,
CDate
)
SELECT
@Seq ,
@LInfoId ,
@LTkId ,
@TbName ,
@FldName ,
@Bf ,
@Af ,
@CDate
END

END
ELSE IF (@Year = 2003)
BEGIN
IF NOT EXISTS
(

SELECT * FROM MedicalHistory_2003
WHERE [Seq] = @Seq
AND LInfoId = @LInfoId
AND LTkId = @LTkId
)
BEGIN

INSERT INTO MedicalHistory_2003
(
[Seq] ,
LInfoId ,
LTkId ,
TbName ,
FldName ,
Bf ,
[Af] ,
CDate
)
SELECT
@Seq ,
@LInfoId ,
@LTkId ,
@TbName ,
@FldName ,
@Bf ,
@Af ,
@CDate
END
END
ELSE IF (@Year = 2004)
BEGIN
IF NOT EXISTS
(

SELECT * FROM MedicalHistory_2004
WHERE [Seq] = @Seq
AND LInfoId = @LInfoId
AND LTkId = @LTkId
)
BEGIN

INSERT INTO MedicalHistory_2004
(
[Seq] ,
LInfoId ,
LTkId ,
TbName ,
FldName ,
Bf ,
[Af] ,
CDate
)
SELECT
@Seq ,
@LInfoId ,
@LTkId ,
@TbName ,
@FldName ,
@Bf ,
@Af ,
@CDate
END
END
ELSE IF (@Year = 2005)
BEGIN
IF NOT EXISTS
(

SELECT * FROM MedicalHistory_2005
WHERE [Seq] = @Seq
AND LInfoId = @LInfoId
AND LTkId = @LTkId
)
BEGIN

INSERT INTO MedicalHistory_2005
(
[Seq] ,
LInfoId ,
LTkId ,
TbName ,
FldName ,
Bf ,
[Af] ,
CDate
)
SELECT
@Seq ,
@LInfoId ,
@LTkId ,
@TbName ,
@FldName ,
@Bf ,
@Af ,
@CDate
END
END


Is it okie ?

Thanks
[quoted text, click to view]

Hugo Kornelis
5/9/2005 12:00:00 AM
[quoted text, click to view]

Hi John,

Sorry for the delay in replying. My vacation is to blame.

As I already told you, if the partitioning column is not part of the
primary key, then you can't create a partitioned view. Your other post
suggests that you are trying to roll your own version, which is fine if
all you want to achieve is to divide the data over multiple table, e.g.
to balance the data over multiple disks, to make archiving historic data
easier or to enable some backup scheme. But you should be aware that
this comes at the price of reduced performance. You don't have a
partitioned view, you have just a bunch of tables and a UNION view that
binds them together.

In a "real" partitioned view, SQL Server can use the partitioning column
to quickly locate which of the underlying table(s) should be searched.
As a result, performance can improve.
In your case, there is no "real" partitioned view. SQL Server doesn't
know how the data is distributed across the tables, so it has to do much
more work to satisfy your query.

If the above sounds a bit vague, that is because the information you
have given is vague as well. I don't know how your tables and views
look, nor what keys and indexes you have or how your data is
distributed. For more specific feedback, you'll have to post SQL scripts
that I can copy and paste into QA to recreate the tables and views on my
test database. See http://www.aspfaq.com/5006 for some excellent
instructions.

[quoted text, click to view]

The only thing I can tell for sure is that you should never use SELECT *
in production code. Allways list the columns you need, and never include
more columns than those that you really need. Reducing the number of
columns always saves on network traffic, and sometimes allows the
optimizer to pick a much quicker execution plan. And SELECT * can prove
to be a maintenance nightmare in the future!

Best, Hugo
--

Hugo Kornelis
5/9/2005 12:00:00 AM
[quoted text, click to view]

Hi John,

No, it isn't. As I'm sure you've already discovered during your tests.

First, I don't know why you do this. If you manage to get this working,
then you definitely won't get any of the benefits of a partitioned view.
As both David and me have already pointed out to you, you are just
letting SQL Server work harder. For a real partitioned view, you don't
have to write a trigger. You just insert into the view and SQL Server
makes sure the data gets written to the correct table.

Anyway, there are also two big errors in your trigger, and you are
forgetting something.

The first is that the trigger will not deal properly with multi-row
inserts. Even if your current application never produces multi-row
inserts, some future change will, and then you're forked.

The second problem is that you have no appropriate handling for
duplicate rows. You check for non-existance before inserting, but don't
handle the case of existance. No error messages, no rollback, no
inserting of the data in some "handle manually" table - nothing. The
transaction to insert the data will appear to succeed, but without
actualy inserting the data. When you finally discover that, over the
years, 0.01% of all rows have disappeared without a trace, you and your
manager will have a field day....

Finally, you forgot to add a trigger to handle updates (don't forget to
include code for the special case of a change to the CDate column, that
might cause the row to move from one table to another) and a trigger to
handle deletes.

Best, Hugo
--

AddThis Social Bookmark Button