why should there be only one clustered index in a table?
The clustered index sorts the *data*. You can only sort that data one way, so you can only have one clustered index. -- Tibor Karaszi, SQL Server MVP http://www.karaszi.com/sqlserver/default.asp http://sqlblog.com/blogs/tibor_karaszi [quoted text, click to view] "Shocky" <yusuf.bhiwandiwala@gmail.com> wrote in message news:1172300846.418994.248200@z35g2000cwz.googlegroups.com... > why should there be only one clustered index in a table? >
Did you read the index chapters in Books Online. They explain the index tree and that for a clustered index, the leaf level of the index *is* the data. Here's a SQL Server 2005 Books Online URL describing a clustered index: ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/udb9/html/26b28045-c3c2-465a-b564-bf2189e93fdc.htm -- Tibor Karaszi, SQL Server MVP http://www.karaszi.com/sqlserver/default.asp http://sqlblog.com/blogs/tibor_karaszi [quoted text, click to view] "Shocky" <yusuf.bhiwandiwala@gmail.com> wrote in message news:1172304346.924023.3700@j27g2000cwj.googlegroups.com... > >> The clustered index sorts the *data*. You can only sort that data one way, so you can only have >> one >> clustered index. > > > Am still not clear >
[quoted text, click to view] > The clustered index sorts the *data*. You can only sort that data one way, so you can only have one > clustered index.
Am still not clear
Our company had a client-project that contained a double-cluster, but their situation was quite different (we have not found a reason to re-duplicate their solution). My old SQL 2000 cert software demonstrated this. Why they didn't show two unique indexes together.. hm. My apologies in advance, I use the "Sandwich" analogy a lot. If meat and bread were clustered together: All these would be distinct: tuna and wheat tuna and white tuna and frenchbread Let's say if every option was filled out in the table, the indexes would display itself as if it was a 'cross join'. Regards, C- [quoted text, click to view] "Tibor Karaszi" wrote: > Did you read the index chapters in Books Online. They explain the index tree and that for a > clustered index, the leaf level of the index *is* the data. Here's a SQL Server 2005 Books Online > URL describing a clustered index: > > ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/udb9/html/26b28045-c3c2-465a-b564-bf2189e93fdc.htm > > > -- > Tibor Karaszi, SQL Server MVP > http://www.karaszi.com/sqlserver/default.asp > http://sqlblog.com/blogs/tibor_karaszi > > > "Shocky" <yusuf.bhiwandiwala@gmail.com> wrote in message > news:1172304346.924023.3700@j27g2000cwj.googlegroups.com... > > > >> The clustered index sorts the *data*. You can only sort that data one way, so you can only have > >> one > >> clustered index. > > > > > > Am still not clear > > >
[quoted text, click to view] >> why should there be only one clustered index in a table? <<
SQL Server is still based on physically contigous sequential storage. The clusterd index is based on physically sequential storage ordering, and there can be only one sort order. How many ways can you sort a magnetic tape?
Hi Joe No, this has been discussed here many times. The clustered index does not store the data PHYSICALLY sorted, it is only LOGICALLY sorted.. -- HTH Kalen Delaney, SQL Server MVP http://sqlblog.com [quoted text, click to view] "--CELKO--" <jcelko212@earthlink.net> wrote in message news:1172324918.636808.144360@p10g2000cwp.googlegroups.com... >>> why should there be only one clustered index in a table? << > > SQL Server is still based on physically contigous sequential storage. > The clusterd index is based on physically sequential storage ordering, > and there can be only one sort order. How many ways can you sort a > magnetic tape? > >
The Books Online Tibor referenced has a good architectural description of clustered indexes. Basically, clustered index leaf nodes are the actual table data rows and are ordered on each page by the index key. You can have only one clustered index on a table because data rows can be ordered in only one way at a time. If you also create a non-clustered index or indexed view containing all table columns, you'll effectively get the same behavior as multiple clustered indexes. However, this is not commonly done because of the overhead of duplicating the entire table. -- Hope this helps. Dan Guzman SQL Server MVP [quoted text, click to view] "Shocky" <yusuf.bhiwandiwala@gmail.com> wrote in message news:1172304346.924023.3700@j27g2000cwj.googlegroups.com... > >> The clustered index sorts the *data*. You can only sort that data one >> way, so you can only have one >> clustered index. > > > Am still not clear >
On 24 Feb, 16:38, "Kalen Delaney" <replies@public_newsgroups.com> [quoted text, click to view] wrote: > Hi Joe > > No, this has been discussed here many times. The clustered index does not > store the data PHYSICALLY sorted, it is only LOGICALLY sorted.. >
A fact which demonstrates that the commonly quoted excuse for having only one clustered index ("that the data can only be sorted one way") is bogus. Obviously any number of logical sorts are possible and the same data is in practice recorded in multiple places in different indexes. I imagine that the explanation for allowing just one clustered index is that SQL Server uses cluster keys as bookmarks. I suppose that adding the complexity to allow several different keys for bookmarks would bring no benefit other than the possibilities of covering more queries - something that can be achieved anyway through a non- clustered index. That seems to make sense to me anyway. -- David Portas, SQL Server MVP Whenever possible please post enough code to reproduce your problem. Including CREATE TABLE and INSERT statements usually helps. State what version of SQL Server you are using and specify the content of any error messages. SQL Server Books Online: http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx --
The data is sorted according to the clustered index, it is just not stored physically in that sorted order. Joe repeatedly makes claims to the way SQL Server physically stored the data that are just not true at all. Whether it is physical or logical, the data in the table (which is the leaf level of the clustered index) can only be sorted one way with the current architecture. Each page can only point to one 'next' page, and the chain of pages linked together is the logically sorted table. -- HTH Kalen Delaney, SQL Server MVP http://sqlblog.com [quoted text, click to view] "David Portas" <REMOVE_BEFORE_REPLYING_dportas@acm.org> wrote in message news:1172351478.658536.99680@k78g2000cwa.googlegroups.com... > On 24 Feb, 16:38, "Kalen Delaney" <replies@public_newsgroups.com> > wrote: >> Hi Joe >> >> No, this has been discussed here many times. The clustered index does not >> store the data PHYSICALLY sorted, it is only LOGICALLY sorted.. >> > > A fact which demonstrates that the commonly quoted excuse for having > only one clustered index ("that the data can only be sorted one way") > is bogus. Obviously any number of logical sorts are possible and the > same data is in practice recorded in multiple places in different > indexes. > > I imagine that the explanation for allowing just one clustered index > is that SQL Server uses cluster keys as bookmarks. I suppose that > adding the complexity to allow several different keys for bookmarks > would bring no benefit other than the possibilities of covering more > queries - something that can be achieved anyway through a non- > clustered index. That seems to make sense to me anyway. > > -- > David Portas, SQL Server MVP > > Whenever possible please post enough code to reproduce your problem. > Including CREATE TABLE and INSERT statements usually helps. > State what version of SQL Server you are using and specify the content > of any error messages. > > SQL Server Books Online: > http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx > -- >
[quoted text, click to view] > I guess David is suggesting that you could have more than one set of link > pointers per page....
Yes, that is why I said [quoted text, click to view] >> the data in the table ... can only be sorted one way with the current
architecture -- HTH Kalen Delaney, SQL Server MVP http://sqlblog.com [quoted text, click to view] "Erland Sommarskog" <esquel@sommarskog.se> wrote in message news:Xns98E263834A02Yazorman@127.0.0.1... > Kalen Delaney (replies@public_newsgroups.com) writes: >> Whether it is physical or logical, the data in the table (which is the >> leaf level of the clustered index) can only be sorted one way with the >> current architecture. Each page can only point to one 'next' page, and >> the chain of pages linked together is the logically sorted table. > > I guess David is suggesting that you could have more than one set of link > pointers per page. He discusses the issue of the clustered index keys > being > row locators for the NC index, but that would be addressed with one of the > clustered indexes being promoted to primary clustered index. > > Obviously, in such a solution there would still be a low number of > possible > clustered indexes. (And nevermind that the current space architecture does > not have any space for them.) > > It would not be possible to have more han one of these multiple clustered > index to be contiguous, though. > > > -- > Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se > > Books Online for SQL Server 2005 at > http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx > Books Online for SQL Server 2000 at > http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Shocky (yusuf.bhiwandiwala@gmail.com) writes: [quoted text, click to view] > why should there be only one clustered index in a table?
There can only be one clustered index in a table, because the leaf node of the index is the data pages. However, you can define any number of non-clustered index which includs all columns in the table, which would mean that in practice you would have more than one clustered index. But note that this mean that the data is more than one place, and space consumption rises as a consequence. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at
[quoted text, click to view] On 24 Feb, 23:38, Erland Sommarskog <esq...@sommarskog.se> wrote: > Kalen Delaney (replies@public_newsgroups.com) writes: > > Whether it is physical or logical, the data in the table (which is the > > leaf level of the clustered index) can only be sorted one way with the > > current architecture. Each page can only point to one 'next' page, and > > the chain of pages linked together is the logically sorted table. > > I guess David is suggesting that you could have more than one set of link > pointers per page. He discusses the issue of the clustered index keys being > row locators for the NC index, but that would be addressed with one of the > clustered indexes being promoted to primary clustered index. >
Actually I hadn't thought of that Erland. The data in a single index can only be sorted one way in the current architecture. But, that same data can appear in multiple indexes and there could easily be several indexes covering the total set of columns in the table. Those columns could be added at leaf level using the INCLUDE clause or they could be part of the index key itself. So in my view "sorted one way" isn't an adequate explanation of why only one clustered index is permitted and I'm not surprised that it baffles some people. It seems like a better explanation is the need for SQL Server to identify each data row uniquely in every index. A clustered index causes the index key to be substituted in place of the RIDs that are otherwise used in nonclustered indexes. There is no benefit I can think of to having more than one such key, therefore only one clustered index is necessary. Unless this subtle distinction is understood it seems quite reasonable for the OP to speculate about multiple clustered indexes, each one containing the same set of data at leaf level - as Gert-Jan points out, to some extent that is what the INCLUDE keyword achieves. Let's be careful to distinguish between a Table, which is part of the database's Logical Model, and an Index, which is part of the physical implementation of that model. Hopefully we will agree that the phrase "logically sorted table" is a contradiction in terms! -- David Portas, SQL Server MVP Whenever possible please post enough code to reproduce your problem. Including CREATE TABLE and INSERT statements usually helps. State what version of SQL Server you are using and specify the content of any error messages. SQL Server Books Online: http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx --
Kalen Delaney (replies@public_newsgroups.com) writes: [quoted text, click to view] > Whether it is physical or logical, the data in the table (which is the > leaf level of the clustered index) can only be sorted one way with the > current architecture. Each page can only point to one 'next' page, and > the chain of pages linked together is the logically sorted table.
I guess David is suggesting that you could have more than one set of link pointers per page. He discusses the issue of the clustered index keys being row locators for the NC index, but that would be addressed with one of the clustered indexes being promoted to primary clustered index. Obviously, in such a solution there would still be a low number of possible clustered indexes. (And nevermind that the current space architecture does not have any space for them.) It would not be possible to have more han one of these multiple clustered index to be contiguous, though. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at
David Portas (REMOVE_BEFORE_REPLYING_dportas@acm.org) writes: [quoted text, click to view] > It seems like a better explanation is the need for SQL Server to > identify each data row uniquely in every index. A clustered index > causes the index key to be substituted in place of the RIDs that are > otherwise used in nonclustered indexes. There is no benefit I can > think of to having more than one such key, therefore only one > clustered index is necessary. Unless this subtle distinction is > understood it seems quite reasonable for the OP to speculate about > multiple clustered indexes, each one containing the same set of data > at leaf level - as Gert-Jan points out, to some extent that is what > the INCLUDE keyword achieves.
Unfortunately, I think that explanation has to count as an after- construction. The fact that the clustered key is the is the row locator for the NC indexes is a novelty added in SQL 7. And it was not possible to have multiple clustered indexes in SQL 6.5 or earlier either. To have multiple clustered indexes, you need to have either 1) duplicated data pages (which can be achieved with all-inclusive NC indexes or indexed views) 2) multiple next/prev pointers on page level. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at
[quoted text, click to view] Erland Sommarskog wrote: > > Kalen Delaney (replies@public_newsgroups.com) writes: > > Whether it is physical or logical, the data in the table (which is the > > leaf level of the clustered index) can only be sorted one way with the > > current architecture. Each page can only point to one 'next' page, and > > the chain of pages linked together is the logically sorted table. > > I guess David is suggesting that you could have more than one set of link > pointers per page. He discusses the issue of the clustered index keys being > row locators for the NC index, but that would be addressed with one of the > clustered indexes being promoted to primary clustered index. > > Obviously, in such a solution there would still be a low number of possible > clustered indexes. (And nevermind that the current space architecture does > not have any space for them.) > > It would not be possible to have more han one of these multiple clustered > index to be contiguous, though. > > -- > Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se > > Books Online for SQL Server 2005 at > http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx > Books Online for SQL Server 2000 at > http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx Before SQL Server 2005, the feature that was mentioned most as being special for the clustered index, was the fact that the leaf level of the clustered index will always contain the table data. And since creating an index of all columns of the table was not considered equal (or simply not possible), this was a big difference. But as you all know, SQL Server 2005 supports the INCLUDE keyword, which makes it possible to include all non-index columns to the leaf level of a nonclustered index. A clustered index has the property that the data will be sorted based on the index order. A nonclustered index with all non-indexed columns included will do exactly the same thing. So that property is no long unique to a clustered index. So although SQL Server will still only allow you create one clustered index using the CLUSTERED keyword, for all practical purposes, you can add as many as you want (up to a few hundred). You just have to omit the CLUSTERED keyword, use the INCLUDE feature and you will have an index where the data is clustered on the index order.
Yes, I agree that the terms clustered and nonclustered can be very confusing. Also, when you start looking deeper into the way data structures are kept track of in SQL Server 2005, the differentiation between heaps and b-trees as the two types of possible organization becomes clearer. Note however, that a PRIMARY KEY CONSTRAINT does not have to imply the table is organized as a B-TREE. You can specify as PRIMARY KEY as nonclustered, or if you already have a clustered index, and you ALTER the table to define a PRIMARY KEY, it will be nonclustered. Also, I may be overlooking something, but you don't need SQL 2005 to have additional indexes that contain all the table's columns, IF the table has fewer than 16 columns and less than 900 bytes per row. -- HTH Kalen Delaney, SQL Server MVP http://sqlblog.com [quoted text, click to view] "Steve Kass" <skass@drew.edu> wrote in message news:OypAyeQWHHA.5108@TK2MSFTNGP06.phx.gbl... > The use of the word CLUSTERED causes all kinds of confusion. > > Tables are organized in one of two ways: B-TREE or HEAP. This > is determined indirectly. If one index on the table is created with the > CLUSTERED keyword, or if a PRIMARY KEY constraint is present > in the CREATE TABLE statement, a table is organized as a B-TREE. > Otherwise, the table is organized as a HEAP. > > B-TREE organization requires that some sequence of columns be > identified as the key columns of the B-tree, and the mechanism > SQL Server provides for identifying these columns is in the list of > key columns of the (sole) clustered index on the table. There can > be only one specification of the way the B-tree is organized, therefore > only one index can be designated as CLUSTERED. > > The purpose of the keyword CLUSTERED, therefore, is to declare that > a) the table is to be organized as a B-tree, and b) to specify the key > columns > for the B-tree. Thus the reason there can be only one clustered index on > a table is because a B-tree cannot be ordered in multiple ways. > > When a table is organized as a B-tree, the B-tree that serves as the > primary store of the table data is referred to as the clustered index > of the table. All indexes have B-tree structure, but only one ("the" > clustered index) can have the following two properties 1) Contains > all columns, i.e., all the table's data; 2) has as its key columns the > columns whose values are used as row locators in all indexes. > > In SQL Server 2005, it is possible to create more than one B-trees > with property 1) above via CREATE INDEX .. INCLUDE. It is > impossible, however, to have two different B-trees each having > property 2) above, because indexes only have room to store one > row locator. The additional "complete table" B-trees that satisfy > property 1) might be called additional clustered indexes, since they > function in many ways as "the" clustered index of a table. > > If I could invent the syntax, I would not use CLUSTERED/NONCLUSTERED. > I would require a table's organization to be specified when it is > created, and then indexes would just be indexes: > > CREATE TABLE T ( > ... > ) AS HEAP > > CREATE TABLE X( > ... > ) AS BTREE(ColA,ColB) > > -- Steve Kass > -- Drew University > -- http://www.stevekass.com > -- A1D8F830-B056-48DC-9A20-FB4931E7B380 > > Shocky wrote: > >>why should there be only one clustered index in a table? >> >>
[quoted text, click to view] On 25 Feb, 18:06, Steve Kass <s...@drew.edu> wrote: > > If I could invent the syntax, I would not use CLUSTERED/NONCLUSTERED. > I would require a table's organization to be specified when it is > created, and then indexes would just be indexes: > > CREATE TABLE T ( > ... > ) AS HEAP > > CREATE TABLE X( > ... > ) AS BTREE(ColA,ColB) > > -- Steve Kass > -- Drew University > -- http://www.stevekass.com > -- A1D8F830-B056-48DC-9A20-FB4931E7B380 > Already I don't like your syntax. The failure to separate "implementation language" from DDL is one of the worst offences of most SQL DBMSs (Oracle and Microsoft anyway). SQL was not intended to be used in that way. So many problems and limitations stem from that one simple design error. -- David Portas, SQL Server MVP Whenever possible please post enough code to reproduce your problem. Including CREATE TABLE and INSERT statements usually helps. State what version of SQL Server you are using and specify the content of any error messages. SQL Server Books Online: http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx --
[quoted text, click to view] On 25 Feb, 20:15, Steve Kass <s...@drew.edu> wrote: > > This would by no means be the first implementation detail > to be part of CREATE TABLE (not that I'm wedded to this > particular syntax, either). What about FILLFACTOR, > ON filegroup, etc.? >
FILLFACTOR and ON filegroup are other equally bad transgressions IMO. [quoted text, click to view] > What I'd really like to see is for the word CLUSTERED to > go away (if not deprecated, be never needed), and that it > be completely clear what table columns are in an index from > the CREATE INDEX and CREATE TABLE statements. Is that > something you'd agree with, and if so, what suggestions > do you have? >
I agree that CLUSTERED / NONCLUSTERED ought to be deprecated. I think you've summarized the problem very well: we need a new syntax that better defines heap vs B-tree and the other index attributes. I'd just prefer to see that syntax kept separate from constraint and table definitions. -- David Portas, SQL Server MVP Whenever possible please post enough code to reproduce your problem. Including CREATE TABLE and INSERT statements usually helps. State what version of SQL Server you are using and specify the content of any error messages. SQL Server Books Online: http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx --
The use of the word CLUSTERED causes all kinds of confusion. Tables are organized in one of two ways: B-TREE or HEAP. This is determined indirectly. If one index on the table is created with the CLUSTERED keyword, or if a PRIMARY KEY constraint is present in the CREATE TABLE statement, a table is organized as a B-TREE. Otherwise, the table is organized as a HEAP. B-TREE organization requires that some sequence of columns be identified as the key columns of the B-tree, and the mechanism SQL Server provides for identifying these columns is in the list of key columns of the (sole) clustered index on the table. There can be only one specification of the way the B-tree is organized, therefore only one index can be designated as CLUSTERED. The purpose of the keyword CLUSTERED, therefore, is to declare that a) the table is to be organized as a B-tree, and b) to specify the key columns for the B-tree. Thus the reason there can be only one clustered index on a table is because a B-tree cannot be ordered in multiple ways. When a table is organized as a B-tree, the B-tree that serves as the primary store of the table data is referred to as the clustered index of the table. All indexes have B-tree structure, but only one ("the" clustered index) can have the following two properties 1) Contains all columns, i.e., all the table's data; 2) has as its key columns the columns whose values are used as row locators in all indexes. In SQL Server 2005, it is possible to create more than one B-trees with property 1) above via CREATE INDEX .. INCLUDE. It is impossible, however, to have two different B-trees each having property 2) above, because indexes only have room to store one row locator. The additional "complete table" B-trees that satisfy property 1) might be called additional clustered indexes, since they function in many ways as "the" clustered index of a table. If I could invent the syntax, I would not use CLUSTERED/NONCLUSTERED. I would require a table's organization to be specified when it is created, and then indexes would just be indexes: CREATE TABLE T ( .... ) AS HEAP CREATE TABLE X( .... ) AS BTREE(ColA,ColB) -- Steve Kass -- Drew University -- http://www.stevekass.com -- A1D8F830-B056-48DC-9A20-FB4931E7B380 [quoted text, click to view] Shocky wrote: >why should there be only one clustered index in a table? > >
[quoted text, click to view] Kalen Delaney wrote: >Yes, I agree that the terms clustered and nonclustered can be very >confusing. Also, when you start looking deeper into the way data structures >are kept track of in SQL Server 2005, the differentiation between heaps and >b-trees as the two types of possible organization becomes clearer. > >Note however, that a PRIMARY KEY CONSTRAINT does not have to imply the table >is organized as a B-TREE. You can specify as PRIMARY KEY as nonclustered, or >if you already have a clustered index, and you ALTER the table to define a >PRIMARY KEY, it will be nonclustered. > >
Thanks. That's right - the only use of PRIMARY KEY that forces B-tree organization [quoted text, click to view] >Also, I may be overlooking something, but you don't need SQL 2005 to have >additional indexes that contain all the table's columns, IF the table has >fewer than 16 columns and less than 900 bytes per row. > > >
True. I didn't say what I was thinking, which was that only in 2005 can you create a second B-tree that really works like a clustered index for retrieval. With INCLUDE, you can not only make sure the index contains all columns, but you can do so without having to make all the columns key columns, doing which causes the index to be unnecessarily bloated above the leaf level. This is a valuable property of "clustered indexes" that you only get once per table before 2005, and I should have mentioned it as the property that distinguishes 2005 from 2000 here (I only mentioned the "all columns" property, which is not the distinctive one.). In 2005, INCLUDE provides this ability to organize any index this way: as a B-tree based on a subset of columns. As a result, in 2005 one more thing we used to only get with CLUSTERED is now available for any index. Thanks for pointing this out.
On 24 Feb 2007 05:48:38 -0800, "--CELKO--" <jcelko212@earthlink.net> [quoted text, click to view] wrote: >>> why should there be only one clustered index in a table? << > >SQL Server is still based on physically contigous sequential storage.
Strictly speaking, this isn't even close to accurate. You don't have to have any clustered index on a table, in which case they are just heaps in logically sequential storage, which has no relation to physically contiguous storage on any normal disk volume. Within a filegroup records can be interleaved with other tables' records and with indexes, and using anything beyond a single filegroup is relatively uncommon in practice. Apparently SQL2000/2005 supports raw partitions, but I have never seen one used, and they are certainly not required, and I am not aware that SQLServer uses their contiguous/sequential nature to any advantage. Josh
On 23 Feb 2007 23:07:26 -0800, "Shocky" <yusuf.bhiwandiwala@gmail.com> [quoted text, click to view] wrote: >why should there be only one clustered index in a table?
is there any reason why you would like more, or are you just asking? J.
David, I'm suggesting that the decision to represent a table as a heap or as a B-tree is an important decision that should not be addressed indirectly in the language. It could be addressed directly (as I suggested) or not in DDL at all (or simply not at all, period). I just think that the notion of CLUSTERED has caused a huge amount of confusion. It makes it harder to explain what an index is. Right now, An index may or may not include some or all of the column values for columns unnamed when the index is created. If the index is clustered, it includes all. If the index is nonclustered, what information the index includes depends on whether or not the table is stored as a heap or not, and if not, what the clustered index or clustered primary key column list is. It sure would be easier to say that an index contains 1) the columns you list as key columns when you create it (keyed), 2) plus any columns you INCLUDE when you create it (nonkeyed), 3) plus the row locator 4) plus nothing else Right now, what's in an index is hard to explain, and it can change if you add or drop a different index. I agree that the whole notion of indexes should best be separate from DDL, but this issue of heap vs. b-tree is currently in limbo, which is worse. This would by no means be the first implementation detail to be part of CREATE TABLE (not that I'm wedded to this particular syntax, either). What about FILLFACTOR, ON filegroup, etc.? What I'd really like to see is for the word CLUSTERED to go away (if not deprecated, be never needed), and that it be completely clear what table columns are in an index from the CREATE INDEX and CREATE TABLE statements. Is that something you'd agree with, and if so, what suggestions do you have? SK [quoted text, click to view] David Portas wrote: >On 25 Feb, 18:06, Steve Kass <s...@drew.edu> wrote: > > >>If I could invent the syntax, I would not use CLUSTERED/NONCLUSTERED. >>I would require a table's organization to be specified when it is >>created, and then indexes would just be indexes: >> >>CREATE TABLE T ( >>... >>) AS HEAP >> >>CREATE TABLE X( >>... >>) AS BTREE(ColA,ColB) >> >>-- Steve Kass >>-- Drew University >>-- http://www.stevekass.com >>-- A1D8F830-B056-48DC-9A20-FB4931E7B380 >> >> >> > >Already I don't like your syntax. The failure to separate >"implementation language" from DDL is one of the worst offences of >most SQL DBMSs (Oracle and Microsoft anyway). SQL was not intended to >be used in that way. So many problems and limitations stem from that >one simple design error. > >-- >David Portas, SQL Server MVP > >Whenever possible please post enough code to reproduce your problem. >Including CREATE TABLE and INSERT statements usually helps. >State what version of SQL Server you are using and specify the content >of any error messages. > >SQL Server Books Online: > http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx >-- > >
Another thing: if we didn't have this CLUSTERED word, maybe no one would complain when they didn't get their unORDERed results in "clustered index order". SK [quoted text, click to view] David Portas wrote: >On 25 Feb, 18:06, Steve Kass <s...@drew.edu> wrote: > > >>If I could invent the syntax, I would not use CLUSTERED/NONCLUSTERED. >>I would require a table's organization to be specified when it is >>created, and then indexes would just be indexes: >> >>CREATE TABLE T ( >>... >>) AS HEAP >> >>CREATE TABLE X( >>... >>) AS BTREE(ColA,ColB) >> >>-- Steve Kass >>-- Drew University >>-- http://www.stevekass.com >>-- A1D8F830-B056-48DC-9A20-FB4931E7B380 >> >> >> > >Already I don't like your syntax. The failure to separate >"implementation language" from DDL is one of the worst offences of >most SQL DBMSs (Oracle and Microsoft anyway). SQL was not intended to >be used in that way. So many problems and limitations stem from that >one simple design error. > >-- >David Portas, SQL Server MVP > >Whenever possible please post enough code to reproduce your problem. >Including CREATE TABLE and INSERT statements usually helps. >State what version of SQL Server you are using and specify the content >of any error messages. > >SQL Server Books Online: > http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx >-- > >
[quoted text, click to view] "Gert-Jan Strik" <sorry@toomuchspamalready.nl> wrote in message news:45E1E0B2.67E9E9E9@toomuchspamalready.nl...
<snip> [quoted text, click to view] > P.S. Of course, earlier I should have posted the disclaimer that some > data types cannot be included using the INCLUDE clause of CREATE INDEX > command, such as text and varchar(max) >
Text, ntext and image can't be included using the INCLUDE clause. But varchar(max), nvarchar(max) and varbinary(max) can (at least according to BOL, I admit I've never tried it). Tom <snip>
Steve, No disrespect, but I don't see the difference, whether you specify heap or B-tree during table creation or afterwards using index creation. With the current architecture, the ramifications are the same. One thing is not entirely clear though. Your syntax for specifying a table as B-tree would not allow a non-unique set of columns, would it? That seems a major drawback! As for your property 2: a nonclustered index that includes all other columns does not need a row locator to the data in the heap/clustered index. Obviously, it is not very space efficient and it will have additional cost (overhead), but you can even create a heap with a nonclustered index that includes all columns. This creates both a heap *and* a B-tree organised table. So in the end, I would describe the clustered index as the index that includes the data at the leaf level "for free", so without data duplication and without additional data management overhead. Gert-Jan P.S. Of course, earlier I should have posted the disclaimer that some data types cannot be included using the INCLUDE clause of CREATE INDEX command, such as text and varchar(max) [quoted text, click to view] Steve Kass wrote: > > The use of the word CLUSTERED causes all kinds of confusion. > > Tables are organized in one of two ways: B-TREE or HEAP. This > is determined indirectly. If one index on the table is created with the > CLUSTERED keyword, or if a PRIMARY KEY constraint is present > in the CREATE TABLE statement, a table is organized as a B-TREE. > Otherwise, the table is organized as a HEAP. > > B-TREE organization requires that some sequence of columns be > identified as the key columns of the B-tree, and the mechanism > SQL Server provides for identifying these columns is in the list of > key columns of the (sole) clustered index on the table. There can > be only one specification of the way the B-tree is organized, therefore > only one index can be designated as CLUSTERED. > > The purpose of the keyword CLUSTERED, therefore, is to declare that > a) the table is to be organized as a B-tree, and b) to specify the key > columns > for the B-tree. Thus the reason there can be only one clustered index on > a table is because a B-tree cannot be ordered in multiple ways. > > When a table is organized as a B-tree, the B-tree that serves as the > primary store of the table data is referred to as the clustered index > of the table. All indexes have B-tree structure, but only one ("the" > clustered index) can have the following two properties 1) Contains > all columns, i.e., all the table's data; 2) has as its key columns the > columns whose values are used as row locators in all indexes. > > In SQL Server 2005, it is possible to create more than one B-trees > with property 1) above via CREATE INDEX .. INCLUDE. It is > impossible, however, to have two different B-trees each having > property 2) above, because indexes only have room to store one > row locator. The additional "complete table" B-trees that satisfy > property 1) might be called additional clustered indexes, since they > function in many ways as "the" clustered index of a table. > > If I could invent the syntax, I would not use CLUSTERED/NONCLUSTERED. > I would require a table's organization to be specified when it is > created, and then indexes would just be indexes: > > CREATE TABLE T ( > ... > ) AS HEAP > > CREATE TABLE X( > ... > ) AS BTREE(ColA,ColB) > > -- Steve Kass > -- Drew University > -- http://www.stevekass.com > -- A1D8F830-B056-48DC-9A20-FB4931E7B380 > > Shocky wrote: > > >why should there be only one clustered index in a table? > > > >
Don't see what you're looking for? Try a search.
|