If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#21
|
|||
|
|||
Primary Keys
Lyle Fairfield wrote: The answer, for Access/Jet the PK determines the (non-maintained) clustered index, the physical ordering on disk. From http://msdn2.microsoft.com/en-us/library/wd9d69b1.aspx THE CLUSTERED PROPERTY IS IGNORED FOR DATABASES THAT USE THE MICROSOFT JET DATABASE ENGINE BECAUSE THE JET DATABASE ENGINE DOES NOT SUPPORT CLUSTERED INDEXES. No need to shout. Try reading more widely: New features in Jet Version 3.0: http://support.microsoft.com/default.aspx?id=137039 Quote: "Compacting the database now results in the indices being stored in a clustered-index format. While the clustered index isn't maintained until the next compact, performance is still improved ... The new clustered-key compact method is based on the primary key of the table. New data entered will be in time order." ACC2000: Defragment and Compact Database to Improve Performance http://support.microsoft.com/default...b;en-us;209769 Quote: "A disk defragmenter will place all files, including the database file into contiguous clusters on a hard disk ... If a primary key exists in the table, compacting re-stores table records into their Primary Key order. This provides the equivalent of Non-maintained Clustered Indexes" I think the phrase 'not supported' is used to convey the fact that in Jet you cannot specify the clustered index independent of the PRIMARY KEY as you can in, say, SQL Server. It may just mean that there is no syntax for CLUSTERED INDEX. Regardless of what 'not suuported' means, clustered indexes definitely exist for Jet and PRIMARY KEY is the way to leverage them. Jamie. -- |
#22
|
|||
|
|||
Primary Keys
onedaywhen wrote: No need to shout. BTW I tend to put SQL keywrods in uppercase e.g. PRIMARY KEY. Sorry if you thought I was shouting. Jamie. -- |
#23
|
|||
|
|||
Primary Keys
"onedaywhen" wrote in
oups.com: If you've understood the above you should come to the conclusion that a sole autonumber column will never make a good PRIMARY KEY in Access/Jet, because a random/incrementing integer/GUID does not make a good clustered index. I'd suggest that anyone who uses their autonumber column in a BETWEEN or GROUP BY construct has got something wrong in design and/or queries. I'd further suggest that anyone who uses BETWEEN or GROUP BY constructs which do not include columns that comprise their PKs are likely to have made a poor choice of PK. A random PK would result in the placement of records on as many data pages as possible, thus improving concurrency. -- David W. Fenton http://www.dfenton.com/ usenet at dfenton dot com http://www.dfenton.com/DFA/ |
#24
|
|||
|
|||
Primary Keys
David W. Fenton wrote: If you've understood the above you should come to the conclusion that a sole autonumber column will never make a good PRIMARY KEY in Access/Jet, because a random/incrementing integer/GUID does not make a good clustered index. A random PK would result in the placement of records on as many data pages as possible, thus improving concurrency. Good point, I was thinking too narrowly. If my table had columns for surname, initials and telephone number and my queries predominantly use BETWEEN on the surname column, having the table physically ordered on telephone number may make my queries perform worse than if the physical order was on surname (can you imagine trying to use a paper copy telephone directory ordered on telephone number g ?!) As I said, the choice of PK should be determined by the SQL DML e.g. you are interested in page locks for updates in a multi-user environment, I'm interested in query performance, etc. Jamie. -- |
#25
|
|||
|
|||
Primary Keys
Jamie Collins wrote: If my table had columns for surname, initials and telephone number and my queries predominantly use BETWEEN on the surname column, having the table physically ordered on telephone number may make my queries perform worse than if the physical order was on surname (can you imagine trying to use a paper copy telephone directory ordered on telephone number g ?!) As I said, the choice of PK should be determined by the SQL DML e.g. you are interested in page locks for updates in a multi-user environment, I'm interested in query performance, etc. Oops! I meant to add: In other words I want to fetch rows on the same page and contiguous pages; you want to maximise the chances of the rows each user will be interested in are on different pages (am I correct?) I think in my simple contacts example physically ordering on surname would provide good concurrency as well. Whatever, it's clear we are both thinking about the Jet implementation (i.e. contiguous storage on disk) when considering PKs. Can everyone else say the same? Jamie. -- |
#26
|
|||
|
|||
Primary Keys
Regardless of what 'not suuported' means, clustered indexes definitely
exist for Jet and PRIMARY KEY is the way to leverage them. One can get some of the advantages of clustered indexes by choosing a meaningful primary key, by compacting ... and, perhaps, by defragging. This is a far cry from the convenience and power or a clustered index. |
#27
|
|||
|
|||
Primary Keys
Lyle Fairfield wrote: One can get some of the advantages of clustered indexes by choosing a meaningful primary key, by compacting ... and, perhaps, by defragging. We seem to be in agreement here i.e. what to consider when choosing a PK. This is a far cry from the convenience and power [of] a clustered index. The gap doesn't appear that wide to me but I'm willing to learn otherwise. Details? Thanks, Jamie. -- |
#28
|
|||
|
|||
Primary Keys
Jamie Collins wrote: Lyle Fairfield wrote: One can get some of the advantages of clustered indexes by choosing a meaningful primary key, by compacting ... and, perhaps, by defragging. We seem to be in agreement here i.e. what to consider when choosing a PK. This is a far cry from the convenience and power [of] a clustered index. The gap doesn't appear that wide to me but I'm willing to learn otherwise. Details? Thanks, Jamie. It appears wide to me. A clustered index in SQL-Server is maintained. No compacting or defragging is required. In Jet, if the table/file is small, the performance advantages will, probably, not be noticed. If the table/file is large, there will be a penalty (time/resources) in compacting. Even if we compact, there is, TTBOMK, no guarantee that the compacting will use contiguous sectors, although we might hope for that if the disk is new. Pages containing consecutive (from the pimary key point of view) might be distant from each other (from the disk's point of view). Defragging is likely to cure this. But defragging is slow. And defragging may result in the space after the MDB/E being used by another file. So that, as soon as we update or insert another record it may again be remoteness (from the disk's point of view) from records with "adjacent" primary keys. And to correct this remoteness we may have to compact and defrag. In a maintained clustered index all of this (we hope) is planned and managed by the database engine. This seems to me to be wide gap. I understand that with a static database, the gap would be considerably less. In general, I agree with you that if one were doing a lot of sql work based on = 'Mainwaring; and = 'Milne' it might, depending on other needs of the db, it might be efficient to use some primary key that helped with identifying the records wanted or processed. I think I have said sufficient about this ...if you reply you shall have the field to yourself. |
#29
|
|||
|
|||
Primary Keys
"Jamie Collins" wrote in
ps.com: David W. Fenton wrote: If you've understood the above you should come to the conclusion that a sole autonumber column will never make a good PRIMARY KEY in Access/Jet, because a random/incrementing integer/GUID does not make a good clustered index. A random PK would result in the placement of records on as many data pages as possible, thus improving concurrency. Good point, I was thinking too narrowly. If my table had columns for surname, initials and telephone number and my queries predominantly use BETWEEN on the surname column, having the table physically ordered on telephone number may make my queries perform worse than if the physical order was on surname (can you imagine trying to use a paper copy telephone directory ordered on telephone number g ?!) As I said, the choice of PK should be determined by the SQL DML e.g. you are interested in page locks for updates in a multi-user environment, I'm interested in query performance, etc. Given that with Jet you can only have the one clustered index, I really think this is a pretty irrelevant consideration. I never use the BETWEEN operator on anything but date fields, which in the vast majority of tables could not possibly ever be a candidate for PK (and very seldom even a candidate for inclusion in a compound natural key, which wouldn't give you the clustered index benefit, anyway, unless the date was the first field of the compound key). So, I just don't see any practical benefit in your pointing out the performance advantage of the clustered index. Have you tested SEEKs on non-PK indexes? -- David W. Fenton http://www.dfenton.com/ usenet at dfenton dot com http://www.dfenton.com/DFA/ |
#30
|
|||
|
|||
Primary Keys
"Jamie Collins" wrote in
ps.com: Jamie Collins wrote: If my table had columns for surname, initials and telephone number and my queries predominantly use BETWEEN on the surname column, having the table physically ordered on telephone number may make my queries perform worse than if the physical order was on surname (can you imagine trying to use a paper copy telephone directory ordered on telephone number g ?!) As I said, the choice of PK should be determined by the SQL DML e.g. you are interested in page locks for updates in a multi-user environment, I'm interested in query performance, etc. Oops! I meant to add: In other words I want to fetch rows on the same page and contiguous pages; you want to maximise the chances of the rows each user will be interested in are on different pages (am I correct?) . . . No. I want to minimize the chance that two users will be editing data on the same data page. . . . I think in my simple contacts example physically ordering on surname would provide good concurrency as well. Whatever, it's clear we are both thinking about the Jet implementation (i.e. contiguous storage on disk) when considering PKs. Can everyone else say the same? Well, either way, it's irrelevant for newly added records before the database is compacted, since those are all going to land in their own data page that is not written back in PK order. As long as you've got records added after the last compact, there's always going to be some data pages (and index data pages) that are not in the final order (whether your natural key order or my random Autonumber order), and thus the performance gain can never be fully realized in an actively used database. -- David W. Fenton http://www.dfenton.com/ usenet at dfenton dot com http://www.dfenton.com/DFA/ |
Thread Tools | |
Display Modes | |
|
|