If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#61
|
|||
|
|||
GROUP BY vs DISTINCT
Where? Here. You don't see it? Well, re-initialize your newsreader. Baz
reports it, and, since I was not believing its experimentation was right, I decided to do it myself and got a similar result that I also reported. Vanderghast, Access MVP. "David W. Fenton" wrote in message . 1... "Michel Walsh" vanderghast@VirusAreFunnierThanSpam wrote in : In Jet, the second query uses the same plan than MS SQL Server, so say the second query, in Jet, also takes 1 UET, but the first query, in Jet, easily takes up to 10 times that. Where do you get that information? I didn't see anyone post any such results. -- David W. Fenton http://www.dfenton.com/ usenet at dfenton dot com http://www.dfenton.com/DFA/ |
#62
|
|||
|
|||
What GROUP BY is not...
I really don't know about what Jet does in the step "group" of its query
plan. MS SQL Server is more detailed and uses the index (if it exists, and if the number of records is high enough) of the field to be DISTINCT/GROUP BY. If you implant that strategy yourself, in this case, and if the index exists, since the reading of the records (note the query is covered by the index) is done already sorted, so you can dump the result list onto a stack rather than on a sorted list. Indeed, to find if a record is already in the "selected/grouped" list, you only have to compare it with the top value on the stack. If the index does not exist, you have to relay on stats (if they are available): does the result will have much less rows than the initial data, or not. If the number of row is much smaller in the result, may be preferable to use a sorted list (faster to sort, a little bit longer to search) than to sort the whole initial set first, then use a stack (longer to sort the whole initial set, faster to 'search' for existence). Again, if the index exists, the query being covered by the index, you don't need to touch the data at all, just the index, so, you have less IO (since index would be, generally, more compact, than the 'table', it will resides on less 'pages', so less requirements to hit the hard disk). You see, that is what SQL is: it knows different way of doing the described set, and decide which strategy it will take. If you want to do the same, clearly, you should weight the similar alternatives, and 'hard code' multiple solutions, not just one (in general). Vanderghast, Access MVP "Warrio" wrote in message ... one question then : Does the jet engine sort the selected data before grouping them? If I had to write the function my self, I would create two queries one for a small amount of data and the other for considerable amount of data. Because sorting the data would be useful to avoid make a useless loop on the list of data to display to check if it's displayed already or not, because it's the same as the previous record of the sorted data... |
#63
|
|||
|
|||
What GROUP BY is not...
"David Cox" wrote in
: Most peoples cars could run faster if they opened the hood and made a few adjustments. . There are a lot of Access users out there that do not want to "open the hood", or know anything about what SQL is or does. For those less skilled users learning how to use Group By on the QBE grid will be far more productive in the short to medium term than learning how to edit SQL to add "DISTINCT". Er, you don't have to edit SQL to do that. Using my test query on an old slow PC I would need to run the query over 20 times before it saved me time. I agree that it's not really a significant difference for easy test cases. The other, more worrying, thing about distinct is that I can see no indication that it is in the query when viewed from the grid design window. Stranger still when I added a field x:1 to the design grid the "Group By" version ran faster than Distinct. by a factor around 8:7 DISTINCT shows 1 record for N duplicates. GROUP BY shows 1 record for N duplicates I think I have discovered, at last, why Microsoft did not feel the need for a "DISTINCT" button. I think you're imagining things. -- David W. Fenton http://www.dfenton.com/ usenet at dfenton dot com http://www.dfenton.com/DFA/ |
#64
|
|||
|
|||
What GROUP BY is not...
"Michel Walsh" vanderghast@VirusAreFunnierThanSpam wrote in
: Vanderghast, Access MVP "David W. Fenton" wrote in message . 1... "Warrio" wrote in : when I'll have the choice, I'll use the DISTINCT function with a jet until it gets repaired. I don't know that it needs to be "repaired." Because it is far from being optimal. Well, when used WRONG, perhaps. But have you tested it for its actual performance in aggregating data? If Jet would aggregate data more slowly if it optimized GROUP BY to DISTINCT, then I'm glad that it's slower when used WRONGLY as the equivalent of DISTINCT. But nobody has tested that. I see no reason to, since I have nothing to prove here. -- David W. Fenton http://www.dfenton.com/ usenet at dfenton dot com http://www.dfenton.com/DFA/ |
#65
|
|||
|
|||
What GROUP BY is not...
"Michel Walsh" vanderghast@VirusAreFunnierThanSpam wrote in
: You are also missing that GROUP BY WITH AGGREGATE can use the similar query plan than the one used by DISTINCT and GROUP BY without aggregate (as you can check my 'claim' using MS SLQ Server), where the aggregates are done for a small fraction (less than half of one percent of the whole query execution). Jet is inefficient in both cases, I mean, with or without aggregate. Have you tested that Jet's GROUP BY for aggregates has a performance problem? I NEVER said that result that are identical implies the execution plan should be the same. I said LOGICALLY EQIVALENT description SHOULD be optimized the same way. And DISTINCT and GROUP BY without aggregate ARE logically equivalent descriptions. But they *aren't* logically equivalent. I'm done here. -- David W. Fenton http://www.dfenton.com/ usenet at dfenton dot com http://www.dfenton.com/DFA/ |
#66
|
|||
|
|||
What GROUP BY is not...
1- Yes, they are as slow as without aggregate, while they should run in a
time almost equivalent as DISTINCT, for simple aggregate (that is, not for overly complex arithmetic expression, VBA function, sub-query, ... don't add words I don't say). 2- They are equivalent, as mentioned by Keith Hare, Convenor of the International SQL Standards. I try to find a public reference on the web at the moment, no success yet, but they are logically equivalent. And you are wrong. 3- Allen Browne reports a case when DISTINCT, in JET, is buggy, on the reported case, and YOU HAVE TO use GROUP BY, without aggregate, to really get a list of "distinct" values: http://allenbrowne.com/bug-12.html Vanderghast, Access MVP "David W. Fenton" wrote in message . 1... "Michel Walsh" vanderghast@VirusAreFunnierThanSpam wrote in : You are also missing that GROUP BY WITH AGGREGATE can use the similar query plan than the one used by DISTINCT and GROUP BY without aggregate (as you can check my 'claim' using MS SLQ Server), where the aggregates are done for a small fraction (less than half of one percent of the whole query execution). Jet is inefficient in both cases, I mean, with or without aggregate. Have you tested that Jet's GROUP BY for aggregates has a performance problem? I NEVER said that result that are identical implies the execution plan should be the same. I said LOGICALLY EQIVALENT description SHOULD be optimized the same way. And DISTINCT and GROUP BY without aggregate ARE logically equivalent descriptions. But they *aren't* logically equivalent. I'm done here. -- David W. Fenton http://www.dfenton.com/ usenet at dfenton dot com http://www.dfenton.com/DFA/ |
#67
|
|||
|
|||
What GROUP BY is not...
"David W. Fenton" wrote in message . 1... "David Cox" wrote in : Most peoples cars could run faster if they opened the hood and made a few adjustments. . There are a lot of Access users out there that do not want to "open the hood", or know anything about what SQL is or does. For those less skilled users learning how to use Group By on the QBE grid will be far more productive in the short to medium term than learning how to edit SQL to add "DISTINCT". Er, you don't have to edit SQL to do that. My apologies to the group. It is possible to specify Unique records in the properties window of a query, brought up by left clicking in the design window. I do not like this implementation, so I had never used it, and forgot it existed. a} This is quite a different query with the option selected, but it looks the same in the design window. b} It is potentially misleading. Unique means to me records that are not duplicated, I would naturally tend to interpret distinct the same way. I would prefer to see the word FIRST, or FIRSTOFGROUP used for this feature. Given the way it is I would find it easier to explain to a newbie that Access coceptually uses the first value it finds in a group by default, and expect it to be actually optimised to work that way. Using my test query on an old slow PC I would need to run the query over 20 times before it saved me time. I agree that it's not really a significant difference for easy test cases. The other, more worrying, thing about distinct is that I can see no indication that it is in the query when viewed from the grid design window. Stranger still when I added a field x:1 to the design grid the "Group By" version ran faster than Distinct. by a factor around 8:7 DISTINCT shows 1 record for N duplicates. GROUP BY shows 1 record for N duplicates I think I have discovered, at last, why Microsoft did not feel the need for a "DISTINCT" button. I think you're imagining things. -- David W. Fenton http://www.dfenton.com/ usenet at dfenton dot com http://www.dfenton.com/DFA/ |
Thread Tools | |
Display Modes | |
|
|