A Microsoft Office (Excel, Word) forum. OfficeFrustration

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » OfficeFrustration forum » Microsoft Access » Running & Setting Up Queries
Site Map Home Register Authors List Search Today's Posts Mark Forums Read  

GROUP BY vs DISTINCT



 
 
Thread Tools Display Modes
  #21  
Old June 1st, 2007, 09:22 PM posted to microsoft.public.access.queries
David W. Fenton
external usenet poster
 
Posts: 3,373
Default GROUP BY vs DISTINCT

"Warrio" wrote in
:

Were you the one who said that life is too short??


The test demonstrated exactly what he had forecast, that DISTINCT is
the logical way to get DISTINCT values, and that GROUP BY is for a
different purpose (and would thus very likely not be as fast).

He only did the test because you stop you from continuing to natter
on about it.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/
  #22  
Old June 1st, 2007, 09:23 PM posted to microsoft.public.access.queries
David W. Fenton
external usenet poster
 
Posts: 3,373
Default GROUP BY vs DISTINCT

"Michel Walsh" wrote in
:

and here is the query plan, again, the same, for both statements
(excuse the HTML format, I suspect that without it, it turns out
un-readable):

|--Sort(DISTINCT ORDER BY[pubs].[dbo].[authors].[au_fname]
|ASC))

|--Index Scan(OBJECT[pubs].[dbo].[authors].[aunmind]))


Er, that's a SQL Server query plan, not a Jet one.

Who said we were using SQL Server as the back end?

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/
  #23  
Old June 1st, 2007, 09:40 PM posted to microsoft.public.access.queries
David W. Fenton
external usenet poster
 
Posts: 3,373
Default GROUP BY vs DISTINCT

"Michel Walsh" wrote in
:

and here is the query plan, again, the same, for both statements
(excuse the HTML format, I suspect that without it, it turns out
un-readable):

|--Sort(DISTINCT ORDER BY[pubs].[dbo].[authors].[au_fname]
|ASC))

|--Index Scan(OBJECT[pubs].[dbo].[authors].[aunmind]))


And if you run it in Jet, the SHOWPLAN gives this (Jet 3.5 or 4.0,
run both wiht LastName indexed and indexed (non-unique)):

--- Query1 ---
[SELECT DISTINCT tblPerson.LastName
FROM tblPerson;]

- Inputs to Query -
Table 'tblPerson'
- End inputs to Query -

store result in temporary table


--- Query2 ---
[SELECT tblPerson.LastName
FROM tblPerson
GROUP BY tblPerson.LastName;]

- Inputs to Query -
Table 'tblPerson'
- End inputs to Query -

01) Group table 'tblPerson'

This shows that the answer to the question depends on the database
engine being used, and, as expected, Jet handles the two completely
differently.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/
  #24  
Old June 1st, 2007, 10:09 PM posted to microsoft.public.access.queries
Warrio
external usenet poster
 
Posts: 38
Default GROUP BY vs DISTINCT

it's not that I want to waste my time saying that you are right or write
saying the only way is testing...

but the good thing in computing is that everything has a reason, somewhere
hidden or invisible, but it does exist! and saying that testing will give
the right answer is like gambling! how many parameters can influence your
testing????

even if you make it on a new machine, without any connection and only access
running! you can't have the same result.

come on, computing is one of the rare exact science, don't make something
esle!


"David W. Fenton" a écrit dans le message de
news: ...
"John Spencer" wrote in
:

I am not sure you are correct in your evaluation. I think the
only way to know would be to test with data and to run the tests
multiple times.


Well, surely the way to tell if they are parsed the same is to use
SHOWPLAN and find out. If they are parsed the same, then there isn't
any need to test with data. If they aren't, then you can only answer
the question with testing.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/



  #25  
Old June 2nd, 2007, 09:46 AM posted to microsoft.public.access.queries
Baz
external usenet poster
 
Posts: 380
Default GROUP BY vs DISTINCT

If you are so convinced that Michel is right, on the basis of no evidence
whatsoever, I suggest that YOU better try it.

If someone here told you that the moon is made of green cheese I guess you
would believe that too.

"Warrio" wrote in message
...
Were you the one who said that life is too short?? appearently you have a
lot of time to waste
and about your test, maybe when you were testing the query with GROUP BY,
you had your windows update downloading files

try it 100 times during different times a day and please get back to me


"Baz" a écrit dans le message de

news:
...
Except he's wrong, and so are you. I just tried it (as you easily could
too
if you wanted) on a table with 100,000 records where the field being
selected is a single containing random numbers. DISTINCT is

significantly
faster if the field is not indexed, and an order of magnitude faster if
the
field is indexed.

So now what?

You must have a very special kind of insight if this is going to give

you
a
"vision" of how memory is allocated.

"Warrio" wrote in message
...
Great Thanks Michel for your answer! exactly what I was looking for!

I'll have probably to think about it again once or twice so I can

really
have a vision of how the memory is allocated!

John, the best way would be to have the code that's behind these GROUP

BY
and DISTINCT

the test won't concern a big time difference, plus there is too many
variables within the same machine, you'd have only an average.

thanks again.


"John Spencer" a écrit dans le message de news:
...
I am not sure you are correct in your evaluation. I think the only

way
to
know would be to test with data and to run the tests multiple times.


--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
.

"Michel Walsh" vanderghast@VirusAreFunnierThanSpam wrote in message
...
GROUP BY will NOT be slower, since it forces you to specify which
field(s) you really need to define 'unique-ness', it will be faster,
or
take the same time, at worst, than DISTINCT, but note that GROUP BY

is
more rigid than DISTINCT, since any expression in the SELECT clause

must
then be either in the GROUP BY, either aggregated.


SELECT DISTINCT lastName, firstName, telNumber FROM somewhere

will be slower than

SELECT lastName, firstName, MAX(telNumber) FROM somewhere GROUP BY
lastName, firstName



since not only distinct will uselessly compare the telNumber field,
but
it will also reserve memory to keep it as 'group identifier'; the

second
solution neither does the comparison, neither has to reserve that
extra
memory.



Vanderghast, Access MVP


"Warrio" wrote in message
...
Hello!

Which of GROUP BY and DISTINCT is the quickest? and why?

Thanks for any relevant answer!

PS: Please don't tell me that there is no difference because it's a
matter of one millionth second.













  #26  
Old June 2nd, 2007, 09:51 AM posted to microsoft.public.access.queries
Baz
external usenet poster
 
Posts: 380
Default GROUP BY vs DISTINCT

My God what a pair of idiots (I mean Walsh and Warrio, not you David).

Probably better to let these two fools massage each other's egos in French
while we go and do something more worthwhile.

"David W. Fenton" wrote in message
. 1...
"Michel Walsh" wrote in
:

and here is the query plan, again, the same, for both statements
(excuse the HTML format, I suspect that without it, it turns out
un-readable):

|--Sort(DISTINCT ORDER BY[pubs].[dbo].[authors].[au_fname]
|ASC))

|--Index Scan(OBJECT[pubs].[dbo].[authors].[aunmind]))


Er, that's a SQL Server query plan, not a Jet one.

Who said we were using SQL Server as the back end?

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/



  #27  
Old June 2nd, 2007, 09:56 AM posted to microsoft.public.access.queries
Baz
external usenet poster
 
Posts: 380
Default GROUP BY vs DISTINCT

If you really believe that garbage then you have no chance of getting an
answer to your question. Oh, but I'm forgetting, you have already accepted
Michel's answer, even though he hasn't a clue what he's talking about.

"Warrio" wrote in message
...
it's not that I want to waste my time saying that you are right or write
saying the only way is testing...

but the good thing in computing is that everything has a reason, somewhere
hidden or invisible, but it does exist! and saying that testing will give
the right answer is like gambling! how many parameters can influence your
testing????

even if you make it on a new machine, without any connection and only

access
running! you can't have the same result.

come on, computing is one of the rare exact science, don't make something
esle!


"David W. Fenton" a écrit dans le message

de
news: ...
"John Spencer" wrote in
:

I am not sure you are correct in your evaluation. I think the
only way to know would be to test with data and to run the tests
multiple times.


Well, surely the way to tell if they are parsed the same is to use
SHOWPLAN and find out. If they are parsed the same, then there isn't
any need to test with data. If they aren't, then you can only answer
the question with testing.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/





  #28  
Old June 2nd, 2007, 10:07 AM posted to microsoft.public.access.queries
Baz
external usenet poster
 
Posts: 380
Default GROUP BY vs DISTINCT

It's a good thing he doesn't because speculation and misinformation are no
use to anyone.

"3stone" a écrit dans le message de news:
...
|
| Vanderghast, Access MVP
|

Un grand bonjour amical à Michel !

(dommage qu'il ne vient plus sur ce forum...)

--
A+
Pierre (3stone) Access MVP
Perso:
http://www.3stone.be/
MPFA: http://www.mpfa.info/ (infos générales)



  #29  
Old June 4th, 2007, 02:24 PM posted to microsoft.public.access.queries
Michel Walsh
external usenet poster
 
Posts: 2,404
Default GROUP BY vs DISTINCT

Indeed, I assumed that Jet would solve these simple basic queries the same
way MS SQL Server would do, but it seems Jet optimizer fails to optimize the
second query.

In MS SQL Server, there is no difference between the queries plan, even if
there are multiple fields (indexed, or not) between the DISTINCT and the
logical equivalent GROUP BY.


Vanderghast, Access MVP

"David W. Fenton" wrote in message
. 1...
"Michel Walsh" wrote in
:

and here is the query plan, again, the same, for both statements
(excuse the HTML format, I suspect that without it, it turns out
un-readable):

|--Sort(DISTINCT ORDER BY[pubs].[dbo].[authors].[au_fname]
|ASC))

|--Index Scan(OBJECT[pubs].[dbo].[authors].[aunmind]))


And if you run it in Jet, the SHOWPLAN gives this (Jet 3.5 or 4.0,
run both wiht LastName indexed and indexed (non-unique)):

--- Query1 ---
[SELECT DISTINCT tblPerson.LastName
FROM tblPerson;]

- Inputs to Query -
Table 'tblPerson'
- End inputs to Query -

store result in temporary table


--- Query2 ---
[SELECT tblPerson.LastName
FROM tblPerson
GROUP BY tblPerson.LastName;]

- Inputs to Query -
Table 'tblPerson'
- End inputs to Query -

01) Group table 'tblPerson'

This shows that the answer to the question depends on the database
engine being used, and, as expected, Jet handles the two completely
differently.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/



  #30  
Old June 4th, 2007, 02:47 PM posted to microsoft.public.access.queries
Michel Walsh
external usenet poster
 
Posts: 2,404
Default GROUP BY vs DISTINCT

Half of the pair will humbly tell you that if it just happen that Jet is ...
crippled... in the particular case, that is surely not 'by design', since
clearly, other SQL optimizer achieve to find the SAME plan of execution for
a DISTINCT and the logically equivalent GROUP BY.


There is nothing 'built-in' the concepts of DISTINCT or of GROUP BY that may
leads you to think they logically 'differ', exception made for some syntax
particularities like DISTINCT which can be used with *, while GROUP BY
cannot (due to extra validations carried over when GROUP BY is explicitly
used).



Vanderghast, Access MVP.


"Baz" wrote in message
...
My God what a pair of idiots (I mean Walsh and Warrio, not you David).

Probably better to let these two fools massage each other's egos in French
while we go and do something more worthwhile.

"David W. Fenton" wrote in message
. 1...
"Michel Walsh" wrote in
:

and here is the query plan, again, the same, for both statements
(excuse the HTML format, I suspect that without it, it turns out
un-readable):

|--Sort(DISTINCT ORDER BY[pubs].[dbo].[authors].[au_fname]
|ASC))

|--Index Scan(OBJECT[pubs].[dbo].[authors].[aunmind]))


Er, that's a SQL Server query plan, not a Jet one.

Who said we were using SQL Server as the back end?

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/





 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +1. The time now is 11:38 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 OfficeFrustration.
The comments are property of their posters.