If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#31
|
|||
|
|||
IIf statement
I am sorry Jamie, but I did not see the posts that you posted this morning
until now. -- Barry Guidry "Jamie Collins" wrote: Barry wrote: Sorry, after analyzing the results a bit more I realized that it is also prioritizing the results with a not null [Qual], rather than the opposite. I realized this after adding another row of null [Qual] data with same [Lab_id] as a row with a not null [Qual]. And, it seems to be excluding the row of data that contained only one row of data in a select single [Lab_id] group. Barry, If you are going to change the test data set (e.g. for the better) please post it (rather than describe it) along with the expected results. I took the time to post SQL DDL code to create the test table and the test data using INSERT statements; it would be nice if you could make similar efforts to keep us up to speed. I suspect I'm not the only one finding the changing spec/data situation a little frustrating, albeit a good challenge g. Jamie. -- |
#32
|
|||
|
|||
IIf statement
Jamie, the test table query resulted in an error, but I ran the proposed
solution on my original dataset. The result was the only remaining issue that it sums together the results with null and non-null quals. I do not know if you saw my other posts from today but I changed your first recommended expression a bit with an outcome close to what we are looking for. Please look at it here http://www.microsoft.com/communities...968&sloc=en-us -- Barry Guidry "Jamie Collins" wrote: Barry wrote: we must make an intelligent decision as to a priority of summing the [Result] values if there is any null value in [Qual] within the [Lab_id] group, rather than taking the max of the [Result] if all [Qual] values in a [Lab_id] This is an example of the test data: NAME DATE LAB_ID ANALYTE RESULT QUAL RW-6 2/24/2006 711721 o-Xylene 0.4 ND RW-5 2/24/2006 711722 o-Xylene 0.4 ND RW-1 2/24/2006 711723 m+p-Xylene 2800 RW-1 2/24/2006 711723 o-Xylene 1900 Duplicate 2/24/2006 711724 m+p-Xylene 2500 Duplicate 2/24/2006 711724 o-Xylene 1700 Here's my latest attempt: Create the test table: CREATE TABLE Xylenes ( NAME VARCHAR(12) NOT NULL, [DATE] DATETIME NOT NULL, LAB_ID CHAR(6) NOT NULL, ANALYTE VARCHAR(15) NOT NULL, [RESULT] DECIMAL(17, 2) NOT NULL, QUAL CHAR(2) ) ; Create test data: INSERT INTO Xylenes ( NAME, [DATE], LAB_ID, ANALYTE, [RESULT], QUAL) VALUES ('RW-6', #2006-02-24 00:00:00#, '711721', 'o-Xylene', 0.4, 'ND') ; INSERT INTO Xylenes ( NAME, [DATE], LAB_ID, ANALYTE, [RESULT], QUAL) SELECT DT1.NAME, DT1.[DATE], DT1.LAB_ID, DT1.ANALYTE, DT1.[RESULT], DT1.QUAL FROM ( SELECT 'RW-5' AS NAME, #2006-02-24 00:00:00# AS [DATE], '711722' AS LAB_ID, 'o-Xylene' AS ANALYTE, 0.4 AS [RESULT], 'ND' AS QUAL FROM Xylenes UNION ALL SELECT 'RW-1', #2006-02-24 00:00:00#, '711723', 'm+p-Xylene', 2800, NULL FROM Xylenes UNION ALL SELECT 'RW-1', #2006-02-24 00:00:00#, '711723', 'o-Xylene', 1900, NULL FROM Xylenes UNION ALL SELECT 'Duplicate', #2006-02-24 00:00:00#, '711724', 'm+p-Xylene', 2500, NULL FROM Xylenes UNION ALL SELECT 'Duplicate', #2006-02-24 00:00:00#, '711724', 'o-Xylene', 1700, NULL FROM Xylenes ) AS DT1 ; The proposed solution: SELECT X1.LAB_ID, 'sum_of' AS aggregate_type, SUM(X1.[RESULT]) AS aggregate_value FROM Xylenes AS X1 WHERE EXISTS ( SELECT * FROM Xylenes AS X2 WHERE X2.LAB_ID = X1.LAB_ID AND X2.QUAL IS NULL ) GROUP BY X1.LAB_ID UNION ALL SELECT X1.LAB_ID, 'max_of', MAX(X1.[RESULT]) FROM Xylenes AS X1 WHERE NOT EXISTS ( SELECT * FROM Xylenes AS X2 WHERE X2.LAB_ID = X1.LAB_ID AND X2.QUAL IS NULL ) GROUP BY X1.LAB_ID ORDER BY 1 ; Jamie. -- |
#33
|
|||
|
|||
IIf statement
Yes, Jamie that seems to do it! I have added data to test it with under
various circumstances and it checked out on all scenarios ran so far. I thank you very much for your expertise and for your patience. I will still need to use it to eventually come up with (I guess something like) a 'Make-Table Query' to generate the total results from this query combined with a field or two of generic data auto-filled in, but I think I will think about it a while and start a new thread for that possibly. Thank you very much! -- Barry Guidry AMO Environmental Decisions, Inc. "Jamie Collins" wrote: Barry wrote: I have gotten even closer to my end result by adapting the SQL expression to this: SELECT X1.LAB_ID, 'max_of' AS aggregate_type, MAX(IIF(X1.QUAL IS NULL, NULL, X1.[RESULT])) AS aggregate_value FROM Xylenes AS X1 WHERE NOT EXISTS ( SELECT * FROM Xylenes AS X2 WHERE X2.LAB_ID = X1.LAB_ID AND X2.QUAL IS NULL ) GROUP BY X1.LAB_ID UNION ALL SELECT X1.LAB_ID, 'sum_of', SUM(IIF(X1.QUAL IS NOT NULL, NOT NULL, X1.[RESULT])) FROM Xylenes AS X1 WHERE EXISTS ( SELECT * FROM Xylenes AS X2 WHERE X2.LAB_ID = X1.LAB_ID AND X2.QUAL IS NOT NULL ) GROUP BY X1.LAB_ID ORDER BY 1; by reversing the order of the "SUM" and "MAX" functions, as well as the "WHERE/WHERE NOT EXISTS", to yield these results: LAB_ID aggregate_type aggregate_value 711721 sum_of 2 711722 sum_of 711722 max_of 0.4 711723 sum_of 4700 from this set of test data (added a couple more rows to original dataset): SAMPLE DATE LAB_ID ANALYTE RESULT QUAL RW-6 2/24/2006 711721 o-Xylene 0.4 ND RW-6 2/24/2006 711721 o-Xylene 2 RW-5 2/24/2006 711722 o-Xylene 0.4 ND RW-1 2/24/2006 711723 m+p-Xylene 2800 RW-1 2/24/2006 711723 o-Xylene 1900 RW-1 2/24/2006 711723 m+p-Xylene 1 ND Duplicate 2/24/2006 711724 m+p-Xylene 2500 Duplicate 2/24/2006 711724 o-Xylene 1700 So, the SQL is accurately selecting and sums the results with null [Qual]'s over those with not null [Qual]'s (very good). The only issue that remains is that it did not include a SUM of the two rows of data with the [Lab_id] = "711724"; it appears for some reason that it was due to both rows of data having null [Qual]'s. Maybe because it is only comparing null/not null [Qual]'s and not additionally only summing the results of data with all null [Qual]'s in the same group [Lab_id], right?? This? SELECT X1.LAB_ID, 'max_of' AS aggregate_type, MAX(IIF(X1.QUAL IS NULL, NULL, X1.[RESULT])) AS aggregate_value FROM Xylenes AS X1 WHERE NOT EXISTS ( SELECT * FROM Xylenes AS X2 WHERE X2.LAB_ID = X1.LAB_ID AND X2.QUAL IS NULL ) GROUP BY X1.LAB_ID UNION ALL SELECT X1.LAB_ID, 'sum_of', SUM(IIF(X1.QUAL IS NULL, X1.[RESULT], NULL)) FROM Xylenes AS X1 WHERE EXISTS ( SELECT * FROM Xylenes AS X2 WHERE X2.LAB_ID = X1.LAB_ID AND X2.QUAL IS NULL ) GROUP BY X1.LAB_ID ORDER BY 1; Jamie. -- |
#34
|
|||
|
|||
IIf statement
Barry wrote: Jamie that seems to do it! I have added data to test it with under various circumstances and it checked out on all scenarios ran so far. I thank you very much Thanks to John Spencer. In fact, you may want to consider his construct e.g. SELECT X1.LAB_ID , IIF(EXISTS ( SELECT * FROM Xylenes AS X2 WHERE X2.LAB_ID = X1.LAB_ID AND X2.QUAL IS NULL) , SUM(IIF(X1.QUAL IS NULL, X1.[RESULT], NULL)) , MAX(IIF(X1.QUAL IS NULL, NULL, X1.[RESULT]))) AS aggregate_value FROM Xylenes AS X1 GROUP BY X1.LAB_ID; I'm not sure whether I prefer John's correlated subquery or my UNION ALL approach. Jamie. -- |
#35
|
|||
|
|||
IIf statement
That makes two of us. I would probably try both and see if I got the same
results. If so, then I would try them both on large datasets and see what kind of performance I got with them. I suspect that the Union query might be faster But since you can never be sure what the query engine will do when developing the query plan, who knows. It is probably only academic for small datasets. "Jamie Collins" wrote in message ups.com... Barry wrote: Jamie that seems to do it! I have added data to test it with under various circumstances and it checked out on all scenarios ran so far. I thank you very much Thanks to John Spencer. In fact, you may want to consider his construct e.g. SELECT X1.LAB_ID , IIF(EXISTS ( SELECT * FROM Xylenes AS X2 WHERE X2.LAB_ID = X1.LAB_ID AND X2.QUAL IS NULL) , SUM(IIF(X1.QUAL IS NULL, X1.[RESULT], NULL)) , MAX(IIF(X1.QUAL IS NULL, NULL, X1.[RESULT]))) AS aggregate_value FROM Xylenes AS X1 GROUP BY X1.LAB_ID; I'm not sure whether I prefer John's correlated subquery or my UNION ALL approach. Jamie. -- |
#36
|
|||
|
|||
IIf statement
John Spencer wrote: I'm not sure whether I prefer John's correlated subquery or my UNION ALL approach. That makes two of us. I would probably try both and see if I got the same results. If so, then I would try them both on large datasets and see what kind of performance I got with them. Agreed. Jamie. -- |
Thread Tools | |
Display Modes | |
|
|