View Single Post
  #23  
Old December 6th, 2005, 04:50 PM posted to microsoft.public.access,microsoft.public.access.queries,microsoft.public.access.formscoding
external usenet poster
 
Posts: n/a
Default General solution for missing sequence numbers

You're right, they don't know. Nor do they care. They're scientists, not
computer techs. They expect a database to fill a need, not dictate to them
in matters that do not interest them.

The numbering convention is something used all over the world in this
particular field (parasitic lichenology, to be specific) and calling them
stupid for using a numbering convention seems pointless to me. Records are
generally not deleted, this entire issue came about because gaps in the
numbering appeared due to misuse of a tool not designed for the job.
Duplicates of a sample are sometimes sent to other institutions and the
database now has tools for handling that, but the only time an actual record
might be deleted is if a sample is lost or accidentally destroyed, and in
that case the number is NOT re-used. In fact, I doubt that even then would
the record be deleted, it would probably simply have a remark put in a
comment field to the effect that the sample is no longer in existence, but
the information is still valuable.

And sequential numbering is not 'just wrong'. Thousands of things in daily
life are numbered so, streets, houses, rooms, movie tickets, store receipts.
Can you imagine a library where book catalog numbers were not in sequence?
Or zip codes assigned randomly, instead of at least being geographically
grouped? Certainly, users occasionally have unrealistic expectations and
some education is often necessary. But expecting an entire scientific
community to alter their standards to suit my notions of order is
ridiculous.

--
Pete



"Rob Oldfield" píše v diskusním příspěvku
...

In that situation the issue isn't about how to 'fill in the gaps', it's

how
to explain to a particular group of scientists that they don't have a clue
about how relational databases work. Are they really stupid enough to not
understand that, although my highest ID number is 10000, that I only have
1000 samples? Even if I point out that 9000 of those records are marked

as
'not really a sample' (i.e. the idea of disallowing deletions and marking
the record as inactive instead)?

Expecting sequential numbering is just wrong. It's up to those of us who
actually know about the issues raised by John Vinson to not allow those

who
don't to tell us how to put databases together.


"Peter Danes" wrote in message
...
Such situations are common, for a variety of reasons. Depends on the
database and the user and what the data is for. The particular example

that
inspired this outburst is a mycological database, where the numbers are

used
to sequentially number the scientist's samples. She told me that

numbering
is important for others in the field to know roughly how many samples a
particular researcher has, and for internal inventory purposes, that

they
don't expect to have holes in the numbering sequence.

If someone who has 1,000 samples in their collection publishes something
about their sample number 10,000 and it is known that the person does

not
have anywhere near 10,000 samples, it would be viewed as odd at the very
least, possibly unethical and such a person would find himself not taken
seriously by other researchers. One or two numbers amiss in this

situation
is obviously not a major concern.

And for the internal inventory controls, if someone sees sample 152 next

to
150, they are going to wonder where is number 151. The inventory methods
used expect sequential numbering and a missing number is an indication

of
something wrong. For inventory numbers in the original database, she

used
the record number that appears in the text box of Access's navigation
control in conjunction with an autonumber field. You may guess what sort

of
hash resulted from that. I started out trying to fix a few things for

her
and wound up doing almost a complete re-write of the entire thing and

this
numbering issue is one of the things that surfaced. She wants to be able

to
fill in all the gaps as well as add new numbers to the end as she

collects
new samples.

Pete


"David C. Holley" píse v diskusním príspevku
...
What is the specific *NEED* to find the missing numbers?

Peter Danes wrote:
I occasionally need to determine a number that I don't have in a
sequence, either the first missing one in a gap in a set of

sequential
numbers or the next one in line at the end of a numbered series.

Always
it meant some fumbling around, with either VBA at first or later with

SQL
when I got good enough at it, establishing the proper join parameters

and
such. For SQL experts, this is probably routine and trivial, but for

me
it was always a bit of a chore. The last straw came with a database

which
I recently wrote, where the converted data had such a numbered

series,
and the owner wanted to be able to do both, fill in missing numbers

in
the gaps AND add new numbers at the end.

Walking home from a bar last night, I got to thinking about it and
realized that both problems are actually fairly similar and that a

simple
and general solution is possible.
I put together a simple table containing one field with the following
entries:

1,2,3,4, 8,9,10, 15,16,17,18, 20, 22,23,24,25, 28,29,30

Missing a


5,6,7, 11,12,13,14, 19, 21, 26,27 and 31 on up.

This is the dataset used for all of the following examples.


Finding the next new number at the end of a series with SQL is

trivial;
here is a simplified version of a statement that I found somewhere in

the
discussion groups a few years ago:

SELECT Max(MyTable.MySeqFld)+1 FROM MyTable;

This will return a one-record, one-field recordset containing exactly

one
value: 31, which is one greater than the largest value so far used in
that field. This is what you would want to use instead of Access's
autonumber, if the field is to contain meaningful sequence numbering,
rather than just a unique identifier.

Locating gaps is a little more complicated: it involves a self-join

from
N to N+1 and finding where N+1 doesn't exist, indicating a gap at

that
point.

SELECT MT1.MySeqFld+1
FROM MyTable AS MT1 LEFT JOIN MyTable AS MT2 ON
MT1.MySeqFld+1=MT2.MySeqFld
WHERE MT2.MySeqFld IS NULL;

This generates a recordset of 5, 11, 19, 21, 26, 31, where each value

is
the first missing value in a gap, including the "open gap" at the

end,
and that's where the trick to a general solution begins. Since these
situations normally call for either the first (lowest number) gap or

last
(end of recordset) gap, you need either the first or last record

returned
by this query. Sorting and using the TOP predicate gives you exactly
that.

SELECT TOP 1 MT1.MySeqFld+1
FROM MyTable AS MT1 LEFT JOIN MyTable AS MT2 ON
MT1.MySeqFld+1=MT2.MySeqFld
WHERE MT2.MySeqFld IS NULL
ORDER BY MT1.MySeqFld;

This will again return a one-record, one-field recordset containing
exactly one value: 5, the first missing number in the first gap in

the
sequence. Ascending sort order is the default, so the smallest number

is
the first returned.

SELECT TOP 1 MT1.MySeqFld+1
FROM MyTable AS MT1 LEFT JOIN MyTable AS MT2 ON
MT1.MySeqFld+1=MT2.MySeqFld
WHERE MT2.MySeqFld IS NULL
ORDER BY MT1.MySeqFld DESC;

This will return a one-record, one-field recordset containing exactly

one
value: 31, the same "one greater than the highest value so far used

in
that field" that is returned by the first simple example. Specifying

the
descending order here is necessary, since we want the last (greatest)
record from the set and Access SQL does not have a BOTTOM predicate.

Finally, an even more general statement can be used:

SELECT TOP 1 MT1.MySeqFld+1
FROM MyTable AS MT1 LEFT JOIN MyTable AS MT2 ON
MT1.MySeqFld+1=MT2.MySeqFld
WHERE MT2.MySeqFld IS NULL
ORDER BY ((INSTR("LF",[First or Last (F or L)]) *2)-3)*MT1.MySeqFld;

This expects one parameter, F or L and will return either the first
missing number or the next number at the end of the line. The INSTR
expression evaluates to either -1 or 1 (or -3 if the parameter

supplied
is neither F nor L, but that has the same effect as -1 in this

instance),
that is then used as a multiplier for the sort field, so the sort is
either by the field or by the negative of the field (or 3 times the
negative of the field), giving either ascending or descending order

and
with the TOP 1 predicate again returns exactly the one value of

interest.