If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#21
|
|||
|
|||
How do you mulitply in a field?
On 16 Aug, 17:58, "BruceM" wrote:
Regarding autonumber or other arbitrary PKs Wait up. PRIMARY KEY designation is *always* arbitrary but I think you meant 'artificial key' (or similar). Do you mean keys in the logical model or in the physical implementation...? understand that I *never* argued in favor or creating a table for which the "only unique constraint is defined on a sole autonumber column". Do I not recall correctly a thread where you had an entity type 'trainer' (natural person) where there was no industry standard identifier and the compound of the available attributes (company identifier, person full name) did not yield uniqueness so you omitted a key from your logical model and used an autonumber PRIMARY KEY in your Access implementation? Apologies in advance if I've recalled incorrectly (google groups archive search seem to be broken just now). Also, note that I was addressing the group rather than you personally; again, sorry if I was not clear. If I have a listing of vendors, what is the *unchanging* unique constraint? In the logical model, a good key should be *stable*; immutability is the ideal since but real life is not always ideal. I too wish all your vendors were issued with an unchanging identifier. In which sector do to you operate e.g. DUNS number? In mine, incorporated bodies must be registered with a government agency which issues a public and unchanging (for all practical purposes) identifier. Although there are incentives for businesses to incorporate, I still need to model unincorporated businesses; fortunately, we have "passing off" laws which makes trading name (yes, can change) plus trading period a reasonably stable key. I also need to model businesses from all other jurisdictions for which an enterprise key of our own design is used. I've found trusted sources of potentially useful identifiers wary of revealing details (I found the 'DX Exchange' people in the UK most unhelpful). Yes, there are no easy answers, autonumber included. My main problem with autonumbers is that they do not exist in the logical model, therefore they are not a logical key. This was the problem I recalled from your 'trainers' scenario. A record's uniqueness is one thing. The means of identifying it in relationships is another. In the physical implementation I've no problem with you or anyone using a so-called surrogate if you also have a natural key or enterprise key which exists external to the database. I wouldn't recommend doing it myself for a variety of reasons e.g. doing so makes data harder to read. maybe cascading updates of multi-field keys are not a problem. I would rather avoid them Ideally (that word again), I think the SQL DBMS should store the key value only once and use an internal surrogate, in the way you do by hand with autonumber, to point to it; this way an ON UPDATE CASCADE action would physically change only one value. Such SQLs exist but professionally I need use Microsoft's products, at least in the immediate future; Microsoft doesn't have them so I go without. So my ideal surrogate would be hidden and I'd still see the real key values in the referencing tables but that's not the reality for me. With your way (by hand with autonumber) you either have to work with the meaningless values or use a JOIN (or three or thirteen) to see the real values, so it's not for me thanks and good luck to you. You will not change my mind on the subject, nor I yours. One thing that tickles me about the 'autonumber PK' advocates it that they most often use the incremental Long Integer flavour, presumably because it makes the data easier to read! If I cannot dissuade you from autonumber, I urge you to choose random: it will improve concurrency (let's not start on the whole 'physical clustering on disk' thing), you will be less inclined to expose the autonumber of users and, having to type the values over and over, you may come to appreciate a well designed key (fixed width, check digit, etc). It's 'cruel to be kind'; making you type replication IDs (GUID) would just be cruel g. Jamie. -- |
#22
|
|||
|
|||
How do you mulitply in a field?
(ie a Make Table query
in Access), that block of data is NOT part of the data storage model for your application. I don't get your 'Make Table query' example. IMO no application should be creating permanent tables on the fly. I think that one of the main uses of a make table query is to reassemble data that were previously stored in a spreadsheet or some such. In that case the data wrangling is indeed not part of the data storage model, but rather a form of data entry. |
#23
|
|||
|
|||
How do you mulitply in a field?
Hi again. You were sort of overlooked in all of that, weren't you?
Let's rethink this a bit. The purpose of the database is to store the fees. The calculations can be done wherever they are needed. If you want to have a printout of records within a period of time (a month, for instance) you can add a calculated (unbound) text box to the report footer. For its Control Source: =Sum([SomeField]) where SomeField is a numeric field for which you wish to calculate the total. If you want to limit the report to a block of time, in the report's Record Source query you can add a criteria to the ProjectDate field (or whatever you call it): Between [Enter start date] And [Enter end date] What version of Access are you using? You should be able to create a query, save it with a name, and select that named query as the report's (or form's) Record Source. You can use a similar technique to what I described for a report to get a sum in a form. In a report you can group the records (by month, for instance) and calculate the sum for each month. You can do running sums in reports, and you can use queries to calculate totals in various ways. What exactly do you need to do? In describing this, describe the table structure and relationships, as well as the real world situation behind the database. "wazabbbi" wrote in message ... I'm a newbee at this whole access thing. The purpose of this database is to calculate fees and total fees collected for a project. I wanted to be able to store the result in the same table. I was able to have the result calculated on the form but the version I have of access is so old that I can only run reports from a table. To further complicate things I have another field that I want to have the total of the inspection fee and two other fields displayed and stored. In excel its simple enough to do but I'm having trouble figuring out how to do it in access. Eventually I'd like to run reports showing total fees collected for a set period of time but also for a particular applicant (our applicants may have multiple projects). Can you guys please elaborate as to the best way for me to accomplish my goal. Thanks =) "Jamie Collins" wrote: On Aug 16, 2:16 pm, "Roger Carlson" wrote: I don't see why this should be contradictory. The fact is the calculated column is not "stored" in a Query, so the cases are not even similar. Since when did normalization have anything to do with physical storage?! Consider that in some SQLs (e.g. Oracle) VIEWs can be materialized i.e. their data *are* physically stored. I repeat: normalization applies to logical tables. By the same reasoning, we should never store a Join of two tables because that would produce data redundancy. A JOIN in a query does not by necessarily result in a denormalized structu you have to consider the whole: SELECT clause, WHERE clause, GROUP BY clause, etc. But yes, the reasoning is the same: reundant data in a VIEW, regardless of how it was derived, consistutes denormalization. Jamie. -- |
#24
|
|||
|
|||
How do you mulitply in a field?
On 17 Aug, 11:12, "Graham R Seach" wrote:
Technically, normalisation applies only to the logical relation, not to a physical table or any view of it. We need a conceptual layer. The one I am employing is SQL (the language)... Views are not virtual tables In SQL terms, that's a misstatement because a SQL VIEW is a virtual table. If you open a "table" to look at its data, you're visualising that data through an additional layer called a view Again, in SQL terms, that's a misstatement because a SQL VIEW is a table (lowercase) but a SQL TABLE is not a SQL VIEW. FWIW my source of reference is the SQL-92 specification. As I would hope you know, any visualisation of the data conained within a table, is a view In your entire post, if you substituted your term 'view' for my term 'resultset' then we would be in broad agreement. As it is, you seem to use 'view' to mean 'SQL VIEW' and 'resultset' interchangeably, which I find more than a little confusing to be honest. In application design I often write a vanilla SQL query and think to myself, "I'm sure this would be useful in other situations" and would therefore look to create a persisted object in my SQL DBMS. But how to decide whether to expose it as a SQL VIEW or a SQL PROCEDURE? Often the choice of PROCEDURE is a no-brainer (contains control of flow procedural code, aggregated results make no sense without delimiting parameter values, etc) but it only become a VIEW if I can achieve the same degree of normalization as a comparable base table. I simply do not apply normalization to a PROCEDURE's resultset. Just as an aside; in a later post you state that we should always aim for the highest normal form. That's not entirely accurate or desirable. I'm sure that when you think about it, you'll agree. DKNF or (heaven forbit) Lossless Joins are not for the feint of heart; nor are they anywhere near practical in 99.99% of cases. Yes, I put it badly. I'm not sure I even believe in 6NF g. Even in data warehouses, we rarely go beyond 4NF. I think you are mistaken. I was taught that 5NF is always achievable and that data warehouses are often purposely denormalized. A single column lookup table is in 5NF and that's a very common design in OLTP! Jamie. -- |
#25
|
|||
|
|||
How do you mulitply in a field?
"Jamie Collins" wrote in message
s.com... On 16 Aug, 17:58, "BruceM" wrote: Regarding autonumber or other arbitrary PKs Wait up. PRIMARY KEY designation is *always* arbitrary but I think you meant 'artificial key' (or similar). Do you mean keys in the logical model or in the physical implementation...? Yes, I meant "artificial". You have completely lost me when you say the PK designation is always arbitrary. Are you saying the designation is arbitrary? I think I use the term PK where you use "unique constraint". Perhaps there is a difference between the two, but in any case, when I refer to a PK I mean the field or fields that are used for relationships. The record's uniqueness is an atttibute of the record aside from the fact that it has an autonumber or other artificial key. To put it another way, the autonumber PK is a convenience, not the only thing that is unique about the record. understand that I *never* argued in favor or creating a table for which the "only unique constraint is defined on a sole autonumber column". Do I not recall correctly a thread where you had an entity type 'trainer' (natural person) where there was no industry standard identifier and the compound of the available attributes (company identifier, person full name) did not yield uniqueness so you omitted a key from your logical model and used an autonumber PRIMARY KEY in your Access implementation? Apologies in advance if I've recalled incorrectly (google groups archive search seem to be broken just now). No, you misunderstood somewhat, but you have quite a memory. There were several issues there. There are enough natural fields to ensure uniqueness, but that would mean about a six-field key. For most purposes the FirstName, LastName, MI are enough, but do not provide a guarantee. Another issue was that most training is done by employees (supervisors, etc.), but that some training is done by outside people. If I use EmployeeID (or a multi-field natural key) to identify the trainer, I would need to enter outside trainers into the Employee table (not desirable) or into a Trainer table with similar fields, but in that case I would need to come up with a single field PK that would not conflict with the one in the Employee table, or else create another multi-field PK. Or I could just store the name, with enough detail that there is no ambiguity about the person's identify, and be done with it. Since most outside trainers conduct a session or two only, storing the names in their own table, or creating a record in the Employee table, seems to make little sense. Also, note that I was addressing the group rather than you personally; again, sorry if I was not clear. If I have a listing of vendors, what is the *unchanging* unique constraint? In the logical model, a good key should be *stable*; immutability is the ideal since but real life is not always ideal. I too wish all your vendors were issued with an unchanging identifier. In which sector do to you operate e.g. DUNS number? In mine, incorporated bodies must be registered with a government agency which issues a public and unchanging (for all practical purposes) identifier. Although there are incentives for businesses to incorporate, I still need to model unincorporated businesses; fortunately, we have "passing off" laws which makes trading name (yes, can change) plus trading period a reasonably stable key. I also need to model businesses from all other jurisdictions for which an enterprise key of our own design is used. Some vendors are incorporated bodies, and some are local machinists who construct tooling. In any case, the typical situation is to enter the proposed vendor into the database, then to obtain tax ID or other such information. Needing a tax ID before the vendor can be entered into the database is not a real-world option. A "reasonably stable" key worries me. Again, remember that I am talking about the field or combination of fields that are related to other tables. I link to an employee table from several databases. If the employee's name is part of the key, an employee whose name changes means that several databases now need to be updated. For that matter, when they change the employee ID number, as happened recently, it becomes rather awkward and time-consuming. I've found trusted sources of potentially useful identifiers wary of revealing details (I found the 'DX Exchange' people in the UK most unhelpful). Yes, there are no easy answers, autonumber included. My main problem with autonumbers is that they do not exist in the logical model, therefore they are not a logical key. This was the problem I recalled from your 'trainers' scenario. A record's uniqueness is one thing. The means of identifying it in relationships is another. In the physical implementation I've no problem with you or anyone using a so-called surrogate if you also have a natural key or enterprise key which exists external to the database. I wouldn't recommend doing it myself for a variety of reasons e.g. doing so makes data harder to read. maybe cascading updates of multi-field keys are not a problem. I would rather avoid them Ideally (that word again), I think the SQL DBMS should store the key value only once and use an internal surrogate, in the way you do by hand with autonumber, to point to it; this way an ON UPDATE CASCADE action would physically change only one value. Such SQLs exist but professionally I need use Microsoft's products, at least in the immediate future; Microsoft doesn't have them so I go without. So my ideal surrogate would be hidden and I'd still see the real key values in the referencing tables but that's not the reality for me. With your way (by hand with autonumber) you either have to work with the meaningless values or use a JOIN (or three or thirteen) to see the real values, so it's not for me thanks and good luck to you. The surrogate key, when there is one, is hidden from view. I can see it if I choose, but it is not exposed to the user. The user, however, will see enough detail to know which Jamie Collins they are selecting when there are two employees with that name. The user can identify the unique record when needed without ever seeing the surrogate key. You will not change my mind on the subject, nor I yours. One thing that tickles me about the 'autonumber PK' advocates it that they most often use the incremental Long Integer flavour, presumably because it makes the data easier to read! If I cannot dissuade you from autonumber, I urge you to choose random: it will improve concurrency (let's not start on the whole 'physical clustering on disk' thing), you will be less inclined to expose the autonumber of users and, having to type the values over and over, you may come to appreciate a well designed key (fixed width, check digit, etc). It's 'cruel to be kind'; making you type replication IDs (GUID) would just be cruel g. I don't care what the autonumber PK looks like, or whether it is random or sequential or whatever. The choice has nothing whatever to do with making the data easier to read, because I do not read the autonmber field except during the development stages. There is no temptation to expose it to the users. I can imagine limited situations where it would be OK to do so. For instance, a calls database may use a number as a reference for the call, in the same way that electronic banking uses an apparently arbitrary number as a transaction reference. But this is the infrequent exception. I don't know about replication IDs, so if I am creating a mess for some future situation about which I have no understanding now, so be it. Jamie. -- |
#26
|
|||
|
|||
How do you mulitply in a field?
John W. Vinson wrote:
Ummm.... Access 2.0, back 15 years ago, let you base reports on queries. And I'm pretty certain that 1.0 and 1.1 did also. picking nits Access 2.0 was AFAIK dated 1994 thus it was 13 years ago. /picking nits Tony -- Tony Toews, Microsoft Access MVP Please respond only in the newsgroups so that others can read the entire thread of messages. Microsoft Access Links, Hints, Tips & Accounting Systems at http://www.granite.ab.ca/accsmstr.htm Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/ |
#27
|
|||
|
|||
How do you mulitply in a field?
Jamie,
The original point you tried to make was that it is wrong to include calculated fields in a persisted view (capitalisation nothwithstanding), due to the rules of normalisation. I'm sorry but you are incorrect. It is OK to include calculated fields in such views because they do not form part of the logical model, which is where the rules of normalisation apply. We normalise relations - not tables. Normalisation is not developed at physical level (where tables and views exist). You have to remember that data modelling is an implementation-independent exercise, and strictly speaking, you should not enforce the constraints imposed by one level on the objects of another, because those objects simply do not exist in the former. In physical terms, views are not virtual tables, because they do not store data and nor do they (necessarily) represent formal entities. One should not assume the wrong interpretation of the word "virtual" in SQL-92. People (including those who wrote the standard) use the word with abandon, but the intention was merely to provide a word/phrase which aids in understanding a concept - not to define a new class of object. That is one salient point! ...a SQL VIEW is a table (lowercase) but a SQL TABLE is not a SQL VIEW. A SQL view is not a table; it is the mechanism by which a dataset can be visualised, which may include data from one more more SQL tables. I agree, a SQL table is not a SQL view; it is the physical implementation of a relation, the contents of which cannot be viewed without a SQL view. ...but it only become a VIEW if I can achieve the same degree of normalization as a comparable base table... No, and that's where you misunderstand the concepts involved. Normalisation is not applied to tables (in the physical implementation) - only to relations (at logical phase, which is independent of physical implementation). Relations are logical - tables are physical. Normalisation is a logical activity - not a physical one. When I design a logical model, I leave it to the DBA to create the physical model in a way that ensures my logical design can be imlemented using the chosen DBMS technology. Strictly speaking, I don't care what technology the DBA will use to implement the database, and (s)he does not need to worry about normalisation, because by the time (s)he gets the logical design, it's already done. I simply do not apply normalization to a PROCEDURE's resultset. Of couse not. I'm not sure I even believe in 6NF Me either smirk, although I've done quite a bit research with them. A single column lookup table is in 5NF... Yes, point taken. Regards, Graham R Seach Microsoft Access MVP Sydney, Australia "Jamie Collins" wrote in message ups.com... On 17 Aug, 11:12, "Graham R Seach" wrote: Technically, normalisation applies only to the logical relation, not to a physical table or any view of it. We need a conceptual layer. The one I am employing is SQL (the language)... Views are not virtual tables In SQL terms, that's a misstatement because a SQL VIEW is a virtual table. If you open a "table" to look at its data, you're visualising that data through an additional layer called a view Again, in SQL terms, that's a misstatement because a SQL VIEW is a table (lowercase) but a SQL TABLE is not a SQL VIEW. FWIW my source of reference is the SQL-92 specification. As I would hope you know, any visualisation of the data conained within a table, is a view In your entire post, if you substituted your term 'view' for my term 'resultset' then we would be in broad agreement. As it is, you seem to use 'view' to mean 'SQL VIEW' and 'resultset' interchangeably, which I find more than a little confusing to be honest. In application design I often write a vanilla SQL query and think to myself, "I'm sure this would be useful in other situations" and would therefore look to create a persisted object in my SQL DBMS. But how to decide whether to expose it as a SQL VIEW or a SQL PROCEDURE? Often the choice of PROCEDURE is a no-brainer (contains control of flow procedural code, aggregated results make no sense without delimiting parameter values, etc) but it only become a VIEW if I can achieve the same degree of normalization as a comparable base table. I simply do not apply normalization to a PROCEDURE's resultset. Just as an aside; in a later post you state that we should always aim for the highest normal form. That's not entirely accurate or desirable. I'm sure that when you think about it, you'll agree. DKNF or (heaven forbit) Lossless Joins are not for the feint of heart; nor are they anywhere near practical in 99.99% of cases. Yes, I put it badly. I'm not sure I even believe in 6NF g. Even in data warehouses, we rarely go beyond 4NF. I think you are mistaken. I was taught that 5NF is always achievable and that data warehouses are often purposely denormalized. A single column lookup table is in 5NF and that's a very common design in OLTP! Jamie. -- |
#28
|
|||
|
|||
How do you mulitply in a field?
Hey Jamie,
I beleive you mis-understood my post, in conjuction with me probably not using clear laguage to explain ... What I said ... Normalization is the level of efficiency your data model stores data that is INPUT. A View is an OUTPUT of the normalized data and has no buisness being part of the data storage model. Said much better by someone else ... In any case, you have to make the distinction between (a) the storage of data, and (b) the use of that data. Pure storage requires normalisation. Use of the data requires whatever form is necessary to convert that data into information or knowlege. ..... Also, the OUPUT does not guide my data model ... See ya, Brent |
#29
|
|||
|
|||
How do you mulitply in a field?
On Aug 17, 1:35 pm, "BruceM" wrote:
You have completely lost me when you say the PK designation is always arbitrary. Are you saying the designation is arbitrary? Yes. Consider my favourite example: a temporal database with a business rule, "at no time can an employee have two salaries." The logical model identifies a table EmployeeSalaryHistory comprising employee_number, salary_amount, start_date and end_date; all columns required; end_date = maximum date (#9999-12-31 23:59:59# in implementation) is used to represents a row in the current state current. The logical model candidate keys a (employee_number, start_date) (employee_number, end_date) (employee_number, start_date, end_date) In implementation we can put a SQL UNIQUE constraint on all of the above but I choose to omit the third one because it's already covered by the first two; this is an example how something in the logical model may not directly related to a single corresponding object in the implementation. There's something missing of course: the above candidate keys only ensure each *period* is unique, whereas the business rule is "at no time". To cut a long (and hopefully familiar) story short, we need a sequenced key where overlapping periods are prevented (in implementation a CHECK constraint may be used plus some other embellishments). It is this sequenced key that I would consider as being the logical primary key (lowercase) of the table. As described we have implemented a table with a primary key but no SQL (the language) PRIMARY KEY (PK) designation. Just about everyone says that every SQL table should have a PK. Take a look in the ANSI SQL spec: 1) "none of the values in the specified column or columns be the null value" (section 4.10). 2) PK is a unique constraint (section 4.10.2). 3) there can only be one PK per table (section 11.7). 4) For the references specification of a FOREIGN KEY: "If the referenced table and columns does not specify a reference column list, then the table descriptor of the referenced table shall include a unique constraint that specifies PRIMARY KEY" (section 11.8) [i.e. you don't specify the columns involved, those of the PK will be used.] That's it as far as SQL (the language) is concerned. The unwritten implication is that PRIMARY KEY will have an 'implementation specific' meaning. For Access/Jet, they a 1) Determines clustering (physical ordering) on disk on compact. 2) PK columns appear as bold text in the Relationships diagram. 3) If you don't specify a PK the Access interface will nag you, "Although a primary key isn't required, it's highly recommended. A table must have a primary key for you to define a relationship between this table and other tables in the database [incorrect!]. Do you want [Access] to create a primary key now?" The choice is arbitrary. Whereas it is objectively demonstrable whether a table is in, say, 3NF, you can't say look at a table and say, "Hey, you've picked the wrong candidate for PK there," because the choice of PK subjective, determined by personal preference, prejudice and ignorance. Back to the EmployeeSalaryHistory: we have two candidate keys defined using UNIQUE. Do I really have to choose one to promote to PK. Just about everyone says every table should have a PRIMAY KEY ...or do they really mean primary key? FWIW David Portas SQL Server MVP seems to have similar dilemmas: Down with Primary Keys? http://blogs.conchango.com/davidport...eys_3F00_.aspx I say that yes, every table should have a PRIMARY KEY because otherwise at the very least you are missing out on the implementation's meaning; worse, though, you would be at the mercy of the implementation's designers' default behaviour e.g. in Access/Jet another NOT NULL (?) UNIQUE will be used for physical ordering (clustering) but this is undocumented and because I have no idea how the other candidates are eliminated (first one defined chronologically? one defined on first columns in left-to-right order? one defined on first columns in chronologically created order? does nullable columns affect the choice) I'll make an explicit choice based on my own criteria. I've previously mentioned a 'telephone directory' style table where phone number is the real world unique key and clustering would favour last name and because in Access/Jet there is no explicit way of specifying clustering then PRIMARY KEY (last_name, phone_number) would be best for these aims. Thus, in this scenario the PRIMARY KEY isn't even a candidate key! when I refer to a PK I mean the field or fields that are used for relationships. Do you not have tables for which more than one key are referenced by other tables? Jamie. -- |
#30
|
|||
|
|||
How do you mulitply in a field?
On Aug 20, 8:02 am, wrote:
you have to make the distinction between (a) the storage of data, and (b) the use of that data. Pure storage requires normalisation. Use of the data requires whatever form is necessary to convert that data into information or knowlege. .... Also, the OUPUT does not guide my data model ... I think I see what you are getting at. A report is output, hence doesn't need to be normalized. A stored procedure's resultset is output, hence doesn't need to be normalized. However, a SQL VIEW is a table and tables should be normalized. Jamie. -- |
Thread Tools | |
Display Modes | |
|
|