Using InterBase Data Types
by Paul Reeves Article that first appeared in the UK Borland User Group Magazine
Choosing data types in any database platform always appears straightforward until you discover that a data type you have used since you were knee high is not available on the current platform you are working. When one comes to InterBase for the first time it appears that some important types are missing or difficult to implement. Not so. It is just a matter of knowing what to do.
What, no boolean?
InterBase does not have a native boolean data type. However, they are not difficult to implement. IB offers us domains, and the first step is to define a domain of boolean. The second step is to define the datatype of the domain. There are two ways, each with their own advantages. My preference is to use a SMALLINT (16 bits), defaulting to zero, with a check constraint to ensure only the values of zero or one are entered. ie:
CREATE DOMAIN D_BOOLEAN AS SMALLINT DEFAULT 0
CHECK (VALUE BETWEEN 0 AND 1);
Once you have defined this domain you can forever use it as a boolean datatype without further concern. It is particularly suitable from a Delphi point of view, as Pascal booleans work in a similar manner.
Alternatively, you can define the domain as a CHAR(1) and ensure appropriate single character values are stored with the check constraint. If 'T' and 'F' or 'Y' and 'N' are more meaningful for your application then use this approach.
Which should I use - VARCHAR or CHAR?
On first reading the manual it seems obvious. VARCHAR should store just the number of characters entered in a column and CHAR should pad the entered characters with spaces. Alas, it doesn't work like that. Indeed, VARCHAR columns require more storage than CHAR columns. Why? Because when storing a VARCHAR InterBase adds two bytes that state just how big the VARCHAR actually is.
Confusingly, the manual also states that when storing a VARCHAR or CHAR it adds extra padding to bring the length up to that specifed for the column type. What the manual doesn't state, though, is that after padding CHARs and VARCHARs on the server, it then compresses the data prior to storing. Your database is not filled with trailing blanks, whichever datatype you use.
So a CHAR will in fact be stored in a smaller space. However, when you do a select on a VARCHAR column, InterBase strips the padding and returns the stored value. When you do a select on a CHAR column, InterBase returns the value and the necessary padding. Thus the two bytes saved in storage of a CHAR must be balanced against the subsequent need to strip the spaces on the client side. As far as storage goes, then, there is little difference in the two, but the effect is slightly different.
In practical terms all this boils down to a rule. Only use CHARs if you are storing strings of at most a few characters. And the exception to the rule is when you are working with intermediate tables that are required to export data to fixed length prn files. Then the padding will be a positive advantage. There is also the exception to the exception. If you want to concatenate something like middle_initial, that may often be null and do not wish to strip the padding then a VARCHAR(n) would be appropriate.
How do I work with money ?
InterBase may appear awkward when it comes to working accurately with currency values. The problems of working with floating point decimals in a binary environment are well known, and effectively render dubious any system that interfaces with traditional double-entry book-keeping. While FLOAT and double precision types offer almost unlimited storage for currency values they are going to cause a lot of trouble when sums and comparisons are made.
The simplest solution for a lazy mind, like mine (and others, no doubt) would be to work exclusively in pennies (ignoring fractions of course), storing the data as integers and scaling the displayed values to two places to show the major currency unit. Unfortunately, in InterBase, integers are only 32-bit, so using this solution is not practical if large sums are involved, or if one is working in a hyper-inflated currency. If your application will never deal with aggregated sums of more than £21,474,836.48, then integers are a safe bet for currency storage. InterBase Software Corp has been lobbied intensively to introduce a 64-bit INTEGER and have promised that it will be introduced with a bug-fix release of version 5 sometime in the coming year. It will not be in v5.0, but its need is acknowledged and will be be addressed.
Editors Note
64-bit INTEGER is now available in InterBase 6.0
So, for larger sums or fractions of a penny, what is one to do? On perusal of the manual the solution appears to lie in the NUMERIC or DECIMAL data type, but further reading does not really help one to distinguish between the two. Neither does it satisfactorily explain the effect of the two parameters that one can use, unless you read very closely.
To summarise the manual:
- Precision means the total number of digits stored - inclusive of both sides of the decimal point.
- Scale is the number of those digits that are to the right of the decimal point.
- Numeric means that the largest value stored is dictated by (precision-scale).
- Decimal means that numbers with larger than precision-scale will be stored; decimal specifies at least in contrast to numerics' the most.
If you are still confused, read the several pages of explanation in the manual and then come back to the above summary.
Having cleared up the semantics of these datatypes, which should one use? The convention amongst developers appears to favour NUMERIC over DECIMAL and I cannot comment further on that. The next step is to choose an appropriate precision and scale. To achieve an effect similar to our maximum integer above and still store the data in an integer datatype (thus not requiring any further conversion) one would use NUMERIC(9,3) giving a maximum of £999,999.999. Not quite enough for anything larger than a very small business. To store values up to £999,999,999.999 requires NUMERIC(12,3). Enough for me but perhaps not enough for roubles or lira though. If you use NUMERIC(15,4) you get values up to £99,999,999,999.9999 (ie 99 billion) That should do the trick for most people, combining size with the ability to handle calculations down to a hundredth of a penny.
The BDE has a nice feature - BCD support. Combined with NUMERIC(15,4) Delphi apps have no trouble dealing with and displaying these values correctly. So, define a domain called Currency with this datatype, use it for all such columns, make sure that BCD is true in the database params and then forget about it. BCD stands for Binary-coded decimal. It is a sophisticated means of storing floats in binary.
For those of you that want larger integers for currency storage and you want it now you may be interested in checking out Jason Wharton's IBObjects. By using a neat trick, IBO allows you to work transparently with 53-bit integers.