My Lock File Has Grown Too Large - 30 Days from Paradox to InterBase
by Skip Rowland. Borland Developers Conference 1999.
The overwhelming majority of Paradox apps (or any other file-server apps, for that matter) involve 50 or fewer tables. This case study takes a look at what happened when a small app grew beyond Paradox's capabilities. First, I'll give you an overview of the situation, then I'll cover the problems, solutions, band-aids, bruises, and successes encountered along our "do or die" migration to a better place.
Overview
This case study concerns a long-term client of mine that is a full-service marine company, offering shipping, barging, and tug services world-wide. They started with a Paradox for Dos based billing and utilization system back in '93. In January of '97, they approached me to take a more in-depth look at their database requirements. They were feeling pressure, internally from increased management needs and externally from increased industry regulation. What they needed was a system that would tie all departments (Sales, Dispatch, Billing, Operations, Crew Management, HR, and Risk Management) together into a central database.
At the time, they were using Paradox for Dos, Professional File, Excel, Word Perfect, and a number of other tools, on a Novell 3.11 network with about 30 work stations. Five base apps were identified:
- Dispatch/Billing
- Operations
- Crewing
- HR
- Risk Management
In time, three more were identified:
- Sales Management
- Operations Management
- General Administration
And I'm sure more apps will surface as things move forward.
The initial proposal required migrating to Windows95, which in turn required upgrading nearly all of their hardware and software. Even though this was inevitable, it was something that they had been putting off as long as possible. Client/Server was proposed, but turned down as being something they were not ready for; consequently, Paradox v7 was chosen for data storage.
Development was started in February of '97 and proceeded typically: create a database, build a prototype, go back to the drawing board. Each time, we got a little closer to giving the user what they wanted, but could not articulate.
As development progressed, more and more areas of interest/concern were identified, each leading to more and more lookup tables. The database that started with 25 tables grew to about 400 tables.
At first, things operated smoothly. The Billing/Dispatch app was the easiest port since it had been fully defined and in use for nearly four years. Unfortunately, things started to go downhill in June, '98, after the HR app was rolled out.
Problems
Each app could stand on its own and operate acceptably. However, once an app started competing with the other apps, everything started to fall apart. The problem was exacerbated by users with 16 megs of ram also trying to run Word, Excel, email, and any number of other apps simultaneously. Sometimes just the database applications would lockup, sometimes the whole machine. Sometimes the db apps would load completely, sometimes, the load would fail. We got a lot of the dreaded "insufficient memory to complete operation" messages. Basically, we were experiencing a lot of inconsistent, weird, and basically unacceptable behavior.
Band-Aids
As I said, each app could run on its own, as a departmental solution. But, since that defeated the whole vision of an enterprise solution, splitting up and recombining the data on a regular basis was not even an option. Initially, it was my responsibility to make the whole thing work as it was, using Delphi against Paradox. I went through each app back and forth, eliminating any extraneous overhead, tightening down everything I could. The base architecture was taken straight from the KnowledgeBase:
- Executables were in different folders than the data
- Net dirs and Priv dirs were set in the executable
- dbiSetLockRetry was set to 20
This wasn't enough. First, we upgraded the BDE from 3.51 to 4.01, and then we jacked up the BDE configuration settings.
MAXBUFSIZE | 8192 |
MAXFILEHANDLES | 500 |
SHAREDMEMSIZE | 8192 |
This worked for a while. Unfortunately, due to prototyping and testing cycles, only the Dispatch/Billing app was being fully utilized. By September, '98, though, the HR and Operations departments were starting to come online and really use the applications. That led to Black Monday. About 10 AM, September 21, all four apps seized simultaneously. It was obvious that band-aids were not enough.
Solutions
The only truly acceptable solution was to migrate to Client/Server. InterBase was the only logical choice, based on cost, availability, maintenance, and my exposure to it. Because so many people were dependent on the applications, this had to be done as quickly as possible. In fact, I was initially given 30 days to do it.
There were some additional considerations facing the client, namely upgrading their network operating system. They were using NetWare 3.11 and they knew they had to move. Their options were NT4, wait for NT5, NW4.11, or NW5. Ultimately, NW4.11 was the only viable option. Since they don't have an in-house network administrator, they felt that NT4 was too risky and that they wouldn't have enough time for NT5 to shake out before the year 2000. Since there was no certified version of InterBase for NW5, that left only NW4.11.
Some Additional Band-Aids
Before I could start on the migration, I still had to do something to make the current applications at least accessible. The biggest problem was that the lock files would grow to a size that they were fairly unnavigable by the BDE (2.5+ megs). The interim solutions included:
- Splitting data into multiple directories and using multiple aliases
- Writing a "lock file manager" that kept an eye on the gross lock files sizes. Whenever it reached 700, everybody would be asked to exit the database applications. Usually, the lock files would have to be manually deleted--the BDE wasn't doing its normal housekeeping.
- We upgraded the BDE again, to 4.51, and jacked up the configurations even further.
MAXBUFSIZE | 8192 |
MAXFILEHANDLES | 500 |
SHAREDMEMSIZE | 8192 |
This was ugly, but at least people could get to their data.
The Actual Migration Process
There are four phases to migrating a Delphi app from Paradox to InterBase:
- Getting set up
- Database migration
- Codebase migration
- Server optimization
Getting set up involves installing InterBase, getting familiar with WISQL and using it, and then configuring the BDE.
For a clean start, I uninstalled the InterBase that had been installed when I installed Delphi. I did a full install of IBLocal with the default settings. Then I created a folder named PDox2IB where I dropped in shortcuts to InterBase Server Manger, WISQL, Database Desktop, and the BDEAdministrator.
The first thing I wanted was an InterBase password with fewer keystrokes.
- Start Server Manager
- Log in as User SYSDBA with the password masterkey
- Selected User Security and modified the SYSDBA password to "bo"
Note
to see changes to passwords, you have to logout and log back in
Next, I launched WISQL in order to create a new InterBase database to migrate to.
- From the menu, File|Create Database
- DatabaseName: C:Program FilesBorlandIntrBaseVMXVM.gdb
- Password: bo
Next, through Windows Explorer, I created a new data folder with all the tables (remember, one of the band-aids was to break them out with app-specific aliases).
Then, I launched BDEAdmin to create a new aliases for my source and destination databases.
- On the Database tab, rt-mouse click and select New. Leave type as STANDARD
- Changed name to VMS
- Changed PATH to C:Apps3EESBigData
- Again, on the Database tab, rt-mouse click and select New. This time, select INTRBASE
- Changed name to VMX
- Changed server name to C:Program FilesBorlandIntrBaseVMXVM.gdb
- Close BDEAdmin
I was now set up to migrate the data.
Database Migration
First, this is NOT a one-step process. You are setting yourself up for an exercise in frustration if you expect it to magically happen the first time. Delphi C/S includes a program called Datapump.exe; it is installed in the BDE folder and referenced as "Data Migration Wizard." The datapump is a good tool, but it has limitations -- do not expect it to be the be-all, end-all. Sometimes, you have to roll up your sleeves and write tools to do it yourself. If you do not have the datapump, you will have to write your own. For now, though, we'll assume you can use the datapump.
The biggest problems: Field Names and Field Types.
Tip
Plan for it to take more than a couple of iterations. As you use Datapump to identify your problems, it is best to make incremental changes to both your source database and your source code base.
The Big Picture: you will have to do it AT LEAST TWICE: once for development, and once for the production database! You really should document your conversion process so that production side migration is a smooth process that is the result of development side polishing.
The two of the biggest changes your database will undergo has to do with:
- Field Names
- Secondary Indices
You can try to handle this preemptively, or you can go a few rounds with the datapump. The datapump produces a couple of paradox tables--one before "upsizing" and one after. The before table show changes that will be made to your database, and the after shows what happened.
Look for fieldname changes
It's best to modify the source tables and then the code before the final migration. That way, you'll know your code will work after the migration.
- The most common one is changing Spaces to "_" (Delivery Date becomes DELIVERY_DATE)
- The next most common one is having a field named "Date", which is a reserved word
- Other reserved words that will kill you are: VALUE, ORDER, GROUP, SET
- Finally, handle illegal characters (#, @, etc).
You might want to initially drop all secondary indexes, except for the ones required by the code. Most secondary indices on Paradox tables are used to allow users to change the sort order of the data they are viewing. This is unnecessary in a C/S environment, as you will be using queries instead of tables to view data. Secondary indices in a C/S environment are used to speed up queries, and until you know how your database is going to be used, you really don't need the overhead of creating and maintaining lots of secondary indices, especially descending ones.
The Basic, Manual Process
Repeat
- through Windows Explorer, launch Datapump
- Select a source alias (VMS)
- Select target alias (VMX)
- Login with SYSDBA password
- Select all tables
- Push the upsize button
- Write copy of this report to file creates a Paradox table
- Close Datapump
- Using Database Desktop, query out failures
- Until there are no failures
To see your results, launch WISQL, then File|Connect to Database
- C:Program FilesBorlandIntrBaseVMXVM.gdb
- Login with SYSDBA password
To see what the Datapump did,
- Extract|SQL Metadata for Database
- To see the contents of a table, SELECT * FROM [tablename]
- Close WISQL
Note
If you have executed a single select you will be prompted to commit work. It doesn't cost anything either way, to say yes or no.
The Production Data Conversion
We'll assume that the datapump will actually do the job. In reality, it may or may not work. It will create your IB tables and indices, but beyond that, I had inconsistent results with the actual data porting. So, I built a tool that:
- replicates my paradox database without the data into a new folder
- lets me collect all the source data into my conversion source folder
- moves my conversion source database into my IB database (using batchmoves).
Between steps 2 and 3, I use Datapump to build my IB database from my empty, replicated database.
Codebase Migration
The goals here are simple:
- Make it compile
- Make it run
- Make it run correctly
Even if you are an absolute optimist, and you know you are using nothing but plain vanilla components, you've still got some work to do. Any app with any sophistication to it is not going to run the first time, no matter what you've been told.
What I'm going to cover now are hard-fought lessons that I uncovered during my forced migration. You may be in a situation that doesn't require some steps, but you never know when your situation will change... There are a number of different models for getting to your InterBase data, and each of them would make their own paper. What follows is the model I settled on out of sheer necessity. It works, even if it may not be the best.
Create a new project folder for ibProjects
- In the ibProjects, create a folder for this project (mine was ees).
- In the project folder, create a data folder and a source folder.
- If you are using w2w, copy the w2w main folder from wherever it is to the ibProjects folder. Do this for any
- other 3rd party components that have their own folders. Remember, you are porting, migrating. Everything goes initially, then only the required stay.
- Also, delete the .dof, .dsm, and .dsk files.
Using a text editor--not Delphi--strip the project file down to nearly nothing at first. when you open Delphi, you do not want forms or data modules opening automatically. As you scrub and sanitize your modules, you can add them back to the project.
The Application Login
The first thing you'll do is create an automatic connection to your IB database. This will allow you to connect without an annoying login, and it will also shield your users from yet another annoying login.
In the BDE (with Delphi closed) rename your application's Paradox alias to something else.
Then, inside Delphi,
File|New Datamodule
Drop in a TDatabase component
Change its name to DBConnection
Set Alias to the new BDE Alias for your IB database
Set DatabaseName to the old Paradox BDE Alias
Double-click on it to open its property editor
Push the Defaults button
Make it look like the following:
SERVER NAME=c:apps4eesvm.gdb USER NAME=SYSDBA PASSWORD=mypassword
Check off keep connected
Make sure prompt is unchecked
name the unit dmKey
Save the unit as IBKey
Open your project. If you've stripped out everything, the only thing that should open is the source file (the .dpr). View the project manager. Push the Add button and add IBKey to your project. Inside your dpr, first thing that you want to happen is for this datamodule to be created.
What is really cool is that other datamodules will use this connection WITHOUT having to reference the datamodule in a uses clause!!!
Now you are ready to, one by one, add in project units, scrubbing them from all of the Paradox-related stuff. I suggest you start with the dfms, because sometimes forms won't open if the fields aren't properly defined.
IN THE CODE (BOTH dfms and pas files)
Problem and Solutions
AsBoolean | changeto TStringField. In the DFM, specify size=1 |
AsDateField | changeto TDateTimeField |
IndexName | changeto IndexFieldNames |
DatabaseName | make sure they all reference the old Paradox Alias. In the initial pass, you should have only one alias. |
.db, .dbf, etc. | strip extensions off ALL tablenames |
TDBCheckBox | make sure ValueWhenChecked, ValueWhenUnchecked corresponds to size of field. If field is defined as CHAR(1) or VARCHAR(1), then value "TRUE" will blow up. |
If you use Woll2Woll's components, there are a few more changes you'll need to make.
wwDBComboLookup | reset lookup properties |
Lookup Property | verify ALL aliases and tablenames in lookup strings, especially the DFMs. Otherwise, your form will not open, even if it will compile! |
At this point, your application will probably run. It was my experience that some apps ran almost acceptably with nothing but TTable, but some were absolutely horrendous, beyond acceptable. That leads to the next step, Server Optimization.
Server Optimization
For me this was the fun part. I actually had my applications running. Now it was time to make them sing by moving the code out of the exe and onto the server. In many ways, this is an on-going process. In this paper, I can highlight some of the essential areas. As my work progressed, I uncovered many more advanced tips.
A word of caution: The first goal is to make your current app run; the second goal is to optimize the code to take advantage of the C/S back end. It is very tempting to get ahead of yourself and start optimizing while you are migrating. That's ok, but don't let it get you off track...
While there are enough issues to make for a book (I'm working on that...), for the scope of this paper, I'll deal with the following primary areas for server optimization:
- General Overview
- Database Synchronization
- TTables to TQueries
- Key Generators
- The C/S Event Model as seen by Delphi
General Overview
First up: Code Placement. This is the rule of thumb: if it involves database behavior, put it on the server; otherwise, put it in Delphi. For example, use before validation and update procedures on the server to validate data. Use Delphi after post to refresh screens. Keep in mind, the user is manipulating UI values, not database values. They are not database values until they are successfully posted to the database. That's why you'll have to rely on Delphi UI event and BDE database events. It would be nice if OnNewRecord was a database event, but it can't be. That means you'll have to use Delphi to handle your new record initializations.
It might be easier if you viewed the UI as simply a template for controlling data access. "On New Record" is an artificial event that allows you to prepare the user for completing the data entry template. You can used stored procs to retrieve default values and the like, but all the same, you are simply preparing the screen for the user to complete. The server doesn't come into play until you send the captured data up for posting. Then you get the results (if any) of the before post trigger. Again, unfortunately, there is no way to string them together for a consolidated "these are all the issues" message. The server can only send them back one at a time (unless you have time to program the server for an if then if then else else if then matrix).
Also, even if the same rules apply, the events are different when posting a new record or updating an old one. The Before Post trigger only applies to updating an existing row. To apply those rules to a new record, you'll need a Before Insert trigger.
There is no such thing as refresh on queries (because the resulting dataset is not uniquely keyed). If you are using the BDE, the only way to refresh is to place a bookmark, close and re-open the query, and then go to the bookmark. This can be a serious hit, but it comes with the territory. Through the BDE, you can specify using soft commits, which will refresh the current cursor without having re-execute the entire query. The downside is that other users will not see your changes to the database.
You have got to pay attention when working with queries, both in terms of balancing performance and keeping your UI in sync. When working with TTables, the BDE does a lot of work, keeping linked cursors synchronized. With queries, it's another story. Deleting a row does not automatically cause another query to refresh to reflect the deleted row. This is especially true when working with views. You have to handle refreshing manually, yourself.
Linking a query via a datasource is NOT the same as linking to TTABLES via a MasterSource and MasterField. If you add a row to a "linked detail," you still have to provide the master linking values. Using TTables, the BDE does this for you, automatically providing the linking values for the linked key fields. In queries, you have to do this manually in order to avoid key-viol errors.
Basically, you need to adopt the following mentality: let the server do the work, let your program reflect the state of the data on the server.
Database Synchronization
Ok, one of the decisions you'll have to make has to do with how you pull data off the server. One of the things you lose that you had in Paradox is the "autorefresh" where all the users saw the other's changes almost immediately. You have to configure the BDE to handle your situation.
In the BDE Administrator, on the Configuration tab, drill down to Drivers, Native, INTRBASE and set DriverFlags to 512. This is the setting for Repeatable Read/Hard Commit. It carries a performance hit, but the trade off is avoiding deadlocks (two or more transactions competing for a lock), and it allows other users to see data your changes, and vice versa.
TTABLES to TQueries
The main difference between Paradox and C/S apps is that Paradox apps tend to be navigational apps where the user is given a grid/list of data in one view and a panel with edit controls to make changes. When the user adds a new row or makes a change to an existing row, the changes are immediately reflected in the grid. In a C/S app, the list usually shows the result of one query, while the edit controls are tied to another query. Consequently, changes to the second query are NOT automatically reflected in the grid. Even though Refresh is a compilable method of TQuery, it does not work. To refresh the list, you have to close the query and open it again. And it's not just that simple. If you were editing, you can set a bookmark. If you added a new row, you'll have to go find the new addition yourself...
- Start by placing TQuery components beside your Ttable components
- Add the following to the SQL property: SELECT * FROM [TableName]
- Set RequestLive to True
- Change the DataSet property of the TTable's DataSource to point to the TQuery
To link to TQueries together in a Master/Detail relationship, use the following SQL for the Detail TQuery:
- SELECT * FROM [TableName] WHERE (JOBID=:JOBID)
- Then set the Detail TQuery's DataSource property to the DataSource associated with the Master TQuery.
This will work, but keep in mind, "SELECT * FROM ATABLE" is not really different and no better than a TTable! In time, you will want to optimize you queries into selecting only the rows and fields that you absolutely need. What I prefer is building VIEWS that the user can filter as a way of letting them navigate the database; when they find what they are looking for, they push an Edit button that takes them to a form with all the editable fields. The fields are populated with the results of a "SELECT * FROM ATABLE WHERE (KEYFIELD=:KEYFIELD)" query, where the paramater KEYFIELD is taken from the current row in the view.
Key Generators
This is very simple. Create a generator, write a proc to get a value from the generator, then get that value in the Delphi OnNewRecord event.
CREATE GENERATOR ZKEYS;
CREATE PROCEDURE SP_NEW_KEY
RETURNS (ID INTEGER)
AS
BEGIN
ID = GEN_ID (ZKEYS, 1);
END
Query1.OnNewRecord
begin
with Query1 do
FieldByname('FIELD1').AsInteger := GetNewKey;
end;
The C/S Event Model as seen by Delphi
The event stream is different between Delphi, the BDE, and IB (or any other C/S backend).
- "Before" events on the server mean before the action can complete.
- "Before" events in Delphi and the BDE mean before the action starts (or mode changes).
This means your Delphi "BeforeInsert" does not correspond to your IB "Before Insert". The following is the actual chain of events:
- Delphi Before Insert
- Delphi On New Record
- Delphi After Insert
- Delphi Before Post
- InterBase Before Insert
- InterBase After Insert
- Delphi After Post
or
- Delphi Before Edit
- Delphi After Edit
- Delphi Before Post
- InterBase Before Update
- InterBase After Update
- Delphi After Post
I have wrestled with the Stored Procs vs Triggers question for many, many hours. At one time, I believed in using Stored Procs for everything and avoiding Triggers at all costs. I was put off by 1) the amount of work necessary to write and maintain triggers, and 2) the idea that events could chain way beyond my ability to control them. That was before I figured out how to use them.
Over time, I have developed a model that allows me to get the most out of a C/S backend, while maintaining the warm-fuzzy of a navigable, file-server app.
/* this table's field holds a Carriage Return */
Create Table ZCR (CR CHAR (2));
CREATE TABLE X (
FIELD1 INTEGER NOT NULL,
FIELD2 ...,
...
VALIDATED INTEGER,
PRIMARY KEY (FIELD1));
CREATE TRIGGER TR_X_BI FOR X
ACTIVE BEFORE INSERT
AS
DECLARE VARIABLE CR CHAR(2);
DECLARE VARIABLE ERRS VARCHAR (255);
BEGIN
ERRS = '';
SELECT CR FROM ZCR INTO :CR;
IF (NEW.FIELD2 IS NULL) THEN
ERRS = 'Field2 is required.'||:CR;
IF (NEW.FIELDn IS NULL) THEN
ERRS = :ERRS||'Fieldn is required.'||:CR;
IF (:ERRS > '') THEN
BEGIN
NEW.VALIDATED = GEN_ID (ZERRORS, 1);
INSERT INTO ZERRORLOG (ERRORID, ERRORMSG) VALUES (NEW.VALIDATED,
:ERRS);
END
ELSE
NEW.VALIDATED = NULL;
END
CREATE TRIGGER TR_X_AI FOR X
ACTIVE AFTER INSERT
AS
BEGIN
IF (NEW.VALIDATED) THEN
BEGIN
END
END
Inside my Delphi application, I'll use the following:
function GetNewKey: LongInt;
var
sp: TStoredProc;
begin
sp := TStoredProc.Create (nil);
with sp do
begin
DatabaseName := Framework.MasterAlias;
StoreProcName := 'SP_GET_KEY';
Params.CreateParam (ftInteger, 'ID', ptOutput);
Prepare;
ExecProc;
Result := ParamByName('ID').AsInteger;
UnPrepare;
Free;
end;
end;
Query1.OnNewRecord
begin
with Query1 do
FieldByname('FIELD1').AsInteger := GetNewKey;
end;
Query1.AfterInsert;
begin
with Query1 do
ParamByName('FIELD1').AsInteger := FieldByName('FIELD1').AsInteger;
end;
Query1.AfterPost;
begin
with Query1 do
begin
{you have to close/open in order to see changes made by the server}
DisableControls;
Close;
Open;
EnableControls;
if not FieldByName('VALIDATED').IsNull then
begin
SP_GET_ERROR.ParamByName('ID').AsInteger :=
FieldByName('VALIDATED').AsInteger;
SP_GET_ERROR.Prepare;
SP_GET_ERROR.ExecProc;
Framework.ReportServerError
(SP_GET_ERROR.ParamByName('ERRMSG').AsString);
SP_GET_ERROR.UnPrepare;
end;
end;
This is only a rough sketch of my basic framework; however, you can see that it is designed to maked the server do most of the work.
Bruises
The migration was not completely automatic. We had timing issues in terms of obtaining and installing the new network operating system. Some of the apps ran acceptably enough with TTABLES to go right into production, rather than wait for server optimization, some had to wait. Also, even though we had apps running and we had gotten past the lock file problems, the users were starting to want more and more from the applications. It became hard balancing optimizing what they had originally and adding new features before all the old ones were in place. And that, on top of solving all the little problems that the new environment brought with it...
As of this draft, IB 4.2.2 is the latest version available for NW4.11. There are a number of bugs fixed and enhancements added in subsequent versions that have not been added backwards. Hopefully, though, there will be an IB5.5 release for NW4.11 first half of '99.
The biggest problems I've had to deal with:
- Run-away queries. You can't let a user loose with a query unless you've tested it to complete successfully.
- Metadata changes with users logged in are a no-no in 4.2.2. You can (and probably will) corrupt your database.
- Fine-tuning behavior: it is not automatic. It requires a different mind set for both the programmer and the user, especially if they are used to having access to "all" of the data.
- There are a number of configuration issues that you must consider (hard commits, soft commits, repeatable reads, max rows allowed, to name a few).
- There are some bugs between the BDE, SQL-Links, and IB. They are rare and inconsistent. Just keep your eyes open.
- Sweeping and garbage collection. Turn off Auto-sweeping and do it yourself. Otherwise, your users can get caught in a huge sweep during the busiest time of the day.
- Protocol selection. TCP/IP is the best, but requires work to set up and configure. SPX is the easiest, but performance takes a hit. (We had to use SPX because of some peripheral hardware requirements.) Some machines will not even connect the first attempt, but then connect faithfully and speedily every subsequent attempt.
Successes
What actually happened and where we are today:
- Started on 9/22 band-aiding the Delphi/Paradox apps; this took about two weeks
- Started IB in earnest on 10/3
- Ready on 11/1with Dispatch/Billing, Operations, Crewing, and HR
- First brick wall: NOS upgrade had not been done
- Second brick wall: finding correct protocol and setting up work-stations
- 4 of 4 completed apps fully migrated and in production, working in harmony with no more lock files, index out of date and dead-dog performance
- Almost all server optimization completed
Now, we are in a position to actually separate data entry functionality from management views, and reporting is much more simple. All in all, we have a solid foundation to build on.