FBOpt Database Optimiser
FBOpt is a command-line tool for optimising Firebird databases. The tool addresses some aspects of the internal structure of databases that the Firebird engine itself cannot rectify and some issues that survive even a backup and restore with gbak. It is created and distributed by IBPhoenix on the IBPhoenix Developer DVD (starting from issue 29).
Internal Database Structure
Efficiency of the internal database structure plays an important role in the performance of the Firebird engine. The layout of internal structures has a direct impact on the complexity and speed of operations needed for processing them, which translates to utilisation of CPU and memory. Data distribution has a primary impact on the number of I/O operations and on cache utilisation. While the layout of data structures is fixed in the code, data distribution is variable, depending on the actual operations performed on the database. In current Firebird implementations, the engine pays only minimal attention to the storage and distribution of both user and internal data. The engine’s primary focus is on the speed of the current operation, rather than the overall balanced performance. Over time this can lead to degradation of performance as the internal database layout becomes progressively less efficient.
Eventually the drop in performance requires intervention which, for Firebird, means a rebuild of the database from a logical backup (gbak). A gbak restore fixes some problems, namely:
- Row fragmentation, restoring each row on a single page if a single row is not bigger than database page
- Table and index fragmentation, data and index pages are compacted to hold as much data as possible, restoring all pages that belong to a single database table or index consecutively.
Over time this can lead to degradation of performance as the internal database layout becames less and less effective. Eventually the drop in performance requires intervention, which for Firebird means a rebuild of the database from a logical backup (gbak).
However, restore from backup fixes only the following problems, namely:
- Row fragmentation, so each row is stored on single page (if a single row is not bigger than database page)
- Table and index fragmentation, data and index pages are compacted to hold as much data as possible, and all pages that belong to single database table or index are stored consecutively.
A gbak restore does not address these problems:
- Index clustering: jumps to different data pages while walking through an index
- Data density on data pages: all data pages could be either highly packed or have space reserved for updates
- Physical location of user data and index structures: all user data is stored first, in creation order, then all the indices are added
- Prevention of data fragmentation
FBOpt was designed and created to address these problems.
Optimisation with FBOpt
FBOpt is basically a database copy utility. Its operation is similar to piping gbak backup to gbak restore via standard output, except that FBOpt has no need to create an intermediate backup data stream and can apply “tweaks” to copied data. Because FbOpt uses the same low-level BLR interface as gbak for communicating with the Firebird engine, it is as fast as gbak. However, some steps that are required by gbak are unnecessary in a data copy utility. Also, some operations are handled more efficiently by FBOpt - especially large BLOBs and ARRAYs - so it can often be faster than gbak. The performance gain is usually a few per cent, but it could be significant when the database contains many huge BLOB values. The real value of FbOpt is the additional operations that it can perform on newly-created databases.
With FBOpt, an administrator can specify
- the physical order of rows in individual tables
- the space reserved (or not reserved) for individual tables
- the physical order of tables in a newly created database
- clusters of tables and their indices, storing index structures together with their source tables.
Physical Order of Rows
Specifying the order of rows can significantly improve performance for queries that read data in a range of key values or in a given key order.
If there is an index with the same key, it has the highest possible “clustering factor” and there are no unnecessary jumps to different data pages while walking through the index. Any potential loss of performance from using such an index for returning data in key order, rather than internal sort order, is eliminated.
Space Reservation
Specifying the space reservation level for individual tables allows the option to mark tables that do not change for higher storage density while keeping necessary space reserved for tables that are frequently updated. High data density saves disk space and increases query performance by reducing the number of I/O operations.
Physical Order of Tables and Clustering
Specifying the physical order of tables in a database allows storage of tables storing static data before tables that are updated or extended, so the volatile part of the database is at the end of the file. Combining this with table clustering makes it possible to keep static data more or less separated from volatile data which, over time, helps reduce the build-up of fragmentation.