Product Proposal: InterBase 7.0 Major Release
November 5, 1999
Important
This document contains information on what was initially proposed for InterBase V7.0 - pre December 1999.
Executive Summary
- This is a proposal and specification for the next major release of the InterBase product.
- Target FCS date for the product is approximately Third Quarter 2001.
Product And Market Summary
Product Components
Product Name: InterBase 7.0.0
InterBase SuperServer II: New build based on version 6.0.0 with major new enhancement for SMP scaleability and other features.
The version string for these versions are WI-V7.0.0, LI-V7.0.0, SO-V7.0.0
- Server hardware:Windows/Linux/FreeBSD:Intel Pentium/Pentium II/Pentium III single CPUIntel Pentium/Pentium II/Pentium IIImulti-CPU (SMP)Solaris:Sun SPARC/UltraSPARC single CPUSun SPARC/UltraSPARC multi-CPU (SMP)
- Server build compiler:Windows: Microsoft Visual C++, latest version ca. 2001Linux/FreeBSD: GNU egcs compiler, latest version ca. 2001Solaris: Sun Sparcworks C Compiler, latest version ca. 2001
Client compiler: Same per platform
Bundle InterBase Clients: WI-V7.0.0, LI-V7.0.0, SO-V7.0.0
Bundle InterClient: InterClient 2+
Bundle Documentation: InterBase 7.0 documentation in PDF format, with Adobe Acrobat Reader for Win32, Solaris, Linux, and FreeBSD.
Platform Specification
Platform Certification Requirements
Certify the product on the following OS configurations:
- Windows 98
- Windows NT 4.0, current service pack
- Windows 2000
- Red Hat Linux for Intel, current release
- S.u.S.E. Linux for Intel, current release
- FreeBSD for Intel, current release in Linux compatibility mode
- Solaris 2.6 for 32-bit SPARC hardware
- Solaris 7 for 64-bit UltraSPARC hardware
- For Core Team consideration:
Solaris 8 for 64-bit UltraSPARC hardware
Test matrix of clients and servers on heterogeneous platforms:
MTBF testing to ensure high uptime and reliability
RDBMS Competition
- Oracle 8i
- Microsoft SQL Server
- IBM DB2
Strategic Justification
Market Justification
In order to remain competitive in the RDBMS market, InterBase needs to evolve and enhance certain features. Key market drivers are the growing popularity of Internet, Java, XML, SMP, and mobile computing.
Release And Project Scope
Remedy Defects
Features To Implement
Short Term Features: To be backported to form InterBase Minor Release "CodeName1"
Some features are in demand for a short term minor release of InterBase. R&D should implement these features first, and then make a source branch for the porting team to build and test a stable 6.1 product. This is also an opportunity to release high priority bug fixes.
Local Access
SuperServer needs to implement a local access mechanism for performance reasons. XNET is a partially finished portable IPC implementation.
Built-in SQL functions
Though UDFs offer an elegant means for users to extend the SQL features of the server, most users prefer not to be required to implement common functions themselves. They request us to have a set of frequently-used functions as built in to the server. Most other RDBMS products do this. In order to offer value to users and remain competitive, we need to do this.
- Implement all built-in functions from ANSI SQL 92 (COALESCE, CONVERT, LOWER, NULLIF, OVERLAY, POSITION, SUBSTRING, TRANSLATE, TRIM). COALESCE is the one that users ask for most frequently, probably because it cannot be duplicated with a UDF. We already support SQL functions UPPER and EXTRACT.
- Implement functions to perform Blob operations, such as: searching, concatenate, substring, char type to Blob, Blob to char type, etc.
- Consider functions from the proposed new versions of ANSI SQL, such as the regular-expression matching function SIMILAR.
- Implement functions supported by ib_udf.dll, udflib.c, Greg Deatz' FreeUDFLib, and Robert Schieck's udflib. This does not necessarily mean to use Greg or Rob's source code, but just to implement functions matching the semantics of functions in these libraries.
- Research the set of functions that are supported by our principle competitors, and duplicate as many of these functions as make sense. Design a proposal and have Core Team approve it. See 7.0 functions for some research on competitors' functions.
- Implement functions as built-in to ibserver.exe without requiring DECLARE EXTERNAL FUNCTION syntax to enable access to the functions.
Optimizer improvements
We can always use a more intelligent optimizer. Sriram already has some specific fixes that include rearchitecting the way we handle NULLs in indexes. Any optimizer improvements which R&D judges can fit in the schedule will be considered by Core Team.
Enhance external tables
External tables currently support padded-field text files. Common data export tools do not support this type of file, preferring instead a field-delimited format.
Support text files with user-defined symbol delimiter between fields. For instance, tab-delimited or comma-delimited. Record separator would be newline, by default. A suggested implementation calls for R&D to introduce new syntax such as:
CREATE TABLE Foo [ EXTERNAL [ DELIMITED [ DELIMITOR 'symbol' ] [ SEPARATOR 'symbol' ] ] [ FILE ] 'pathname' ] ( ... )
Investigate what it would take to implement at least a rudimentary XBASE and/or Paradox file format support. This is a request for research and a "nice to have" category feature.
Java UDFs
JNI and embedded JVM to enable UDFs to be written in Java (Paul Ostler's research project).
OLE DB
Bundle a native OLE DB provider solution. We do not necessarily have to implement this in house; we can license a solution from an outside party, but we still must perform acceptance testing, documentation, and integration.
Blob Parameters
Test and document use of Blobs as input and output parameters for Stored Procedures. This apparently works already in the engine, we just need to certify it.
Read-ahead I/O
Charlie Caro has designed a "prefetch" implementation for optimizing I/O in the server. Estimates are that it could increase some kinds of operations 30-40%.
IBX Support
Ravi Kumar has requested several minor features for optimization of IBX, including returning RDB$DB_KEY in the XSQLDA, etc.
Temporary tables education
Users frequently demand the use of temporary tables that are visible only in a session, and which evaporate automatically when the session terminates.
Cleanup implementation of IBCONFIG parsing
For example, string config arguments are required to be quotes with double-quotes. This is inconsistent with the new IB 6.x semantics of doublequotes as identifier delimiters.
Also, ibconfig lines are limited to 80 characters, including parameter name and whitespace. In a world of long pathnames and perhaps multiple arguments for some ibconfig entries, this limit is too restrictive.
Comparative analysis
Conduct benchmarks to quantitatively measure InterBase performance relative to prior versions of InterBase, as well as versus competitors. For example, we need TPC benchmarks to publish on www.tpc.org.
Qualititive comparisons are required; technical comparisons of InterBase features versus the competitors' features. Where does InterBase have differentiating advantages?
Medium term features: To be backported to form InterBase Minor Release "CodeName2"
Additional features that are also in high demand but require a more lengthy development effort should similarly form a source branch for the porting team to create a stable 6.2 product.
Localized message database
Most of the structure in our message database needed to allow the message database to contain text in multiple languages is done. But the data has been neglected and not maintained. We need to revisit this in order to take care of the international needs. Doing this includes the following steps:
- Clean up the message database.
- Perform some editorial review of our message text.
- Outsource translation.
- Distribute localized versions of interbase.msg.
- Provide an Install API option, an ib_config parameter, and a GUI method of switching the message file.
Shadowing to a mapped drive on Wintel
Users want to write their shadow files to a mapped drive in a Wintel environment. We can write a shadow file to an NFS-mounted filesystem in a UNIX environment. Without this feature, some users criticize the shadowing feature, calling it "useless." They are willing to accept the immense performance penalty for performing I/O over a LAN, but they want a "hot backup" on a separate host for the case of a complete server failure.
Another shadow feature request: allow clients to connect to a shadow database as though it were a read-only database.
General support for nifty GDML features:
Create table based on a query or table
This is a SQL feature request to make it easier to copy a table. CREATE TABLE FOO AS SELECT * FROM BAR; The table FOO would be created automatically with datatypes matching the fields of the query, and fill the new table with the data in the query result set.
Support the GDML column alias
GDML allowed columns to have an alternative name, an "alias." We don't support this feature in SQL, but it is supported in the engine.
Support the GDML first n query
GDML allowed users to request the first n rows of a more lengthy query. This is a handy feature that customers have requested, even without knowing that we have the capability in the engine. Surface this functionality through SQL.
Integration with operating system security
Add options for integration with authentication systems including Microsoft Domains, Microsoft ADS, Novell NDS, NIS, and Hesiod and Kerberos on appropriate platforms. Given the number of such security services, perhaps it is better to create a plug-in type interface for InterBase security, so that third parties can write their own security bindings. Be careful that this doesn't open a security hole such that hackers can install their own security plug-in and gain access to everything.
Data import/export tool
We should provide a user-friendly data import tool or wizard that loads data as quickly as the insert-from-external-table method. This would help customers who need to upsize their Paradox, Visual dBASE, or Access databases. This can leverage from the work done with external tables in CodeName1.
Comparative analysis
Qualititive comparisons are required; technical comparisons of InterBase features versus the competitors' features. Where does InterBase have differentiating advantages?
Long term features: To form full InterBase Major Release "CodeName3"
Major features that require architectural or major ODS changes, or which require significant development time must be released in a full 7.0 project. However, for some of these features, work must begin immediately for research, design, and implementation.
Product managability enhancements
GFIX enhancements
Detect and correct more cases of database corruption
Asynchronous query cancel
This is a high-demand feature, to allow the DBA to monitor, identify, and forcibly kill runaway queries. Clients should also be able to kill their own queries. This feature should allow SYSDBA to cancel a query even if the user count is full; don't deny SYSDBA because user attachments have been exceeded.
Client termination
This is a high-demand feature, to allow the DBA to monitor, identify, and forcibly detach individual client connections. This feature should allow SYSDBA to terminate a client connection even if the user count is full; don't deny SYSDBA because user attachments have been exceeded.
Server-level or database-level security privileges
We've proposed this for allowing control over who can create metadata, who can back up a database, etc. There is an increasing demand for security as people deploy on the Internet.
Secure deployable databases
This has been called "Database Security II." We should have a mechanism for prevention of database kidnappers. One suggestion is to create a key pair in a database and its matching isc4.gdb so that the database cannot be moved to another system. Another suggestion is to store passwords in the gdb file, not in isc4.gdb. This level of security should be optional on per-database basis. There should also be a failsafe, because highly protected databases could become permanently locked if the key is somehow lost.
64-bit file I/O
The limitation of 2GB per database file is a hardship for some users with large databases (for example, there is a user in Brazil with a single 182GB database made up of 91+ files). By using 64-bit file pointers on platforms that support them, InterBase would support files larger than any practical requirement.
Performance and scalability enhancements
Multi-threaded server (Superserver II)
Implement the InterBase Superserver as a fully threaded engine, without a mutex around the engine. This should increase throughput and performance greatly. This is also needed to allow InterBase Superserver to benefit from SMP hardware, which is now very popular.
One additional requirement is to allow the DBA to configure the number of CPUs and which CPUs ibserver utilizes.
Multithreaded client interface
Customers often demand to be able to share database connection handles and transaction handles between multiple threads in an application.
Query optimizer improvement
We have many anecdotal reports that the query optimizer in InterBase 5.x is inferior to the optimizer in 4.x for many types of queries. We'll never create a perfect query optimizer for all cases, but it should be an ongoing task to improve the cost estimation algorithms to serve the commonly used query types. This relieves users of the responsibility to generate PLANs to override the optimizer. There are situations when it is inconvenient for the user to provide a PLAN, such as when using a machine query generator like Crystal Reports. InterBase is supposed to be self-tuning. In every release, InterBase should do a better job of creating its own PLANs.
Developer and DBA tools
Debugger for stored procedures & triggers
There exist contributed tools that do some form of debugging on stored procedures and triggers. We should evaluate these and distribute them if they meet our standards.
Diagnosting/debugging tool
Tech Support and QA both need some mechanism to help deduce the possible causes for crashes or other erroneous behavior. For example, when an important user has a recurring crash, it would be helpful for speedy diagnosis to be able to log the server context and state when the crash occurs. One suggestion for this implementation would include graphical diagnostic monitoring tools (such as a Windows-gds_lock_print with dynamically updating display).
New features
Bidirectional scrollable cursors
This is important for navigating through live datasets, if we want to implement that feature. The alternative is to require navigable datasets to be cached on the client in a TClientDataset. This feature has been implemented in the code, but ifdef'd out. It needs to be activated, tested, and documented.
Event technology enhancements
We should make a quantum leap in event technology, to provide a new paradigm in "push" technology. Implement hierarchical events and allow data arguments to be passed with an event.
For example:
POST FOO.BAR.BAZ ARG1 ARG2
Allow clients to listen for classes of events. For example:
EVENT_WAIT ANY EVENT_WAIT FOO EVENT_WAIT FOO.BAR EVENT_WAIT FOO.BAR.BAZ
These statements would all receive the event posted in the example above. In other words, instead of matching an event based simply on exact string, the event matching algorithm would match based on exact string, i.e. with the regular expression /^FOO$/, but also match with the regular expression /^FOO..*/.
Other event-related feature suggestions:
- Make sure there are components for Delphi/C++Builder/JBuilder that take advantage of the new Event technology.
- Create an "event cursor" mechanism so the client could retrieve the event name/class, info about the event (who posted it, for example), the number of arguments, and the data of the arguments themselves.
- Deferred events so a client could log on and get a sequence of events that have been queuing up.
- Permit different asynchronous event handlers for different event classes, i.e. one event handler for event FOO, a different event handler for event FOO.BAR, and a different event handler for event LIME.
- Permit multiple asynchronous event handlers for any given event.
- Allow events within or outside a transaction.
- Add SQL security to GRANT certain users or roles the privilege to receive notification of an event class.
Expression indexes
Permit indexes to be defined for an arbitrary scalar expression, not just a simple column or set of columns. This would provide a method to perform case-insensitive sorting. It could also be a way to encourage dBASE users to migrate to InterBase without losing functionality. This feature has been implemented in the code, but ifdef'd out. It needs to be activated, tested, and documented.
Comparative analysis
Conduct benchmarks to quantitatively measure InterBase performance relative to prior versions of InterBase, as well as versus competitors. For example, we need TPC benchmarks to publish on www.tpc.org.
Qualititive comparisons are required; technical comparisons of InterBase features versus the competitors' features. Where does InterBase have differentiating advantages?
Schedule
The porting development team will port and certify the product as quickly as is practical, starting 11/8/99.