Keyword Migration - Functional Specification

Description

The parser that InterBase uses to validate SQL syntax makes a distinction between reserved words and object names. Each time the parser requests a new token from the lexical analyzer, a call is made to HSHD_lookup to determine if the token about to be returned is a keyword. If it is a keyword, then the token is returned to the parser and the parser performs the checks necessary to determine if it is looking for a keyword. If the call to HSHD_lookup does not return anything, meaning the proposed token is not a keyword, then the token SYMBOL is returned.

This lookup to determine if the token is a keyword or not is one of the areas which causes incompatibilities between versions of the product as customers are forced to update their applications if they created database objects which have names that conflict with the new keywords.

There are a few solutions to this problem

  • Force customers to upgrade all clients and servers at the same time. In addition, force them to remove any reserved words from their database schemas. This is the solution which was used in the past and the one that InterBase V6.0 should avoid at all costs.
  • Remove the keyword restrictions from the parser. This change would require changes to the parser grammar so that each terminating symbol could be reduced to include either a SYMBOL token or a keyword token. With this solution, much of the parser grammar would need to change in order to avoid shift/reduce and reduce/reduce conflicts. Many of these conflicts stem from the rules used to recognize data type information.
  • Version the keywords in the parser so that older clients can connect to a new server and still use existing object names. This change is the least risky to implement and will provide the same amount of flexibility as changing the parser.

The third solution will be used as it is the least risky and most flexible. This is how it works:

Each reserved word has a parser version associated with it. This version is used to determine how to handle a particular keyword and is completely transparent to the user. If the client's parser version is less than the current parser version in the server, then the token about the be returned by Lex was not a keyword when the client was created, and therefore should be treated as a SYMBOL token. If the client's parser version is greater than or equal to that of the keyword, then the token is returned as a keyword.

With this implementation, old clients will not be able to access new features of the server unless implemented in such as way as to not interfere with existing clients and if the version of the keyword is set appropriately.

User Interface/Usability

This functionality is implemented completly inside the engine. However, the way in which keywords are recognized will change based on the remote protocol of the client. Though the remote protocol does not explicitly define the client version, PROTOCOL_VERSION10 was introduced in InterBase V6.0 and will be used to determine the parser version.

Remote Protocol Parser Version Functionality
< 10 1 All tokens listed as keywords and introduced in InterBase V6.0 are valid symbol names. New tokens for datatypes can not be used. New SQL features or data types can not be used
>=10 2 No tokens listed as keywords will be allowed to be used as symbol names. All new SQL features and data types can be used.

This restriction based on client dialect is made since there is no way for older clients to send the version of the parser when preparing a statement. We plan on adding an internal parameter to DSQL_PREPARE which provides the parser version. This would remove the dependency between client dialect and parser version.

Requirements and Constraints

This feature will exist on all platforms. The constraints are listed in the table above. With the current implementation, the client dialect must be increased each time a keyword is added to the product.

Migration Issues

This feature allows for older (pre InterBase V6.0) clients to connect to a InterBase V6.0 server but they will not be restricted by keywords that were not available in a InterBase V5.0 server. Now, the migration path for customers would be:

  • Backup databases with old version.
  • Install new server.
  • Restore databases with new version.
  • Upgrade old clients and applications (if needed).

This feature should only be a necessity for customers migrating to InterBase V6.0. In the futures, customers will be able to use delimited identifiers around keywords in their applications.

For NT installations of the product, an assumption is being made that the client and server are upgraded at the same time. This would mean that local access to the server will automatically change once the server is upgraded.