Database Parameter - isc_dpb_utf8_filename

By Paul Beach

Adriano dos Santos Fernandes... (README.connection_string_charset1.txt)

"Before Firebird 2.5, filenames that were used in the connection string were always passed from the client to the server without any conversion. On the server, the filenames are used with the Operating Systems own API functions without any conversion too. This creates a situation where filenames using non-ASCII characters do notinteroperate well when the client and the server are different Operating Systems or even same Operating System using different codepages.

The problem was addressed in Firebird 2.5 in the following way: The filename is considered, by default, to be on the Operating System codepage.

A new DPB (database parameter block was introduced, named isc_dpb_utf8_filename. It is used to change the problem outlined above, so Firebird could consider the passed filename as being in UTF-8 format.

If a 2.5 (or newer) client is communicating with a remote server older than 2.5, and isc_dpb_utf8_filename was used, the client converts the filename from UTF-8 to the client codepage and passes that filename to the server. The client removes isc_dpb_utf8_filename DPB.

This guarantees backward compatibility when people are using the same codepage on the client and the server.

If a 2.5 (or newer) client is communicating with a 2.5 (or newer) server, and isc_dpb_utf8_filename was not used, the client converts the filename from the OS codepage to UTF-8 and inserts the isc_dpb_utf8_filename DPB. If isc_dpb_utf8_filename was used, the client just passes the original filename within the DPB to the server. So the client always passes to the server an UTF-8 filename and the isc_dpb_utf8_filename DPB.

The filename received on the server is subject to the same rules above. But note that a 2.5 client may automatically convert the filename and insert the DPB.

Clients older than 2.5 do not, so the received filenames are going to be considered as being the server codepage. We guarantee backward compatibility when the client and server codepage are the same.

The Operating System codepage used for conversions is: Windows: The Windows ANSI code page Others: UTF-8"

Creating a database with a UTF8 name

So how do you create a database with a UTF8 name?

Simply put the answer to the question is to use the isc_create_database function and use the isc_dpb_utf8_filename database parameter buffer (DPB):

ISC_STATUS isc_create_database (
ISC_STATUS *status_vector,
short db_name_length,
char *db_name,
isc_db_handle *db_handle,
short parm_buffer_length,
char *parm_buffer,
short db_type);

Note

db_type is unused.

The code

It does the following, using the DPB

  • Sets the username and password
  • Sets the page size of the database
  • Sets the sweep interval to 0
  • Sets a isc_dpb_utf8_filename
  • Sets the database character set to UTF8

Code:

#include <stdio.h>
#include <string.h>
#include "ibase.h"

int main(int argc, char* argv[])
{

ISC_STATUS status_vector[20];

isc_db_handle db1;

char dpb_buffer[256], *dpb, *p;

char *database = "test.fdb";
char *user_name = "SYSDBA";
char *password = "masterkey";
char *charset = "UTF8";
char *pagesize = "8192";
char *sweep = "0";

short dpb_length;

dpb = dpb_buffer;
*dpb++ = isc_dpb_version1;
*dpb++ = isc_dpb_utf8_filename;
*dpb++ = strlen(database);
     for (p = database; *p;)
     *dpb++ = *p++;
*dpb++ = isc_dpb_user_name;
*dpb++ = strlen(user_name);
     for (p = user_name; *p;)
     *dpb++ = *p++;
*dpb++ = isc_dpb_password;
*dpb++ = strlen(password);
     for (p = password; *p;)
     *dpb++ = *p++;
*dpb++ = isc_dpb_set_db_charset;
*dpb++ = strlen(charset);
     for (p = charset; *p;)
     *dpb++ = *p++;
*dpb++ = isc_dpb_page_size;
*dpb++ = strlen(pagesize);
     for (p = pagesize; *p;)
     *dpb++ = *p++;
*dpb++ = isc_dpb_sweep_interval;
*dpb++ = 1;
*dpb++ = 0;

dpb_length = dpb - dpb_buffer;

printf("Creating Database\n");

isc_create_database(
     status_vector,
     strlen(database),
     database,
     &db1,
     dpb_length,
     dpb_buffer,
     0);

if (status_vector[0] == 1 && status_vector [1])
     {
     isc_print_status (status_vector);
     return 1;
     }

}

The result

isql:

SQL> connect 'test.fdb';
Database:  'test.fdb'
SQL> show database;
Database: test.fdb
     Owner: SYSDBA
PAGE_SIZE 8192
Number of DB pages allocated = 149
Sweep interval = 0
Forced Writes are ON
Transaction - oldest = 1
Transaction - oldest active = 1
Transaction - oldest snapshot = 1
Transaction - Next = 4
ODS = 11.2
Default Character set: UTF8
SQL>