Date: March 2010
Like most database servers, Pervasive PSQL keeps track of data on database objects, such as tables, columns, views, and indexes, in so-called DDF files (sometimes called catalog or system tables). These files are automatically created by PSQL during installation and/or when a database is created. This type of data is often called (technical) meta data.
Sometimes, we need to access the meta data in our applications. It might be that we want to know the names of the columns of a specific table, or the data type and maximum length of a particular column. But how do we query those DDF files? There are four different approaches to query them. First, you can use Pervasive Control Center to view the properties of the database objects. The advantage of this approach is that you don’t have to understand the structure of each DDF file. PCC understands it. Retrieving the meta data is purely a matter of point and click. The disadvantage of this approach is that you can’t use it if you want to query the meta data from within an application.
The second approach is through a set of system stored procedures supported by Pervasive PSQL. These procedures allow us to display meta data from the DDF files. For example, with the following statement we get a list of all the tables in the current database:
CALL PSP_TABLES (,,'USER TABLE')
The result might look like this:
TABLE_QUALIFIER TABLE_OWNER TABLE_NAME TABLE_TYPE REMARKS FILE_LOCATION
--------------- ----------- ----------------- ---------- ------- ---------------------
TENNIS ? PLAYERS USER TABLE ? PLAYERS.mkd
TENNIS ? TEAMS USER TABLE ? TEAMS.mkd
TENNIS ? MATCHES USER TABLE ? MATCHES.mkd
TENNIS ? PENALTIES USER TABLE ? PENALTIES.mkd
TENNIS ? COMMITTEE_MEMBERS USER TABLE ? COMMITTEE_MEMBERS.mkd
The disadvantage of this approach is that you can’t query the results of the procedures. These procedures work as predefined queries and the returned results are final results. For example, if we only want to see tables with at least four columns, or with a name that starts with the letter D, that’s not possible. So, it’s not a very flexible approach.
The third approach is the die-hard approach, where we write our own SELECT statements and query the DDF files directly. In principle, querying the DDF files is like querying ordinary tables. However, their internal structures are somewhat complex and rather cryptic. A lot of codes are used to describe certain properties. In other words, writing queries on DDF files is hard, and you have to understand the internal structures of the DDF files very well.
The fourth approach is that we define views on the DDF files. Those views will present a more straightforward structure to the applications. For my book The SQL Guide to Pervasive PSQL, I created a number of those views. In a simple and transparent way those views give access to the actual, underlying DDF files. In addition, because they are views, we can write any query on those views to get whatever we want.
For example, with the following long and complex statement we create a view that shows the properties of all the columns of each table in the current database:
CREATE VIEW COLUMNS (TABLE_NAME, COLUMN_NAME, COLUMN_NO, DATA_TYPE,
CHAR_LENGTH, PRECISION, SCALE, NULLABLE) AS
SELECT UPPER(TAB.XF$NAME), UPPER(COL.XE$NAME),
(SELECT COUNT(*)
FROM X$FIELD AS F
WHERE F.XE$FILE = TAB.XF$ID
AND F.XE$OFFSET <= COL.XE$OFFSET
AND F.XE$DATATYPE NOT IN (227,255)),
CASE XE$DATATYPE
WHEN 0 THEN 'STRING'
WHEN 1 THEN
CASE XE$SIZE
WHEN 1 THEN 'TINYINT'
WHEN 2 THEN 'SMALLINT'
WHEN 4 THEN 'INTEGER'
WHEN 8 THEN 'BIGINT'
ELSE 'UNKNOWN INTEGER'
END
WHEN 2 THEN
CASE XE$SIZE
WHEN 4 THEN 'REAL'
WHEN 8 THEN 'DOUBLE'
ELSE 'UNKNOWN REAL'
END
WHEN 3 THEN 'DATE'
WHEN 4 THEN 'TIME'
WHEN 5 THEN 'DECIMAL'
WHEN 8 THEN 'NUMERIC'
WHEN 9 THEN
CASE XE$SIZE
WHEN 4 THEN 'BFLOAT4'
WHEN 8 THEN 'BFLOAT8'
ELSE 'UNKNOWN BFLOAT'
END
WHEN 11 THEN 'VARCHAR'
WHEN 14 THEN
CASE XE$SIZE
WHEN 1 THEN 'UNSIGNED1 BINARY'
WHEN 2 THEN 'UNSIGNED2 BINARY'
WHEN 4 THEN 'UNSIGNED4 BINARY'
WHEN 8 THEN 'UNSIGNED8 BINARY'
ELSE 'UNKNOWN UNSIGNED BINARY'
END
WHEN 15 THEN
CASE XE$SIZE
WHEN 2 THEN 'AUTOINC2'
WHEN 4 THEN 'AUTOINC4'
ELSE 'UNKNOWN AUTOINC'
END
WHEN 16 THEN 'BIT'
WHEN 17 THEN 'NUMERICSTS'
WHEN 18 THEN 'NUMERICSA'
WHEN 19 THEN 'CURRENCY'
WHEN 20 THEN 'TIMESTAMP'
WHEN 21 THEN 'CLOB/BLOB'
WHEN 25 THEN 'WSTRING'
WHEN 26 THEN 'WZTRING'
WHEN 27 THEN 'GUID'
WHEN 30 THEN 'DATETIME'
ELSE CONCAT('DATATYPE UNKNOWN = ', CAST(XE$DATATYPE AS CHAR(3)))
END,
XE$SIZE, XE$SIZE, XE$DEC,
CASE
WHEN XE$FLAGS & 4 = 4 THEN 'YES'
ELSE 'NO'
END
FROM X$FIELD AS COL, X$FILE AS TAB
WHERE COL.XE$FILE = TAB.XF$ID
AND TAB.XF$FLAGS & 16 <> 16
AND COL.XE$DATATYPE NOT IN (227,255)
Let’s create a new table to show what a query on the above view looks like.
CREATE TABLE PLAYERS (
PLAYERNO INTEGER NOT NULL,
NAME CHAR(15) NOT NULL,
INITIALS CHAR(3) NOT NULL,
BIRTH_DATE DATE,
SEX CHAR(1) NOT NULL,
JOINED SMALLINT NOT NULL,
STREET VARCHAR(30) NOT NULL,
HOUSENO CHAR(4),
POSTCODE CHAR(6),
TOWN VARCHAR(30) NOT NULL,
PHONENO CHAR(13),
LEAGUENO CHAR(4),
PRIMARY KEY (PLAYERNO))
Now, query the COLUMNS view:
SELECT *
FROM COLUMNS
WHERE TABLE_NAME = 'PLAYERS'
The result is:
TABLE_NAME COLUMN_NAME COLUMN_NO DATA_TYPE CHAR_LENGTH PRECISION SCALE NULLABLE
---------- ----------- --------- --------- ----------- --------- ----- --------
PLAYERS SEX 5 STRING 1 1 0 NO
PLAYERS BIRTH_DATE 4 DATE 4 4 0 YES
PLAYERS JOINED 6 SMALLINT 2 2 0 NO
PLAYERS INITIALS 3 STRING 3 3 0 NO
PLAYERS NAME 2 STRING 15 15 0 NO
PLAYERS PLAYERNO 1 INTEGER 4 4 0 NO
PLAYERS STREET 7 VARCHAR 31 31 0 NO
PLAYERS HOUSENO 8 STRING 4 4 0 YES
PLAYERS POSTCODE 9 STRING 6 6 0 YES
PLAYERS TOWN 10 VARCHAR 31 31 0 NO
PLAYERS PHONENO 11 STRING 13 13 0 YES
PLAYERS LEAGUENO 12 STRING 4 4 0 YES
Here is another example: How many integer columns does the PLAYERS table have?
SELECT COUNT(*)
FROM COLUMNS
WHERE TABLE_NAME = 'PLAYERS'
AND DATA_TYPE IN ('TINYINT', 'SMALLINT', 'INTEGER', 'BIGINT')
The result is:
COUNT(*)
-------
2
With the first three approaches this query would not have been possible. And can you imagine how complex this query would have been if we had written it directly on the DDF files? Now, it’s not that complex at all, and the query is quite readable.
An additional advantage of the fourth approach over the one where developers write their own queries directly on the DDF files, is that not all the developers have to understand the structures of the DDF files, only the ones who write the view. And the chance that someone makes a mistake with understanding how it works is minimized.
The view definition shown above is just an example. You can change it anyway you want. For example, if you prefer different names for the data types, or if you don’t want to show the table and column names in capitals, just change the CREATE VIEW statement. But hopefully this view is an acceptable starting point.
If your applications need to access the meta data in the DDF files in a simple way, we recommend you create your own views, move the logic that understands the DDF files from the applications to the view definitions.
Finally, if you can improve or extend the definition of the above view, or of one of the view definitions in my book, please let me know. You can send you improvements to sql@r20.nl.