About the maturity of Apache Derby

Engineering Insights & Enterprise solutions

For those of you that didn’t hear about it yet, Apache Derby is a Java relational database engine that can both run as a network server or embedded in a Java application. You might have encountered Derby under its previous name, Cloudscape DB, shipped with the recent J2EE reference implementations. Sun also plans to include this product in JDK 6 but is branded as Java DB.

I recently did some work on porting a tool to use the Apache Derby database engine, besides the existing MySQL 4.1 support. The database behind the application was of medium size: around 50 tables and reasonably complex queries. I used at that time the most recent 10.1.3.1 version and also used development versions of the 10.2.1.3 version.

While overall Derby is a great idea and a good implementation, it is important to know that it is still immature. While doing the port I have encountered the following problem categories:

unsupported SQL features,
internal server errors,
fatal server errors,
stored function limitations,
differences between the networked and embedded JDBC drivers.

Unsupported SQL features range from simpler to annoying ones. Alias visibility is a great issue here, so for instance the following SQL query is not valid in Derby:

SELECT SUM(table1.field1) AS my_sum1
FROM table1
GROUP BY my_sum1

Select item aliases are not visible in the GROUP BY clause, but _are_ visible in the ORDER BY clause!

Internal server errors range from plain NullPointerExceptions (in the case of the COALESCE built-in function) to some protocol errors. Once protocol errors appear, though less frequent, one needs to restart the server (or the application, if you’re running in embedded mode). A frequent source of internal server errors is when one tries to overcome the alias limitation by putting group expressions in the GROUP BY clause (e.g. GROUP BY SUM(table1.field1).
Hopefully, I encountered fatal server errors only in one scenario, and I managed to get over it but unfortunately not in a reproducible manner. This particular server error was caused by a more complex query that produced an internal server error. The server did not recover from it so all subsequent valid queries were raising the same error.

It is important to know that Derby has a limited number of built-in functions, so one needs to frequently resort to custom stored functions. The problem with this is that, besides the lack of proper documentation, this support is quite limited. Stored functions can be defined only in Java. That would be mostly fine if it would work well. But for instance, if you want a stored function that returns an INTEGER value, you must specify the return type of the associated Java method as int. This means that basically, you cannot return NULL integers from your function!

I have encountered some differences between the networked and embedded JDBC drivers. Actually only one, but which convinced me not to complicate things and to overall avoid the embedded driver. The problem is related to the way NULL values are specified for prepared statement parameters. If the java.sql.PreparedStatement.setNull(int parameterIndex, int sqlType) the method is called with sqlType 0 (done by default by iBATIS) then it will raise an error, but only with the embedded driver.
There are several other limitations that would need a lot more detail. The good news is that for most of these problems there are bug reports and developers are working to fix them, but I think that there’s still a lot to do to stabilize the whole thing. I managed to successfully avoid all the pitfalls and do the port, but the effort was too big in my opinion.
I don’t say that you shouldn’t use Derby for your projects. It’s just that it is important to know its limitations and be sure that you don’t have a better alternative. As a side note, compared with Derby the Hypersonic SQL database engine is very much more unstable and featureless.