The JVM Fanboy Blog 

Entries tagged [series]

Apache Derby part 3: Stored Procedures

by vincent


Posted on Tuesday Aug 04, 2015 at 12:54AM in Relational Database


In the previous part, we created a small database to store movie titles and the media type. In this part we will create and load a Stored Procedure into that database to make it really easy to add movies using the "ij" command-line tool with one simple statement.

First a warning. Most dedicated DBA's will probably hate you when you ask them to enable Java stored procedures support in their database. I don't think it's really considered common practice, but many enterprise-grade database systems offer stored procedures written in Java. I can think of many reasons not to do this, but there are probably cases where it can be a good solution. Since Derby is a database management system implemented fully in Java anyway, it's make a bit more sense here. There's a nice article on the official Oracle Nashorn team blog about this subject.

We will create a simple stored procedure that can be used to easily add a movie using a single CALL AddMovie('Movie Title', 'Media Name') statement. What it will do is simply look up the media record by name, then create the specified movie record. So it saves you from looking up the media id manually or using a subquery to do it with one query. I will never recommend to use stored procedures like this in an application (in this case for performance reasons, applications should look up the id of the media once and cache that), but if I'd plan to add many records manually using SQL and a command-line tool like "ij", then I'd seriously consider adding a utility stored procedure like this for convenience. That is, if the table contained more fields :)

Read the long full article for all sample code

Read More

Apache Derby part 2: Getting started

by vincent


Posted on Tuesday Jul 21, 2015 at 12:07AM in Relational Database


In the first part of this series, I've introduced the Apache Derby database management system. Let's now download Derby and get it up and running by creating a small database.

I am a dedicated NetBeans IDE user (in fact, I am a registered NetBeans project contributor, thanks to my involvement with the NBPython project). I'll be using NetBeans throughout the rest of the series, but on this post I'll only be using the interactive commandline tool "ij".

Note that I assume the reader has basic skills to use Java and SQL, this is not a Java/SQL tutorial.

Downloading Derby

Go to http://db.apache.org/derby/ and choose "Download". At the time of this writing 10.11.1.1 (August 26, 2014) is the latest Official Released Release. Click on the version, then download the bin distrubution. In my case, the filename was db-derby-10.11.1.1-bin.zip.

Setting up Derby

Copy and unzip the zip file in a directory that pleases you. If you are using Windows, preferably do not put it in the Program Files directory, unless you know how to do this with administrator rights. In the examples below, I choose to unzip the ZIP file in my c:\Java directory (I am writing this on a Windows machine).

Open Command Window in Windows (Terminal in Linux) and change current directory to the unzipped directory) Derby's manual suggests setting the DERBY_HOME environmental variable is required. This is also a good moment to check if Java is installed correctly on your machine.

On my Windows machine:

cd \Java\db-derby-10.11.1.1-bin
set DERBY_HOME=\Java\db-derby-10.11.1.1-bin
java -version

if the last command produced a Java version number that is 1.7 or 1.8 or higher you'll be good to go. Otherwise you'll have to install or re-install Java on your machine.

On Linux, you'll do something like (assuming you have a "java" directory in your home directory and you unzipped the ZIP file there. Substitute your username for USER):


cd ~/java/db-derby-10.11.1.1-bin
export DERBY_HOME=/home/USER/java/db-derby-10.11.1.1-bin
java -version

If you plan to use the Derby commandline tools extensively, you'll have to add the bin directory to the path and ensure the DERBY_HOME environmental variable is set on each boot. This is out of scope for this tutorial, IMO.

Now create a directory to store the databases. I'd recommend to not store those in the db-derby-10.11.1.1-bin directory, to make upgrading your version of Derby easier.

I have created a "derby_databases" directory in my java directory.

Windows (Linux uses will know how to do this, I assume) :)
mkdir \java\derby_databases

Running ij and creating our first database in embedded mode

If we set up everything correctly, we should now be able to start "ij" the interactive Derby CLI utility.

Windows: (Linux users, start "./ij")

cd \java\db-derby-10.11.1.1-bin\bin
ij.bat

A simple "ij version 10.11" screen should appear with ij> prompt. Type "help;" here - do not forget the semicolumn ; - and press enter

ij version 10.11
ij> help;
 Supported commands include:

  PROTOCOL 'JDBC protocol' [ AS ident ];
                               -- sets a default or named protocol
  DRIVER 'class for driver';   -- loads the named class
  CONNECT 'url for database' [ PROTOCOL namedProtocol ] [ AS connectionName ];
                               -- connects to database URL
                               -- and may assign identifier
  SET CONNECTION connectionName; -- switches to the specified connection
  SHOW CONNECTIONS;            -- lists all connections
  AUTOCOMMIT [ ON | OFF ];     -- sets autocommit mode for the connection
  DISCONNECT [ CURRENT | connectionName | ALL ];
                               -- drop current, named, or all connections;
                               -- the default is CURRENT

  SHOW SCHEMAS;                -- lists all schemas in the current database
  SHOW [ TABLES | VIEWS | PROCEDURES | FUNCTIONS | SYNONYMS ] { IN schema };
                               -- lists tables, views, procedures, functions or synonyms
  SHOW INDEXES { IN schema | FROM table };
                               -- lists indexes in a schema, or for a table
  SHOW ROLES;                  -- lists all defined roles in the database, sorted
  SHOW ENABLED_ROLES;          -- lists the enabled roles for the current
                               -- connection (to see current role use
                               -- VALUES CURRENT_ROLE), sorted
  SHOW SETTABLE_ROLES;         -- lists the roles which can be set for the
                               -- current connection, sorted
  DESCRIBE name;               -- lists columns in the named table
--- cut for brevity ----

Now create our first database... Substitute the c:\java\derby_databases\ with the directory you have created to store your databases.

ij> CONNECT 'jdbc:derby:c:\java\derby_databases\derby_test;create=true';

If you somehow lost your connection while doing the exercise below, and you want to connect to your previously created database, you can simply remove the create=true part from above's string to re-open the existing database

.

Unlike some other embeddable databases, like SQLite, Derby does not store a database in one file. Rather, it creates a subdirectory with the database name and store all related files in that directory. It is very important that the files in that directory are NEVER edited. With a single edit you can completely damage your databases.

Let's find out what schema's are created by default and which tables are added to the default "APP" schema.

ij> show schemas;
TABLE_SCHEM
------------------------------
APP
NULLID
SQLJ
SYS
SYSCAT
SYSCS_DIAG
SYSCS_UTIL
SYSFUN
SYSIBM
SYSPROC
SYSSTAT

11 rows selected

ij> show tables in APP;
TABLE_SCHEM         |TABLE_NAME                    |REMARKS
------------------------------------------------------------------------

0 rows selected

The schemas starting with SYS* are for internal use. The "APP" schema is the one we will be using for our first table. As you can see, Derby did not create any table in this schema. Let's create an extremely simplistic database to store your DVD/Bluray collection.

ij> set schema "APP";
0 rows inserted/updated/deleted

ij> CREATE TABLE media (id INT NOT NULL GENERATED ALWAYS AS IDENTITY, name VARCHAR(32), CONSTRAINT media_pk PRIMARY KEY (id));
0 rows inserted/updated/deleted

ij> CREATE TABLE movie (
ID INT NOT NULL GENERATED ALWAYS AS IDENTITY, 
name VARCHAR(255) NOT NULL,  
media INT, 
CONSTRAINT movie_pk PRIMARY KEY (id), 
CONSTRAINT movie_media_fk FOREIGN KEY (media) REFERENCES media ON DELETE SET NULL ON UPDATE RESTRICT);
0 rows inserted/updated/deleted

ij> SHOW TABLES IN APP;
TABLE_SCHEM         |TABLE_NAME                    |REMARKS
------------------------------------------------------------------------
APP                 |MEDIA                         |
APP                 |MOVIE                         |
2 rows selected

Note that although we used names in lower case, all names have been converted to UPPERCASE. This happened because we did not put names between quotes. The SQL standard dictates that unquoted names are always converted to uppercase, but many modern database relaxes this rule a bit (or can be configured that way). In Derby this is, as far as I am aware, not configurable. If you do not want table, column, schema, constraint etc. names to be stored in uppercase, then you will always have to put each and every name between two double quotes ("media" or "Media"). Beware that using unquoted names will not work in that case anymore, so Derby is case sensitive only when quotes are used, in all other cases it will convert the name to uppercase.

Finally let's add some records to our database.

ij> INSERT INTO media (name) VALUES ('DVD');
1 row inserted/updated/deleted
ij> INSERT INTO media (name) VALUES ('Bluray');
1 row inserted/updated/deleted
ij> SELECT * FROM media;
ID         |NAME
--------------------------------------------
1          |DVD
2          |Bluray


 ij> INSERT INTO movie (name, media) VALUES ('Sharknado', 2);
1 row inserted/updated/deleted
ij> INSERT INTO movie (name, media) VALUES ('Kung Fury', 1);
1 row inserted/updated/deleted
ij> INSERT INTO movie (name, media) VALUES ('Mega Shark vs Giant Octopus', 1);
1 row inserted/updated/deleted
ij> SELECT * FROM movie;
ID         |NAME
                                                            |MEDIA
--------------------------------------------------------------------------------------------------------------------------------------------------------
1          |Sharknado
                                                            |2
2          |Kung Pow!
                                                            |1
3          |Mega Shark vs Giant Octopus
                                                            |1

3 rows selected

ij> SELECT MOVIE.id, MOVIE.name, MEDIA.name AS "media" FROM MOVIE INNER JOIN MEDIA ON MOVIE.MEDIA=MEDIA.ID ORDER BY MOVIE.NAME;
ID         |NAME
                                                            |media

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-------------
2          |Kung Pow!
                                                            |DVD

3          |Mega Shark vs Giant Octopus
                                                            |DVD

1          |Sharknado
                                                            |Bluray


3 rows selected

ij>quit;

Now note that in the last query output, the third header column name is displayed in lowercase , because the alias name "media" was put between double quotes.

To be continued

Now that we have Derby up and running, let's continue with the fun stuff: configuring NetBeans and creating our first stored procedure in Java.


Apache Derby part 1: Introduction

by vincent


Posted on Tuesday Jul 07, 2015 at 12:08AM in Relational Database


Apache Derby part 1: Introduction

This will be a series of blog posts about Apache Derby, an open-source "traditional" related database management system, fully implemented in Java.

I've been a happy Derby user for a few years now (in fact, this blog is powered by a Derby database) and I'd like to share my enthusiasm. As I don't see Derby mentioned a lot in recent blogs, articles etc., I thought maybe this article will be of some help for people who are looking for more information. I've used Derby successfully in several smaller-scale web applications.

This first part of the series is a generic introduction, the following parts will be much more technical and actually contain code :) .

History

The early years

The history of Derby is widely documented.

In 1997 a start-up called Cloudscape Inc released a commercial database management system called Cloudscape (initially it was released with the generic "JDBMS" name). Two years later Informix Software acquired Cloudscape. In 2001 IBM acquired their database assets and renamed it to IBM Cloudscape.

IBM donated the code to Apache Software Foundation in 2004 and one year later it became an open-source sub-project of the Apache's DB top-level project. IBM continued to sell IBM Cloudscape by taking a snapshot of the then-current Apache Derby version, adding installers, offering support contracts and adding additional proprietary components that provided ODBC and Microsoft MDAC compatibility, among other things. IBM retired the IBM Cloudscape product in 2007. The additional proprietary components are no longer available.

Sun/Oracle JavaDB

Sun joined the project when the code was donated to the Apache Software Foundation. They took a snapshot of the then-current Apache Derby version, added it to the Java Developers Kit (JDK) and called it the "JavaDB" component. They did not modify one single line of code, the project name is still Apache Derby, but Sun/Oracle call it JavaDB in their documentation. So every user of a recent JDK has access to a full copy of Apache Derby. JavaDB was Derby 10.6 for Sun's JDK 6, 10.8 for Oracle's JDK 7 and 10.11 for JDK 8.

While JavaDB is very convenient while testing Apache Derby, I'd normally recommend to use the latest available Derby version for production use (10.11.1.1 released on 27th of August 2014 at the time of this writing).

Why choose Apache Derby?

Apache Derby was not created to replace generic, full enterprise-ready DBMS'es like Oracle Database, IBM DB2, Postgresql, etc. Derby was, in my opinion, also not envisioned with "big data" applications in mind. One of the reasons is that Derby can store its database on one logical disk only (unless RAID is used...). Also, LOB fields are 'limited' to 2GB each. 

An old 2005 presentation sheet from the IBM website about IBM Cloudscape (that was still on-line at the time of this writing) suggests that Derby should be used when the total stored data is expected to be 50GB or less. Note that this is not a documented maximum, there does not seem to be a hard-coded maximum database size (apart from the mentioned 2 GB per LOB field). I should note many of the shortcomings mentioned in the presentation sheet have been fixed by the Derby developers in the last 10 years.

If your applications can live without storing terabytes worth of data and serving hundreds of users concurrently, here are some of the highlights that I am aware of:

  • Derby can run in two modes: Client/Server and Embedded mode, more on this below.

  • The engine makes full use of JVM's powerful multi-threading, features. I'd call the performance very good (but remember the name of this blog, I'm a bit biased on JVM performance probably). I'd like to create some proper benchmarks, but I'll have to research how to do this carefully.

  • Stored Procedures can be written in any JVM language that can compile classes that implement Interfaces. Those classes are then loaded by Derby and can be triggered using SQL or JDBC code.

  • You can create Derby-style table functions by creating custom classes that implement the standard JDBC "Resultset" interface. Those classes are loaded in Derby and are available to SQL queries like normal, but read-only, Derby database tables. This is a great feature to import foreign data from external text/XML files, other databases (perhaps even NoSQL(!) databases), etc. The user can use the full SQL syntax to filter the data dynamically returned by the object. This is very powerful stuff. "Yes SQL", please! :)

  • A more optimized version of table functions are also available, the class must additionally implement one of Derby's own interfaces. Classes that will do this will be much quicker when the WHERE-clause is used for example, but give up some flexibility as the columns and their data-types must have been declared beforehand.

  • Custom data types are supported, but classes must implement the standard Serializable, or, preferably, the Externalizable interface, so this will in many cases not work on your classes without some additional work.

  • In Embedded mode, Derby can load JAR files stored in the database as if they were on the user's ClassPath.

  • Temporary in-memory only databases can easily be created. Those databases are removed from memory when disconnecting or shutting down the server. Those are supported both in Embedded and in Server/Client mode.

  • The newest version contains (currently experimental) built-in support for the Apache Lucene high-performance search indexer project.

  • SQL support is getting better and better with each release. New is the MERGE statement that INSERTS, UPDATES and DELETES records with one single statement.

  • Derby comes with built-in support for the Java Management Extensions (JMX) , to remotely monitor the Derby processes in real-time.

  • For testing and development purposes only, Derby comes with a simple Servlet web-application that can be used to do some common administration tasks

Embedded / Client/Server Modes

Embedded Mode 

Derby comes with a JDBC driver that contains its complete database engine embedded in it. The JAR file containing this driver is only about 2.5 MB (yes, still a lot bigger than the public-domain SQLite written in C, but still very acceptable). In Embedded Mode applications store their databases on the local file-system, or in temporary in-memory only databases.  Databases are available on a single JVM instance only, but can still accept multiple connections and fully supports multi-threading (but, as mentioned, those are not accessable from other JVM instances). 

As mentioned in the previous section, in this mode Derby can load JAR files stored in the database as if they were on the user's classpath. This, like stored procedures written in Java, requires some extremely careful planning on the developer's side, but could probably be useful in some specific situations. Also, the driver has some additional features to load read-only databases directly from local JAR files that are available on the user's classpath.

Client/Server Mode

In Client/Server Mode, a Derby Network Server runs on a server that is available via a standard TCP/IP port. A specific Derby Network JDBC Driver is now used to connect client connections to the databases. In this mode databases are available to any user that can connect to it, from any JVM instance.

Several options of user authentication are available, including storing name/password in simple plain text Java properties file format, using a LDAP server or storing usernames and passwords directly in the Derby database. User authorization can be controlled per database by standard SQL GRANT/REVOKE commands, or users can be grouped in either READ/WRITE or READ-ONLY groups. The authorization features could be improved, from what I understand any user that has server access level can shutdown the Network Server with a single command... :(

A standard Java Security Manager can be used to restrict the rights of the database server.

Derby in use

  • Derby only supports the JDBC standard to let clients connect to its databases. Many non-JVM programming languages, like Node.js and Python, have libraries that provide basic JDBC support.

  • As mentioned, Derby's SQL support is strong. It is quite strict though. If you, like me, are familiar with default Postgresql setups, it will take some time to get used to put every schema, table and column names in your SQL queries between double quotes, because otherwise they are automatically stored in UPPERCASE format. Once you use quoted names, working with unquoted names does not work any more ("SELECT * FROM person", does not work when the table name is "Person"). This restriction is part of the official SQL standard, but most DBMS'es are more forgiving or can be configured to be completely case insensitive. As far as I know this is not possible with Derby. I believe, however, that being forced to write strict SQL queries that adhere fully to the official SQL standard is not necessarily a bad thing.

  • I believe most major ORM libraries offer Apache Derby support. Oracle's JavaDB component probably helps here. For example, the popular Hibernate ORM comes with a Derby dialect.

  • Derby comes with the "ij" command-line tool that can be used to create databases, write queries, etc. Since this is a default Java console application, I really miss some basic features like cursor keys support. I believe this can be added by adding some external JAR files to the classpath, but could not get this to work yet unfortunately. This makes using this application a frustrating experience to me every time I boot it up.

  • Luckily Derby is supported by a lot of Java developer and JDBC SQL tools. For example, I am very happy with NetBeans IDE's Databases features on the Services tab. I find it a joy to add records, tables to, and execute queries on my Derby database this way. I'm sure the other famous IDEs offer similar features as well.  I had reasonable success with SQuirreL SQL Client as well.

  • The quality of the standard documentation is generally quite good, especially for an open source project. I presume this is because of the commercial roots of Derby. The organization of the information sometimes leaves something to be desired, as content is spread between different PDF/HTML files and sometimes content you'd expect to be in one manual is actually described in one of the others.

To be continued...

In the next installment of this series, i plan to write about setting up and configuring Derby.