In my last post I included some sample data which was useful for playing around with queries
That sample data was generated with dbmonster, a nice tool I discovered recently which comes in handy when you need to populate a database table to test your queries, the engine, schema, etc.
In a nutshell, here are the strengths I like about it:
- It's written in Java, so it'll feed any database with a JDBC driver, and AFAIK, that's just about any RDBMS you might need to use
- It's not too slow (it's actually pretty fast if you don't go crazy on long text based datatypes)
- It includes several data generators, and being written in Java, rolling out my own should be easy.
As for the weaknesses:
- It can be unbearably slow with long text columns
- The generated text data might not represent a very real installation (again, look at my previous post for a quick sample). Still, the fact that you can easily write and plug your own generator(s) is enough to give kudos to the developers for a good architecture. Even if a feature isn't available now, it's painless to add in the future
It has no way to manage relational integrity, hence, it doesn't really feed RDBMS, just JDBC enabled databases(yes it does, please see the comments)
My last point means that, if you have table A and table B, both related through a foreign key constraint, dbmonster doesn't provide a way for you to generate keys for table A and then reuse these keys
while inserting the foreign keys in table B. Not that I've found so far, at least.
Related with this limitation, tables are generated one at the time. Still, if you need a tool to populate tables, even very large ones, with random values, heck, up until a few days ago, I was doing this from
bash or python, so it's definitely an improvement. I've discussed this with colleagues and some proprietary tools are either available or in the works that would handle referential integrity while
generating data for database testing. If anyone knows an Open Source one that does, it'd be great. Otherwise, that would sure make one hell of a good project :)
Ok, so enough rant, here's some data: You get it from here.
Start off by creating a dbmonster.properties file. Here's a sample:
dbmonster.jdbc.driver=com.mysql.jdbc.Driver dbmonster.jdbc.url=jdbc:mysql://localhost/sample_db dbmonster.jdbc.username=mysql_user dbmonster.jdbc.password=mysql_password # for Oracle and other schema enabled databases #dbmonster.jdbc.schema=schema_name # maximal number of (re)tries dbmonster.max-tries=50 # default rows number for SchemaGrabber dbmonster.rows=1000 # progres monitor class dbmonster.progress.monitor=pl.kernelpanic.dbmonster.ProgressMonitorAdapter
Dbmonster is now ready to reach your database.
The file is pretty self explanatory. However, here are some pointers:
Now let's run it:
Besides the obvious difference in the number of rows, the second generation had larger text values (char(250) vs char(40)). You can see how that affected generation time. It should have finished (proportionally) in
In conclusion, depending on the size of the database you need to generate, and the use you intend for it (forget testing a huge schema with referential constraints), dbmonster can certainly aid
Before it's ready to feed it, though, you need to create an xml file (ah, yes, the plague of the Java world, and I've been a Java guy for most of my working years...). Call it what you want,
being a consistent kind type of man, I've been using
<?xml version="1.0"?>
<!DOCTYPE dbmonster-schema PUBLIC "-//kernelpanic.pl//DBMonster Database Schema DTD 1.1//EN" "http://dbmonster.kernelpanic.pl/dtd/dbmonster-schema-1.1.dtd">
<dbmonster-schema>
<name>Sample DB</name>
<table name="jobs" rows="150000">
<key databaseDefault="true">
<generator type="pl.kernelpanic.dbmonster.generator.MaxKeyGenerator">
<property name="columnName" value="id"/>
</generator>
</key>
<column name="title" databaseDefault="false">
<generator type="pl.kernelpanic.dbmonster.generator.StringGenerator">
<property name="allowSpaces" value="true"/>
<property name="excludeChars" value=""/>
<property name="maxLength" value="250"/>
<property name="minLength" value="15"/>
<property name="nulls" value="0"/>
</generator>
</column>
<column name="link" databaseDefault="false">
<generator type="pl.kernelpanic.dbmonster.generator.StringGenerator">
<property name="allowSpaces" value="true"/>
<property name="excludeChars" value=""/>
<property name="maxLength" value="65500"/>
<property name="minLength" value="150"/>
<property name="nulls" value="0"/>
</generator>
</column>
<column name="description" databaseDefault="false">
<generator type="pl.kernelpanic.dbmonster.generator.StringGenerator">
<property name="allowSpaces" value="true"/>
<property name="excludeChars" value=""/>
<property name="maxLength" value="65500"/>
<property name="minLength" value="150"/>
<property name="nulls" value="0"/>
</generator>
</column>
<column name="city" databaseDefault="false">
<generator type="pl.kernelpanic.dbmonster.generator.StringGenerator">
<property name="allowSpaces" value="true"/>
<property name="excludeChars" value=""/>
<property name="maxLength" value="250"/>
<property name="minLength" value="15"/>
<property name="nulls" value="0"/>
</generator>
</column>
<column name="postdate" databaseDefault="false">
<generator type="pl.kernelpanic.dbmonster.generator.DateTimeGenerator">
<property name="nulls" value="0"/>
</generator>
</column>
<column name="company_id" databaseDefault="false">
<generator type="pl.kernelpanic.dbmonster.generator.NumberGenerator">
<property name="nulls" value="0"/>
</generator>
</column>
<column name="country_id" databaseDefault="false">
<generator type="pl.kernelpanic.dbmonster.generator.NumberGenerator">
<property name="nulls" value="0"/>
</generator>
</column>
</table>
</dbmonster-schema>
export CLASSPATH=dbmonster*.jar:mysql*jar
java pl.kernelpanic.dbmonster.Launcher -s sample_db.xml
fipar@telecaster:~/soft/dbmonster-core-1.0.3$ java -classpath mysql-connector-java-5.1.6-bin.jar:dbmonster-core-1.0.3.jar pl.kernelpanic.dbmonster.Launcher -s sample_db.xml
2009-08-05 13:43:27,244 INFO DBMonster - Let's feed this hungry database.
2009-08-05 13:43:27,697 INFO DBCPConnectionProvider - Today we are feeding: MySQL 5.0.75-0ubuntu10.2-log
2009-08-05 13:43:27,783 INFO Schema - Generating schema
about 1:50, yet it took dbmonser almost 10 minutes to generate this data.
you in stress testing any database engine.