New REST API for jBPM 6

24 October 2013 3 comments

Most of the team has been working very hard for the last couple months on the Drools/jBPM/Optaplanner 6.0 release. One thing that’s changed with this release is that the umbrella project has given a new name KIE. You can find more about that elsewhere.

I wanted to quickly introduce the new REST API — and ask for feedback, should you be so inclined. Some of the community has been kind enough to already submit jira’s with suggestions, which is great!

I’ve been documenting the REST API here: https://github.com/droolsjbpm/droolsjbpm-integration/wiki/Rest-API

If you use the (new) jbpm-console war or the kie-wb war, the REST API is available via those war’s.

Again, if you have any ideas or suggestion, please feel free to leave a comment or submit a jira.

Thanks!

Categories: jBPM

An update (perl) script for updating github repositories.

14 August 2013 Leave a comment

One of the challenges of the drools/jbpm/kie project is that there are lots or repositories — even more so with the 6.x branch. Of course, there are the core repositories (drools, jbpm), the shared API repositories, the integration repositories and then once you start looking at the appliations, most of which based on the new UberFire framework, the list just keeps on growing.

There’s also the fact that I regularly dive into the code of say, Hibernate, Hornetq, RestEasy or Wildfly. Of course, then there are git repositories with examples, Arquillian container repositories..

You get the point.

What I wanted to share with you is a script that I use to regularly update all of these repositories. You can find it here, where I’ve pasted it into a github gist.

First off, the script assumes that it’s in the parent directory of your repository. That is to say that if all of your github repositories are in /home/me/workspace, then the script assumes that it’s been started there.

This script does the following for every repository that you give it:

  1. Changes directory into the repository.
  2. runs git remote update
  3. Checks to see that the current (git) branch is master.
  4. Checks to see if there are any changes to the repository (any file changes that are unstaged, in the index or otherwise not committed).
  5. Calls git merge --ff origin/master
  6. Prints the output of git status
  7. Calls git prune
  8. Calls git gc

The list of repositories can be found at the end of the script, underneath the __DATA__ tag in the perl script. You can also add comments in the list.

Of course, you can restart the script halfway if you want to by calling ./update.pl -f <repo> (which restarts “from” that repo) or ./update.pl -a <repo> (which restarts “after” that repo). Enjoy!

Categories: Other

Why it doesn’t pay to write unit tests

21 November 2012 5 comments

I’ve unfortunately had to do a lot of traveling in the last couple days and I’ve been reading No one makes you shop at Wal-Mart on the plane. Among other things, the book describes the economic model underlying  the idea of a ‘public good’.

A public good, as opposed to a private good, is basically a resource that is enjoyed freely by a group. A clean environment or a quiet neighborhood is a good example of this.

A private good, in this example, might be the right to play your music as hard as you want to. However, if everyone starts doing that, the public good of a quiet neighborhood will soon disappear. In this scenario, everyone in the neighborhood has to pay the cost of setting a limit on how hard and when you can play your music in order to preserve the public good of a quiet neighborhood. This idea is related to the Prisoner’s Dilemma, for those of you curious about that.

With regards to software development, a set of well-running unit tests is a public good while the act of writing a  unit test is actually a private cost. I think this is self-explanatory to most of the developers reading this,  but I’ll explain it just to be sure.

Writing a unit test that is of good quality is not advantageous to a developer writing or modifying code for the following reasons:

The productivity of a developer is measured based on 2 things:

  1. The number and quality of features the she produces
  2. The number of bugs she fixes

The first measurement, the number of features produced, is weighted more heavily: that is to say that creating a feature is, in general, seen as more productive than fixing a bug.

However, writing a good unit test does not directly contribute to either the creation of features or bug fixes.

While writing a unit test might be helpful when creating or modifying a feature, it’s not necessary. Every decent developer out there is perfectly capable of writing software without having to write a unit test. In fact, writing a unit test costs time which a developer might otherwise have spent on writing more features or fixing bugs! The better the unit test is, the more time that a developer will have needed to spend on it, making good unit tests more “expensive” than lower quality unit tests.

Thus, from an individual developer’s perspective, it does not pay to write good unit tests, especially in the short term.

Furthermore, the unfortunate thing about “quality”, is that the quality of a feature (or any piece of code) is something that can only be measured in relation to how long the code has existed. In other words, the quality of code is never immediately apparent and frequently only apparent after a significant period of time and use. Often, by the time the (lesser) quality of the code becomes apparent, no one can remember or determine exacty who created the feature and it’s not productive to search for that information.

But it does always pays to have a high quality suite of unit tests. A well-written suite of unit tests does 3 things:

  • Most importantly, it will alert the developer to any problems caused by new or changed code.
  • Because an existing unit test will obviously use the existing API, it will alert the developer to problems with backwards compatibility if the developer changes the existing API.
  • Lastly, unit tests are functional examples of code use: they document how existing code should and can be used.

All of these benefits help a developer to write better quality features (in less time) and help not only with fixing bugs, but also with preventing bugs!

But, in a situation in which there are no external pressures in how a developer writes his or her code, there are no immediate reasons for a developer to write unit tests. This is especially true in situations in which the developer will only be working on a project for a relatively short period of time.

Of course, some developers might feel that writing tests helped them develop features more quickly — or that it might help them fix bugs more quickly. However, if at a certain point they have to justify their use of time to a superior (the project lead, project manager, etc.) and they explain that they were writing unit tests instead of writing new features or fixing bugs, they will get in trouble, especially if there’s less value placed on unit tests or refactoring.

At a company, obviously, this is where a project manager, project lead or even a CTO comes in. While it may be in the interest of the developer to create new features and fix bugs as quickly as possible, it’s probably equally important to the CTO and other managers that the quality of the software created meets certain standards. Otherwise, users of the software might become so disgruntled with the software that they’ll complain, leading to a negative reputation of the software, which may lead to the company going bankrupt!

It’s in the interests of the CTO and other managers to require software developers to write unit tests that are of a certain level of quality: namely, the unit tests should be good enough to assure that the software retains a positive reputation among it’s customer base. This is often a difficult limit to quantify but luckily often easier to qualify, I think.

Open source software is, however, a different story. There is no CTO and the highest authority in an open source project is often the lead of the project.

The question, then, is what determines the quality of a suite of unit tests in an open source project? To a large degree, the answer is obviously the attitude of the lead of the project. Attitude is often very hard to measure, unfortunately: it’s easy enough to say one thing and do another. If you ask the lead what she thinks, it’s hard to say if the answer she gives you represents the attitude that she communicates to the rest of her project.

Realistically, one of the most decisive factors determining the quality of the features of an open source project is simply the example set by the lead in the code and tests that he or she writes.

Categories: Other

Openshift at Devoxx 2012

14 November 2012 1 comment

Good morning! If you’re at Devoxx today, feel free to come by and visit me at 14.00: I’m giving a talk on OpenShift. I’ve got to say this: I’ve rarely had so much fun preparing a talk.

The thing is, OpenShift just works. Sure, now and then you have to figure a few things out, but given all of the work it does for you — or rather, given all of the work you no longer need to do, it.. it rocks!

The talk is called “Openshift: State of the Union” and it’s a quick primer followed by a couple of demos. Now, I just need to hope they don’t randomly decide to do maintenance during my talk. ;D

I’ve just uploaded my slides for those curious: you can find them on slideshare.

See you at the talk!

Categories: Other

Persistence testing in jBPM

27 October 2012 Leave a comment

In the last year or so, I’ve started 3 different persistence-related testing initiatives for jBPM. This is a quick summary of what’s already been created and what the plans are going forward.

Maven property injection cross-database testing

 
I’ve described the basic mechanisms and infrastructure of this testing infrastructure here:
- jBPM 5 Database testing wiki page

The main jBPM project pom contains maven properties that are injected into resource files that are placed in (test) resource directories for which maven (property) filtering has been turned on.

In turn, when maven processes the test resources, the properties in the filtered files are replaced with the values placed in the pom. The resource files filtered are mostly a persistence.xml file and a datasource.properties file used to create data sources.

Unfortunately, there a still a couple problems:

  • To start with, this is only really completely implemented in the drools-persistence-jpa and jbpm-persistence-jpa modules. It needs to be fully implemented or “turned on” in a number of other important modules, such as jbpm-bam, jbpm-bpmn, jbpm-human-task-core and jbpm-test.
  • Some developers have encountered problems in their environments due to the fact that the persistence.xml file (in src/test/filtered-resources) contains properties (like ${maven.jdbc.url}) instead of real values. I’m not sure what’s going on there, but I’d like to fix that problem if I can.
  • Lastly, this infrastructure doesn’t help us test with different ORM frameworks. The problem here is that it’s practically impossible to test using a specific ORM framework (for example Hibernate 3.3) while the other ORM frameworks (Hibernate 4.1, OpenJPA) are in the classpath. But if you want to test with a specific ORM framework, the first thing you need to do is to have it in the classpath. So how do I test against multiple (“specific”) ORM frameworks?
    • While maven profiles are an answer to this, maven profiles add lots of complexity to a setup. I don’t want to make a maven profile for every different ORM framework that we need to test against, instead, I’d like to just make a maven profile that turns on the cross-database and cross-ORM framework testing
  • In the coming months, it looks like we’ll try to make sure that this framework is turned on and executed in all of the modules were it’s applicable. While I expect that I’ll eventually remove the maven filtering being used here, I think I’ll probably try to keep the property based control: being able to run the test suite on a different database simply by injecting (settings.xml) or otherwise changing (in the pom.xml) properties is valuable.

    Backwards compatible BLOB testing

     
    One new that I encountered when learning the jBPM 5 (and Drools) code is the use of BLOB’s in the database to store session and process instance state (and work item info). One of the great advantages of storing process instance state in a BLOB is that it avoids the complicated nest of tables that a BPM engine would otherwise bring with it — see the jBPM 3 database schema for a good example. :/

    Using a BLOB also has the advantage that changes can be made to the underlying data model without having an impact on the database schema that the (end) user will use while working with jBPM 5. This, more than any other reason, is really _the_ reason to use BLOBs. I still like to think about how much easier that makes the work of developers working on Drools and jBPM. (Considering the complexity of the Drools RETE network, it makes even more sense to use a BLOB.)

    However, when the serialization (or “marshalling”) code was originally developed, the choice was made for a hand-crafted serialization algorithm, instead of relying on pure Java serialization. This is also with good reason given that Java serialization is slow and the hand-crafted serialization algorithm was a lot faster.

    At the beginning of this year, for reasons having to do with the evolution of both Drools and jBPM 5, Edson (Tirelli) replaced all of the serialization code for Drools and jBPM 5 with protobuf. That was definitely an improvement, mostly because protobuf makes forward and backwards compatibility way easier.

    However, it made some of the “marshalling testing framework” work I had done up until that moment not usable anymore: the marshalling testing framework relies on generated databases from previous versions to be able to test for backwards and forwards compatibility. Switching to protobuf unfortunately broke all backwards compatibility at the cost of making sure that backwards compatibility from then on would be ensured.

    At the moment, what needs to be done is the following:

    • The databases for the different branches (5.2, 5.3, 5.4 and now 6.0) of jBPM need to be regenerated.
    • The code for the marshalling framework (as well as other persistence testing classes) needs to be cleaned up a little bit.

    Multiple ORM framework (ShrinkWrap based) testing

     
    I ran into a specific Hibernate 4 problem last month and it was unfortunately something that could also have affected compatibility with Hibernate 3. This meant that I needed to run the test suite multiple times with Hibernate 3 and then Hibernate 4 to check certain issues.

    As a result, I ended up building the framework in the org.jbpm.dependencies [github link] package. This framework creates a jar containing the tests to be run with the specified database settings (persistence.xml) as well as the specific ORM framework that the tests should be run with. It then runs the tests in the jar.

    This is eventually the framework that I want to have used within all persistence related jbpm modules. The code itself should probably be moved to the jbpm-persistence-jpa module.

Categories: jBPM, Persistence

How Guvnor and Designer talk to each other

4 October 2012 5 comments

I just spent a good hour talking with Tihomir about Designer.

Specifically, he explained how the interaction between Designer and Guvnor works.

In order to finish my work on a persistence layer for Designer, one of the things I need to understand is how Designer interacts with Guvnor. At the moment, Designer basically uses Guvnor for persistence: everything that you modify, create and save in Designer is saved in the Guvnor instance that Designer runs in.

However, there’s been more and more interest in being able to run Designer standalone: running Designer independently and without a ‘backing’ Guvnor insance. What I’m working on is inserting a persistence “layer” in the Designer architecture so that users can choose whether to use Guvnor for this persistence or whether to use a standalone persistence layer for this (such as a a database) — or some combination of this.

But in order to do that work, it’s important to understand exactly how Guvnor interacts with Designer: what does Guvnor tell Designer and what does Designer store in Guvnor, and how and when does the communication about this happen?

And now I can explain that interaction to you.

Let’s start at the very beginning: you’ve installed Guvnor and Designer on your application server instance and everything is running. You have this brilliant idea for a new BPMN2 process and so you click, click around to create a new BPMN2 asset and in doing so, open Designer in a new IFrame within Guvnor.

Now, when you open Designer, if I understand correctly, Designer takes a bunch of the standard, default “assets” that it will use later and goes ahead and stores them in Guvnor. Some of these are stored in the global area and others are stored in the particular package that your asset will belong to. But how does Designer actually know what the package and asset name is of what it’s editing?

Let me focus on a detail here: when you actually open Designer in Guvnor, you’ll be opening a specific URL, that will look something like this:

http://localhost:8080/drools-guvnor/org.drools.guvnor.Guvnor/Guvnor.jsp?#AssetEditorPlace:972fc35a-a3cc-430c-b460-71477931ca5b

This results in the UUID that you see after AssetEditorPlace: being sent to Designer (on the server-side). Unfortunately, Guvnor doesn’t have a method for looking up an asset based on its UUID, so Designer needs to figure out which the package and asset name of the asset it’s working with. Designer needs this information in order to interact with Guvnor. That means Designer does the following:

List of assets for the defaultPackage package

List of assets for the defaultPackage package

  • Designer first requests the list of all packages available in Guvnor.
  • After that, Designer then requests the list of all assets in each package.
  • Then Designer keeps searching list of assets until it finds the UUID it’s been given.
  • This way, it can figure out what the package name and asset name (title) is of the asset (process) it’s editing.

Naturally, I didn’t believe Tiho at first when he said this. My second reaction was to submit a Jira to fix this (GUVNOR-1951). I’llmake this better as soon as I finish this persistence stuff (or maybe as part of it.. hmm. Depends on how quickly Guvnor adds the needed feature.)

In any case, at this point, you have your blank canvas in Designer and Designer knows where it can store it’s stuff in Guvnor.

You go ahead and create your brilliant BPMN2 process with the help of all of Designer’s awesomely helpful features.

And then you need to save the process, what next?

Saving the process in Guvnor

Saving the process in Guvnor

Right, you click on a menu in Guvnor, and then it gets complicated. To start with, clicking on “Save Changes” or “Save and Close” in the Guvnor menu calls some client-side JavaScript in Guvnor that then calls some client-side JavaScript from Designer.

The ORYX.EDITOR.getSerializedJSON() call

This here on the right is a screen print of a find in Tiho’s IntelliJ IDE of the Guvnor (GWT) Java class showing where the code for this call is. The call shown is the ORYX.EDITOR.getSerializedJSON(), in case you don’t have superhuman vision and can’t read the text in the picture.

This calls JavaScript in Designer that retrieves the JSON model of the BPMN2 process in the canvas. (Designer actually stores the BPMN2 process information on the client side in a JSON data structure, which gets translated to BPMN2 XML on the server side.)

Once the Guvnor JavaScript (client-side) has gotten this JSON representation of the BPMN2 process back, it then sends a request to a Designer servlet that translates the JSON to BPMN2. Guvnor doesn’t really care about JSON — it certainly can’t read it and doesn’t know what to do with it, so it relies on Designer to translate this JSON to BPMN2 that it can store in its repository.

The Guvnor to Designer JSON to BPMN 2 "Translator" call

Again, for those of you without superhuman vision, the screen print fragment above is the Guvnor GWT code that makes sure that the retrieved JSON is translated to BPMN2.

Once the Guvnor client-side JavaScript (derived from Guvnor GWT code) has gotten the BPMN2 back, it then sends that XML back to the Guvnor server-side to be stored in the repository (under the correct package name and asset title).

And that’s how it all works!

Of course, I’m no GWT expert, so I might have glossed over or incorrectly reported some details — please do let me know what you don’t understand or if I made any mistakes (that means you too, Tiho :) ).

Regardless, the above summarizes most of the interaction between Guvnor and Designer. The idea with the persistence layer I’m adding to Designer is that much of this logic in Designer will be centralized. Having the logic in a central place in the code in Designer will then allow us to expose a choice to the user (details pending) on where and how he or she wants to store the process and associated asset data.

Lastly, there are definitely opportunities to improve the performance of this logic. For example, using Infinispan as a cache on the server-side, even when Designer is used with Guvnor is an idea that occurred to me, although I haven’t thought enough yet about whether or not it’s a good idea. We’ll see what else I run into as I get further into this..

Categories: jBPM

Basic tutorial on Event Sourcing and CQRS

22 September 2012 Leave a comment

I recently learned about the Event Sourcing (and CQRS) design patterns. Unfortunately, reading through Martin Fowler’s bliki definition just didn’t help. What helped the most were two things:

Rinat Abdullin's Decide-Act-Report diagram

Rinat Abdullin’s Decide-Act-Report (CQRS) diagram

  1. Matt Givney’s contact-mgr-cgrs example
  2. Brian Ritchie’s CQRS presentation

CQRS relies on Event Sourcing in order to work.

Event Sourcing requires the following:

  • You have an event store in which you store events.
  • You have Aggregate Roots, which is a way of structuring data structures so that you can always link the fields and attributes and other associated data with the underlying “root” data structure. For example, the ‘pages’ in a ‘book’ would never change without the ‘book’ data structure knowing that.

Event Sourcing means that every time you change any data — any state, in fact, which also happens to be data  — you generate an event.

Matt’s contact manager example application does just that: in all of the methods which change any data, an event is 1. created, 2. filled with the new data, and 3. added to the event store. It’s a little more complicated than that, but only because you also have to take into account the transactional logic necessary when changing data or state.

What’ s interesting about this is that you can then recreate state by “replaying” the events — from the beginning, if you have to. Replaying the events means that all of the incremental changes made are then reapplied after which you get your current state back!

CQRS is a design pattern that makes distributed and concurrent systems easier to design and maintain. The diagram above illustrates the problem particularly well: The user decides what she wants, resulting in a change being acted out on the state. That change then is also translated into an event which updates the report, which the user can then query if she’s curious.

By separating your “write” data model from your “read” data model you can respond to queries much more quickly and you can also make sure to use costly database transactions for operations that need it, for example. You do then run the risk that the data you query is not 100% accurate with regards to when it was queried: your “read”  data model is always updated after your “write” data model.

Of course, Brian Ritchie’s CQRS presentation explains it better than I could (Sorry Udi, you came up with it, but Brian just explained it better, imho).


Why am I writing about this? Well, CQRS certainly doesn’t really apply to my work, but Event Sourcing does. Being able to rewind (and re-forward) state is something that I think a BPM engine should do. I’m also curious if Event Sourcing could help when architecting any BPM engines that deal with ACM.

Categories: Other
Follow

Get every new post delivered to your Inbox.