In the last 2 weeks or so, I’ve been busy with troubleshooting a performance problem on Websphere 8.5. And it was a pain in the rear to figure how to generate dumps.. so, here’s what I figured out, that others my profit and enjoy it!
There are 2 ways to generate dumps (and 3 different types of dumps) on Websphere.
- Open the console (probably at https://localhost:9043/ibm/console/login.do )
- Open (click on) the “Troubelshooting” group on the left side.
- Open (click on) “Java dumps and cores”
- Voila, select “Heap dump”, “Java core” or “System dump”!
- If you’re on Linux and are getting messages when generating a “System dump” (otherwise known as a core dump), you probably have to modify systemd’s coredump preferences (and/or a coredump preference file in /etc/sysctl.d/).
Otherwise, you can also make websphere generate dumps when receiving a SIGQUIT signal. However, you have to make sure that the IBM JDK receives some parameters in order to do this.
- Open the console (probably at https://localhost:9043/ibm/console/login.do )
- Open (click on) the “Servers” group on the left side.
- Open (click on) the “Server Types” group, under which you’ll see “Websphere application servers” — click on that as well.
- Click on your server in order to go your server preferences.
- Scroll down until you see “Java and Process Management” under the “Server Infrastructure” heading on the right side.
- Open (click on) “Java and Process Management” and click on “Process Definition”
- Click on “Java Virtual Machine” on the right of the page, under the “Additional Properties” heading.
- On the “Configuration” tab of the “Java Virtual Machine”, you’ll see a “Generic JVM arguments” text box.
- Add the following arguments to the text box:
-Xdump:system:events=abort+user+vmstop,label=core.%pid-%Y%m%d.%H%M%S.dmp,request=exclusive+prepwalk+compact -Xdump:heap:events=abort+user+vmstop,label=heapdump.%pid-%Y%m%d.%H%M%S.phd,request=exclusive+compact -Xdump:java:events=abort+user+vmstop+slow,label=javacore.%pid-%Y%m%d.%H%M%S.txt,request=exclusive+compact
Once you’ve started websphere, you can then generate these dumps by sending a SIGQUIT signal to the process, for example, like so:
kill -QUIT <websphere-process-id>
kill -3 <websphere-process-id>
(I unfortunately don’t know how you would do that on windows.. :/ )
These dumps will probably be generated in the home directory of your websphere installation (or somewhere under that, depending on how you configured your websphere installation).
If you need more info about these options, then I recommend either runing any of the following (IBM JDK) java commands:
- java -Xdump:events
- java -Xdump:request
Or otherwise, use google to search for things “Xdump websphere“.
Once you’ve gnerated the dumps, your main 2 tools will be:
- Eclipse Memory Analyzer + the DTFJ plugin
- The ISA (IBM Support Assistant) support tool suite for Websphere (v5 as of early 2016)
- In particular, the “Memory Analyzer” tool is equivalent to MAT + DTFJ.
The ISA support tool suite is better and has a far broader array of tools to use, but (in my experience) harder to find your way around.
At this point, I wish you the best of luck in figuring out your problem, whether it be a memory leak or a performance issue! While I’ve sometimes been very lucky in finding issues, at other times it can be very similar to looking for a needle in a haystack.
For almost 2 years now, I’ve been playing around with different ideas about BPM engine architectures — in my “free” time, of course, which wasn’t that often.
The initial idea that I started with was separating execution from implementation in the BPM engine. However, that idea isn’t self-explanatory, so let me explain more about what I mean.
If we look at the movement (migration) of an application from a single (internal) server to a cloud instance, this idea is evident: the application code is our implementation while the execution of the application is moving from a controlled environment to a remote environment (the cloud) that we have no control over.
A BPM engine, conceptually, can be split into the following idea’s:
- The interpretation of a workflow representation and the subsequent construction of an internal representation
- The execution, node for node, of the workflow based on the internal representation
- The maintainance and persistence of the internal workflow representation
A BPM engine is in this sense very similar to an interpreted language compiler and processor, except for the added responsibility of being able to “restart” the process instance after having paused for human tasks, timers or other asynchronous actions.
The separation for execution from implementation then ends up meaning that the code to execute a certain node, whether it’s a fork, join, timer or task node, is totally separated and independent of the code used to parse, interpret and maintain the internal workflow representation. In other words, the engine shouldn’t have to know where it is in the process in order to move forward: it only needs to know what the current node is.
While is this to some degree self-evident, this idea also lends itself to distributed systems easily.
Recently, I came across an interesting talk Fred George on Microservices at Baruco 2012. He describes the general idea behind Microservices: in short, microservices are 100 line services that are meant to be disposable and (enormously) loosely coupled. Part of the idea behind microservices is that the procedural “god” class disappears as well as what we typically think of when we think of the control flow of a program. Instead, the application becomes an Event Driven Cloud (did I just coin that term? ;D ). I specifically used the term cloud because the idea of a defined application disappears: microservices appear and disappear depending on demand and usage.
A BPM engine based on this architecture an then be used to help provide an overview or otherwise a translation between a very varied and populated landscape of microservices and the business knowledge possessed by actual people.
But, what we don’t want, is a god or service: in other words, we don’t want a single instance or thread that dictates what happens. In some sense, we’re coming back to one of the core idea’s behind BPM: the control flow should not be hidden in the code, it should be decoupled from the code and made visible and highly modifiable.
At this point, we come back to the separation of execution from implementation. What does this translate to in this landscape?
One example is the following. In this example, I’m using the word “event” very liberally: in essence, the “event” here is a message or packet of information to be submitted to the following service in the example. Of course, I’m ignoring the issue of routing here, but I will come back to that after the example:
- An “process starter” service parses a workflow representation and produces an event for the first node.
- The first workflow node is executed by the appropriate service, which returns an event asking for the next node in the process.
- The “process state” service checks off the first node as completed, and submits an event for the following node.
- Steps 2 and 3 repeat until the process is completed.
There are a couple of advantages that spring to mind here:
- Extending the process engine is as simple as
- introducing new services for new node types
- introducing new versions of the “process starter” and “process state” services
- Introducing new versions of the process engine becomes much easier
- Modifying the process definition of an ongoing process instance becomes a possibility!
However, there are drawbacks and challenges:
- As I mention above, we do run into the added overhead of routing the messages to the correct service.
- Performance is a little slower here, but then again, we’re doing BPM: performance penalties for BPM are typically found in the execution of specific nodes, not in the BPM engine itself.
- This is similar to databases where the choice between a robust database and a cache solution depend on your needs.
In particular, reactive programming (vert.x) seems to be a paradigm that would lend itself to this approach on a smaller scale (within the same server instance, so to speak), while still allowing this approach to scale.
One of the challenges of the drools/jbpm/kie project is that there are lots or repositories — even more so with the 6.x branch. Of course, there are the core repositories (drools, jbpm), the shared API repositories, the integration repositories and then once you start looking at the appliations, most of which based on the new UberFire framework, the list just keeps on growing.
There’s also the fact that I regularly dive into the code of say, Hibernate, Hornetq, RestEasy or Wildfly. Of course, then there are git repositories with examples, Arquillian container repositories..
You get the point.
What I wanted to share with you is a script that I use to regularly update all of these repositories. You can find it here, where I’ve pasted it into a github gist.
First off, the script assumes that it’s in the parent directory of your repository. That is to say that if all of your github repositories are in /home/me/workspace, then the script assumes that it’s been started there.
This script does the following for every repository that you give it:
- Changes directory into the repository.
git remote update
- Checks to see that the current (git) branch is master.
- Checks to see if there are any changes to the repository (any file changes that are unstaged, in the index or otherwise not committed).
git merge --ff origin/master
- Prints the output of
The list of repositories can be found at the end of the script, underneath the __DATA__ tag in the perl script. You can also add comments in the list.
Of course, you can restart the script halfway if you want to by calling ./update.pl -f <repo> (which restarts “from” that repo) or ./update.pl -a <repo> (which restarts “after” that repo). Enjoy!
I’ve unfortunately had to do a lot of traveling in the last couple days and I’ve been reading No one makes you shop at Wal-Mart on the plane. Among other things, the book describes the economic model underlying the idea of a ‘public good’.
A public good, as opposed to a private good, is basically a resource that is enjoyed freely by a group. A clean environment or a quiet neighborhood is a good example of this.
A private good, in this example, might be the right to play your music as hard as you want to. However, if everyone starts doing that, the public good of a quiet neighborhood will soon disappear. In this scenario, everyone in the neighborhood has to pay the cost of setting a limit on how hard and when you can play your music in order to preserve the public good of a quiet neighborhood. This idea is related to the Prisoner’s Dilemma, for those of you curious about that.
With regards to software development, a set of well-running unit tests is a public good while the act of writing a unit test is actually a private cost. I think this is self-explanatory to most of the developers reading this, but I’ll explain it just to be sure.
Writing a unit test that is of good quality is not advantageous to a developer writing or modifying code for the following reasons:
The productivity of a developer is measured based on 2 things:
- The number and quality of features the she produces
- The number of bugs she fixes
The first measurement, the number of features produced, is weighted more heavily: that is to say that creating a feature is, in general, seen as more productive than fixing a bug.
However, writing a good unit test does not directly contribute to either the creation of features or bug fixes.
While writing a unit test might be helpful when creating or modifying a feature, it’s not necessary. Every decent developer out there is perfectly capable of writing software without having to write a unit test. In fact, writing a unit test costs time which a developer might otherwise have spent on writing more features or fixing bugs! The better the unit test is, the more time that a developer will have needed to spend on it, making good unit tests more “expensive” than lower quality unit tests.
Thus, from an individual developer’s perspective, it does not pay to write good unit tests, especially in the short term.
Furthermore, the unfortunate thing about “quality”, is that the quality of a feature (or any piece of code) is something that can only be measured in relation to how long the code has existed. In other words, the quality of code is never immediately apparent and frequently only apparent after a significant period of time and use. Often, by the time the (lesser) quality of the code becomes apparent, no one can remember or determine exacty who created the feature and it’s not productive to search for that information.
But it does always pays to have a high quality suite of unit tests. A well-written suite of unit tests does 3 things:
- Most importantly, it will alert the developer to any problems caused by new or changed code.
- Because an existing unit test will obviously use the existing API, it will alert the developer to problems with backwards compatibility if the developer changes the existing API.
- Lastly, unit tests are functional examples of code use: they document how existing code should and can be used.
All of these benefits help a developer to write better quality features (in less time) and help not only with fixing bugs, but also with preventing bugs!
But, in a situation in which there are no external pressures in how a developer writes his or her code, there are no immediate reasons for a developer to write unit tests. This is especially true in situations in which the developer will only be working on a project for a relatively short period of time.
Of course, some developers might feel that writing tests helped them develop features more quickly — or that it might help them fix bugs more quickly. However, if at a certain point they have to justify their use of time to a superior (the project lead, project manager, etc.) and they explain that they were writing unit tests instead of writing new features or fixing bugs, they will get in trouble, especially if there’s less value placed on unit tests or refactoring.
At a company, obviously, this is where a project manager, project lead or even a CTO comes in. While it may be in the interest of the developer to create new features and fix bugs as quickly as possible, it’s probably equally important to the CTO and other managers that the quality of the software created meets certain standards. Otherwise, users of the software might become so disgruntled with the software that they’ll complain, leading to a negative reputation of the software, which may lead to the company going bankrupt!
It’s in the interests of the CTO and other managers to require software developers to write unit tests that are of a certain level of quality: namely, the unit tests should be good enough to assure that the software retains a positive reputation among it’s customer base. This is often a difficult limit to quantify but luckily often easier to qualify, I think.
Open source software is, however, a different story. There is no CTO and the highest authority in an open source project is often the lead of the project.
The question, then, is what determines the quality of a suite of unit tests in an open source project? To a large degree, the answer is obviously the attitude of the lead of the project. Attitude is often very hard to measure, unfortunately: it’s easy enough to say one thing and do another. If you ask the lead what she thinks, it’s hard to say if the answer she gives you represents the attitude that she communicates to the rest of her project.
Realistically, one of the most decisive factors determining the quality of the features of an open source project is simply the example set by the lead in the code and tests that he or she writes.
The thing is, OpenShift just works. Sure, now and then you have to figure a few things out, but given all of the work it does for you — or rather, given all of the work you no longer need to do, it.. it rocks!
The talk is called “Openshift: State of the Union” and it’s a quick primer followed by a couple of demos. Now, I just need to hope they don’t randomly decide to do maintenance during my talk. ;D
I’ve just uploaded my slides for those curious: you can find them on slideshare.
See you at the talk!
I recently learned about the Event Sourcing (and CQRS) design patterns. Unfortunately, reading through Martin Fowler’s bliki definition just didn’t help. What helped the most were two things:
CQRS relies on Event Sourcing in order to work.
Event Sourcing requires the following:
- You have an event store in which you store events.
- You have Aggregate Roots, which is a way of structuring data structures so that you can always link the fields and attributes and other associated data with the underlying “root” data structure. For example, the ‘pages’ in a ‘book’ would never change without the ‘book’ data structure knowing that.
Event Sourcing means that every time you change any data — any state, in fact, which also happens to be data — you generate an event.
Matt’s contact manager example application does just that: in all of the methods which change any data, an event is 1. created, 2. filled with the new data, and 3. added to the event store. It’s a little more complicated than that, but only because you also have to take into account the transactional logic necessary when changing data or state.
What’ s interesting about this is that you can then recreate state by “replaying” the events — from the beginning, if you have to. Replaying the events means that all of the incremental changes made are then reapplied after which you get your current state back!
CQRS is a design pattern that makes distributed and concurrent systems easier to design and maintain. The diagram above illustrates the problem particularly well: The user decides what she wants, resulting in a change being acted out on the state. That change then is also translated into an event which updates the report, which the user can then query if she’s curious.
By separating your “write” data model from your “read” data model you can respond to queries much more quickly and you can also make sure to use costly database transactions for operations that need it, for example. You do then run the risk that the data you query is not 100% accurate with regards to when it was queried: your “read” data model is always updated after your “write” data model.
Why am I writing about this? Well, CQRS certainly doesn’t really apply to my work, but Event Sourcing does. Being able to rewind (and re-forward) state is something that I think a BPM engine should do. I’m also curious if Event Sourcing could help when architecting any BPM engines that deal with ACM.
When I was in college, I saw the movie “The Matrix” a couple times. I only watched it 3 times maybe, but I never watch movies multiple times (Okay, except for my favorite movie ever, but that’s another story).
In any case, I would get so psyched to code after watching that. When I had a project I needed to code for school that I just wasn’t enjoying that much — or when I just wasn’t motivated to work, I’d watch the Matrix and it would just psyche me up to build the worlds I was building in my code.
I think it was the sense of discovering and creating your own worlds that “The Matrix” conveys — the way it brought me in touch with that feeling in myself, and how I’ve always enjoyed that.
The (“leaked”) Valve handbook does that for me now — if I had theoretically read it, of course. But if I had read it, I would say that it is infused with that joy of creating. And I can’t help but be contaminated by that when I read it.