Wednesday, June 13, 2007

Pentaho on Oracle's App Server (OC4J)

Wow! It has really been 4 months since my last post?? Moving over to development has cut into the time I had for blogging, documenting, communicating, you name it! We are coding like crazy.

Well, I'm back because I am heading out to ODTUG Kaleidoscope next week, and in preparation for the show, I decided to setup Pentaho on Oracle's Java Edition App Server, which is OC4J, which is based on the Orion app server. I was pleased that I managed the migration in less than a day, and I wanted to share the steps with all those folks who are too impatient to wait for this to get into our J2EE deployment distribution :)

Mind you, it takes a bit of tweaking, but it is certainly very do-able, and all server features are stable (minus the portal stuff, I didn't get a chance to address moving the portal over). Here is the repro of where I started, what I tweaked and what I came out with. Enjoy!

Where I Started

I started out by downloading a Pentaho J2EE deployment distribution from the Pentaho downloads site. The version I used for this exercise was This distribution is found on our downloads page under Pentaho Open BI Suite Web Archive (.war). I know, the name implies a .war distribution, but trust me, its the deployments zip file.

Unpack this distribution to a work directory of your choosing. This distribution has an Ant build script that lets you build several different .war files and .ear files configured for different app servers and RDBMSs. Next, I'll detail the tweaks I had to make to get the Orion build target working, which is sufficient to buid an .ear file appropriate for Oracle's OC4J.

I also downloaded the sample solutions distribution and the sample data (Hypersonic) distribution, so I would have stuff to test against. You can get both of these distributions from the Pentaho downloads site as well.

What I Tweaked

About two years ago, we had started to incorporate build scripts for the Orion application server in anticipation of great community demand for this build. However, with the multitude of projects we have taken on, and a surprising lack of banging from the community, we never found time nor priority to finish that build, until now.

I really want to be able to demo our stuff on Oracle at the user conference, so I took on the task of cleaning up and repairing the Orion build. NOTE that you will be able to get these fixes in a near future build, as soon as we get our M4 release out the door and I can check this stuff in.

The Build

For the brave and impatient, here's what it needs out of the gate:

  1. We'll call the root of your work directory [pentaho_j2ee_deployments]. In this directory, you will find a build.xml file. Open that file in your favorite text editor.
  2. Find the ant target named "build-orion2.0.5-ear". Delete that target entirely. Replace the target with the following XML:

    <!-- ===================================================================
    target: build-orion2.0.5-ear
    =================================================================== -->
    <target name="build-orion2.0.5-ear" depends="zip-pentaho-style-war">

    <antcall target="war-pentaho-tomcat">
    <param name="rdbms" value="hsqldb" />
    <mkdir dir="${build.ears.dir}/orion" />

    <ear destfile="${build.ears.dir}/orion/pentaho.ear"
    <attribute name="Implementation-Title"
    value="${impl.title}" />
    <attribute name="Implementation-Version"
    value="${impl.version}" />
    <attribute name="Implementation-Vendor"
    value="${impl.vendor}" />
    <fileset dir="${build.wars.dir}">
    <include name="pentaho-style.war" />
    <include name="sw-style.war" />
    <fileset dir="${build.wars.dir}/tomcat/hsqldb"
    includes="pentaho.war" />
    <metainf dir="pentaho-res/orion" includes="*.xml" />


  3. Save and close the build.xml file.
  4. In the same directory, open the file. Add the following line to this file:
  5. Save and close the file.

Those few steps fix up the build files so that you can build a pentaho.ear for OC4J. You of course will have to have Ant in your system path, and a JDK available of at least 1.4.2. I will assume if you are climbing this mountain, those are easy steps you already know how to set up:)

Don't run the build yet! There are several configuration file tweaks that have to be added in order for this app to be configured properly.

The web.xml File

We need to make a few minor changes in the Pentaho web.xml.

First, the OC4J container's default application is running on port 8888. Pentaho's default port is set to 8080. So I changed Pentaho's default port to be 8888, since that seemed to be the easiest road. NOTE that you want to add your changes to two web.xml files, since it is duplicated in the deployer distribution.

You need to modify BOTH

- [pentaho_j2ee_deployments]/pentaho-precompiled-jsps/pentaho.war/WEB-INF/web.xml


- [pentaho_j2ee_deployments]pentaho-webapp/WEB-INF/web.xml !!

  1. Open both web.xml files.
  2. Find the base-url param-name, with the value http://localhost:8080/pentaho/. Change the value to http://localhost:8888/pentaho/.

We also need to make sure the server can find the Pentaho solutions directory. If you haven't yet, unpack the solutions distribution that you downloaded to a work directory of your choice.

  1. Find the param-name pentaho-solutions, and replace the value with the absolute path to the pentaho-solutions directory that you just unpacked. The value should look something like d:\work\pentaho-solutions.
  2. Save and close both web.xml files.

The application.xml File

The application.xml file lives in the [pentaho_j2ee_deployments]/pentaho-res/ear directory.
  1. Open the application.xml file.
  2. Delete all modules under the comment <!-- additional web apps --> , as well as the web-uri module for the sw-style.war above the comment. The only modules your application.xml should have left is the pentaho.war and the pentaho-style.war.
  3. Save and close the application.xml file.

Note that the sw-style.war provides the styles for the Steel Wheels samples in the solutions, but I did this quick and dirty, so I left out as much extras as possible. You may want to include the sw-styles.war, and see if it works. I left it out.

The orion-web.xml File

The orion-web.xml file is non-existent, you need to create one.
  1. Add an orion-web.xml file to [pentaho_j2ee_deployments]/pentaho-webapp/WEB-INF.
  2. Add the following XML to the file:
    <resource-ref-mapping location="jdbc/HibernateDS" name="jdbc/Hibernate"/>
    <resource-ref-mapping location="jdbc/SampleDataDS" name="jdbc/SampleData"/>
    <resource-ref-mapping location="jdbc/QuartzDS" name="jdbc/Quartz"/>
    <resource-ref-mapping location="jdbc/SharkDS" name="jdbc/Shark"/>
    <resource-ref-mapping location="jdbc/SampleDataAdminDS" name="jdbc/SampleDataAdmin"/>
    <resource-ref-mapping location="jdbc/SampleDataDS" name="jdbc/datasource1"/>
    <resource-ref-mapping location="jdbc/SampleDataDS" name="jdbc/datasource2"/>
    <resource-ref-mapping location="jdbc/SampleDataDS" name="jdbc/datasource3"/>
    <resource-ref-mapping location="jdbc/SampleDataDS" name="jdbc/datasource4"/>
    <resource-ref-mapping location="jdbc/SampleDataDS" name="jdbc/solution1"/>
    <resource-ref-mapping location="jdbc/SampleDataDS" name="jdbc/solution2"/>
    <resource-ref-mapping location="jdbc/SampleDataDS" name="jdbc/solution3"/>
    <resource-ref-mapping location="jdbc/SampleDataDS" name="jdbc/solution4"/>
  3. Save and close the orion-web.xml.

Build the pentaho.ear File

You've got everything you need now to build the pentaho.ear file. Go to a command prompt, navigate to your [pentaho_j2ee_deployments] directory, and execute:

ant build-orion2.0.5-ear

OC4J Server Configuration Changes

Now that you have your .ear file, you will obviously want to deploy it through OC4J Server Console. First, shut down your OC4J instance, because we have a few mods that we need to make to the server configuration.

TopLink Conflicts

Now, I know this is not nice, but since we have no use for Toplink with Pentaho, and the antlr.jar conflicts with our hibernate3 library, you will need to delete the Toplink directory under [OC4J_Home]. This causes no harm as long as you're not using Toplink for some other reason. If so, then maybe you can investigate further with Oracle how to avoid the library conflicts.

This will cause the java_sso webapp in the default module deployed in the server to fail. I simply undeployed the app.

Out of Memory Errors

Pentaho is a big engine, and as such uses a bit of memory. I found that I was running out of memry early and often. So I modified the JVM_ARGS parameter in the [OC4J_Home]/bin/oc4j.cmd file to include:

set JVMARGS=%OC4J_JVM_ARGS% -Xms128m -Xmx512m -XX:MaxPermSize=128m

Add the hsqldb.jar to the [OC4J_Home/lib] Directory

Since we will be testing against the Hypersonic sample data, we need to add the JDBC driver for Hypersonic to the [OC4J_Home/lib] directory. You can find the hsqldb.jar in [pentaho_j2ee_deployments]/pentaho-third-party.

Note that when you deploy the .ear file, you MUST set an additional entry on the classloader's classpath for this jar!

Mod the data-sources.xml File

And finally, we need to add our datasources to the [OC4J_Home]/j2ee/home/config/data-sources.xml file. Add the following xml to the file, before the <datasources/> end tag:


Problem in the Solutions' XSLTs

I don't get this at all, but Oracle has some sort of built-in XSLT processor that complains when your XSLT tries to reference a Java class as the value for a namespace. So what you need to do is go through the [pentaho-solutions]/system/custom/xsl directory, and check every xsl for this occurrence (I would say at least 20 of them have it) and make the following mod to those XSLTs:

  1. Open the xslt file.
  2. Find the java class reference in the namespace declaration. They are always at the top of the file, and look simialar to xmlns:msg="org.pentaho.messages.Messages" . Note that xmlns:msg isn't the only possibility, just an example for you to follow.
  3. Prepend the java classname value with So the fixed example would be xmlns:msg="" .
  4. Repeat the steps above for every xslt file in that directory that has the namespace occurence.
There is one syntax problem in one of the xslt's as well. JBoss and Xalan don't seem to care about it, but OC4J will error out, so we need to fix that.

  1. Open the file [pentaho-solutions]/system/custom/xsl/files-list.xml.
  2. Search for the following lines of XML, and delete them. Note they are not right next to each other.
    <xsl:param name="target" />
    <xsl:with-param name="target" select="$target"/>

Deploying the Pentaho .ear File

Finally!! You can deploy your .ear file now. Start your OC4J server back up, and use the Server Console to deploy. Here are a few notes on the deployment options:

  1. You do not need to set up a security provider in the deployer wizard. Pentaho uses the standard J2EE security via ACEGI, and no custmo extensions to the configuration are necessary.
  2. Make sure that you map a path on the classloader to the hsqldb.jar that you moved into OC4J's lib directory!
  3. You may see some log4j exceptions on start-up, but as long as you see a statement toward the end of the console log that says "Pentaho BI Platform ready", then you should be ready to go!

Navigate in your browser to http://localhost:8888/pentaho/Home, and test it out!

What I've Got - Pentaho on OC4J!

So, now I'm stoked because I can talk intelligently to Oracle tools users about Pentaho on Oracle. I think I might even dive in and use Kettle to move our sample data over to Oracle Express.

Very soon all this should be doc'ed and available through our SVN repository... hopefully any early comers who use this tip will let meknow how it goes for them:)

And please, if you will be at the conference in Daytona next week, stop by the Pentaho booth and say hey! Can't wait to meet you all!


Saturday, February 03, 2007

Wrap up on the Pentaho Implementation Workshop

All in all, I had great expectations for the 3 day implementation workshop, and I wasn't disappointed!

The last session on Thursday, Dashboards and AJAX, was crammed full of great technical information. The most important info that I can give you all is that the current JSP based dashboards are going away, in exchange for dashboards built upon Pentaho AJAX components, currently under development. Of course, we all were very excited to hear about the new AJAX component architecture, and happy to hear that we can get to the code. The ETA for GA delivery is somewhere in the 3rd or 4th quarter of this year (with milestone builds available earlier most certainly), but always stomping on that hairy edge, I can't wait to dive into the code and contribute to implementation (in my spare time, ha ha! ).

James Dixon, Chief Geek for Pentaho, also went into a bit of the history of Pentaho Dashboards, which was really helpful in gaining perspective on why Dashboards require so much coding today. The philosophy and design goals for Dashboards (really for the platform, in general) is to remain delivery agnostic - meaning we want the platform output to be delivered via the client's choice of technology, not our own. So if you are a JSF shop, .NET shop, or Java applet guru, it won't matter to us, since we deliver the content from the platform in XML. You can take it from there, and transform that XML any way you wish. Well, that design goal is tough to stick to when you are implementing a Dashboard architecture, since Dashboards are heavy on UI, usually containing reports, charts, dials, gauges and numerous other widgets, in some portal type fashion specific to the user's point of view. So we went about component-izing all of the above mentioned widgets, and used a simple JSP (or not so simple JBoss Portal) to demonstrate what COULD be done. The response from our community has been to make Dashboards easier to build, and fortunately, AJAX has conme along (or been there, depending on how you look at AJAX) so that we can deliver that ease of implementation.

I haven't had a chance to say much about the group that attended the class with me. I was tickled to finally meet some folks that I have been chatting with via email, some for more than a year now! Nice, intelligent, talented and truly passionate about BI - I could spend alot more time with these folks, we share so many traits and interests (I know, I think very highly of me ;))! The attendees came from Canada, the Netherlands, Germany, France and the US, which speaks for the demand that our training generates, as well as the global presence that Pentaho has earned in a short 2.5 years. I can't believe sometimes that I am a part of something this big, and this bold! On a day to day basis, it feels like we are just a bunch of guys doing what we have done best for a long time - building BI. But when you gather your community, partners and teammates in a room like we did last week, it sure feels a whole lot bigger, a lot more significant. And so, I can someday explain to my daughter why my job makes me proud :)

Thursday, February 01, 2007

Workshop, Day 3

I know, I know, what happened to Day 2? Well, between busy sessions, picking up the kids, and teaching my night class at Brevard Community College, blogging fell to the way side. For my three avid fans, I sincerely apologize, but by now you are used to it :)

So Day 2 of the workshop, in all honesty was a bit of a fire fighting exercise. Since we are working with the very latest code for the platform (and I mean VERY latest), we ran into a couple of problems during the Subscriptions session that prevented us from seeing the results of the subscribed bursting examples that we set up. But the content was good and subscriptions in the stable builds is very very powerful. We have the ability with the platform to relieve the administration and information overload that occurs with the typical scheduled reporting process. Subscriptions also prove to be flexible and easy to use, which makes it a nice tool for consumers of Pentaho reports, analysis, ETL and processes.

The Day 2 afternoon session was all about advanced deployments of the platform, as well as customizing deployments for each user's environment. Brian Hagan walked through the complex details of manually deploying web application through JBoss and Tomcat, focusing on the touch points that are required when you have your own app server installation already in place. Overall a good session that could be helpful to anyone that struggles with J2EE deployments today.

Day 3 started out with Bill Seyler, a stellar Pentaho engineer, presenting the life cycle management features within Pentaho. For anyone who is not familiar with the term and what it means in Pentaho, life cycle management is versioning solution content for the platform. Bill covered the architecture of how Pentaho interacts with version control systems, which seems to be a very clean and simple implementation. The beauty of life cycle management in Pentaho is that due its simple interface, any version control system can be used, as well as using multiple systems for one Pentaho deployment.

Next up was Anthony DeShazor, our engineering wrangler, talking about scalability and clustering. Much of this session covered the general topology and infrastructure issues that prevent almost any application from scaling. The point I took away is we can control what the Pentaho application does, but how you get at your data and how you deliver it out to consumers can bottleneck any good app. It was great to participate in tis discussion, since many in the room are experienced in the field and had much to contribute. Anthony then took us through the JBoss Clustering presentation that James Dixon presented at JBossWorld late last year. It was a simple architectural discussion covering JBoss Clustering, ending with some pretty impressive benchmarks that proved Pentaho's ability to scale. The most interesting news that came out at the end of this session is that Pentaho has started to build a BI benchmarking bundle, based on the Transaction Processing Performance Counsel processes, and plans to release it to the open source community for benchmark responses. Feel free to email James ( and it will get forwarded) if you are interested in participating in that effort!

Our last session after lunch is Dashboarding and AJAX, a session that all the trainees, including myself are looking forward to. I'll fill you in tonight on how that session goes and how this all wraps up.

Tuesday, January 30, 2007

Implementation Workshop: Metadata

"My users think I'm a God." , Matt Harbert, DivX

I had to start with that quote, I was so pleased to hear it during discussions over lunch this afternoon during the workshop. I was talking with a couple of classmates about Kettle, the Pentaho Data Integration tool, reveling in stories of leaping tall buildings with Kettle as my booster pack. From across the room, Matt tossed the aforementioned words of glory, and I thought "well, if that doesn't just sum it all up". Kettle really is that good, and you'll only know if you dive into it, because, of course, I am a Pentaho-an, making me slightly biased ;)

These were the topics of discussion that opened the Metadata module of the workshop. Metadata is another project being architected and driven by Matt Casters, founder of the Kettle project. Jake Cornelius led the module, and did a nice job of showing us the Pentaho metadata Editor, a handy tool that assists in building Pentaho metadata models. The Metadata Editor is not quite 2 months old, and from what we were shown, is proving to hold lots of potential. The core functionality is there, and the user interface is intuitive, once you learn the new jargon. Amidst a short array of funky behavior and a few bugs, the Metadata Editor's power shined through with it's ability to model not only mappings to physical tables, but also extended formulas, formatting and style properties in a hierarchical fashion (termed "concepts"), and internationalization functions.

This is the end of day one, time to pop over and visit the dev guys on my way home. I have a renewed sense of excitement today, partly because this training is turning out to be even better than I had expected (and my expectations were pretty high), and partly because I'm training up for a brand new seat on the Pentaho ride :)

Implementation Workshop: Security Simplified

Well, we just finished the security module of the workshop, and I have to say, sans the network issues, I am really impressed. Mat Lowry, a Pentaho engineer who focuses on security during his day job, put together the content for the module today. Mat took a pretty complex set of topics (LDAP, Acegi, CAS and J2EE Container Security) and delivered just enough content to understand easily what Pentaho Security is made up of, and what Pentaho adds to the standard technologies available to you in a J2EE environment.

It seems we have done a very nice job of separating the wrangling of authentication and authorization from the functionality of the BI platform. I plan to follow up this workshop with a deeper dive into Acegi, as Mat has gotten me really excited about what it and the Spring framework can do. I'm taking away the relief that Acegi can handle a good 80% to 90% of my web resource security problems, without me having to write more code. I like it, I like it.

My thoughts on the hands on lab is it really made me think and I was pleasantly surprised to find that I understood the concepts Mat covered, and could apply them in the 60 minute lab exercise that was given to us. This was not your typical training class exercise that with loads of screenshots and step-by-step instructions, you could achieve one simple implementation of security. This was more like "A train leaves Tampa at 400 miles an hour at the same time a train leaves Daytona at 200 miles an hour, when will they meet" type of exercise. Now, I think I've mentioned before that I'm a very bright, but pretty simple person, and frankly this type of exercise reminds me how little focus I have. Once we stopped chatting, and I could read the lab carefully, I had no trouble implementing my own switch over from memory based security to LDAP based security within the Pentaho platform.

I have to admit, I feel a tad bit smarter than I used to :)

Pentaho Implementation Workshop

This week, I get the pleasue of sitting in on the Pentaho Implementation Workshop, a hands-on in depth training session covering many advanced implementation features in the Pentaho BI platform.

This training kicks off my new role as a Pentaho developer! I decided to move over to the engineering side after a year and a half leading the Pentaho community, which was a very rewarding experience. You knnow what they say, you have to go where your heart takes you :)

Here's the workshop agenda:

Dashboards and AJAX
Reporting User Interface
Life Cycle Management
Advanced Deployments

Community members here - Roland, Samuel, Fabrizio, welcome, and it's so nice to finally meet you in person :)

I'll be blogging on the workshop all week, so stay tuned!

Friday, January 12, 2007

Follow-on to Internationalization.... Vote!

If you read my previous post on internationalization, you know that I'm looking for a great Confluence solution for handling multiple translations of the Pentaho documentation inside our new wiki.

Go here, login and VOTE for Atlassian to help us solve the problem!

And my apologies to Atlassian regarding my comment that they may not be responsive in their forums - I wasn't monitoring the thread they responded to, only four short days after I posted my dilemma. I will have crow for dinner :)

Thanks everybody!