Sunday, September 28, 2014

Hello Docker.

Docker. Hmmmm. I really want to love it. Everybody else loves it, so I should right? I think maybe some of the "shiny" isn't so bright using DOCKER ON MY MAC.  Although, Chris J. over at Viget wrote this blogpost that singularly walked me through each Mac-Docker gotcha with zero pain.   Total stand-up guy, IMHO.

If you are not familiar, Docker plays favorites to Linux based operating systems, and requires a virtual machine wrapper called boot2docker in order to run on a Mac or Windows OS. Not a huge hurdle, but definitely feels heavier and a bit more maintenance intensive ... two of the core pain points in traditional virtual environment deployments that Docker proposes to alleviate.

Beyond that silliness, there is a whole lot more *nix based scripting than I expected. Somehow I thought the Dockerfile language would be richer, accommodating more decision-based caching. You know, something like cache this command but not this one.  As I looked around and read a few comments from the Docker enthusiasts and Docker folks-proper, it seems there is a great desire to keep the Dockerfile and it's DSL ... well ... simple. Limited? Is that a matter of perspective? I can appreciate simple I guess, but I still want to do hard stuff ... and thus I am pushed to the *nix script environment. This may just be a matter of stuffing myself into these new Docker jeans and waiting for them to stretch for comfort:)

One blessed moment of triumph I would like to share: I was able to write a Dockerfile that would accommodate pulling source from a private Github repository using SSH. This is NOT a difficult Docker exercise. This is a persnickety SSH exercise:) The Docker container needs to register the private SSH key that will pair with the public key that you have registered at Github. At least that is the approach I took. Please do let me know if there are easier / better / more secure alternatives.

So, the solution. The first few steps, I'm going to assume you know how to do, or can find guidance. They are not related to the container setup.

I'm going to tell you right up front that my solution does have a weakness (requirement?) that may not be altogether comfortable, and Github downright poo-poos it. In order to get the container to load without human intervention, you need to leave off the passphrase when you generate your SSH keys (Gretchen ducks.).  I planned to revisit this thorn, but just simply ran out of time. Would love to hear alternatives to this small snafu. Anyway, if you're still in the game,  then read on...

Here are the steps you should follow to get this container up and running.

  1. Generate a pair of SSH keys for Github, and register your public key at
  2. Create a folder for your Docker project.
  3. Place your private SSH key file (id_rsa) in your Docker project folder.
  4. Create your Dockerfile, following the example below.
  5. Build your image, and run your container.
  6. Profit:)

The Dockerfile

FROM gmoran/my-env
MAINTAINER Gretchen Moran

RUN mkdir -p /root/.ssh

# Add this file ... this should be your private GitHub key ...
ADD id_rsa /root/.ssh/id_rsa

RUN touch /root/.ssh/known_hosts
RUN sudo ssh-keyscan -t rsa -p 22 >> /root/.ssh/known_hosts

Running as root User

I am referencing the root user for this example, since that is the default user that Docker will use when you run the container. If you would like a bit more protection, you can create a user, and run the container with that user with the following command ...

USER pentaho

I created the 'pentaho' user as part of a Dockerfile used in the base image gmoran/my-env. IMPORTANT: Note that gmoran/my-env also downloads the OpenSSH daemon and starts is as part of the CMD Dockerfile command.

Adding the id_rsa File

The id_rsa file is the private SSH key generated as part of the first step in this process. You can find it in the directory you specified on creation, or in your ~/.ssh directory.

There are a number of ways to add this key to the container. I chose the simplest ... copy it to the container user's ~/.ssh directory. OpenSSH will look for this key first when attempting to authenticate our Github request.

Adding to the known_hosts File

We add the SSH key to the known_hosts file to avoid the nasty warning and prompt for this addition at runtime.

In my thrashing on this, I did find several posts in the ether that recommended  disabling StrictHostChecking, which hypothetically produces the same end result as manufacturing/mod'ing the known_hosts file. This could however leave this poor container vulnerable, so I chose the known_hosts route.

At the End of the Day ...

So at the end of the day, when I thought I would be honing my Docker skills, I actually came away a with a stronger set of Unix scripting skills. Good for me all in all.  I am excited about what Docker will become, and I do find the cache to be enough sugar to keep me drinking the Docker kool-aid.

I should say I appreciate not actually having to struggle with Docker. It is a nice, easy, straight-forward tool with very few surprises (we won't talk about CMD versus ENTRYPOINT). Any time-consuming tasks in this adventure were directly related to my very intentional avoidance of shell scripting, which I now probably have a tiny bit more appreciation for as well.

In the words of the guy I like the most today, Chris Jones ... Good Guy Docker :) 

Tuesday, April 15, 2014

Pentaho Analytics with MongoDB

I love technology partnerships. They make our lives as technologists easier by introducing the cross sections of functionality that lie just under the surface of the products, easily missed by the casual observer. When companies partner to bring whole solutions to the market, ideally consumers get more power, less maintenance, better support and lower TCO.

Pentaho recognizes these benefits, and works hard to partner with technology companies that understand the value proposition of business analytics and big data. The folks over at MongoDB are rock stars with great vision in these spaces, so it was natural for Pentaho and MongoDB to partner up.

My colleague Bo Borland has written Pentaho Analytics with MongoDB,  a book that fast tracks the reader to all the goodness at your fingertips when partnering Pentaho Analytics and MongoDB for your analytics solutions.  He gets right to the point,  so be ready to roll up your sleeves and dig into the products right from page 1 (or nearly so).  This book is designed for technology ninjas that may have a bit of MongoDB and/or Pentaho background. In a nutshell, reading the book is a straight shot to trying out all of the integration points between the MongoDB database and the Pentaho suite of products.

You can get a copy of Pentaho Analytics with MongoDB here.  Also continue to visit the Pentaho wiki, as these products move fast.

Friday, March 07, 2014

Pentaho's Women in Tech: In Good Company

I was honored this week to be included in a a blog series that showcases just a few of the great women I work with, in celebration of International Women's Day on March 8.

Check out the series, I think you'll find the common theme in the interviews interesting and inspiring. Pass on the links if you have girls in your life that could be interested in pursuing technology as a career. 

Friday, December 14, 2012

Pentaho's 12 Days of Visualizations

If you are interested in the ultimate extendability of Pentaho's visualization layer, you'll love this fun holiday gift from Pentaho: 12 Days of Visualizations.  Check back each date marked for a new plugin that demonstrates Pentaho leveraging cool viz packages like Protovis, D3 and more.

Today's visualization: the Sunburst!

Merry Christmas and Happy New Year!

Thursday, September 27, 2012

Resolving "AppName is damaged and can't be opened." Don't move it to the trash!

I recently stumbled across this problem with one of Pentaho's applications. When the application was downloaded and installed on a Mac, launching the .app file resulted in "This app is damaged and can't be opened. Move to the trash".

 Relatively quickly with a few searches, we figure out that GateKeeper was the messenger, but why was she being so harsh? Our apps are unsigned (a signature improvement slated for the next release), but damaged? I was offended.

As it turns out, Apple has a decent support article that explains why you might get a "damaged..." message versus GateKeeper's standard message warning the user that the application is unsigned.

The answer to softening GateKeeper's tone (AKA getting her to only prompt with a security message rather than a "damaged" message) lies within the info.plist file within the .app. Kurtis, our .app builder, found that if he sets the following values, then the .app reverts to being a harmless unsigned .app.


I hope this solution saves someone else the heartache of deploying a"damaged" .app file.

kindest regards, 

Saturday, October 08, 2011

Pentaho and OpenMRS Integration

We have a great opportunity to explore how Pentaho can provide ETL, analytics, and reporting benefits to OpenMRS, an open source medical records platform and community interested in global health care.

Check out the first projects underway, and decide if you have time to participate:

Pentaho ETL and Designs for Dimensional Modeling

Cohort Queries as Pentaho Reporting Datasource
This project still needs a lead developer; we'd like to have these projects run in tandem.

To get involved, feel free to email me directly, or contact any of the OpenMRS mentors listed in the projects.

kindest regards and in His grace,

Tuesday, October 04, 2011

PCM11: Continuity and Change @ Pentaho

Last week, I enjoyed my third (of four) Pentaho Community meetup, this year held in Rome (Frascati), Italy. Jan Aertsen did a fantastic job summarizing the presentations, you can review them all here, including access to the presentation materials. At this particular juncture, I find myself in my longest commitment to a single company in my career. The entire ride has this very cool thread of continuity through tides of swift and constant change that comes with being a bleeding edge software company.

When I look back over the past seven years, many times I focus solely on Pentaho milestones and growth, the markets we've entered and enjoyed success with, the new initiatives that take hold. PCM11 gave me a look at the global reach of success that Pentaho enjoys, creating opportunity and economy beyond the bounds of the company official. This is what makes open source make sense to me. This appeals to me.

The people that make up the Pentaho community are a talented, committed group of individuals who are growing in their own endeavors, many based on the community edition of the Pentaho BI Suite of tools. Many of our community colleagues have been committed to Pentaho from the earliest releases of 2004 and 2005. Their efforts are paying off, and while Pentaho the company doesn't get everything right, we've managed to earn the respect and partnership of some incredibly driven and talented people.

Another interesting phenomena - community members becoming Pentaho employees, blurring any lines that get drawn at times between community and corporate.

From the ranks of the Pentaho community a well of talent has sprung - Slawo, Roland, Jan, Jens, and a handful of others. Pentaho is incredibly savvy in hiring from the community. Our community is the hotbed of Pentaho, DBA, big data, analytic and reporting knowledge, both from a project development perspective and from a solutions development perspective. How many software projects suffer from the writers not understanding the use cases? Not eating their own dog food? Well, the newest Pentaho developers have been at that bowl for some time, and the internal developers can help them keep that commitment with internal initiatives delivering Pentaho solution driven information.

And what of the other direction? Those leaving the formal Pentaho realm and working entirely community based? Well, that would be me. It's not like this is new news - I'm now infamous for my off-again, on-again relationship with formal employment :) Don't mistake me for irresponsible; I just have higher priorities. We all should be so blessed, right?

The great news is I also have reaped the benefits of a long series of lessons in BI, big data, analytics, reporting, visualizations, problem solving and code writing. So I take these lessons learned into the community and can begin to give back a little. To my fellow community members, to other open source projects, to Pentaho.

One project that has caught my attention is the OpenMRS project. OpenMRS is a medical records system platform widely deployed throughout the compromised countries of the world. OpenMRS is open source, and has a thriving community of developers, implementers, users and observers from well established world health organizations.

I intend to spend the last quarter of this year investigating integration points between Pentaho tooling and OpenMRS. OpenMRS could use more insight into their data; Pentaho is an excellent set of tools for turning raw data into information. I see synergies here :)

Soon, there will be a project page to stay informed if you're interested or would like to participate. I'll post back as soon as I have the leg work done. In the meantime, checkout It's a very rational site that gets you up to speed quickly on the project.

Cheers & all in His grace,