Puppet Modules and The Forge are Broken!


Puppet modules need to be as easy to produce as ruby gems are. Puppet itself isn’t broken. Chef can be learnt from. There exist low hanging fruit for Puppetlabs.

Ready, set, go!

I recently completed the Puppetlabs module challenge; write a puppet module for setting up Riak on a computer. In the course of doing this, I’ve been thinking about how puppet is to write modules for and how it ties into the greater software eco-system. It should be said that I’m a relative newbie at configuration management, but I have worked with large software systems, and have been active in open source projects for many years, so I have some idea of how it ties together.

Puppet is a really nice tool for doing deployments. It hooked me in the choice between chef and itself, by having a declarative syntax rather than an imperative one, and I know from experience that having a more modular and loosely coupled syntax is preferable in most cases. It turned out that was true; I was able to unit-test my puppet module easily through rspec-puppet, by giving an execution context, e.g. operating system and other system variables, and asserting on the graph that puppet built to be applied. Contrast this with how it would be done in chef; running it on a snapshotted virtual machine, and you’ll start to see the benefits.

Puppetlabs recently released Puppet 3.0, the next major version that initiates the usage of semantic versioning for the whole software. It also does away with dynamic scoping of variables, forcing module authors to be explicit about where their module takes their data. I think this is good, because on the other hand; hiera promises to make the dynamic scope of variable much less useful, by introducing other, more structured, means of getting the data; hierarchies on different data providers in this case. Is nice, come buy!

So I wrote my module just as the software is making a major transition, of course there’d be complications. I’d like to start with how testing works:

Testing Puppet modules

rspec-puppet is a way of asserting on the resource graph; but it’s not enough - because I also have to include the puppetlabs helpers to get the rake tasks and rspec-hiera to get hiera support. Since Hiera is built into puppet 3.0, I definitely think it should be built into the rspec-puppet gem - or simply rely on the puppet gem.

It turned out that I couldn’t easily create Hiera-based data in the rspec-hiera gem, because about 8 months ago, Puppetlabs had removed some method for some reason, and the testing infrastructure hadn’t been updated. If I tried to reference the hiera software as a gem, that wasn’t possible either, because it had a custom rake/build infrastructure that bundler wouldn’t detect properly. Basically, I was stuck with what non-working tests I had, and there was noone to ask. If Puppetlabs expect people to start pushing lots and lots of modules to the forge, they should take a page from the book of how ruby in general does it, and perhaps even go down the route taken by vagrant; to simply host all gems in rubygems.org, with some added meta-data.

Also, it goes without saying, that I think they should unit-test, lint and verify their own modules; because just opening them in geppetto or running the linter on them produces multiple warnings.

Speaking of the linter: the Anchor pattern seems to be very recommended; writing modules without it is just not composable - in this sense chef composes better when the recipes are written with worse code - because most puppet modules simply do not use the anchor module; and so they don’t compose well. On the other hand side, writing modules in a structured way for puppet will compose much better than writing recipes for chef.

This is actually a large problem, because part of the promise of configuration management is the ability to compose modules together to form your system!

As such, I belive that the linter should compile the resource graph, just like StackHammer does, and assert on it; for example, classes containing classes should anchor them. Perhaps this could be a flag to rspec-puppet.

Tooling support for Puppet

I used Geppetto to write the module - it’s a great tool that REALLY makes it much faster to write the modules! It should be featured prominently in the documentation, because I’m sure it would save newbies quite a lot of time.

rspec-puppet did what it promised, but I would like more documentation on how to actually assert on the resource graph, including much more documentation from the side of Puppetlabs on what the resource graph looks like. Treat the graph classes/API as a public API and start including that in the surface area that requires a major to break - deprecate methods not to be used, instead of just deleting them (like they did with the Hiera methods) and again, document it.

Vim served me well for writing the rspec tests. I could have used RubyMine here as well, but at least the 3.x line of RM has had problems with any sort of ruby development that wasn’t Ruby-on-Rails development.

Publishing Puppet modules

The publishing story, even with puppetlabs_spec_helper is pretty dire.

  • Puppet’s module command is overly picky and fails with completely non- obvious errors if your custom types (./lib/puppet/types/*) contain any errors.

  • There’s no way to declaratively specify what things should be included in the module, so there’s no way I can exclude development-only type of files, e.g. Guardfile.

  • Since I can’t customize what file I don’t want included in the module, I have to write custom ruby to do that - it’s pretty unacceptable from a viewpoint of ease of publishing one’s module.

  • The puppetlabs rake helpers don’t compose well, and there’s no way that I can remove some of them or customize them. E.g. a normal workflow could look like this:

    1. Hack
    2. Build module pkg/
    3. Hack
    4. Run linter

    The linter will complain of the copied files in pkg/ because it recursively searches the fill directory tree - and I can’t modify the task :lint in rake, because it’s all hard-coded with few-to-none extension points.

  • The rake helpers for running the specs only delete the cloned puppet modules if they are successful, otherwise it just leaves them there. I naturally want to run my tests for almost any code change, so the buggy deletion behaviour aside, I don’t want it to delete the cloned modules. Again, no extension points, so I have to hack around it with a custom rake file.

  • Publishing the module then; finally I managed to publish it, after a lot of hacking around tooling problems, and I look at it at the forge: haf/riak - and it seems that my dependency on puppet-hiera is the empty string? Why? And how do I correct it? And why is that module at 0.3.0? And how do I separate that dependency between puppet 2.7.x and 3.0 - I don’t know, and the docs don’t seem to tell me either.

  • It’s pretty hard to find good documentation on how to use vagrant with puppet in good way - e.g. what do I symlink where and how do I do it? The issue at hand is running vagrant in a directory structure of a puppet module with a Vagrantfile that can include dependent modules. Part of this is solved by the puppetlabs rake helpers, by cloning the modules into a subfolder and symlinking . to that same folder - but it’s totally non-obvious how it should be set up with Vagrant for testing - but I can’t write a module only based on unit testing either, because it’s too rough around the edges and seems to be an after-thought.

    It’s worth noting though, that because of the loosely coupled declaration- vs. runtime of puppet, it actually should be pretty easy to retro-fit and I’m sure Puppetlabs can make the unit-testing support really nice.

    Step one is fixing the issues that I have reported in the module README, in this blog entry and across the web in other places.

    Step two could be to make it possible to unit-test easily inside a Vagrant box with Sahara for snapshot rollbacks, like chef does it, or alternatively publishing docs on testing with vagrant in some other way.

  • If I want to point my module to a git repository of another module, it’s not as easy as adding it in a Gemfile. Why not? It could be.

I realize that to quickly enact change is hard; perhaps it would be a good idea to embrace what existing ruby infrastructure is already there? This could be done until there’s a critical mass of puppet modules out there.

Puppet on Windows

I’ve worked on more .Net projects than I have worked on projects using other languages and frameworks. I can tell that there’s a huge missing piece of infrastructure when developing on Windows. System Center just isn’t there for the ordinary developer - and 99.9% of use are ordinary developers, so there’s something missing. We don’t want to write long and hard-to-understand XML files or equally nasty MsBuild XML files for Team Foundation Server, the so called CI of Microsoft.

The missing piece is really deployment + configuration management for those deployed applications. A single application is always “easy” to deploy, but once the number of applications and services start increasing, a more powerful tool is used.

I believe puppetlabs has a window of opportunity to establish themselves as the deployment platform of choice for .Net. We need:

  • A build system - albacore (check)
  • A versioning system - git (check)
  • A packaging system for applications - ???
  • A deployment system for packages - ???

I was hoping that the packaging system for applications would be a zip to start with - simplest possible thing - and the deployment system a combination of Ruppet and MCollective. In the longer term - perhaps I’ll create a task for albacore that wraps FPM and creates something “better”?

What I’m saying is; puppet and it’s language is versatile enough - stop the development on features and start expanding tooling support to gain market traction in shops that need to deploy Windows based services, sites and apps.

That market is huge.

Closing Remarks

When I search for puppet modules I don’t go to the forge; because I realize that because it’s too hard to send a module to the forge, people will simply push the source code to github – that’s where I search for modules. It should be otherwise!

I’m optimistic for the future, especially if Puppetlabs decide to dedicate a few development cycles to removing the rough edges for creating modules and writing some docs for deployment with Puppet+MC.

So in the end, the puppet module infrastructure is broken, but only in small ways that can be mended easily enough!


Twitter @henrikfeldt

Twitter @logarylib

Henrik Feldt +46 737 53 27 18 haf