Garrettux Rotating Header Image

java.lang.NullPointerException when migrating jobs from Hudson to Jenkins

I was recently asked to stand up a Jenkins instance and migrate some jobs that had previously been created in Hudson. I simply copied ~hudson/jobs to ~jenkins on the new box, but they wouldn't load. I looked at the jenkins logs and saw some stack traces. Important parts below:
SEVERE: Failed Loading job Production Deploy
java.lang.NullPointerException
    at hudson.model.Project.createTransientActions(Project.java:206)
So, with a little help from my friendly neighborhood internets, I found a post on stack overflow that helped. I added the <publishers/> tag to the job's config.xml as suggested, and still no luck. However, the most recent stack trace was pointing at a different culprit:
SEVERE: Failed Loading job Production Deploy
java.lang.NullPointerException
	at hudson.model.Project.getBuildWrappers(Project.java:121)
After comparing my busted config.xml to a working one, I noticed it was also missing a <buildWrappers/> tag at the end, which makes sense considering the code referenced by the stacktrace. I added that, and restarted Jenkins, and BOOMSHAKALAKA just like that I'm back in business.

VirtualBox::Exceptions::COMException: Error in API call to save_settings: 2147500037

Thanks to Joel Jensen @ nervetree.com for posting a solution for this error (VirtualBox::Exceptions::COMException: Error in API call to save_settings: 2147500037), and thanks to Google for leading me to it. I stumbled across this exact error while working with Vagrant boxes, and I was scratching my head.

knife-vagrant

I've been working on the idea of a knife plugin that will spin up a Vagrant box and test a specific Chef runlist. Right now I've got it creating a VM and running Chef successfully. Next I want to start adding the test part - probably Cucumber tests. I'll be posting updates here as work progresses. If you're interested in the source, check it out at https://github.com/garrettux/knife-vagrant.

Running Vagrant on Windows

I've been tasked with creating VM templates for our developers to use. Most of them are already using virtualbox on their desktop/laptop workstations, so a Vagrant/Chef combination seemed like a good choice. However, some of them run windows, some run linux, and some run mac, so I'll have to see how each one works. So far I've been trying on windows, using cygwin to do the CLI stuff just because it's a little easier. Here are my notes on what I've done so far to make it work.

* install DevKit first: https://github.com/oneclick/rubyinstaller/wiki/development-kit
* alias vagrant='vagrant.bat'
* fixed gemspecs for json gems: http://stackoverflow.com/questions/5771758/invalid-gemspec-because-of-the-date-format-in-specification

Python Bottle REST API for pulling Oracle data into Chef

At Silverpop we're really starting to ramp up our implementation of Chef, and coming up with some really cool ways of doing things. Our application is primarily a set of Java webapps with Oracle providing the databases.

One thing I really wanted to do was auto-discovery of connection strings for the application configs. We have multiple application clusters, each cluster combining to serve one running instance of our application. We call these clusters pods.

Our DBAs maintain a database of all the TNSNames values for all our production databases. It's mainly for their own administrative use, but I wanted to pull that information into Chef. So, I wrote a little REST API that connects to the TNSNames database, and returns the results of a given query in JSON format. I used Bottle as the web framework because it looked quick and easy to setup a one off app like this.

There are three main pieces to the code below - a Daemon class (adapted from http://www.jejik.com/articles/2007/02/a_simple_unix_linux_daemon_in_python), a couple of functions that express the SQL queries and translate the results, and the routes for the API that call those functions. Stuff like URLs and logins have been made all generic and whatnot.

#!/usr/local/bin/python
# dba_api.py
# exposes a REST API that returns connections strings for a pod in JSON format
# can be queried by environment or SID.
# Examples:
# http://dbserver/conn/pod/1
# http://dbserver/conn/stage/5
# http://dbserver/conn/sid/SID
# controlled like an init script, e.g. ./dba_api (stop|start|restart)


import sys, os, time, atexit
import logging
from signal import SIGTERM
import cx_Oracle
import json
import bottle
from bottle import route, run, request, abort

# connection string for the dba database
dbconn = 'USER/PASS@dbserver:PORT/SID'
progname = os.path.basename(sys.argv[0]).split('.')[0]

"""
daemon code from http://www.jejik.com/articles/2007/02/a_simple_unix_linux_daemon_in_python/
"""
class Daemon:
        """
        A generic daemon class.

        Usage: subclass the Daemon class and override the run() method
        """
        def __init__(self, pidfile, stdin='/dev/null', stdout='/dev/null', stderr='/dev/null'):
                self.stdin = stdin
                self.stdout = stdout
                self.stderr = stderr
                self.pidfile = pidfile

        def daemonize(self):
                """
                do the UNIX double-fork magic, see Stevens' "Advanced
                Programming in the UNIX Environment" for details (ISBN 0201563177)
                http://www.erlenstar.demon.co.uk/unix/faq_2.html#SEC16
                """
                try:
                        pid = os.fork()
                        if pid > 0:
                                # exit first parent
                                sys.exit(0)
                except OSError, e:
                        sys.stderr.write("fork #1 failed: %d (%s)\n" % (e.errno, e.strerror))
                        sys.exit(1)

                # decouple from parent environment
                os.chdir("/")
                os.setsid()
                os.umask(0)

                # do second fork
                try:
                        pid = os.fork()
                        if pid > 0:
                                # exit from second parent
                                sys.exit(0)
                except OSError, e:
                        sys.stderr.write("fork #2 failed: %d (%s)\n" % (e.errno, e.strerror))
                        sys.exit(1)

                # redirect standard file descriptors
                sys.stdout.flush()
                sys.stderr.flush()
                si = file(self.stdin, 'r')
                so = file(self.stdout, 'a+')
                se = file(self.stderr, 'a+', 0)
                os.dup2(si.fileno(), sys.stdin.fileno())
                os.dup2(so.fileno(), sys.stdout.fileno())
                os.dup2(se.fileno(), sys.stderr.fileno())

                # write pidfile
                atexit.register(self.delpid)
                pid = str(os.getpid())
                file(self.pidfile,'w+').write("%s\n" % pid)

        def delpid(self):
                os.remove(self.pidfile)

        def start(self):
                """
                Start the daemon
                """
                # Check for a pidfile to see if the daemon already runs
                try:
                        pf = file(self.pidfile,'r')
                        pid = int(pf.read().strip())
                        pf.close()
                except IOError:
                        pid = None

                if pid:
                        message = "pidfile %s already exist. Daemon already running?\n"
                        sys.stderr.write(message % self.pidfile)
                        sys.exit(1)

                # Start the daemon
                self.daemonize()
                self.run()

        def stop(self):
                """
                Stop the daemon
                """
                # Get the pid from the pidfile
                try:
                        pf = file(self.pidfile,'r')
                        pid = int(pf.read().strip())
                        pf.close()
                except IOError:
                        pid = None

                if not pid:
                        message = "pidfile %s does not exist. Daemon not running?\n"
                        sys.stderr.write(message % self.pidfile)
                        return # not an error in a restart

                # Try killing the daemon process
                try:
                        while 1:
                                os.kill(pid, SIGTERM)
                                time.sleep(0.1)
                except OSError, err:
                        err = str(err)
                        if err.find("No such process") > 0:
                                if os.path.exists(self.pidfile):
                                        os.remove(self.pidfile)
                        else:
                                print str(err)
                                sys.exit(1)

        def restart(self):
                """
                Restart the daemon
                """
                self.stop()
                self.start()

        def run(self):
                """
                You should override this method when you subclass Daemon. It will be called after the process has been
                daemonized by start() or restart().
                """


def get_db_by_pod( pod, lsite ):
    con = cx_Oracle.connect(conn)
    cur = con.cursor()
    cur.prepare('select sid, vip, port, host, port2 from db_info where product = :pod and businessline = :lsite')
    cur.execute(None, {'pod': pod, 'lsite': lsite})
    fieldNum = 0
    fieldNames = {}
    desc = [d[0].lower() for d in cur.description]
    results = dict(enumerate([dict(zip(desc,line)) for line in cur.fetchall()]))
    cur.close()
    con.close()
    return results

def get_db_by_sid( sid ):
    con = cx_Oracle.connect(conn)
    cur = con.cursor()
    cur.prepare('select sid, vip, port, host, port2 from db_info where sid = :sid')
    cur.execute(None, {'sid': sid})
    fieldNum = 0
    fieldNames = {}
    desc = [d[0].lower() for d in cur.description]
    results = dict(enumerate([dict(zip(desc,line)) for line in cur.fetchall()]))
    cur.close()
    con.close()
    return results


@route('/conn/sid/:sid', method='GET')
def get_conn(sid):
    sid = sid.upper()
    entity = get_db_by_sid(sid)
    if not entity:
        abort(404, 'No connection information found for %s' % sid)
    return entity

@route('/conn/pod/:pod', method='GET')
def get_pod(pod):
    pod = 'POD' + pod
    entity = get_db_by_pod(pod, pod)
    if not entity:
        abort(404, 'No connection information found for %s' % pod)
    return entity

@route('/conn/stage/:pod', method='GET')
def get_pod(pod):
    pod = 'POD' + pod
    entity = get_db_by_pod(pod, 'STAGE')
    if not entity:
        abort(404, 'No connection information found for %s' % pod)
    return entity

class MyDaemon(Daemon):
    def run(self):
        run(host='0.0.0.0', port=9090)

if __name__ == "__main__":
        daemon = MyDaemon('/var/run/%s.pid' % progname)
        if len(sys.argv) == 2:
                if 'start' == sys.argv[1]:
                        daemon.start()
                elif 'stop' == sys.argv[1]:
                        daemon.stop()
                elif 'restart' == sys.argv[1]:
                        daemon.restart()
                else:
                        print "Unknown command"
                        sys.exit(2)
                sys.exit(0)
        else:
                print "usage: %s start|stop|restart" % sys.argv[0]
                sys.exit(2)

Now, once that was up and running, I needed to be able to consume the JSON returned by the API into Chef. I created a cookbook called "tnsnames", consisting of one recipe. The node[:dba_api][:host] and node[:dba_api][:port] attributes are set in the roles we created for each pod.

ruby_block "tnsnames" do
  block do
    begin
      pod = node[:pod_name] or raise Chef::Exceptions::AttributeNotFound, "Could not determine pod name.  Ensure that the pod role is applied to this node."
      m = pod.match /(.+)([0-9])/
      lsite = m[1]
      podnum = m[2]
      rest_url = "http://#{node[:dba_api][:host]}:#{node[:dba_api][:port]}"
      rest = Chef::REST.new(rest_url)
      conn = rest.get_rest("conn/#{lsite}/#{podnum}")
      Chef::Log.debug(conn)
      node[:tnsnames] = conn
      Chef::Log.info("Got TNS names for #{pod} successfully from API at http://#{node[:dba_api][:host]}:#{node[:dba_api][:port]}/conn/#{lsite}/#{podnum}")
    rescue Chef::Exceptions::AttributeNotFound => e
      Chef::Log.error(e)
    rescue StandardError => e
      Chef::Log.error("Getting TNS names for #{pod} from DBA API failed: #{e}")
    end
  end
end

The end result being a node[:tnsnames] attribute which is a hash of the SID, VIP, port number, and server hostname for the database.

Chef nodes not being indexed

There's a character escaping bug in chef-solr-indexer 0.9.x that prevents a node from being indexed if ohai returns certain escapable characters, when said characters are in a key (as opposed to a value). More detail in this JIRA. For the nodes not being indexed, you will get errors like this in /var/log/chef/solr-indexer.log:
[Wed, 04 May 2011 17:23:21 -0400] INFO: Indexing node 856a0f74-65a9-4d55-931f-d2ee555b9544 from chef status error Broken pipe}
You can match the id (856a0f74-65a9-4d55-931f-d2ee555b9544) in the errors to the node in couchDB. This should be fixed in chef 0.10.0, but a workaround for 0.9.x is to install the fast_xs gem on the chef server, then bounce chef-solr and chef-solr-indexer.

Setting passwords in bulk with expect

Yes, this is probably a topic that's been covered many times before in many places, but I just had to do it for the first time, so I figured I'd write it up. We have an email server that our QA analysts use specifically to send mail to from our application. Recently the requirement came up to have 100 mailboxes (in this case, unix user accounts) created quickly. I figured a quick and dirty would be to handle the "passwd" command with expect, and wrap it in a bash loop with useradd. So with a little googling I found this guy:

[root@qamail ~]# cat user_bulk.expect
#!/usr/bin/expect
spawn passwd [lindex $argv 0] set password [lindex $argv 1] expect "password:" send "$password\r" expect "password:" send "$password\r" expect eof
Once I made sure it worked the way I wanted, I fired up the loop:

[root@qamail ~]# for a in `seq 1 100`; do useradd qa${a}; ./user_bulk.expect qa${a} somepassword; done
spawn passwd qa1
Changing password for user qa1.
New UNIX password:
Retype new UNIX password:
passwd: all authentication tokens updated successfully.
spawn passwd qa2
Changing password for user qa2.
New UNIX password:
Retype new UNIX password:
passwd: all authentication tokens updated successfully.
spawn passwd qa3
Changing password for user qa3.
New UNIX password:
Retype new UNIX password:
passwd: all authentication tokens updated successfully.
And so on. I was especially glad I took the time to do this, because immediately after I created the accounts with a super-secure password (not shown above), QA asked me to change it, for all 100 accounts, to something simple and hackable that they could remember. Go figure.

VMware Project Onyx

Last night I stumbled on Project Onyx, a tool from VMware labs that turns mouse clicks into PowerCLI code.  Basically it's a proxy that runs on your local machine and intercepts and interprets the SOAP traffic between your VI client and the server. I haven't done much with PowerCLI, but I've been spending a lot of time messing with the VMware VI Java API - specifically coding against it in JRuby.  So far, this is an awesome tool if you're trying to learn VMware automation.  I set up Onyx following the video on their site, and had it up and running within a few minutes.  In my first attempt I added some RAM to a VM and powered it on, and immediately Onyx gave me back this little gem:
# ------- ReconfigVM_Task -------

$spec = New-Object VMware.Vim.VirtualMachineConfigSpec
$spec.changeVersion = "2010-12-07T18:12:36.952652Z"
$spec.memoryMB = 2048

$_this = Get-View -Id 'VirtualMachine-vm-4590'
$_this.ReconfigVM_Task($spec)

# ------- PowerOnMultiVM_Task -------

$vm = New-Object VMware.Vim.ManagedObjectReference[] (1)
$vm[0] = New-Object VMware.Vim.ManagedObjectReference
$vm[0].type = "VirtualMachine"
$vm[0].value = "vm-4590"

$_this = Get-View -Id 'Datacenter-datacenter-21'
$_this.PowerOnMultiVM_Task($vm, $null)
So far I'm really excited about the possibilities here.  I'll post code once I get some working examples that I've derived from the Onyx output.

My First Experiences with Vijava API, JRuby, and Chef API

Wow, that's a lot of stuff.  Nonetheless, my first learning experiences with JRuby, the VMware VI Java API, and coding against the Chef API were all intertwined. I've been working with Chef for about six months now, and I've really gotten to like it.  I started learning Ruby at about the same time, and the two definitely go hand in hand.  I've also been working with VMware for a few years, and recently decided it was time to start learning how to automate VMware administration, when I stumbled on Steve Jin's VI Java API.  At first glance I thought I couldn't do anything with it, because I know nothing about Java.  Then I found Steve's post about using JRuby, and things started to happen. So I wrote some code.  I'm not going to claim it's good code, but it works.  Basically just wrapped up some functions I thought I would use regularly in methods - in a file called vijava.rb that I "require" in other scripts to do one-off tasks:
require 'java'
require 'dom4j-1.6.1.jar'
require 'vijava2120100715.jar'

import java.net.URL
import com.vmware.vim25.ManagedObjectReference
import com.vmware.vim25.VirtualMachineSnapshotInfo
import com.vmware.vim25.VirtualMachineSnapshotTree
module VIJava
  include_package "com.vmware.vim25.mo"
end

def time_diff_ms(start, finish)
   (finish - start) * 1000.0
end

def vcenter_status(rootFolder)
  vms = VIJava::InventoryNavigator.new(rootFolder).searchManagedEntities("VirtualMachine")
  hosts = VIJava::InventoryNavigator.new(rootFolder).searchManagedEntities("HostSystem")
  pools = VIJava::InventoryNavigator.new(rootFolder).searchManagedEntities("ResourcePool")

  puts "#{hosts.length} Managed Hosts found"
  puts "#{pools.length} Resource Pools found"
  puts "#{vms.length} Virtual Machines found"
end

def print_vm_status(vm)
  puts "-" * 30
  puts "VM name: " + vm.getName()
  puts "Guest OS: " + vm.getConfig().getGuestFullName()
  puts "Multiple snapshots supported: " + "#{vm.getCapability().isMultipleSnapshotsSupported()}"
  puts "VMware Tools Status: " + "#{vm.getPropertyByPath("guest.toolsStatus")}"
  list_networks(vm)
end

def list_all_vms(vms)
  vms.each { |vm| print_vm_status(vm) }
end

def list_tools_status(vms)
  puts "Virtual Machines with tools status not ok:\n"
  count = 0
  vms.each do |vm|
    unless "#{vm.getPropertyByPath("guest.toolsStatus")}" == "toolsOk"
      print_vm_status(vm)
      count += 1
    end
  end
  puts "\n#{count} VMs found with tools status not ok"
end

def list_networks(vm)
  a = []
  vm.networks.each { |net| a.push(net.name) }
  puts "Networks: #{a.join(",")}"
end

def list_snapshots(vm)
  snap_info = vm.getSnapshot()
  if snap_info == nil
    puts "#{vm.getName()} has no snapshots"
  else
    snaptree = snap_info.getRootSnapshotList()
    puts "Snapshots: "
    get_snapshots(snaptree)
  end
end

# dont call get_snapshots method directly.  use list_snapshots(vm) instead.
def get_snapshots(snaptree)
  snaptree.each do |snapshot|
    puts snapshot.getName()
    child_snapshot_list = snapshot.getChildSnapshotList()
    unless child_snapshot_list == nil
      get_snapshots(child_snapshot_list)
    end
  end
end

def create_snapshot(vm, name, description, dump, quiesce)
  task = vm.createSnapshot_Task(name, description, dump, quiesce)
  # this seems bad, but task.waitForMe() == Task.SUCCESS gave me "uninitialized constant Task (NameError)"
  # also tried VIJava::Task.SUCCESS since com.vmware.vim25.mo is in module VIJava - and got this:
  # undefined method `SUCCESS' for Java::ComVmwareVim25Mo::Task:Class (NoMethodError)
  # so just waiting for a string seems wrong, but it also appears to work.
  if task.waitForMe() == "success"
    puts "Snapshot #{name} created for VM #{vm.getName()}"
  end
end

def remove_snapshot(vm, snapshot)
  snap = VIJava::getSnapshotInTree(vm, snapshot)
  if snap == nil
    puts "snapshot #{snapshot} does not exist"
  else
    task = snap.removeSnapshot_Task(removechild)
    if task.waitForMe() == "success"
      puts "Removed snapshot: #{snapshot}"
    end
  end
end

def remove_all_snapshots(vm)
  task = vm.removeAllSnapshots_Task()
  if task.waitForMe() == "success"
    puts "Removed all snapshots for #{vm.getName()}"
  end
end
So, that got me part of the way to where I wanted to be.  But I still had to hardcode a list of VMs to snapshot, which seems ugly.  Our application is grouped into clusters of VMs that, together, run one instance of our application.  We call these clusters "pods."  What if I want to snapshot an entire pod?  I don't want to have to hardcode a list of VMs every time.  That's where Chef comes in.  We're already using Chef to manage a good amount of information about our infrastructure, and each VM (node, in chef terms) has a role that identifies it as a member of a specific pod. So I should be able to give Chef a pod name, get a list of nodes, then tell the VI Java API to snapshot those VMs, right?  Yep - here's chef_api.rb:
require 'rubygems'
require 'chef'

def list_nodes_fqdn(rest)
  nodes = rest.get_rest("/nodes")
  nodes.keys.each do |key|
    node = rest.get_rest("/nodes/#{key}")
    attr = node.attribute
    puts attr[:fqdn]
  end
end

def get_nodes_in_pod(rest, pod)
  allnodes = []
  query_string = "role:#{pod}"
  search = rest.get_rest("/search/node?q=#{query_string}")
  search["rows"].each do |node|
    attr = node.attribute
    allnodes.push attr[:fqdn]
  end
  allnodes
end

# example usage
#Chef::Config.from_file("/home/mgarrett/.chef/knife.rb")
#rest = Chef::REST.new("http://CHEF.EXAMPLE.COM:4000")
#list_nodes_fqdn(rest)
#n = get_nodes_in_pod(rest, "stage2")
#puts n.inspect
So now I can send a search query to the Chef API with a pod name, get back a list of nodes, and tell VMware to make snapshots, in snapshot.rb:
require 'vijava'
require 'chef_api'

pod = ARGV[0]

unless ARGV[0]
  puts "Usage: #{$0} [pod]\nSnapshot all Virtual Machines in [pod]"
  exit
end

vi_api_url = "https://VCENTER.EXAMPLE.COM/sdk"
start = Time.now
service_instance = VIJava::ServiceInstance.new(URL.new(vi_api_url), "USERNAME", "PASSWORD", true)
stop = Time.now
puts "Time taken to instantiate API connection to #{vi_api_url}: #{time_diff_ms start, stop} ms"
rootFolder = service_instance.getRootFolder()

Chef::Config.from_file("/home/mgarrett/.chef/knife.rb")
rest = Chef::REST.new("http://CHEF.EXAMPLE.COM:4000")
vms = get_nodes_in_pod(rest, "#{pod}")
snapshot_name = Time.now.strftime("%m%d%Y")
snapshot_description = "Snapshot for DR"

vms.each do |hostname|
  node = rest.get_rest("/nodes/#{hostname}")
  attr = node.attribute
  if attr[:virtualization][:emulator] == "vmware" && attr[:virtualization][:role] == "guest"
    vm = VIJava::InventoryNavigator.new(rootFolder).searchManagedEntity("VirtualMachine", "#{hostname}")
    puts "#{hostname}:"
    create_snapshot(vm, snapshot_name, snapshot_description, false, false)
  else
    puts "#{hostname} is not a virtual machine, cannot perform snapshot operation"
    next
  end
end

service_instance.getServerConnection().logout()
Pretty sweet.  Oh, and I almost forgot - Chef is running on Jruby!  It was way easier than I thought it would be.  I just ran "jruby -S gem install chef jruby-openssl" and I was in business.

VirtualBox Guest Additions on Ubuntu 10.10

I came across this helpful little tidbit today on the interwebs an figured I'd share: http://ubuntu-tutorials.com/2010/09/08/install-guest-additions-on-ubuntu-10-10-beta-workaround/