Thursday, 27 February 2014

Relocating Another Data Centre

I recently took part in another data centre relocation project. I was one of a number of project managers moving some of the servers in a 1,300 server data centre. I moved about 200, and decommissioned another 50. I was directly planning and executing moves, so my role was different from on my previous project. It was good to experience a move from another position.

The project was successful in the end. I have to say that there were a number of lessons learned, which goes to prove that no many how many times you do something, there's always something more to learn.

Unlike my previous experiences, there were three major organizations working together on this relocation: the customer and two IT service providers to the customer. All organizations had good, dedicated, capable people, but we all had, at a minimum, a couple of reporting paths. That in itself was enough to add complication and effort to the project.

The senior project manager identified this right from the start and he made lots of good tries to compensate and mitigate for it. We did a number of sessions to get everyone on the same page with respect to methodology. Our core team acted as a cohesive team and we all adhered to the methodology. And in fact, across the project I think it's safe to say that the front line people did as much as they could to push toward to project goals.

Despite our best efforts, we all, across the three organizations, had to devote significant efforts to satisfy our own organization's needs. It's worth noting that much of this is simply necessary -- organizational governance is a big issue in the modern economy, and appearing to have management control is an business reality.

So if you're planning a relocation, take a look at the organizational structures that will be involved, and take them into account when planning your data centre relocation project.

Friday, 24 January 2014

Time Zone in Rails

There’s pretty good info out there about using time zones in Rails, and Rails itself does a lot of the heavy lifting. The Railscast pretty much covers it. It’s only missing a discussion of using Javascript to figure out the client browser’s time zone.

Time Zone from the Browser

To get the time zone from the browser, use the detect_timezone_rails gem. The instructions give you what you need to know to set up a form with an input field that will return the time zone that the browser figured out. That would work perfectly if you were implementing a traditional web site sign-up/sign-in form.
However, I needed to do something different. Since I’m using third party identity providers (Google, Twitter, Facebook, etc.) via the excellent Omniauth gems, I needed to be able to put the time zone as a parameter on the URL of the identity provider’s authorization request. Omniauth arranges for that parameter to come back from the identity provider, so it’s available to my app’s controller when I set up the session.
To add the parameter, I added this jQuery script to the head of the welcome page:
<script type="text/javascript">
  $(document).ready(function(){
        $('a.time_zone')
          .each(function() {
            this.href = this.href + "?time_zone=" +
              encodeURIComponent($().get_timezone());
      });
  });
</script>
This added the time zone, appropriately escaped, to the URL for the identity provider (the href of the <a> elements). This worked because I had set each of the links to the identity providers to have class="time_zone", like this:
<div class="idp">
  <%= link_to image_tag("sign-in-with-twitter-link.png", alt: "Twitter"), 
    "/auth/twitter", 
    class: "time_zone" %></div>
In the controller, I did this (along with all the other logging in stuff):
if env["omniauth.params"] &&
  env["omniauth.params"]["time_zone"]
  tz = Rack::Utils.unescape(env["omniauth.params"]["time_zone"])
  if user.time_zone.blank? 
    user.time_zone = tz
    user.save!
    flash.notice = "Your time zone has been set to #{user.time_zone}." +
      " If this is wrong," +
      " please click #{view_context.link_to('here', edit_user_path(user))}" +
      " to change your profile."
  elsif user.time_zone != tz
    flash.notice = "It appears you are now in the #{tz} time zone. " +
      "Please click #{view_context.link_to(edit_user_path(user), 'here')}" +
      " if you want to change your time zone."
  end
else
  logger.error("#{user.name} (id: #{user.id}) logged in with no time zone from browser.")
end      
Of course, you may want to do something different in your controller.

Testing Time Zones

However you get your time zones, you need to be testing your app to see how it works with different time zones. YAML, at least for a Rails fixture, interprets something that looks like a date or time as UTC. So by default, that’s what you’re testing with. But that might not be the best thing.
I had read that a good trick for testing is to pick a time zone that isn’t the one your computer is in. Finding such a time zone might be hard if you have contributors around the world. I like the Samoa time zone for testing: Far away from UTC, not too many people living in the time zone, and it has DST.
If you want a particular time zone in your fixtures, you have to use ERB. For example, in my fixtures I might put this:
created_at: <%= Time.find_zone('Samoa').parse('2014-01-30T12:59:43.1') %>
And in the test files, something like this:
test "routines layout" do
  Time.zone = 'Samoa'
  correct_hash = {
    routines(:routine_index_one)=> {
      Time.zone.local(2014, 01, 30)=> [
        completed_routines(:routine_index_one_one)
      ],
      ...

Gotchas

I found a few gotchas that I hadn’t seen mentioned elsewhere:
  • Rails applies the time zone magic when it queries the database, so if you change your time zone after you retrieve the data, then you have to force a requery, or the cached times will still be in the model. Shouldn’t be a problem when running tests, but is when using the console to figure things out
  • You can’t use database functions to turn times into dates, as these won’t use the time zone. No group by to_date(...) or anything like that

Tuesday, 7 January 2014

Self-Referential, Polymorphic, STI, Decorated, Many-to-Many Relationship in Rails 4

Preamble

I wanted to model connections à la connections in LinkedIn or Facebook in a Rails application. This means a many-to-many association between instances of the same class. That caused me some grief trying to get it hooked up right because you can’t rely on Rails to figure everything out.
The other trick in my application is that the people involved in the connections might be users who have registered to use the application, or they might be people created by a registered user, but who aren’t registered to user the application.
Concretely, and hopefully more clearly, I have “users”, who have registered, and I have people who can be involved connections. In my app the people who aren’t registered users are “patients”.
In the course of trying to get this all to work I stumbled across three approaches to this type of problem:
  1. Polymorphic classes
  2. Single Table Inheritance (STI)
  3. Decorator pattern
The combination of the many-to-many combined with the two classes took a lot of work to get straight. The Rails Guides were a great starting point, but I find that specifying Rails associations can be tricky if it’s not completely straightforward, and especially when you start chaining them together.
In the end, I decided to go with the Decorator pattern. But I’ll start with the one I threw out first: Polymorphic.

Polymorphic

I got pretty far with polymorphic associations, but I couldn’t figure out how I was going to get a list of all people (patients and users) connected to another person. I could either get all patients or all users from the methods that the Rails associations gave me, but not a list of all together.
I realized in writing the preamble above that I probably should have realized that what I was trying to model wasn’t really a polymorphic situation. Polymorphic in the examples I saw was used to connect an object to another object from any one of a number of unrelated classes. Of course, hindsight is 20/20.
This post convinced me that trying to get a list of all people wasn’t going to come naturally from a polymorphic approach, so I stopped pursuing it.

Single Table Inheritance

I got fired up about single table inheritance (STI) as I was reading about how to make the polymorphic approach work. A good brief write up is here: http://blog.thirst.co/post/14885390861/rails-single-table-inheritance. The Railscast is here: http://railscasts.com/episodes/394-sti-and-polymorphic-associations (sorry, it’s a pro Railscast so it’s behind a paywall).
Others say I shouldn’t do STI. People say it can cause problems. One problem is if the type of an object will change, and change because of user input, it’s hard to handle. The view and controller are fixed to a certain object, so you can’t change the object type based on user input.
So here’s the code. First, create the models:
rails g model person name:string type:string provider:string uid:string
rails g model link person_a:references person_b:references b_is:string
person.rb
class Person < ActiveRecord::Base
  has_many :links, foreign_key: "person_a_id"
  has_many :people, :through => :links, :source => :person_b
  scope :patients, -> { where(type: "Patient") }
  scope :users, -> { where(type: "User") }
end
user.rb (obviously there will be functionality here, but this is what I needed to get the associations to work):
class User < Person
end
patient.rb (as with user.rb, functionality will come later):
class Patient < Person
end
link.rb
class Link < ActiveRecord::Base
  belongs_to :person_a, class_name: "Person"
  belongs_to :person_b, class_name: "Person"
end
It was a little hard to get the associations to work. The key to making the has_many :links,... in person.rb work was the , class_name: "Person" on the association in link.rb.
With the above, I can do things like:
person = Person.find(1).first
person.people.first.name
person.people.patients.first.name
person.people.users.first.name
That’s all pretty sweet, and I really considered using this approach. In fact, I may return to it. There’s a lot left to do with my application. However, I’m pretty sure that I will need to deal with cases like a registered user corresponding to multiple patients (e.g. people get created under different names). Eventually I need a way to consolidate them.

Decorator

In the end, perhaps the simplest was the best. I just decorated a person with an instance of a user when the person is a registered user. (This allows multiple people for a user, which might be useful for consolidating duplicate people.)
Here’s what I did:
Generate the models:
rails g model link person_a:references person_b:references b_is:string
rails g model person user:references name:string
rails g model user uid:string name:string provider:string
person.rb
require 'person_helper'

class Person < ActiveRecord::Base
  belongs_to :user
  has_many :links, foreign_key: :person_a_id
  has_many :people, through: :links, source: :person_b

  include PersonHelper
end
I thought the person model should have has_one instead of belongs_to, but that would put the foreign key in the wrong model.
user.rb
require 'person_helper'

class User < ActiveRecord::Base
  has_many :identities, class_name: "Person"
  has_many :links, through: :identities
  has_many :people, through: :links, :source => :person_b

  include PersonHelper
end
lib/person_helper.rb
module PersonHelper
  def users
    people.select { |person| ! person.user_id.nil? }
  end

  def patients
    people.select { |person| person.user_id.nil? }
  end
end
link.rb
class Link < ActiveRecord::Base
  belongs_to :person_a, class_name: "Person"
  belongs_to :person_b, class_name: "Person"
end
With the above, I can do things like:
person = Person.find(1).first
person.people.first.name
person.patients.first.name
person.users.first.name
user = User.find(2).first
user.users.first.name
Again, sweet. Same number of files at the STI version. Instead of subclassing, common functionality is handled by a mixin module.

Postscript

Another thing people don’t seem to like about STI is that it’s easy to end up with a big table full of all sorts of columns used only in a few places. Most modern database management systems aren’t going to waste a significant amount of space for unused columns, so I’m not sure what the problem is.
However, it got me thinking if there isn’t a way in Rails to have more than one table under a model. Or more to the point, could you have a table for the base model class, and a different table for each of the subclasses, and have Rails manage all the saving a retrieving.
I’m sure I’m not the first person to think of this. But I’m not going to go looking for it right now.

Other Resources

Rails 4 guides on associations: http://guides.rubyonrails.org/association_basics.html and migrations: http://guides.rubyonrails.org/migrations.html.
Ryan Bates’ Railscast on self-referential associations: http://railscasts.com/episodes/163-self-referential-association, and on polymorphic associations: http://railscasts.com/episodes/154-polymorphic-association.

Thursday, 2 January 2014

Moving to rbenv and Installing Rails on LInux Mint 13

I'm back to doing a bit of Rails. As always, the world has moved on. Rails is at 4.0.2, and Ruby 2.0 is out. The Rails folks are recommending rbenv to manage different Ruby versions and their gems. I knew I still had some learning to do to be using rvm properly, so I decided to invest the learning time in learning rbenv, since that's what the mainstream was using.

First, I had to remove the lines at the end of my ~/.bashrc, ~/.profile, and ~/.bash_profile, and restart all my terminal windows.

I followed the rbenv installation instructions here: https://github.com/sstephenson/rbenv#installation, including the optional ruby-build installation.

Then, I did:

rbenv install -l

that shows 2.0.0-p353 as the newest production version of MRI. So I did:

rbenv install 2.0.0-p353
rbenv rehash # Either this or the next was necessary to avoid trying to install Rails in the system gem directories.
rbenv global 2.0.0-p353
gem install rails
rbenv rehash # Don't forget this again

Now I was ready to test a new application:

rails new example
cd example
rails server

Then I pointed a browser to: http://localhost:3000, and voilà.

I'm not sure I want to leave the rbenv global in place...

Sunday, 15 September 2013

Editing Screencasts in Linux

My son wants to start putting titles on the Minecraft screencasts he makes. I had tried to use OpenShot to edit his screencasts a few months ago, but the quality degraded so much as to be useless.

I took some time this weekend to dig into the issue. Here's what I learned. But first, here's the tools I'm using.

We're using Linux Mint 13. We used to use RecordMyDesktop, the standard version that comes with the distro, to capture the raw screencast. Recently, I rolled my own screencaster. (More on that elsewhere.) For editing, I use OpenShot, again the version that comes with the distro.

Directly uploading the output of RecordMyDesktop to YouTube produces acceptable videos, so I was reasonably confident that RecordMyDesktop wasn't the problem. I suspected that the process of importing the screencast to OpenShot and then exporting it again was converting formats one or more times along with way, with the attendant loss of quality. So my first task was to figure out the input format.

With a bit of Googling, I found the command line tool that I needed: avprobe. Install the right package:

sudo apt-get install libav-tools

Then get the file information:

avprobe filename.ogv

The output I got was:

Input #0, ogg, from 'testing-5.ogv':
  Duration: 00:00:06.33, start: 0.000000, bitrate: 963 kb/s
    Stream #0.0: Data: skeleton
    Stream #0.1: Video: theora, yuv420p, 864x512 [PAR 1:1 DAR 27:16], 15 fps, 15 tbr, 15 tbn, 15 tbc
    Stream #0.2: Audio: vorbis, 22050 Hz, mono, s16, 89 kb/s
Unsupported codec with id 0 for input stream 0

Next, I opened OpenShot and created a new profile: Edit-> Preferences, then click on the "Profiles" tab, then click on the "Manage Profiles" button, then click on the plus sign to add a new profile. I made it look like this, and then clicked "Save":


The output of avprobe didn't map exactly to what Openshot was looking for, so I had to guess a bit at what to fill in. I suspected that "PAR" from avprobe was the "Pixel Ratio" and that "DAR" was the "Aspect Ratio". The frame rate was "15 fps" (frames per second). The size was given by "864x512" in the avprobe output.

If you're doing your own video, note that the size is going to depend on the size of your Minecraft window when you captured the raw screencast, and also on whether you chose to capture the window border or not. And the aspect ratio will also change based on the size.

Next I created a new project selecting the profile I just created in the "Project Profile" field. Then I added clips and edited the video. Finally, I was ready to export.

I selected File-> Export Video... and went straight to the "Advanced" tab. Then I changed the "Profile", "Video Settings", and "Audio Settings" to look like this:


Then I clicked "Export Video", and I had a video that would upload with acceptable quality to YouTube.

Sunday, 3 February 2013

Work-flow Diagram for Data Centre Relocation

I wrote here about the work-flow for planning and executing the move of a group of one or more servers from one data centre to another. Here's the picture:

Work-flow for a Data Centre Relocation

I've relocated a couple of data centres, and I've just started working on another. The last one moved over 600 servers, about half physical and half virtual. We moved over five months, counting from when the first production workload went live in the new data centre. Our team consisted of five PMs working directly with the server, network and storage admins, and application support teams.

[Update: Check out the visual representation of this post here.]

We knew we had a lot of work to do in a short time, and we were working in a diverse and dynamic environment that was changing as we tried to move it. We needed a flexible and efficient way to move the data centre. One thing that really helped was a work-flow for the PMs to work with the various technical and user teams that allowed teams to focus on doing what they needed to do.

Early in the project we collected all the inventory information we could to build up a list of all the servers, whether they were physical or virtual, make and model, O/S, etc., and put it in the Master Device List (MDL). We then did a high-level breakdown into work packets or affinity groups in consultation with the application support folks. These works packets were what was doled out to the individual PMs.

Each PM then began the detailed planning process for the work packet. Starting from a template, the PM began building the relocation plan, which was simply a spreadsheet with a few tabs:
  • One tab was the plan itself, a minute-by-minute description of the tasks that had to be done, and who was responsible for doing them, over the course of the time immediately around the time of the relocation. Many also included the prerequisite tasks in the days preceding the relocation
  • Another tab was the list of servers, and the method by which they would be moved. We had a number of possible move methods, but basically they boiled down to virtual-to-virtual -- copying a virtual machine across the network, lift and shift -- physically moving a server, and leap frog -- copying the image from a physical server across the network to another, identical physical server
  • The third tab was a list of contact information for everyone mentioned in the plan, along with the approvers for the hand-over to production, escalation points, and any other key stakeholders
At this point many PMs also nailed down a tentative relocation date and time for the work packet and put it in the relocation calendar, a shared calendar in Exchange. The relocation calendar was the official source of truth for the timing of relocations. Some PMs preferred to wait until they had more information. My personal preference is to nail down the date early, as you have more choice about when to move.

The PM then got the various admins to gather or confirm the key information for the server build sheet and the server IP list.

The server build sheet contained all the information needed to build the new server in the new data centre. For a virtual machine, this was basically the number and size of mounted storage volumes including the server image itself. This information was key for planning the timing of the relocation, and in the case of VMs with extra attached storage volumes, made sure that everything got moved.

For physical servers the build sheet had everything needed for a VM, plus all the typical physical server information needed by the Facilities team to assign an available rack location and to rack and connect the server in the new data centre.

The server IP list simply listed all the current IPs used by the server, and their purpose. Most of our servers had one connection each to two separate redundant networks for normal data traffic, along with another connection to the backup network, and finally a fourth connection to the out-of-band management network ("lights-out operation" card on the server). Some servers had more, e.g. for connections to a DMZ or ganging two connections to provide more throughput.

The PM iterated through these documents with the admins and support staff until they were ready. One thing that often changed over the course of planning was the list of servers included in the work packet. Detailed analysis often discovered dependencies that brought more servers into the work packet. Or the volume of work proved to be too much to do in the available maintenance window and the work packet had to be split into two. Or the move method turned out to be inappropriate. We encouraged this, as our goal of minimizing or eliminating downtime and risk was paramount.

When the plan was done the Facilities team took the server build sheet and arranged for the physical move and connection of servers. The Network team took the server IP list and used it to assign the new IPs, and prepare the required network configuration and firewall rules.

The network admins put the new IPs into the same server IP list sheet, which was available to everyone, so for example the server admins could assign the new IPs at the time of the relocation.

At the time of the relocation, everyone did their tasks according to the relocation plan, and the PM coordinated everything. For simple single server, single application relocations, the team typically moved and tested the server without intervention from the PM.

Finally, the Backup and Monitoring teams used the server list in the relocation plan to turn backups and monitoring off for the relocated servers at the old data centre, and to turn  backups and monitoring on for the relocated servers at the new data centre.

It wasn't all roses. We had a few challenges.

We set a deadline for the PMs to have the server build sheets and server IP lists completed two weeks before the relocation, to give time for the Facilities team to plan transport and workloads for the server room staff, and for the Network team to check all the firewall rules and ensure that the new configuration files were right. We often missed that deadline, and were saved by great people in the Facilities and Network teams, but not without a lot of stress to them.

There was some duplication of information across the documents, and it could be tedious to update. As an old programmer, I had to stop myself several times from running off and building a little application in Ruby on Rails to manage the process. But we were a relocation project, not a software development project, so we sucked it up and just worked with the tools we had.

In summary, we had a repeatable, efficient work-flow that still allowed us to accommodate the unique aspects of each system we were moving. We needed five key documents:
  • Master device list (MDL), a single spreadsheet for the whole project
  • Relocation calendar, a single shared calendar in Exchange
  • Relocation plan, per work packet
  • Server build sheet, per server, or per work packet with a tab per server
  • Server IP list, a single document for the whole project (which grew as we went)
The PMs were working with various teams that knew how to do, and were very efficient at, certain repeatable tasks:
  • Communicating outages to the user base (Communication Lead)
  • Moving a physical server and connecting it in the new data, or installing a new server as a target for an electronic relocation of a physical server (Facilities team)
  • Moving a virtual machine or a physical machine image, and its associated storage (Server and Storage team)
  • Reconfiguring the network and firewall for the relocated servers, including DNS changes (Network team, although for simple moves the server admin often did the DNS changes)
  • Acceptance testing (Test Lead who organized testing)
  • Changing backups and monitoring (Backup team and Monitoring team)