MMS2R working in Rails 2.3.x and 3.x

Posted by Mike Tue, 17 Aug 2010 18:53:00 GMT

I should have been more vocal about this change in MMS2R. Back in February of this year I released MMS2R version 3.0.0 and starting with that version it was dependent upon the Mail gem, rather than TMail. As you might be aware, Mail is the mail gem that ActionMailer is using in Rails 3. Your "legacy" Rails 2.*.* application can still get the benefits of the latest MMS2R versions even though TMail is used by the older ActionMailers.

Here is an example of patching ActionMailer::Base in a way that we can ignore the TMail object it passes to it's #receive method and instantiate a Mail object that we can use with MMS2R.

class MailReceiver < ActionMailer::Base

  # RAILS 2.*.* ONLY!!!

  # patch ActionMailer::Base to put a ActionMailer::Base#raw_email 
  # accessor on the created instance
  class << self
    alias :old_receive :receive
    def receive(raw_email)
      send(:define_method, :raw_email) { raw_email }
      self.old_receive(raw_email)
    end
  end

  ##
  # Injest email/MMS here

  def receive(tmail)
    # completely ignore the tmail object rails passes in Rails 2.*

    mail = Mail.new(self.raw_email)
    mms = MMS2R::Media.new(mail, :logger => Rails.logger)

    # do something
  end
end

Here is a Gist of the above code where you can fork your own copy, etc. http://gist.github.com/486883.

Not to brag or anything, but I heard twitpic is using MMS2R in part of it's application.

Thank you and enjoy!

Posted in , ,  | Tags , ,  | no comments | no trackbacks

Getting Passenger to play nice with Interlock, cache_fu, and Memcached

Posted by Mike Wed, 03 Feb 2010 20:22:00 GMT

If you are running your Rails application with Phusion Passenger AND you are caching using Interlock AND/OR cach_fu AND you are using the memcache-client library to connect to your memcache server, then you’ll be seeing plenty of MemCache::MemCacheError errors that might look like these

MemCache::MemCacheError: No connection to server (localhost:11211 DEAD (Timeout::Error: IO timeout), will retry at Mon Dec 10 07:47:23 -0800 2010)

or

MemCache::MemCacheError: IO timeout

If you are using the memcached library to connect to your memcache server then you might be seeing a number of Memcached::ATimeoutOccurred errors that look like this:

Memcached::ATimeoutOccurred: Key {"interlock::controller:action:action:id:tag"=>"localhost:11211:8"}

If you are the former with memcache-client errors then don’t believe the examples you’ve seen, memcache-client doesn’t work well with the way Passenger spawns Rails processes. Don’t even try it, use the memcached library instead.

If you are using Interlock and cache_fu in the same application then you need to have the Interlock plugin loaded before cache_fu. Do so in config/environment.rb like so

# load the Interlock plugin first so it will load the memcache client specified
# in memcache.yml otherwise cache_fu will load memcache-client
config.plugins = [ :interlock, :all ]

Also, when Passenger spawns a new instance of your application you must reconnect your memcache client from within Passgener’s starting_worker_process event However that code example is vague, here is what it should look like within the Rails::Initializer block in config/environment.rb

Rails::Initializer.run do |config|
  # gem and plugin configs above ....
  if defined?(PhusionPassenger)
    PhusionPassenger.on_event(:starting_worker_process) do |forked|
      if forked
        # We're in smart spawning mode ...
        if defined?(CACHE)
          Rails.logger.info('resetting memcache client')
          CACHE.reset
          Object.send(:remove_const, 'CACHE')
          Interlock::Config.run!
        end
      else
        # We're in conservative spawning mode. We don't need to do anything.
      end
    end
  end
end

What is happening in the code above is that we are closing (with reset) the current memcached connection and are then forcing Interlock to initiate a new memcached connection within it’s helper Interlock::Config.run! method. run! will not fire if the global constant CACHE has already been assigned.

The last thing we need to do is put some timeout protection around the Mecached::Rails client when it is getting and setting values from the memcache server. Interlock has a locking mechanism when its writing to the memcache server and will try to perform the write up to five times if the server doesn’t acknowledge that the write has occured. If a timeout exception bubbles up from the runtime then the purpose of the lock is defeated and it is not able to be retried. The same can be said for reads. Your application shouldn’t have a rendering error if a single read fails to complete from the memcache server. With Interlock if a memached read returns a nil value then all that happends is the code in the behavior_cache and view_cache blocks are executed. Read and write caching errors should not be imposed upon the user’s experience, in my opinion.

To do this make an initializer named config/initializers/memcached_rails.rb as the name reveals the purpose of the file. It will alias method chain Memcached::Rails get and set operations so that they only return nil instead of bubbling up a timeout error when they occur. As I already pointed out if Interlock receives a nil value from a read or a write it will proceed and execute the view_cache and/or behavior_cache blocks you have specified in your application. Memcached::Rails get and set operations underpin Interlock’s reads and writes.

class Memcached::Rails
  def get_with_timeout_protection(*args)
    begin
      get_without_timeout_protection(*args)
    rescue Memcached::ATimeoutOccurred => e
      if (RAILS_ENV == "production" ||  RAILS_ENV == "staging")
        nil
      else
        raise e
      end 
    end   
  end     
          
  def set_with_timeout_protection(*args)
    begin
      set_without_timeout_protection(*args)
    rescue Memcached::ATimeoutOccurred => e
      if (RAILS_ENV == "production" ||  RAILS_ENV == "staging")
        nil
      else
        raise e
      end
    end
  end
  
  alias_method_chain :get, :timeout_protection
  alias_method_chain :set, :timeout_protection
end

If you are using cache_fu only, you might not have to be so forceful as I’ve been with Interlock to explicitly reset the mecached client. I’m not certain how to explicitly set memcached as the client library in a cache_fu-only environment either, it seems like it has a preference for memcache-client from the its code I’ve reviewed. Post your experiences in the comments for others to learn from if you are in a cache_fu only environment and use these techniques to overcome timeout errors.

Posted in , , , ,  | Tags , , , , ,  | no comments

will_paginate and PostgreSQL slow count(*)

Posted by Mike Mon, 21 Dec 2009 08:33:00 GMT

When a PostgreSQL database table has many rows of data, selecting the count(*) of the rows can be very slow. For instance, a table with over one million rows on an average performing virtual host can take over 5 seconds to complete. This is because PostgreSQL walks through (scans) all the rows in the table. It can be faster if the count(*) includes conditions on table columns that are indexed.

Assuming a Rails application is using the will_paginate plugin to enumerate over the large table, the rendering of each page will take many seconds to complete. This is due to the expense of scanning all the rows in the table as count(*) will be used in the calculation of the will_paginate navigation. Slow Counting is a known and accepted slow performing operation in PostgreSQL

If the Rails application is paginating over all the rows in a table, then a fast approximation technique can be used in place of the count(*) operation. Assuming a Foo model we mark its conditions as ‘1=1’ so that the will_paginate plugin will use our approximation. The pagination would look like the following

Foo.paginate(:page => params[:page], :conditions => "1=1")

In a Rails initializer that we call config/initializers/will_paginate_postgresql_count.rb we use a Rails alias method chain to implement our own wp_count method. wp_count is a protected method defined in WillPaginate::Finder::ClassMethods that is used to determine the count of rows in a table, this determines how the will_paginate navigation is rendered. It is selecting the reltuples value from PostgreSQL’s pg_class where relkind equals ‘r’ and relname is the name of the table used by our Foo class. The catalog pg_class catalogs tables and most everything else that has columns or is otherwise similar to a table.. The pg_class approximation will only be used when the conditions given to paginate are ‘1=1’, otherwise the original wp_count method is called. The code is saved in a Gist and listed below below

will_paginate_postgresql_count.rb Gist

# add this file as config/initializers/will_paginate_postgresql_count.rb
# in a Rails application
 
module WillPaginate
  module Finder
    module ClassMethods
 
      # add '1=1' to paginate conditions as marker such that the select from the pg_class
      # is used to approximate simple rows count, e.g.
      # Foo.paginate(:page => params[:page], :conditions => "1=1")
      def wp_count_with_postgresql(options, args, finder, &block)
        if options[:conditions] == ["1=1"] || options[:conditions] == "1=1"
          # counting rows in PostgreSQL is slow so use the pg_class table for
          # approximate rows count on simple selects
          # http://wiki.postgresql.org/wiki/Slow_Counting
          # http://www.varlena.com/GeneralBits/120.php
          ActiveRecord::Base.count_by_sql "SELECT (reltuples)::integer FROM pg_class r WHERE relkind = 'r' AND relname = '#{self.table_name}'"
        else
          wp_count_without_postgresql(options, args, finder, &block)
        end
 
      end
 
      alias_method_chain :wp_count, :postgresql
    end
  end
end

Posted in , ,  | Tags , ,  | 1 comment | no trackbacks

Shoulda macros for rendered partials and globbed routes

Posted by Mike Thu, 19 Nov 2009 18:19:00 GMT

Here are a couple of Shoulda macros that I’ve been using. One is to validate that partials are rendered in a view and the other is to validate globbed routes.

should_render_partial

As the name the name implies this macro validates that a partial has been rendered. The macro was born out of the need to test partials being rendered in an implementation of the Presenter Pattern that I wrote. The presenter I wrote was for Appstatz.com a site my friend Shane Vitarana created for tracking iPhone application sales and downloads. The graphs on the site are displayed with the Bluff JavaScript graphing library. The data used by Bluff in each kind of graph was rendered with a composition of different partials. The Presenter Pattern that was implemented was driving which partials were to be rendered based on the state of the application and thus tests were written to validate the implementation of the pattern is acting as expected.

Using should_render_partial takes absolute or relative paths as strings or symbols as its argument and is as easy as this example

class FoosControllerTest < ActionController::TestCase
  context "a beautiful Bluff graph" do
    setup do
      @foo = Factory :foo
      get :show, :id => @foo.id
    end

    should_render_partial 'layouts/_logo'
    should_render_partial :_data
    should_render_partial :_summary
  end
end

The code for should_render_partial is listed below and a Gist of the code is listed at http://gist.github.com/237938

# shoulda validation that a partial has been rendered by a view
 
class Test::Unit::TestCase
 
   def self.should_render_partial(partial)
     should "render partial #{partial.inspect}" do
       assert_template :partial => partial.to_s
     end
   end
 
end

Shoulda loads macros from the test/shoulda_macros/ directory therefore add the macro code to a file in that directory.

should_route_glob

Again, as the name implies should_route_glob tests that if there is a globbing route specified in the config/routes.rb then it is acting as expected. Globbed routes should be the last route mapping in the config/routes.rb file as it will greedily respond to all requests. This kind of routing is used in content management systems and I’ve also seen it used in specialized 404 handlers. For instance if an application is ported to Rails, adding a final controller route that accepts all requests would be useful to track down legacy requests. These requests, their URI and parameters, would be stored in a table so they can be inspected later. Using this technique one can easily find legacy routes that are not be handled by the new controllers, or unexpected routes that are exposed from buggy Ajax requests or odd user input, etc.

The routing (implying a controller named Foo) and its functional test are listed below.

ActionController::Routing::Routes.draw do |map|
  # GET /a/b/c will be exposed to the action as an array in params[:path] and it
  # will have already been delimited by the '/' character in the requested path
  map.any '*path', :controller => 'foos', :action => 'index'
end

The test code is as follows.

class FoosControllerTest < ActionController::TestCase
  should_route_glob :get, '/a/b/c', :action => 'show', :path => 'a/b/c'.split('/')
end

The code for should_route_glob is listed below and a Gist of the code is listed at http://gist.github.com/237987 This code may be a bit verbose as it appears that (as of 11/18/2009) Shoulda is handling globbed routes better. Add a comment if you improve this should_route_glob macro. Shoulda loads macros from the test/shoulda_macros/ directory therefore add the code to a file in that directory.

class Test::Unit::TestCase

  def self.should_route_glob(method, path, options)
    unless options[:controller]
      options[:controller] = self.name.gsub(/ControllerTest$/, '').tableize
    end
    options[:controller] = options[:controller].to_s
    options[:action] = options[:action].to_s

    populated_path = path.dup
    options.each do |key, value|
      options[key] = value if value.respond_to? :to_param
      populated_path.gsub!(key.inspect, value.to_s)
    end

    should_name = "route #{method.to_s.upcase} #{populated_path} to/from #{options.inspect}"

    should should_name do
      assert_routing({:method => method, :path => populated_path}, options)
    end
  end

end

Posted in , ,  | Tags , ,  | no comments | no trackbacks

Vebose Git Dirty Prompt

Posted by Mike Fri, 23 Oct 2009 20:22:00 GMT

I’ve updated my bashrc with a verbose dirty git prompt which is an extension of Henrik Nyh’s Show Git dirty state post. I wanted my prompt to also indicate additional git states with a single character. For instance I want to know when the local branch is ahead of the remote branch and I do that with the plus ‘+’ character. So if there are modified files in the project, and the local project is ahead of the remote I’ll see ‘☭’ for dirty and ‘+’ for ahead in my bash prompt such as the following

mike@daisy 10006 ~/projects/apps/example(master☭?)$

These are the character codes I used for the git dirty state in the project.

  • ‘☭’ – files have been modified
  • ‘?’ – there are untracted files in the project
  • ‘*’ – a new file has been add to the project but not committed
  • ‘+’ – the local project is ahead of the remote
  • ‘>’ – file has been moved or renamed

Here is an example in an example git repository

The code for my git dirty prompt is in this Gist http://gist.github.com/217120 and listed below.

# origin of work http://henrik.nyh.se/2008/12/git-dirty-prompt
function parse_git_dirty {
  status=`git status 2> /dev/null`
  dirty=`    echo -n "${status}" 2> /dev/null | grep -q "Changed but not updated" 2> /dev/null; echo "$?"`
  untracked=`echo -n "${status}" 2> /dev/null | grep -q "Untracked files" 2> /dev/null; echo "$?"`
  ahead=`    echo -n "${status}" 2> /dev/null | grep -q "Your branch is ahead of" 2> /dev/null; echo "$?"`
  newfile=`  echo -n "${status}" 2> /dev/null | grep -q "new file:" 2> /dev/null; echo "$?"`
  renamed=`  echo -n "${status}" 2> /dev/null | grep -q "renamed:" 2> /dev/null; echo "$?"`
  bits=''
  if [ "${dirty}" == "0" ]; then
    bits="${bits}☭"
  fi
  if [ "${untracked}" == "0" ]; then
    bits="${bits}?"
  fi
  if [ "${newfile}" == "0" ]; then
    bits="${bits}*"
  fi
  if [ "${ahead}" == "0" ]; then
    bits="${bits}+"
  fi
  if [ "${renamed}" == "0" ]; then
    bits="${bits}>"
  fi
  echo "${bits}"
}

function parse_git_branch {
  git branch --no-color 2> /dev/null | sed -e '/^[^*]/d' -e "s/* \(.*\)/(\1$(parse_git_dirty))/"
}

export PS1='\[\033[00;32m\]\u\[\033[01m\]@\[\033[00;36m\]\h\[\033[01m\] \! \[\033[00;35m\]\w\[\033[00m\]\[\033[01;30m\]$(parse_git_branch)\[\033[00m\]\$ '

Posted in  | Tags , ,  | no comments | no trackbacks

more about logging directly to script/console

Posted by Mike Wed, 30 Sep 2009 23:25:00 GMT

Logging to the Rails console is an old technique but I’ve noticed many of the examples on the interwebs are broken with the latest 2.3.X series of Rails releases. I believe Jamis Buck’s Watching ActiveRecord Do Its Thing is the popular origin of this technique.

Here is an example that lets you toggle logging on and off in your console. This technique was made popular in Recipe #38 of Advanced Rails Recipes but the original code example is no longer working for me with Rails 2.3.X and MySQL.

Logging in the console

As you can see below, when I start my console and get a count from one of my models only the count is returned. Also, when I perform a GET to the root of the app only the 200 status code is returned. However, when I turn on logging in the console with the loud_logger method I’ve defined both the output of the ActiveRecord log and the ActionController log are printed to the console.

.irbrc and .railsrc

To do the above you’ll have to put the following code in your $HOME/.irbrc and $HOME/.railsrc . I have a .railsrc so that Rails specific code is loaded from there, leaving generic code for all irb sessions in .irbrc . If you were not aware the Rails console is just an extension of the Ruby irb interactive console. By the way, I store my dot files in my $HOME directory using git, the Huba Huba err.the_blog post is a popular origin of this idea.

This code is at the top of my irbrc and it loads the railsrc only if being invoked by the Rails console.

ARGV.concat [ "--readline", "--prompt-mode", "simple" ]
load File.dirname(__FILE__) + '/.railsrc' if $0 == 'irb' && ENV['RAILS_ENV']

This is the code that is an extension of the Advanced Rails Recipes #38. The loud_logger method will put all logging to the console. The default_logger method will put all logging back to the Rails default which is the log/development.log in development mode. The quite_logger will turn off all logging from the console, and to log/development.log. Execute any of these within the console and described behavior takes effect.

def loud_logger
  set_logger_to Logger.new(STDOUT)
end

def quiet_logger
  set_logger_to nil
end

def default_logger
  set_logger_to RAILS_DEFAULT_LOGGER
end

def set_logger_to(logger)
  def logger.flush; end unless logger.respond_to?(:flush)
  ActiveRecord::Base.logger = logger
  ActionController::Base.logger = logger
  ActiveRecord::Base.clear_all_connections!
end

Resources

Posted in ,  | Tags ,  | no comments | no trackbacks

Migrating Legacy Typo 4.0.3 to Typo 5.3.X

Posted by Mike Mon, 07 Sep 2009 07:26:00 GMT

I completed the process of migrating this blog from Typo 4.0.3 to Typo 5.3.X. These are my notes on the process I undertook to complete the migration. I had self hosted the blog on a Linode slice and part of the migration was to switch the hosting to Dreamhost . I want to retain control of the Rails stack for the blog, but I no longer wanted to maintain the server and base application stack.

Source code

The source for the blog is actually from Frédéric de Villamil’s 5.3.X master Typo branch" at Github git@github.com:fdv/typo and if this isn’t necessary for your blog, you can skip past the git notes and install and maintain the source code in another prescribed manner.

Below are the steps to initialize a new git repository. Add in fdv’s master Typo branch as a remote repository. And finally, merge in fdv’s master branch. You would do so if you planned to frequently pull in the master changes to Typo as its being developed by the community, or if you had another remote branch you wanted to pull in changes from. Remember, at this point we are working locally.

mkdir mynewblog
cd mynewblog
git init
touch README
git add .
git commit -a -m 'start of my typo blog'
git remote add -f fdv git://github.com/fdv/typo.git
git checkout -b fdv/master
git pull fdv master
git checkout master
git merge fdv/master

Also, you’ll want to install the gems that Typo relies upon, and freeze in Rails 2.3.3

sudo rake gems:install
rake rails:freeze:edge RELEASE=2.3.3
git add vendor/rails
git commit -m 'freezing in Rails 2.3.3' vendor/rails

Finally, move the git repository you’ve just initialized to your preferred place to host your projects. Perhaps a private Github repository. I host some of my personal projects on a remote server and just pull from it over ssh.

Migating data

I dumped the production data from my old Typo 4.0.3 blog such that I could migrate it in my local environment.

mysqldump -u root --opt my_old_typo_db > /tmp/old.sql

I then scp’d the old data locally and imported it into a new database that was used for the local migration to Typo 5.3.X.

mysqladmin -u root create typo_development
scp mike@olderserver:/tmp/old.sql /tmp/
mysql -u create typo_development < ~/tmp/old.sql
cp config/database.yml.example config/database.yml
# edit database.yml with local settings
rake db:migrate

One small gotcha for me was that I was using the “recent comments” sidebar from Typo 4.0.3 and I had to manually remove it from the stored settings in the database via the mysql prompt. Use Rails dbconsole script to bring up a mysql console.

ruby script/dbconsole

Now delete the recent comments configuration.

delete from sidebars where type='RecentCommentsSidebar';

All of the data should be migrated correctly from 4.0.3 to 5.3.X at this point. Post a comment if you’ve encountered an issue doing your own migration.

Extras

Pink

Pink is punk, and I upgraded the Pink Theme to be Typo 5.3.X compatible. The Pink Theme had been orphaned after 4.0.3 so I had to make some code changes so it would operate in a Typo 5.3.X environment. This is the github project page for Pink http://github.com/monde/pink

I then added Pink as a git submodule so its code would remain independent of my project, yet still be available when the app was deployed.

git submodule add git@github.com:monde/pink.git themes/pink
git submodule update

See the Capistano notes below for additional information about git submodules and Capistrano

Hoptoad

I’ve had good success with the Hoptoad exception notifier so I added it to my project as well.

ruby script/plugin install git://github.com/thoughtbot/hoptoad_notifier.git
# edit your config/initializers/hoptoad.rb settings
git add vendor/plugins/hoptoad_notifier/ config/initializers/hoptoad.rb
git commit -m 'adding hoptoad notifier and its initializer' vendor/plugins/hoptoad_notifier/ config/initializers/hoptoad.rb

I’m not sure if others are deploying plugins as submodules, but I prefer to freeze plugins into my Rails app.

Capistrano

I used the Deploying Rails On Dreamhost with Passenger Rails Tips article and Github’s Deploying with Capistrano article to guide my Capistrano setup. After doing a "capify ." to initialize Capistrano in the project, I added a couple of extra settings and tasks to config/deploy.rb that make the setup specifically tailored for Typo’s configuration on Dreamhost.

First, make Capistrano fetch all the submodules your project is dependent upon during deployment in config/deploy.rb

set :git_enable_submodules, 1

The tasks below are also required. The first is the common touch of the tmp/restart.txt file in the current directory that signals Passenger to reload the application. The second task does three things. It links database.yml from the shared directory to the current directory. The second links the shared files directory into the current directory. The public/files directory is where Typo saves any files that are saved as a part of its minimal content management system. Use this strategy so that the files themselves are not dependent upon deployment or stored in your source code repository. Last is something specific to my blog. I use Get Clicky to track visitors statistics. My blog is currently using the Pink theme and I didn’t want to make Pink dependent on my Get Clicky configuration. Therefore I just copy over a modified Pink layout with my Get Clicky settings whenever a new version of the site is deployed.

namespace :deploy do
  task :restart do
    run "touch #{current_path}/tmp/restart.txt" 
  end
end

desc "link in shared database.yml, etc. with symbolic links"
task :link_in_shared_files do
  run "ln -s #{shared_path}/config/database.yml #{release_path}/config/database.yml"
  run "ln -s #{shared_path}/public/files #{release_path}/public/files"
  run "cp -f #{shared_path}/themes/pink/layouts/default.html.erb #{release_path}/themes/pink/layouts/"
end

after "deploy:update_code", "link_in_shared_files"

Notice that I made the link_in_shared_files task dependent to run after the the Capistrano standard deploy:update_code task has fired.

Redirects

In the virtual host settings for my blog’s old location I promiscuously redirect each request exactly to the new location.

# redirect old apache server:
RedirectMatch permanent  ^(.*)$ http://plasti.cx$1

These redirects are 301 permanent redirects so that Google and the other search engines will update their indexes permanently to the domain it now resides upon.

Notes

The End

So far I’m happy with this setup. For me, its easy to deploy and maintain. Please post any experiences you’ve had with Typo migrations or Typo hosting so that others might benefit from your experience as well.

Posted in , , ,  | Tags , , ,  | 3 comments | 1 trackback

Web Statistics