10 June 2013

TL;DR Sometimes the encoding problem isn’t in your code, it’s in your cache: Clear the cache when ugrading to ruby 1.9

I pushed to production today an app that had been running on ruby 1.8.7-p72 Now, it’s running on ruby 1.9.3-p392. We were ready to upgrade our ruby on Friday, but decided to wait till today, Monday, just in case. There weren’t any issues or much code to change, but it’s better to be safe, right?

Shortly after we sent out the success email to the team, we started getting bug reports of 500 errors that we saw in the logs were of the type ActionView::Template::Error (incompatible character encodings: UTF-8 and ASCII-8BIT) and invalid multibyte escape.

We first tried changing the encoding of our files, but caused exceptions on app start up

find . -type f -name *.haml | xargs vim +"argdo se bomb | se fileencoding=utf-8 | w"
# or
git ls-files | xargs vim +"argdo se bomb | se fileencoding=utf-8 | w"

For whatever reason, trying to recode the files with iconv or recode didn’t do anything.

We added the Rack UTF8 Sanitizer middleware, but that didn’t help

So, we added magic comments to all our relevant files

prepend() {
   printf '%s\n' H 1i "${1}" . wq | ed -s "${2}"
find {app,config,lib/public} -name '*.rb' | while read 'x' ; do  prepend '# -*- encoding: utf-8 -*-' $x ;  done
find {app,config,lib/public} -name '*.haml' | while read 'x' ; do  prepend '-# -*- encoding: utf-8 -*-' $x ;  done
find {app,config,lib,public} -name '*.erb' | while read 'x' ; do  prepend '<%# -*- encoding: utf-8 -*- %>' $x ;  done

Still broken

The fix for multibyte escape error seemed pretty ugly, but worked easily enough

# help out copy and pasting errors of good-looking email addresses
# by stripping out non-ASCII characters
def clean_ascii_text(text)
  # avoids invalid multi-byte escape error
  ascii_text = text.encode( 'ASCII', invalid: :replace, undef: :replace, replace: '' )
  # see http://www.ruby-forum.com/topic/183413
  pattern = Regexp.new('[\x80-\xff]', nil, 'n')
  ascii_text.gsub(pattern, '')

We ensured our rails config set the encoding to utf8, that our database adapter was utf8, that our database tables were collated as utf8. No dice.

We added to the top of config/application.rb

+Encoding.default_external = Encoding::UTF_8
+Encoding.default_internal = Encoding::UTF_8

No dice.

We had a breaththrough when we realized that re-saving items we surfaced on 500 made the page work. And then realized in config/environments/production.rb we had set the cache store as

 config.cache_store = :file_store, "public/system/cache"

The errors were coming from ascii files in the file system cache. We cleared the cache and everything worked. That took 3 hours. I hope this saved you some time.