<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Brightbox Blog &#187; filesystem</title>
	<atom:link href="http://blog.brightbox.co.uk/tag/filesystem/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.brightbox.co.uk</link>
	<description></description>
	<lastBuildDate>Fri, 02 Dec 2011 12:56:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Expiring an entire page cache tree atomically</title>
		<link>http://blog.brightbox.co.uk/posts/expiring-an-entire-page-cache-tree-atomically</link>
		<comments>http://blog.brightbox.co.uk/posts/expiring-an-entire-page-cache-tree-atomically#comments</comments>
		<pubDate>Mon, 16 Nov 2009 16:35:13 +0000</pubDate>
		<dc:creator>John Leach</dc:creator>
				<category><![CDATA[tech]]></category>
		<category><![CDATA[atomic]]></category>
		<category><![CDATA[cache]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[delete]]></category>
		<category><![CDATA[expire]]></category>
		<category><![CDATA[filesystem]]></category>
		<category><![CDATA[html]]></category>
		<category><![CDATA[page cache]]></category>
		<category><![CDATA[rails]]></category>
		<category><![CDATA[ruby]]></category>

		<guid isPermaLink="false">http://blog.brightbox.co.uk/?p=815</guid>
		<description><![CDATA[As you&#8217;ll all know, Rails has page caching baked right in &#8211; the first time an action is hit, it writes a html file of the result to the filesystem. Subsequent hits are served direct from the html file at high speed by the web server without ever involving your Rails app. Expiring the cache [...]]]></description>
			<content:encoded><![CDATA[<p>As you&#8217;ll all know, <a href="http://guides.rubyonrails.org/caching_with_rails.html#page-caching">Rails has page caching baked right in</a> &#8211; the first time an action is hit, it writes a html file of the result to the filesystem. Subsequent hits are served direct from the html file at high speed by the web server without ever involving your Rails app.</p>
<p>Expiring the cache is just a case of deleting the html file. But what if you want to expire an entire tree of cache files? Say you change something in a header or footer, so every single page needs expiring at once.</p>
<p>The usual way to do this is to just <a href="http://ruby-doc.org/core/classes/FileUtils.html#M004337">delete</a> the entire page cache tree, with <code>FileUtils.rm_rf</code>.  This works pretty well, but with a big tree you&#8217;ll get strange behaviour under high load due to concurrent access.  Whilst your <code>rm_rf</code> process is deleting the tree, file by file, your webserver will still be looking in there for page cache files and Rails will still be trying to write them.<br />
<span id="more-815"></span></p>
<p>This is easily solvable.  On a POSIX compliant filesystem, like EXT3, the rename operation is atomic &#8211; it either happens or it doesn&#8217;t, there is no in-between state where it is half renamed or anything.  So, before running the <code>rm_rf</code>, you <a href="http://ruby-doc.org/core/classes/FileUtils.html#M004330">rename</a> your highest-level cache directory to something temporary.  This means the cache expiry is instantaneous, even if you have 100 meg of page cache, and you won&#8217;t get Rails writing new page cache files into it whilst you delete it.</p>
<p>It&#8217;s a good idea to use a robust temporary filename format so two processes don&#8217;t end up renaming a cache directory to the same thing at the same time, especially if your page cache is on a shared filesystem.</p>
<p>An example snippet of code follows. It assumes you want to expire pages from the controller named <code>entries</code>.</p>
<pre><code>require 'socket'
tmp_cache_dir = [Socket.gethostname, Process.pid, Time.now.to_i, rand(0xffff)].to_s
page_cache_tree = File.join(ApplicationController.page_cache_directory, 'entries')
FileUtils.mv(page_cache_tree, page_cache_tree + tmp_cache_dir)
FileUtils.rm_rf(page_cache_tree + '-' + tmp_cache_dir)</code></pre>
<p>This renames <code>RAILS_ROOT/public/entries</code> to something like <code>RAILS_ROOT/public/entries-hostname22351125837834752773</code> (which should be sufficiently unique across a number of nodes in a cluster as to avoid collisions) and then deletes it.</p>
<p>If you want to expire the entire page cache, you&#8217;ll need to change the default from <code>RAILS_ROOT/public</code> as you can&#8217;t rename and delete that (it has images and javascripts etc. too!).  Change it to something like <code>RAILS_ROOT/public/page_cache</code>. You&#8217;ll need to update your web server config to consider this new path too.</p>
<p>Remember that rename is only atomic within the same filesystem, so if you symlink your page cache directory from your RAILS_ROOT onto a shared filesystem, then you need to do all your renames and deletes within this.</p>
<p>Also, this has the side effect of working around a bug with our shared filesystem, <a href="http://gluster.com">GlusterFS</a>, which got upset with <a href="http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=112">multiple concurrent directory tree deletes</a> (this is now fixed though).</p>]]></content:encoded>
			<wfw:commentRss>http://blog.brightbox.co.uk/posts/expiring-an-entire-page-cache-tree-atomically/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

