<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Beyond Syntax &#187; linux</title>
	<atom:link href="http://www.beyond-syntax.com/category/linux/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.beyond-syntax.com</link>
	<description>Looking beyond syntactical meaning</description>
	<lastBuildDate>Thu, 01 Jul 2010 22:45:23 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Performance Monitoring with OProfile</title>
		<link>http://www.beyond-syntax.com/2010/07/performance-monitoring-with-oprofile/</link>
		<comments>http://www.beyond-syntax.com/2010/07/performance-monitoring-with-oprofile/#comments</comments>
		<pubDate>Thu, 01 Jul 2010 21:32:00 +0000</pubDate>
		<dc:creator>Michael Schultz</dc:creator>
				<category><![CDATA[computers]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[guide]]></category>
		<category><![CDATA[howto]]></category>
		<category><![CDATA[oprofile]]></category>
		<category><![CDATA[performance monitoring]]></category>

		<guid isPermaLink="false">http://www.beyond-syntax.com/?p=162</guid>
		<description><![CDATA[oprofile is a low overhead, open-source tool that hooks into Linux and can keep track of CPU event monitoring information.  This is a fairly general statement and for this post I&#8217;ll be using the Intel Penryn microarchitecture, which should have similar event counters to most recent Intel processors.  You can get the canonical [...]]]></description>
			<content:encoded><![CDATA[<p><a title="oprofile home page" href="http://oprofile.sourceforge.net/">oprofile</a> is a low overhead, open-source tool that hooks into Linux and can keep track of CPU event monitoring information.  This is a fairly general statement and for this post I&#8217;ll be using the Intel Penryn microarchitecture, which should have similar event counters to most recent Intel processors.  You can get the canonical list of event counters from Intel&#8217;s own documentation in Chapter 30, Performance Monitoring, of Volume 3B, System Programming Guide (available from <a title="Intel 64 and IA-32 Architectures Software Developer's Manuals" href="http://www.intel.com/products/processor/manuals/">Intel&#8217;s site</a>).  Alternatively, the Japan Advanced Institute of Science and Technology have an <a href="http://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/index.htm">interactive version</a> with all the events for most Intel processors.</p>
<p><span id="more-162"></span></p>
<h3>Event Counters</h3>
<p>If you are unaware, almost every processor manufactured in recent history has some collection of event counters that are incremented when some processor event occurs.  These events can range from clock cycles ticking by, instructions being retired, thermal thresholds being passed, or second level cache misses.</p>
<p>So far, I&#8217;ve only really used the CPU clock cycles, level 1 cache line replacement, and instructions retired event counters.  Your needs might not match mine, so venture over to the Programmer Manual when you need something else!</p>
<h4>Event Ratios</h4>
<p>Related to the event counters are event ratios.  These simple ratios can help you find specific performance issues in your program.  For example, if your program does a lot of memory accesses, the processor may need to replaced cache lines frequently.  But cache line replacements are naturally occurring in programs, how do we find excessive?  Simple!  We can just use the ratio of L1 cache replacements to the number of instructions retired.  Then we&#8217;ll have an idea of how many times per instruction an L1 cache line is replaced.</p>
<h3>Using <code>oprofile</code></h3>
<p>First, you&#8217;ll have to be running Linux, then you&#8217;ll want to install the &#8220;oprofile&#8221; package.  Since this software installs kernel modules for monitoring, you&#8217;ll also need root/sudo access to allow the module to be loaded and unloaded for monitoring sessions.  Here,  I&#8217;ll be running as a user and using the <code>sudo</code> command when needed.</p>
<h4><code>opcontrol</code></h4>
<p><code>opcontrol</code> is main program that lets you interact with the kernel.  If you need a down-and-dirty list of the events available for monitoring, <code>opcontrol --list-events</code> will show you all the event counters at your disposal.</p>
<p>On my processor, the default event to monitor is CPU_CLK_UNHALTED which will tell me where the processor spent most of the time executing.  If you want to monitor different events, you can specify what event(s) to monitor at the command line.</p>
<pre>$ sudo opcontrol --event L1D_REPL:10000 --event INST_RETIRED:10000</pre>
<p>The <code>:10000</code> after each counter simply specifies what the trigger threshold is for raising processor exception.  In other words, every 10,000 instructions retired the processor raises an exception that the oprofile daemon will catch and then increment the sample counter for that event.  So, if you see that oprofile has 1 sample of the INST_RETIRED counter then the processor has seen 10,000 such events.</p>
<p>Now that we have the event counters configured, we can start the monitoring.</p>
<pre>$ sudo opcontrol --start</pre>
<p>Since the system is doing other activities, it is best if what you want to monitor can monopolize the system for the while.  In my case I build a simple program that purposefully causes the L1 cache to have a lot of misses (<a href="http://dev.beyond-syntax.com/l1thrash/l1thrash.c">l1thrash source code</a>).  I&#8217;ll also set the program to execute on one processor (CPU 1).</p>
<pre>$ taskset 02 ./l1thrash</pre>
<p>After it finishes executing, stop oprofile from running and save the profile session on the disk.</p>
<pre>$ sudo opcontrol --stop
$ sudo opcontrol --save l1thrash</pre>
<p>Now we have our profile saved to disk and we can view it with <code>opreport</code>.</p>
<h4><code>opreport</code></h4>
<p>Finally, we get to see how the program handled!  Since we were smart and saved our profile to a session, we&#8217;ll have to specify that at the command line.  You might want to pipe the output to less since it can be long at times.  On my eight core system the output looks ugly.</p>
<pre>$ opreport session:l1thrash
CPU: Core 2, speed 2494.04 MHz (estimated)
Counted L1D_REPL events (Cache lines allocated in the L1 data cache) with a unit mask of 0x0f (No unit mask) count 10000
Samples on CPU 0
Samples on CPU 1
Samples on CPU 2
Samples on CPU 3
Samples on CPU 4
Samples on CPU 5
Samples on CPU 6
Samples on CPU 7
    cpu:0|            cpu:1|            cpu:2|            cpu:3|            cpu:4|            cpu:5|            cpu:6|            cpu:7|
  samples|      %|  samples|      %|  samples|      %|  samples|      %|  samples|      %|  samples|      %|  samples|      %|  samples|      %|
------------------------------------------------------------------------------------------------------------------------------------------------
      541 95.7522      2969  0.9630       301 92.6154       484 69.9422       797 92.6744       707 88.2647       675 90.3614       707 89.8348 vmlinux
        7  1.2389        21  0.0068         6  1.8462         6  0.8671         9  1.0465         6  0.7491         6  0.8032         5  0.6353 oprofile
        6  1.0619         3 9.7e-04         6  1.8462         1  0.1445         3  0.3488         2  0.2497         1  0.1339         4  0.5083 nf_ses_watch
        5  0.8850        16  0.0052         6  1.8462         7  1.0116        25  2.9070        23  2.8714        30  4.0161        27  3.4307 libc-2.5.so
        3  0.5310         2 6.5e-04         1  0.3077         1  0.1445         2  0.2326         4  0.4994         0       0         1  0.1271 libpython2.4.so.1.0
        1  0.1770         0       0         0       0         0       0         0       0         0       0         0       0         0       0 e1000e
        1  0.1770         0       0         0       0         0       0         0       0         0       0         0       0         0       0 irqbalance
        1  0.1770         1 3.2e-04         1  0.3077         0       0         0       0         0       0         0       0         0       0 sshd
        0       0         2 6.5e-04         0       0         0       0         6  0.6977         8  0.9988         8  1.0710        18  2.2872 bash
        0       0         0       0         0       0         0       0         1  0.1163         0       0         0       0         1  0.1271 gawk
        0       0         0       0         0       0         3  0.4335         1  0.1163         2  0.2497         1  0.1339         0       0 bnx2
        0       0         0       0         3  0.9231         0       0         0       0         0       0         0       0         0       0 ehci_hcd
        0       0    305283 99.0179         0       0         0       0         1  0.1163         4  0.4994         1  0.1339         2  0.2541 l1thrash
        0       0        10  0.0032         0       0         0       0        14  1.6279        12  1.4981        19  2.5435        13  1.6518 ld-2.5.so
        0       0         3 9.7e-04         0       0         0       0         1  0.1163         1  0.1248         2  0.2677         2  0.2541 libcrypto.so.0.9.8b
        0       0         0       0         1  0.3077         0       0         0       0         0       0         0       0         0       0 libm-2.5.so
        0       0         0       0         0       0         0       0         0       0         0       0         1  0.1339         0       0 libpthread-2.5.so
        0       0         0       0         0       0         0       0         0       0         0       0         1  0.1339         0       0 syslogd
        0       0         0       0         0       0         0       0         0       0         1  0.1248         0       0         0       0 which
        0       0         0       0         0       0         1  0.1445         0       0         0       0         0       0         0       0 libcups.so.2
        0       0         0       0         0       0         0       0         0       0         0       0         2  0.2677         0       0 libusb-0.1.so.4.4.4
        0       0         0       0         0       0       189 27.3121         0       0        30  3.7453         0       0         7  0.8895 oprofiled
        0       0         1 3.2e-04         0       0         0       0         0       0         1  0.1248         0       0         0       0 cupsd</pre>
<p>You may notice that the columns try to be sorted in descending order by the number of samples taken for a specific process.  However, on CPU 1 (where we ran <code>l1thrash</code>) the sorted order isn&#8217;t close to correct.  Luckily, we know that the bulk of our program only ran on CPU 1, so we can reissue the <code>opreport</code> command specifying that we only care about that processor.</p>
<pre>$ opreport session:l1thrash cpu:1
CPU: Core 2, speed 2494.04 MHz (estimated)
Counted INST_RETIRED.ANY_P events (number of instructions retired) with a unit mask of 0x00 (No unit mask) count 10000
Counted L1D_REPL events (Cache lines allocated in the L1 data cache) with a unit mask of 0x0f (No unit mask) count 10000
INST_RETIRED:1...|   L1D_REPL:10000|
  samples|      %|  samples|      %|
------------------------------------
  1834500 91.0882    305283 99.0179 l1thrash
   154499  7.6713      2969  0.9630 vmlinux
    21655  1.0752        21  0.0068 oprofile
     2176  0.1080        16  0.0052 libc-2.5.so
      442  0.0219        10  0.0032 ld-2.5.so
      435  0.0216         2 6.5e-04 bash
      108  0.0054         3 9.7e-04 libcrypto.so.0.9.8b
       47  0.0023         3 9.7e-04 nf_ses_watch
       43  0.0021         1 3.2e-04 sshd
       35  0.0017         2 6.5e-04 libpython2.4.so.1.0
       10 5.0e-04         0       0 libavahi-common.so.3.4.3
       10 5.0e-04         1 3.2e-04 cupsd
        9 4.5e-04         0       0 libcups.so.2
        7 3.5e-04         0       0 bnx2
        3 1.5e-04         0       0 libavahi-core.so.4.0.5
        1 5.0e-05         0       0 libpthread-2.5.so
        1 5.0e-05         0       0 timemodule.so</pre>
<p>That looks better!  Since we&#8217;ve narrowed down the output to one CPU, we now get to see both events that we monitored too.  You can see that the majority of the time was spent in our <code>l1thrash</code> program, but how did it do?</p>
<p>We know that the number of samples is the number of times that the event counter on the processor hit 10,000 for both counters.  So, we find that our <code>l1thrash</code> program caused <img src='http://s.wordpress.com/latex.php?latex=%28305283%29%2810000%29%20%3D%203052830000&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(305283)(10000) = 3052830000' title='(305283)(10000) = 3052830000' class='latex' /> level 1 cache replacements and retired <img src='http://s.wordpress.com/latex.php?latex=%281834500%29%2810000%29%20%3D%2018345000000&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(1834500)(10000) = 18345000000' title='(1834500)(10000) = 18345000000' class='latex' /> instructions.  Egads!  Is that good or bad?  Well, now we can throw in our ratio calculation for the L1 data cache miss:</p>
<p style="text-align:center;"><img src='http://s.wordpress.com/latex.php?latex=L1_%7Bmiss%7D%20%3D%20%5Cfrac%7BL1D%5C_REPL%7D%7BINST%5C_RETIRED%7D%20%3D%20%5Cfrac%7B305283%7D%7B834500%7D%20%3D%20%5Csim%2016.6%5C%25&#038;bg=ffffff&#038;fg=000000&#038;s=2' alt='L1_{miss} = \frac{L1D\_REPL}{INST\_RETIRED} = \frac{305283}{834500} = \sim 16.6\%' title='L1_{miss} = \frac{L1D\_REPL}{INST\_RETIRED} = \frac{305283}{834500} = \sim 16.6\%' class='latex' /></p>
<p>That seems pretty bad to me!  We can also see that the Linux kernel (<code>vmlinux</code>) had a ratio of 2,969:154,499 or about 1.9%, that is a fairly typical miss ratio.</p>
<h3>A Second Example</h3>
<p>This is a real example of a program I am actively trying to improve.  The program is a kernel module (<code>nf_ses_watch</code>) designed to intercept packets at a decent rate, it is not performing well.  Here I&#8217;ll use the default CPU_CLK_UNHALTED event monitor to see where the processor spends most of its time.</p>
<pre>$ # I've already loaded the kernel module and started my packet generator
$ sudo opcontrol --event default
$ sudo opcontrol --start
$ # I'll wait about 30 seconds so there are enough samples to be meaningful
$ sudo opcontrol --stop
$ sudo opcontrol --save bombard</pre>
<p>Now I have my saved session and can look at the profile.  I&#8217;ve also taken the time to set the interrupt affinity of the Ethernet device to a specific processor (CPU 7), so now we can see if all the time was spent in my code of Linux code.</p>
<pre>$ opreport session:bombard cpu:7
CPU: Core 2, speed 2494.04 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 10000
CPU_CLK_UNHALT...|
  samples|      %|
------------------
   737746 86.6169 nf_ses_watch
    88183 10.3533 vmlinux
    16810  1.9736 e1000e
     3680  0.4321 oprofiled
     2594  0.3046 oprofile
     1578  0.1853 libc-2.5.so
      900  0.1057 bash
       78  0.0092 ld-2.5.so
       52  0.0061 ophelp
       26  0.0031 libavahi-common.so.3.4.3
       22  0.0026 libavahi-core.so.4.0.5
       13  0.0015 gawk
        9  0.0011 libcrypto.so.0.9.8b
        9  0.0011 libpython2.4.so.1.0
        9  0.0011 sshd
        8 9.4e-04 bnx2
        4 4.7e-04 libpthread-2.5.so
        3 3.5e-04 grep
        2 2.3e-04 ipv6
        2 2.3e-04 auditd
        1 1.2e-04 cat
        1 1.2e-04 libdl-2.5.so
        1 1.2e-04 libm-2.5.so
        1 1.2e-04 libpcre.so.0.0.1
        1 1.2e-04 dirname
        1 1.2e-04 automount</pre>
<p>Wow!  Over 86% of the time we were executing code in the <code>nf_ses_watch</code> kernel module (my code)!  Let&#8217;s see if we can dig a little deeper.  First, oprofile has already done the work for us and tracks the specific symbol name within a piece of code that was active when the sample was taken with the <code>--symbols</code> option (this results in a very long list of kernel symbols).  But, in the case of a kernel module, <code>opreport</code> doesn&#8217;t know where to find the symbol names so we have to tell it where the kernel module lives with <code>--image-path</code>.</p>
<pre>$ opreport session:bombard cpu:7 --symbols --image-path ~/nf_ses_watch/kmod | head
warning: /bnx2 could not be found.
warning: /e1000e could not be found.
warning: /ipv6 could not be found.
warning: /oprofile could not be found.
warning: /sbin/auditd could not be read.
CPU: Core 2, speed 2494.04 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 10000
warning: could not check that the binary file /home/mjschultz/mon/module/kmod/nf_ses_watch.ko has not been modified since the profile was taken. Results may be inaccurate.
samples  %        image name               app name                 symbol name
733996   86.1767  nf_ses_watch.ko          nf_ses_watch             do_rip_entry
16810     1.9736  e1000e                   e1000e                   (no symbols)
10308     1.2102  vmlinux                  vmlinux                  rb_get_reader_page
9785      1.1488  vmlinux                  vmlinux                  read_hpet
8701      1.0216  vmlinux                  vmlinux                  ring_buffer_consume
3606      0.4234  vmlinux                  vmlinux                  netif_receive_skb
3530      0.4144  vmlinux                  vmlinux                  kfree</pre>
<p><em>(I&#8217;ve piped the output through <code>head</code> to keep it reasonable.)</em>  We can see the real dirt here!  By a huge margin, the <code>do_rip_entry</code> symbol in my <code>nf_ses_watch</code> module executes more than the Ethernet driver that is handling the raw packets.  So that is where I&#8217;ll be looking when I try to resolve my bug.</p>
<h3>Conclusions</h3>
<p>If you are looking to optimize your program, oprofile is a great tool to use.  The default event monitor (CPU clock cycles on most processors), can give you an idea of what part of your program is using most of the processor time.  Once you know that, you can focus your efforts on reducing the number of cycles spent in that function.  But don&#8217;t forget about all those other events too.  If you have a memory intensive application, maybe you could reduce the memory contention and get an effective speedup with almost no refactoring!</p>
<p><em>(I&#8217;ve tried my best to be accurate with this information and I welcome any explicit corrections or clarifications.)</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.beyond-syntax.com/2010/07/performance-monitoring-with-oprofile/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The nth Backup Solution</title>
		<link>http://www.beyond-syntax.com/2010/02/the-nth-backup-solution/</link>
		<comments>http://www.beyond-syntax.com/2010/02/the-nth-backup-solution/#comments</comments>
		<pubDate>Fri, 19 Feb 2010 20:01:24 +0000</pubDate>
		<dc:creator>Michael Schultz</dc:creator>
				<category><![CDATA[linux]]></category>
		<category><![CDATA[os x]]></category>
		<category><![CDATA[backups]]></category>
		<category><![CDATA[cron]]></category>

		<guid isPermaLink="false">http://www.beyond-syntax.com/?p=147</guid>
		<description><![CDATA[In the past, I had developed my own backup solution.  Unfortunately, over time it didn&#8217;t work out (mainly from changing systems, moving, using a laptop instead of a desktop, and maintaining it).  However, I still like the idea of incremental backups as well as a mirrored version of my files (it saves space and lets [...]]]></description>
			<content:encoded><![CDATA[<p>In the past, I had developed my own <a href="http://www.beyond-syntax.com/2007/10/automatic-backups-using-cron-and-tar/">backup solution</a>.  Unfortunately, over time it didn&#8217;t work out (mainly from changing systems, moving, using a laptop instead of a desktop, and maintaining it).  However, I still like the idea of incremental backups as well as a mirrored version of my files (it saves space and lets me keep a history going back some number of days).</p>
<p><span id="more-147"></span>Now that I&#8217;m somewhat settled (and a little wiser), I decided to once more try my hand at a solid backup plan.  This was mainly motivated by a recent reinstall of my wife&#8217;s system (no lost data, just operating system upgrade).  Since I don&#8217;t have vast amounts of time on my hands, I didn&#8217;t want to forward port my old solution to get it to work on Linux and Mac OS X, so I looked for new solutions.  I recalled <a href="http://www.mscs.mu.edu/~brylow/">my advisor</a> from Marquette mentioning <a href="http://rdiff-backup.nongnu.org/">rdiff-backup</a> as what he put on his wife&#8217;s machine during her dissertation days.</p>
<p>As it turns out, rdiff-backup does most of what I wanted out of my backup solution and, in fact, does it a little better.  The main issue I had with my system was that it would periodically (monthly) take a snapshot of my home directory, after that it would periodically (weekly) build incremental diffs based off that snapshot.  What this boils down to is that, if a catastrophic failure happens I would roll back to the most recent snapshot, then progress forward in time to the most recent incremental file.  Not bad, but if you want better-than-weekly granularity it could be a lot of work.  Obviously, I had scripted this part, but still it is wasted time.  With rdiff-backup, it would be a single copy operation to restore to the most recent version.  If you wanted older versions you could roll back through the incremental diffs (again, it is automated).</p>
<p>The other feature that I needed was the ability to remove backups/incremental data older than some time frame (monthly).  Again, rdiff-backup gives me this ability at the command line.  Other bonuses include the fact that it is cross-platform (via macports or most Linux repositories), written in Python, and not maintained by me!</p>
<p>With the basic service in place, it was time to make it automated.  Again, linked off rdiff-backup&#8217;s page is an article on <a href="http://arctic.org/~dean/rdiff-backup/unattended.html">how to do unattended backups</a>.  Besides the typical unattended SSH-keypair-without-a-passphrase and protecting-the-account steps, it introduced me to a new trick (which for some reason, despite having the knowledge on how to do it, never put together) using SSH config.</p>
<pre>Host athena-backup
	Hostname athena.olympus
	User backups
	IdentityFile ~/.ssh/backups_rsa
	Compression yes
	Protocol 2</pre>
<p>Now, if I try to <code>ssh athena-backup</code>, it&#8217;ll automatically use the correct identity file and user name which saves me from having to specify it on the command line (which you can&#8217;t typically do with wrapper functionality).  More importantly, it doesn&#8217;t break normal SSHing onto that host since we made it a special host (that&#8217;s the part I never put together, even though I knew it was possible).</p>
<p>The next issue I never took that time to think about before was my having moved from desktop to laptop (thereby making 1:00am backups worthless sense the laptop isn&#8217;t always on).  Because rdiff-backup does a roll-back model instead of my roll-forward model, I decided to do hourly backups to my home machine, thus I&#8217;ll likely catch at least one of these a day.  But I&#8217;m not always at home!  Getting around that is trivial, I&#8217;ll just ping the backup server before trying.  If it doesn&#8217;t respond, I don&#8217;t backup.  This is done through:</p>
<pre>ping -c1 -t1 $SERVER &gt; /dev/null 2&gt;&amp;1</pre>
<p>where <code>$SERVER</code> is just the name of the backup server.  It pings the host once with a timeout of 1 second, if it succeeds the backup continues; otherwise the script exits.</p>
<p>Of course, setting up the cronjob is as simple as:</p>
<pre>0 */1 * * * $HOME/.crontab/rdiff-backup.sh</pre>
<p>Hopefully this time around the backup solution is more robust than before.</p>
<p>Attachment: <a href="http://dev.beyond-syntax.com/scripts/rdiff-backup.sh">rdiff-backup.sh</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.beyond-syntax.com/2010/02/the-nth-backup-solution/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Remote Instance of Firefox via SSH -X</title>
		<link>http://www.beyond-syntax.com/2009/07/remote-instance-of-firefox-via-ssh-x/</link>
		<comments>http://www.beyond-syntax.com/2009/07/remote-instance-of-firefox-via-ssh-x/#comments</comments>
		<pubDate>Mon, 27 Jul 2009 18:46:57 +0000</pubDate>
		<dc:creator>Michael Schultz</dc:creator>
				<category><![CDATA[bash]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[firefox]]></category>
		<category><![CDATA[shell]]></category>

		<guid isPermaLink="false">http://www.beyond-syntax.com/?p=95</guid>
		<description><![CDATA[Firefox is a pretty decent web browser.  However, it can be a bit more clever than I want it at times.  For example, if I want to SSH into a remote machine and launch a instance of Firefox &#8212; to take on the remote machine&#8217;s IP address or access localhost &#8212; I would have to [...]]]></description>
			<content:encoded><![CDATA[<p><a title="Read about the Firefox web browser" href="http://www.getfirefox.com/">Firefox</a> is a pretty decent web browser.  However, it can be a bit more clever than I want it at times.  For example, if I want to SSH into a remote machine and launch a instance of Firefox &#8212; to take on the remote machine&#8217;s IP address or access localhost &#8212; I would have to close the local instance then launch the remote instance.  That is annoying and unacceptable behaviour.</p>
<p>Luckily, the solution is fairly straightforward.  Once you have SSH&#8217;d into a remote host (using <code>ssh -X</code>), you simply need to run <code>firefox -no-remote</code>.  Of course you may want to tack on <code>&gt; /dev/null</code> and an ampersand <code>&amp;</code> to ignore the output and background the task. (Thanks to <a href="http://www.theopensourcerer.com/2007/11/15/remote-firefox-over-xssh/">The Open Sourcer</a>.)</p>
<p>With Firefox 2.x this behaviour was somewhat undocumented, but with Firefox 3.x, running <code>firefox --help</code> from the command line shows the <code>-no-remote</code> option.  It also seems that the default (i.e. <code>-remote</code>), is &#8220;documented&#8221; on Mozilla&#8217;s site for <a href="http://www.mozilla.org/unix/remote.html">Remote Control of UNIX Mozilla</a>.</p>
<p>If you wanted to make the <code>-no-remote</code> behaviour the default when SSH&#8217;d into remote machines, you could simply add a few lines to your bash profile to alias the <code>firefox</code> command.</p>
<pre># If we're forwarding X over SSH, make firefox execute on this machine
if [ -n "$SSH_CONNECTION" -a -n "$DISPLAY" ]; then
    alias firefox='firefox -no-remote'
fi</pre>
<p>At least that is what I did.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.beyond-syntax.com/2009/07/remote-instance-of-firefox-via-ssh-x/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Quick Introduction to Makefiles</title>
		<link>http://www.beyond-syntax.com/2009/02/a-quick-introduction-to-makefiles/</link>
		<comments>http://www.beyond-syntax.com/2009/02/a-quick-introduction-to-makefiles/#comments</comments>
		<pubDate>Thu, 05 Feb 2009 02:38:54 +0000</pubDate>
		<dc:creator>Michael Schultz</dc:creator>
				<category><![CDATA[development]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[make]]></category>
		<category><![CDATA[makefiles]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.beyond-syntax.com/?p=19</guid>
		<description><![CDATA[Today at the Marquette Student ACM meeting, I gave a short presentation (PDF) about development on Linux.  Specifically using Makefiles.  As promised I have uploaded it to this site and I will give a little more information in this post.

Variables
The two main types of variables in a Makefile are &#8220;recursively expanded&#8221; (=; equal to) and [...]]]></description>
			<content:encoded><![CDATA[<p>Today at the <a href="http://acm.mscs.mu.edu/">Marquette Student ACM</a> meeting, I gave a short <a href="http://www.beyond-syntax.com/uploads/2009/02/linux-dev.pdf">presentation</a> (PDF) about development on Linux.  Specifically using Makefiles.  As promised I have uploaded it to this site and I will give a little more information in this post.</p>
<p><span id="more-19"></span></p>
<p><strong>Variables</strong></p>
<p>The two main types of variables in a Makefile are &#8220;recursively expanded&#8221; (<code>=</code>; equal to) and &#8220;simply expanded&#8221; (<code>:=</code>; set equal to&#8212;thanks Algol).  Recursively expanded is by far the most common and, realistically, the most confusing.  This form of a variable will not perform the substitution until the last possible moment (lazy evaluation), so if you have the line <code>SOURCES = ${CONFIG} demo.c</code>, make will remember that you use <code>${CONFIG}</code> until it must be known.  So if the value of CONFIG changes, the newest version of CONFIG will be used when it is evaluated.  Simple expansion occurs when the variable is declared (eager evaluation), if another variable (CFLAGS) is referenced in a simply expanded declaration it will be replaced with the current value of CFLAGS.</p>
<p><strong>Targets</strong></p>
<p>A target occurs on the left-hand side of a rule, typically this is the name of the file you want to build.  There are special cases of this, most commonly, &#8220;clean.&#8221;  Usually when someone wants to `make clean&#8217; they want all the object files generated from a previous make to be removed.  However, if you were to create a file named `clean&#8217; the rule would never execute because the file clean is up-to-date.  This can be remedied with by a PHONY target.</p>

<div class="wp_syntax"><div class="code"><pre class="make" style="font-family:monospace;"><span style="color: #990000;">.PHONY</span><span style="color: #004400;">:</span> clean
clean<span style="color: #004400;">:</span>
    rm <span style="color: #004400;">-</span>f <span style="color: #004400;">*.</span>o</pre></div></div>

<p>This creates a phony target that depends on clean, which tells make to ignore any files named clean.</p>
<p><strong>More on Variables</strong></p>
<p>The important &#8220;automatic&#8221; variables are talked about in the presentation (<code>$@</code>, <code>$&lt;</code>, <code>$^</code>, <code>$+</code>, and <code>$?</code>).  Also useful is the % expansion variable.  For example, if there was the rule <code>%.o: %.c</code> in a Makefile, this will tell make that to make any file ending in .o will depend on the same file ending in .c (i.e. a rule <code>foo.o: foo.c</code> automatically exists).  This will then execute the same commands (say <code>gcc -m32 -Os -o foo.o foo.c</code>) for all files ending in .o.  This is a perfect example of why using automatic variables is a great idea.</p>
<p>Well, I think that is everything I have to say about Makefiles in a short amount of time.  If you have any questions please feel free to post in the comments.  I&#8217;ll try my best to answer them promptly!</p>
<p><em>(While most of the content here is off the top of my head, I did reference the <a href="http://www.gnu.org/software/make/manual/make.html">GNU make</a> page.  They have everything you wanted to know about make.)</em></p>
<p>Attachment: <a href="http://www.beyond-syntax.com/uploads/2009/02/linux-dev.pdf">Software Development using GNU/Linux</a> (PDF)<em><a href="http://www.beyond-syntax.com/uploads/2009/02/linux-dev.pdf"><br />
</a></em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.beyond-syntax.com/2009/02/a-quick-introduction-to-makefiles/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Automatic backups using cron and tar</title>
		<link>http://www.beyond-syntax.com/2007/10/automatic-backups-using-cron-and-tar/</link>
		<comments>http://www.beyond-syntax.com/2007/10/automatic-backups-using-cron-and-tar/#comments</comments>
		<pubDate>Sat, 06 Oct 2007 17:49:58 +0000</pubDate>
		<dc:creator>Michael Schultz</dc:creator>
				<category><![CDATA[bash]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[backups]]></category>
		<category><![CDATA[cron]]></category>
		<category><![CDATA[shell]]></category>
		<category><![CDATA[tar]]></category>

		<guid isPermaLink="false">http://www.beyond-syntax.com/?p=47</guid>
		<description><![CDATA[This post is an import from a presentation I did in October of 2007.  Since I&#8217;ve made this presentation, I&#8217;ve stopped using my own script and suggest you use another tool for backups.  I hear rdiff-backup is good.  However, I believe this is still a good introduction to bash scripting, cron, and [...]]]></description>
			<content:encoded><![CDATA[<p><em>This post is an import from a presentation I did in October of 2007.  Since I&#8217;ve made this presentation, I&#8217;ve stopped using my own script and suggest you use another tool for backups.  I hear <a href="http://www.gnu.org/savannah-checkouts/non-gnu/rdiff-backup/">rdiff-backup</a> is good.  However, I believe this is still a good introduction to <code>bash</code> scripting, <code>cron</code>, and <code>tar</code>.</em></p>
<p><span id="more-47"></span></p>
<h3>Original Presentation</h3>
<p>Although it may be less useful without the accompanying speaker, the <a href="http://www.beyond-syntax.com/uploads/2009/02/backup.pdf">original presentation</a> is available.</p>
<h3>Source code for the shell script</h3>
<p>I have made the source code (<a href="http://www.beyond-syntax.com/uploads/2009/02/backup.sh">backup.sh</a>) available for download. In the top matter of the file describe how to add the script to your crontab.</p>
<h3>Description of the script</h3>
<p>For me, the best way to learn something is to take it line by line and that is what I&#8217;m going to do below. Naturally I will combine lines which are similar to save space. Since the target audience is someone who has never seen a shell script, some information may seem unimportant to you.</p>
<p>The concatenated source code that appears on this page may not agree with the source available for download. Odds are I decided the change was not worth updating this page, but made available for consumption as the script. If you notice something that is greatly different please contact me.</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#!/bin/sh</span></pre></div></div>

<p>Selection of a shell interpreter. This <em>must</em> be the first line in the file and be prefix with the &#8216;hash-bang&#8217; (or &#8217;sh-bang&#8217; for short).</p>
<p>I use <code>/bin/sh</code> since it seems to be the most universal amongst systems. It should be noted that on many systems <code>/bin/sh</code> is that same as <code>/bin/bash</code>, I do not know if this means the script will not work in <code>/bin/sh</code>.</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #007800;">SNAPDIR</span>=<span style="color: #000000; font-weight: bold;">/</span>var<span style="color: #000000; font-weight: bold;">/</span>snapshots<span style="color: #000000; font-weight: bold;">/</span><span style="color: #007800;">$USER</span>
<span style="color: #007800;">RMT_DIR</span>=<span style="color: #ff0000;">&quot;user@hostname:~/snapshots&quot;</span>
<span style="color: #007800;">RMT_OPTIONS</span>=<span style="color: #ff0000;">&quot;-i <span style="color: #007800;">$HOME</span>/.ssh/id_dsa&quot;</span></pre></div></div>

<p>Setting some simple variables. Note that there are no spaces between the variable name, the equal sign, and the value; your script will not work with spaces between these three items.</p>
<p>Here I set the snapshot directory (where snapshots should be stored) to be <code>/var/snapshots/$USER</code>. <code>$USER</code> is a special variable that is the same as the user running the script.</p>
<p>Next are <code>RMT_DIR</code> and <code>RMT_OPTIONS</code> which are quoted. Quotes simply make sure spaces are included in the variable. Again, <code>$HOME</code> is a environmental variable that is always a user&#8217;s home directory.</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #007800;">RMT_CMD</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #c20cb9; font-weight: bold;">which</span> <span style="color: #c20cb9; font-weight: bold;">scp</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
<span style="color: #007800;">DATE</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #c20cb9; font-weight: bold;">date</span> +<span style="color: #000000; font-weight: bold;">%</span>Y<span style="color: #000000; font-weight: bold;">%</span>m<span style="color: #000000; font-weight: bold;">%</span>d<span style="color: #7a0874; font-weight: bold;">&#41;</span>
<span style="color: #007800;">TAR</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #c20cb9; font-weight: bold;">which</span> <span style="color: #c20cb9; font-weight: bold;">tar</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
<span style="color: #007800;">MKDIR</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #c20cb9; font-weight: bold;">which</span> <span style="color: #c20cb9; font-weight: bold;">mkdir</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
<span style="color: #007800;">CHMOD</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #c20cb9; font-weight: bold;">which</span> <span style="color: #c20cb9; font-weight: bold;">chmod</span><span style="color: #7a0874; font-weight: bold;">&#41;</span></pre></div></div>

<p>This group of variables will run commands in a &#8220;sub-shell&#8221; before setting the variable name to the value. For example <code>$(which scp)</code> will execute <code>which scp</code> on the system and assign the value returned to <code>RMT_CMD</code>.</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #007800;">LAST_FULL</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #c20cb9; font-weight: bold;">stat</span> <span style="color: #660033;">-f</span> <span style="color: #ff0000;">&quot;%Dc %Sc&quot;</span> <span style="color: #660033;">-t</span> <span style="color: #ff0000;">&quot;%Y%m%d&quot;</span> <span style="color: #800000;">${SNAPDIR}</span><span style="color: #000000; font-weight: bold;">/</span>full-<span style="color: #000000; font-weight: bold;">*</span>.tar.gz \
            <span style="color: #000000;">2</span><span style="color: #000000; font-weight: bold;">&amp;</span>gt; <span style="color: #000000; font-weight: bold;">/</span>dev<span style="color: #000000; font-weight: bold;">/</span>null<span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sort</span> <span style="color: #660033;">-n</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">tail</span> -n1<span style="color: #7a0874; font-weight: bold;">&#41;</span>
<span style="color: #007800;">LAST_TS</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #800000;">${LAST_FULL}</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{ print $1}'</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
<span style="color: #007800;">LAST_DATE</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #800000;">${LAST_FULL}</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{ print $2}'</span><span style="color: #7a0874; font-weight: bold;">&#41;</span></pre></div></div>

<p>In the final group of variables we use &#8220;pipes&#8221; which use the output of the first command as the input of the second command. For <code>LAST_FULL</code> we first <code>stat</code> files of the pattern &#8220;full-*.tar.gz&#8221; in the <code>${SNAPDIR}</code> directory. <em>(N.B. <code>${SNAPDIR}</code> dereferences the <code>SNAPDIR</code> variable we set earlier. The curly braces are not strictly necessary, however I use them when referring to local variables.)</em> For <code>stat</code> I am specifying that the output should be of the form &#8220;&lt;timestamp&gt; &lt;YYYYMMDD&gt;&#8221;, then <code>sort</code> the output using the number in the first column, and finally, take only the last file listed.</p>
<p>Once the last full snapshot time is known, we split it into two variables (<code>LAST_TS</code> and <code>LAST_DATE</code>), again using pipes and awk.</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #000000; font-weight: bold;">!</span> <span style="color: #660033;">-d</span> <span style="color: #800000;">${SNAPDIR}</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
	<span style="color: #800000;">${MKDIR}</span> <span style="color: #800000;">${SNAPDIR}</span>
	<span style="color: #800000;">${CHMOD}</span> go-rwx <span style="color: #800000;">${SNAPDIR}</span>
<span style="color: #000000; font-weight: bold;">fi</span></pre></div></div>

<p>We&#8217;ll start by making sure the directory snapshots directory exists. If it doesn&#8217;t, make the directory and remove all permission from anyone not this user.</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> incr <span style="color: #7a0874; font-weight: bold;">&#123;</span>
	<span style="color: #800000;">${TAR}</span> czf <span style="color: #800000;">${SNAPDIR}</span><span style="color: #000000; font-weight: bold;">/</span>incr-<span style="color: #800000;">${DATE}</span>.tar.gz \
	       <span style="color: #660033;">--exclude-from</span> <span style="color: #007800;">$HOME</span><span style="color: #000000; font-weight: bold;">/</span>.snap-exclude \
	       <span style="color: #660033;">--listed-incremental</span>=<span style="color: #800000;">${SNAPDIR}</span><span style="color: #000000; font-weight: bold;">/</span><span style="color: #007800;">$USER</span>-<span style="color: #800000;">${LAST_DATE}</span>.snar \
	       <span style="color: #007800;">$HOME</span> <span style="color: #000000; font-weight: bold;">&amp;</span>lt; <span style="color: #000000; font-weight: bold;">/</span>dev<span style="color: #000000; font-weight: bold;">/</span>null <span style="color: #000000;">2</span><span style="color: #000000; font-weight: bold;">&amp;</span>lt; <span style="color: #000000; font-weight: bold;">/</span>dev<span style="color: #000000; font-weight: bold;">/</span>null
&nbsp;
	<span style="color: #800000;">${CHMOD}</span> go-rwx <span style="color: #800000;">${SNAPDIR}</span><span style="color: #000000; font-weight: bold;">/</span>incr-<span style="color: #800000;">${DATE}</span>.tar.gz
&nbsp;
	<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">${RMT_CMD}</span>&quot;</span> <span style="color: #000000; font-weight: bold;">!</span>= <span style="color: #ff0000;">&quot;&quot;</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
		<span style="color: #800000;">${RMT_CMD}</span> <span style="color: #800000;">${RMT_OPTIONS}</span> <span style="color: #800000;">${SNAPDIR}</span><span style="color: #000000; font-weight: bold;">/</span>incr-<span style="color: #800000;">${DATE}</span>.tar.gz \
		       <span style="color: #800000;">${SNAPDIR}</span><span style="color: #000000; font-weight: bold;">/</span><span style="color: #007800;">$USER</span>-<span style="color: #800000;">${LAST_DATE}</span>.snar \
		       <span style="color: #800000;">${RMT_DIR}</span>
	<span style="color: #000000; font-weight: bold;">fi</span>
<span style="color: #7a0874; font-weight: bold;">&#125;</span></pre></div></div>

<p>For creating an incremental backup. Using <code>tar</code>, <code>c</code>reate a g<code>z</code>ipped <code>f</code>ile at <code>${SNAPDIR}/incr-${DATE}.tar.gz</code>. Since you may not want <em>all</em> of your home directory backed up, you can exclude files listed in the <code>.snap-exclude</code> file. Now the most important part, <code>--listed-incremental</code>, tells <code>tar</code> what the timestamps of the files were last time it executed. If the timestamp on a file is newer than in the snar (&#8220;snapshort archive&#8221;), it will be added to the tarball. The last argument to tar is simply the directory to backup. <code>&gt;</code> and <code>2&gt;</code> redirect standard out and standard error to <code>/dev/null</code>, thus suppressing all output.</p>
<p>For the sake of security, we revoke all access from the file except for the current user.</p>
<p>The final step is to check of a <code>RMT_CMD</code> exists, if it does execute it. <code>scp</code> works well for this step, as would <code>rsync</code>.</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> full <span style="color: #7a0874; font-weight: bold;">&#123;</span>
	<span style="color: #800000;">${TAR}</span> czf <span style="color: #800000;">${SNAPDIR}</span><span style="color: #000000; font-weight: bold;">/</span>full-<span style="color: #800000;">${DATE}</span>.tar.gz \
	       <span style="color: #660033;">--exclude-from</span> <span style="color: #007800;">$HOME</span><span style="color: #000000; font-weight: bold;">/</span>.snap-exclude \
	       <span style="color: #660033;">--listed-incremental</span>=<span style="color: #800000;">${SNAPDIR}</span><span style="color: #000000; font-weight: bold;">/</span><span style="color: #007800;">$USER</span>-<span style="color: #800000;">${DATE}</span>.snar \
	       <span style="color: #007800;">$HOME</span> <span style="color: #000000; font-weight: bold;">&amp;</span>lt; <span style="color: #000000; font-weight: bold;">/</span>dev<span style="color: #000000; font-weight: bold;">/</span>null <span style="color: #000000;">2</span><span style="color: #000000; font-weight: bold;">&amp;</span>lt; <span style="color: #000000; font-weight: bold;">/</span>dev<span style="color: #000000; font-weight: bold;">/</span>null
&nbsp;
	<span style="color: #800000;">${CHMOD}</span> go-rwx <span style="color: #800000;">${SNAPDIR}</span><span style="color: #000000; font-weight: bold;">/</span>full-<span style="color: #800000;">${DATE}</span>.tar.gz
	<span style="color: #800000;">${CHMOD}</span> go-rwx <span style="color: #800000;">${SNAPDIR}</span><span style="color: #000000; font-weight: bold;">/</span><span style="color: #007800;">$USER</span>-<span style="color: #800000;">${DATE}</span>.snar
&nbsp;
	<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">${RMT_CMD}</span>&quot;</span> <span style="color: #000000; font-weight: bold;">!</span>= <span style="color: #ff0000;">&quot;&quot;</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
		<span style="color: #800000;">${CMT_CMD}</span> <span style="color: #800000;">${RMT_OPTIONS}</span> <span style="color: #800000;">${SNAPDIR}</span><span style="color: #000000; font-weight: bold;">/</span>full-<span style="color: #800000;">${DATE}</span>.tar.gz \
		       <span style="color: #800000;">${SNAPDIR}</span><span style="color: #000000; font-weight: bold;">/</span><span style="color: #007800;">$USER</span>-<span style="color: #800000;">${DATE}</span>.snar \
		       <span style="color: #800000;">${RMT_DIR}</span>
	<span style="color: #000000; font-weight: bold;">fi</span>
<span style="color: #7a0874; font-weight: bold;">&#125;</span></pre></div></div>

<p>Creating a full backup is not much different than an incremental backup. The only difference is the <code>--listed-incremental</code> file (<code>tar</code> will create a new snapshot archive), thus starting with a fresh backup and timestamps. The reason for this is explained in the &#8220;Recovery&#8221; section.</p>
<p>The rest of the function is mostly the same as an incremental backup.</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> normal <span style="color: #7a0874; font-weight: bold;">&#123;</span>
	<span style="color: #666666; font-style: italic;"># Make a full backup if no backup exists</span>
	<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #000000; font-weight: bold;">!</span> <span style="color: #660033;">-f</span> <span style="color: #800000;">${SNAPDIR}</span><span style="color: #000000; font-weight: bold;">/</span><span style="color: #007800;">$USER</span>-<span style="color: #800000;">${LAST_DATE}</span>.snar <span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
		full;
	<span style="color: #000000; font-weight: bold;">else</span>
		<span style="color: #007800;">ELAPSED</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">&#40;</span>$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #c20cb9; font-weight: bold;">date</span> +<span style="color: #000000; font-weight: bold;">%</span>s<span style="color: #7a0874; font-weight: bold;">&#41;</span> - <span style="color: #800000;">${LAST_TS}</span><span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
		<span style="color: #007800;">SNAP_FRAME</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #000000;">7</span> <span style="color: #000000; font-weight: bold;">*</span> <span style="color: #000000;">24</span> <span style="color: #000000; font-weight: bold;">*</span> <span style="color: #000000;">60</span> <span style="color: #000000; font-weight: bold;">*</span> <span style="color: #000000;">60</span> - <span style="color: #000000;">3600</span><span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
&nbsp;
		<span style="color: #666666; font-style: italic;"># Check if it has been over a week since a full snapshot</span>
		<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #800000;">${ELAPSED}</span> <span style="color: #660033;">-gt</span> <span style="color: #800000;">${SNAP_FRAME}</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
			<span style="color: #666666; font-style: italic;"># make a full snapshot</span>
			full;
			<span style="color: #666666; font-style: italic;"># clean up files older than 4 weeks</span>
			clean;
		<span style="color: #000000; font-weight: bold;">else</span>
			incr;
		<span style="color: #000000; font-weight: bold;">fi</span>
&nbsp;
		<span style="color: #7a0874; font-weight: bold;">unset</span> ELAPSED SNAP_FRAME
	<span style="color: #000000; font-weight: bold;">fi</span>
<span style="color: #7a0874; font-weight: bold;">&#125;</span></pre></div></div>

<p>Here we have the main &#8220;brain&#8221; of the program. It begins by making sure a backup exists, if one doesn&#8217;t the script makes a full backup. If at least one full backup exists, then we find out how long it has been since the last full backup and compare that to how frequently full backups should be made. <code>SNAP_FRAME</code> holds the frequency in which backups should be made (every 7 days * 24 hours / day * 60 minutes / hour * 60 seconds / minute (minus 1 hour for time delays)). If too much time has passed, create a full backup and clean out the old files. Otherwise just create an incremental backup.</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> clean <span style="color: #7a0874; font-weight: bold;">&#123;</span>
	<span style="color: #c20cb9; font-weight: bold;">true</span>
<span style="color: #7a0874; font-weight: bold;">&#125;</span></pre></div></div>

<p>The script isn&#8217;t perfect. I have yet to determine a good way to clean out old files (one that isn&#8217;t tied to either <code>scp</code> or <code>rsync</code>).</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> usage <span style="color: #7a0874; font-weight: bold;">&#123;</span>
	<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;usage: $0 [type]&quot;</span>
	<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;[type] can be one of the following:&quot;</span>
	<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;  normal - follow the daily incremental and weekly backup schedule&quot;</span>
	<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;  incr   - create a incremental backup of <span style="color: #007800;">$HOME</span> to <span style="color: #007800;">${SNAPDIR}</span>&quot;</span>
	<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;  full   - create a full backup of <span style="color: #007800;">$HOME</span> to <span style="color: #007800;">${SNAPDIR}</span>&quot;</span>
	<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;  clean  - cleanup backups older than one month&quot;</span>
	<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;  usage  - display this screen&quot;</span>
	<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;  --help - display this screen&quot;</span>
&nbsp;
	<span style="color: #7a0874; font-weight: bold;">exit</span> <span style="color: #000000;">1</span>
<span style="color: #7a0874; font-weight: bold;">&#125;</span></pre></div></div>

<p>This simple function displays the usage information if requested. <code>$0</code> is the script name as typed by the user.</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">case</span> <span style="color: #ff0000;">&quot;$1&quot;</span> <span style="color: #000000; font-weight: bold;">in</span>
	<span style="color: #ff0000;">'normal'</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
		normal;
		<span style="color: #000000; font-weight: bold;">;;</span>
	<span style="color: #ff0000;">'incr'</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
		incr;
		<span style="color: #000000; font-weight: bold;">;;</span>
	<span style="color: #ff0000;">'full'</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
		full;
		<span style="color: #000000; font-weight: bold;">;;</span>
	<span style="color: #ff0000;">'clean'</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
		clean;
		<span style="color: #000000; font-weight: bold;">;;</span>
	<span style="color: #ff0000;">'--help'</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
		usage;
		<span style="color: #000000; font-weight: bold;">;;</span>
	<span style="color: #ff0000;">'usage'</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
		usage;
		<span style="color: #000000; font-weight: bold;">;;</span>
	<span style="color: #000000; font-weight: bold;">*</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
		normal;
		<span style="color: #000000; font-weight: bold;">;;</span>
<span style="color: #000000; font-weight: bold;">esac</span></pre></div></div>

<p>Finally, the driver of the program, a case statement which reads the first argument (<code>$1</code>) and executes the desired function. The default operation is to run in normal mode, but the user is able to force an incremental update, full update, or clean out old files.</p>
<h3>Setting up a cronjob</h3>
<p><code>cron</code> is a simple utility that exists on almost all UNIX or UNIX-like systems. A daemon runs every minute to see if any user has a &#8220;cronjob&#8221; that needs to be executed, if a user does it will run it.</p>
<p>User level cronjobs are maintained by a program called <code>crontab</code>, to view your current crontab type: <code>crontab -l</code>, to edit your crontab use <code>crontab -e</code>.</p>
<p>I personally like to keep all my user level cronjobs in one place, <code>$HOME/.cronjobs/</code>. This folder conatins two files: <code>crontab</code> and <code>backup.sh</code>. <code>crontab</code> is a text file which hold what cronjobs I&#8217;d like to have run while <code>backup.sh</code> is the file described above.</p>
<p>My crontab files looks something like this:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;"># User level crontab</span>
<span style="color: #666666; font-style: italic;"># min hr mday month wday command</span>
00    <span style="color: #000000;">13</span>  <span style="color: #000000; font-weight: bold;">*</span>    <span style="color: #000000; font-weight: bold;">*</span>     <span style="color: #000000; font-weight: bold;">*</span>    <span style="color: #000000; font-weight: bold;">/</span>path<span style="color: #000000; font-weight: bold;">/</span>to<span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>directory<span style="color: #000000; font-weight: bold;">/</span>.cronjobs<span style="color: #000000; font-weight: bold;">/</span>backup.sh</pre></div></div>

<p>Which means I backup my files everyday at precisly 1:00pm by running the file located in <code>/path/to/home/directory/.cronjobs/backup.sh</code>. This can then be loaded into the system cronjobs using the following command.</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">crontab <span style="color: #000000; font-weight: bold;">&amp;</span>lt; crontab</pre></div></div>

<h3>Recovering the data</h3>
<p>If you ever need to restore your backed up data all you need to do is find the most recent full backup (we&#8217;ll say <code>full-20071001.tar.gz</code>) and all the incremental backups since then (in our example <code>incr-2007100[2-5].tar.gz</code>). You&#8217;ll start by extracting the full backup to the correct folder via <code>tar xzf full-20071001.tar.gz</code>, followed by the incremental backups oldest to newest. Effectively you are restoring your entire home folder from n-days ago and applying the differences from each succeeding day. The commands should go as below (where <code>$</code> is the shell prompt).</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">$ <span style="color: #c20cb9; font-weight: bold;">tar</span> xzf full-20071001.tar.gz
$ <span style="color: #c20cb9; font-weight: bold;">tar</span> xzf incr-20071002.tar.gz
$ <span style="color: #c20cb9; font-weight: bold;">tar</span> xzf incr-20071003.tar.gz
$ <span style="color: #c20cb9; font-weight: bold;">tar</span> xzf incr-20071004.tar.gz
$ <span style="color: #c20cb9; font-weight: bold;">tar</span> xzf incr-20071005.tar.gz</pre></div></div>

<p>After which you should have you home directory restored exactly as it appeared at 1:00pm on October 10, 2007.</p>
<p>Attachment: <a href="http://www.beyond-syntax.com/uploads/2009/02/backup.pdf">Automatic backups using <code>cron</code> and <code>tar</code></a> (PDF)<br />
Attachment: <a href="http://dev.beyond-syntax.com/scripts/backup.sh">backup.sh</a> (Shell script)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.beyond-syntax.com/2007/10/automatic-backups-using-cron-and-tar/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
