Musings on Christianity, Politics, and Computer Science Geekery

Quick Bar-Chart of disk usage

Today I was in search of a command that I had used a long time ago, but ran into a much more interesting one instead.  At the time, I must have been needing to discover what files were the largest disk hogs and if there was a long tail (i.e. how many of the 3.7M files in this directory--not my fault, by the way--were inconsequential).  That brings us to this wonderful "one-line" command:

find /dir/ -name "*.xml" -exec du -s {} ; | perl -ni -e 'if (/^(d+)s+(.*)/) { $h{$2} = $1; if ($max < $1) { $max = $1; } if (length($2) > $maxfname) { $maxfname = length($2); } } END { map { $barlen = ($h{$_} / $max) * 50; $bar = "*" x $barlen; printf ("%" . $maxfname . "s" . "(%5d): %s", $_, $h{$_}, $bar); print "n"; } sort { $h{$b} <=> $h{$a} } keys %h }' 2> /dev/null > report.txt

What that specifically does is to find every XML file in the dir directory, use the linux du command to get the file's size.  That list of filenames and sizes is passed to a hacky perl script that pulls out the size, creates a horizontal histogram bar based on the max size (limit 50 *s wide), sort and return the list from max to min.  Lastly, that's saved to report.txt.

That's quite a quick and dirty trick, but produces a nice command-line output like this:

/dir/w6bz9whg.xml(36560): **************************************************
/dir/w6km312r.xml(31772): *******************************************
/dir/w68d03gz.xml(27728): *************************************
/dir/w6vt5fhv.xml(27076): *************************************
/dir/w6m07v80.xml(17420): ***********************
/dir/w68m0zj8.xml(15276): ********************
/dir/w6mq7qpz.xml(15052): ********************
/dir/w6vq30tq.xml(13808): ******************
/dir/w6tb51hr.xml(13160): *****************


Choosing a Laptop

I've had many people over the years ask me what to look for when purchasing a good laptop.  That has changed over the years as we have seen a shift into multi-core computation and reliance on SSD technology.  So, here is a current run-down of buying tips (in order):

  1. Buy just inside your budget, but do spend as much as you can for it, since that will probably make it last as long as it can for you.  Buying the cheapest means you'll likely need to replace it earlier.
  2. Memory:  If you are comparing multiple computers, at this point go with the one with the most RAM.  4GB is standard now for mid-level laptops, 8GB is even better.  2GB is doable for chromebooks and it works for a netbook, but it's not good for doing anything "big" on the computer.  (For any real work I use my desktop or laptop with 8GB of RAM each).
  3. Cores: The next thing to look at is the number of cores in the processor.  It goes hand in hand with the RAM, in that you want as many as possible.  In the worst case I'd trade off more RAM for fewer cores.  This information would likely be in the fine-print of the computer details, but it will say "x-core processor." I'm not too worried about the brand (Intel or AMD) at this point.  Typical low-end laptops have 2 cores.  For longevity, I'd go for at least a 4-core if possible.
  4. Type and speed of processor:  This is secondary to the number of cores, mostly.  Intel has the reputation of being the best, followed by AMD.  However, AMD's processors are cheaper to buy, giving you more options in the lower-cost machines that could potentially be "faster" than their Intel counterparts.  (For desktops, I buy AMD to get more cores and speed for the price).  That is to say, after RAM and number of cores, I'd pick an Intel i7 or i5 line over the AMD chips (A-series processors), but I'd pick AMD's A10,A8 over the Intel i3, Pentium, Celeron, or Atom models.  At this point, don't buy a laptop with an ARM processor (that day will come soon).  Secondly, get the fastest processor of the best line you can (higher GHz).  Since the multi-core revolution, I'd say number of cores wins over speed of the core, since it allows the machine to do more at once, even slowly.
  5. Hard drive: SSDs are faster. Period.  However, they're expensive and therefore smaller in size.  My netbook has a 32GB SSD hard drive, and it's been full for months.  So, I can't do much with it.  If you want to store music, documents, and a lot more, get one with a rotational HDD.  It may be a little slower, but it can store a lot more.  Plus, for a cheaper cost down the road, you can replace the HD with an SSD (about a $100 upgrade).
  6. Brand: Last but not least, get a brand you know.  Dell and Lenovo seem to be the go-to PC brands, and they've been around and solid for a while.  Asus and Acer are also great brands.  I'd personally stay away from HP for now and very off-brands.

Hardness and Political Choices

Right image: 2012 Election Results (1), Left image: Hardest Places to Live in the US (2).


A few weeks ago, the New York Times posted a great article on the hardest places to live in the United States, based on education, median income, unemployment rate, disability rate, and a few other factors.  It is an incredible article, and I recommend reading it at  As soon as I saw their graphic, I immediately wondered if there was a connection between political persuasion and hardness.  To look at this, I grabbed Mark Newman's version of the 2012 election results and a nice image comparator so that the two maps can be compared on top of each other.  The results are interesting!

For clarification: On the election results (left), each district is colored on a gradient from blue to red based on percentage of the vote for the winning candidate (purple would mean an even split Obama/Romney).  For the NYTimes hardness results (right), dzisctrics are colored on a gradient orange to green, where orange is worse (harder to live) and green is better.

What to make?  I don't know.  In the north-east (areas including New England, Kentucky, Michigan, Illinois, and parts of Virginia), it appears that the more liberal areas are usually the easier places to live, and the harder places to live are usually more conservative.  However, in the mid-west (the entire middle of the country west to California), it appears to be just the opposite.  In any case, it's interesting to think about!


1.  Newman, Mark. "Maps of the 2012 US Presidential Election Results." N.p., 8 Nov. 2012. Web. 14 Oct. 2014. <>.

2.  Flippen, Alan. "Where Are the Hardest Places to Live in the U.S.?" The New York Times. The New York Times, 25 June 2014. Web. 14 Oct. 2014. <>

VI Tricks

I may be stuck in the past, or like punishment, but my editor of choice is still VIM.  However, certain tricks seem to be hard to find on Google searches, so I'm going to compile them here:

  • Creating custom commands and keyboard mappings are easy in VIM.  To create a custom command, list the command in the .vimrc file.  The % character includes the current buffer's filename in the shell command.
    command CommandName execute "!shellcommand %"
    This command can be run in VIM using the standard :CommandName convention. To map this new command to a keyboard shortcut, use the map command in the .vimrc file.
    map <F5> :CommandName<CR>

Command Line Tricks

So, I always am using some command line shortcuts to do various tasks, and often have to look up the tricks every time I need to do something remotely fancy.  Here are some of my most-used helpful hints:

  • To remove the leading spaces and tabs from each line of text on standard in (so use with a pipe for the input), this sed command will work well:
    sed -e 's/^[ \t]*//'
  • Reformatting XML/HTML files so that line returns inside tags are removed:
    xmllint --format --noblanks infile.xml > outfile.xml
« Older posts

© 2017 Mininook

Theme by Anders NorenUp ↑