{"id":5352,"date":"2014-11-18T18:37:58","date_gmt":"2014-11-18T08:37:58","guid":{"rendered":"http:\/\/nsrd.info\/blog\/?p=5352"},"modified":"2018-12-11T13:38:45","modified_gmt":"2018-12-11T03:38:45","slug":"not-so-squeezy","status":"publish","type":"post","link":"https:\/\/nsrd.info\/blog\/2014\/11\/18\/not-so-squeezy\/","title":{"rendered":"Not so squeezy"},"content":{"rendered":"<p>It&#8217;s funny, the little tools you build up over the years as someone heavily involved in backup, particularly when it comes to testing.<\/p>\n<p>I have two tools that help me with filesystem and performance testing &#8211; one I call&nbsp;<em>generate-filesystem<\/em>, and one called <em>genbf<\/em> (generate big file).<\/p>\n<p>The&nbsp;<em>genbf<\/em> tool came about when I wanted files that were&nbsp;highly resistant to being compressed \u2013 and indeed, to subsequently being&nbsp;<em>deduplicated<\/em> as well. Sure, bigasm can produce good results, but it isn&#8217;t guaranteed to produce highly random data. That&#8217;s where genbf comes in. Best of all, it&#8217;s fast. For example, a 1GB file on my 12-core lab server gets created in under 10 seconds:<\/p>\n<pre>[pmdg@orilla test]$ <strong>date; genbf.pl -s 1024 -f test.dat; date<\/strong>\nTue Nov 18 19:08:24 AEDT 2014\nProgress:\n     Pre-generating random data chunk. (This may take a while.)\n     0% of random data chunk generated.\n     10% of random data chunk generated.\n     20% of random data chunk generated.\n     30% of random data chunk generated.\n     40% of random data chunk generated.\n     50% of random data chunk generated.\n     60% of random data chunk generated.\n     70% of random data chunk generated.\n     80% of random data chunk generated.\n     90% of random data chunk generated.\n Creating 1024 MB file test.dat\nWrote data file in 5121 chunks.\nTue Nov 18 19:08:33 AEDT 2014<\/pre>\n<p>OK, OK, a 1GB file can be created quickly if you&#8217;re just pulling in from \/dev\/zero, but here&#8217;s the&nbsp;file size difference pre and post-compressed:<\/p>\n<pre>[pmdg@orilla test]$ <strong>ls -al test.dat<\/strong> \n-rw-rw-r-- 1 pmdg pmdg 1073741824 Nov 18 19:08 test.dat\n[pmdg@orilla test]$ <strong>pbzip2 -r test.dat<\/strong>\n[pmdg@orilla test]$ <strong>ls -al test.dat.bz2<\/strong> \n-rw-rw-r-- 1 pmdg pmdg 1065615793 Nov 18 19:08 test.dat.bz2<\/pre>\n<p>(If you haven&#8217;t heard of&nbsp;<em><a title=\"pbzip2 homepage\" href=\"http:\/\/compression.ca\/pbzip2\/\" target=\"_blank\">pbzip2<\/a><\/em>, enlighten yourself and support the author. It&#8217;s brilliant.)<\/p>\n<p>When it comes to subsequently&nbsp;sending the generated data to Data Domain, the deduplication is extremely low &#8211; 20 x 1GB files using the standard setting above, for instance, yields an almost straight additional 20GB occupied space.<\/p>\n<p>If you want to try it out, you can <a title=\"genbf\" href=\"https:\/\/nsrd.info\/utils\/genbf.zip\" target=\"_blank\">download it from here<\/a>. (You&#8217;ll need Perl on your system.)&nbsp;Standard usage is below:<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2014\/11\/genbf.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-5355\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2014\/11\/genbf-1024x580.png\" alt=\"genbf usage\" width=\"695\" height=\"393\" srcset=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2014\/11\/genbf-1024x580.png 1024w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2014\/11\/genbf-300x170.png 300w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2014\/11\/genbf-900x510.png 900w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2014\/11\/genbf.png 1298w\" sizes=\"auto, (max-width: 695px) 100vw, 695px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It&#8217;s funny, the little tools you build up over the years as someone heavily involved in backup, particularly when it&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[20,25],"tags":[242,301,1198],"class_list":["post-5352","post","type-post","status-publish","format-standard","hentry","category-scripting","category-tidbits","tag-compression","tag-deduplication","tag-random-file"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pKpIN-1ok","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/5352","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/comments?post=5352"}],"version-history":[{"count":4,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/5352\/revisions"}],"predecessor-version":[{"id":7447,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/5352\/revisions\/7447"}],"wp:attachment":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/media?parent=5352"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/categories?post=5352"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/tags?post=5352"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}