[Tfug] NAS again

Tue Feb 4 01:05:49 MST 2014

Hi Zack,

On 2/3/2014 3:12 PM, Zack Williams wrote:
> On Mon, Feb 3, 2014 at 12:19 PM, Bexley Hall<bexley401 at yahoo.com>  wrote:
>> This is a different usage model than, for example, a traditional
>> file server that you treat as *secondary* storage -- "on-line".
>
> Not hardware, but you might consider using git-annex to manage this
> sort of thing:
>
> http://git-annex.branchable.com
>
> It's designed for large binary files that aren't on local storage or
> even powered on all the time.

Currently, I have some cheap little scripts that give me what
amounts to "locate(1) on steroids".  E.g., I sort out where every
*file* (on the filesystems of interest) resides.  Then, sequentially
"mount" each file that is a composite object (i.e., an ISO, TAR,
ZIP, ARC, ARJ, RAR, etc. archive) and enumerate its contents
"under" that file.

Only the file systems found mounted are affected in the resulting
"index".  So, if I haven't mounted /ResearchPapers, then none of the
index entries referencing files under that are affected.  OTOH, if
a file that was previously "found" under /OldProjects is no longer
under /OldProjects WHEN OldProjects IS MOUNTED, then the file is
elided from the index.

I.e., I assume nothing happens to the archives when they are "not
mounted" (more specifically, nothing SIGNIFICANT happens to them).

The index gets very large very quickly.  E.g., there are some 56,000
"files" on my technical archive (I happened to have that spinning,
currently).  The commercial software archive has *bigger* files
but probably has a lot more than I would wager to guess (recall all
"updates", release notes, etc. also get tucked on there with each
"application" -- how many copies of Firefox, Thunderbird, Adobe Reader,
etc.?)

I had thought about piping all this into a true database so I could
do more complex queries (potentially including FTS features) but
decided I didn't want to make a career out of tracking stuff!  It
works good enough -- as long as my (organic) memory doesn't fade!  :>

> As for hardware, it sounds like you'd be better off with a drive
> attached to an external enclosure that's hooked to one of your
> always-on machines, and scripts controlling relays that unmount then
> turn power off to the drives when not in use.

The machines that are "always on" are truly underpowered.  They do
trivial things like local name resolution, time service, etc.
I suspect they would be hard pressed to pull bytes off an external
drive at even 100Mbps (network) speeds.  And, I'm not sure it is
possible to power down/up an external USB drive without an insertion
event (on both sides of the interface).

The usage pattern I've established works well.  The problem is all
the boxes are different.  Different user names, passwords, configuration
parameters, implementations, bugs, etc.  I'd just like to find a
"small box" that I could use as the basis for multiple *identical*
storage systems.

I'll start watching for such things at the UofA auctions. Often, they
dump "many" of the same machine in a "lot".  Find something smallish
and I should be all set!

Ages ago, World Care had a boatload of these:
    <http://www.snotmonkey.com/work/ezgo/>   (first picture)
I suspect I could probably find a small 2-4 drive enclosure (with
power supply) and tuck the guts of one of these in the belly.  But,
I think they are all gone :<   (they also have bad caps and recapping
is a PITA on these boards -- no thermal reliefs!)

Ooops!  There's my timer -- biscotti have to come out of the oven!

--don