[Tfug] Version Control

Bexley Hall bexley401 at yahoo.com
Sat Mar 30 09:45:53 MST 2013


Hi Yan,

>> Thanks!  So, are you nominating this/these as the "official"
>> candidate(s) that I should use to represent git in my comparison?
>
> Well, I am not involved in the git project or anything in any way, so I
> can't really speak for them, but gitk is what I was pointed to when I
> needed to make sense of something through a GUI, so that's what I'd point
> someone to if they were in that situation as well :-). At the very least,
> they are the GUI tools that are maintained by the GIT project itself. There
> are plenty of capable, regularly-updated third-party GUIs, but as you've
> pointed out, there is less guarantee that such third-party GUIs will
> continue being developed.

OK, I'll start watching the lists associated with SVN & GIT to see
what is considered "-STABLE".

> As for the automatically figuring out what diff tool to use thing, git can
> definitely use file extensions (as shown here:
> http://lars-tesmer.com/blog/2010/09/20/git---how-to-get-better-diffs-for-images/)
> for automatic file type identification. As you've said, you'd prefer a
> fully automated approach (so you can check in jpegs as "blah.ps" and still
> have git realize they're jpegs).

File extensions are just stupid.  I wasn't named Don.human.male.
I should be able to pick a name that fits whatever criteria *I*
select instead of some arbitrary criteria that someone else has
selected with NO CONSISTENCY!

On *my* system(s) -- i.e., I can't speak to what *other* uses
these may see:
- .m is used by Mathematica, Maple and MatLab for various types
   of "Math" files that are probably incompatible with each other
   (I've never tested this)
   Plus BRIEF macro sources
   And Limbo module interface declarations
- .nb used by Corel and Mathematica to declare (incompatible)
   "NoteBooks"
- .dat is used by everyone for all sort of different "DATa"
- .lib is used by all sorts of different "LIBraries" of
   completely incompatible "items" (avoiding the term "objects")
- .dwg used for incompatible "DraWinGs"
- .doc for incompatible "DOCuments" of various types
etc.  And, that doesn't count files that *don't* have extensions!

So, the "extension" is meaningless:  this file *might* be a <foo>
object... or, a <bar> object... or a <baz> object... or, none of
the above!

I think the old Macs had the right idea (CreatorID) but probably
not an effective system for handling compatibility among different
apps, etc.  But, storing meta data outside of the name just makes
more sense, IMO.

[I am now exploring using a RDBMS in place of a filestore so that
all objects *can* carry their own metadata and be organized per
the needs of the individual apps that create and *consume* them.]

> I've never really had the need for that,
> and don't know exactly how one would go about accomplishing such a thing,
> but there is at least part of a chapter of the git book dedicated to
> diffing in general (http://git-scm.com/book/ch7-2.html), so you could read
> up on it if it's critically important. A quick search yielded that the
> initial setup will likely contain a decent amount of manual work. One

Yes.  I think you need the ability to "tag" each object with a
"type".  I.e., each object is tagged as "untyped" until you
assign a type to it.  This then allows handlers to be attached
for that particular type (which the VCS can then invoke as required).

So, for a lazy user, everything is a BLOB.  For a user very
familiar with the items being imported to the repository, you
could tolerate file extensions as indicators **ON IMPORT**
and us a small table/relation to map extensions to types:
	.c	C source code
	.m	Limbo module interface
	.b	Limbo implementation file
	.dwg	AutoCAD drawing
Or, a script that applies file(1) to each object and looks up the
results of file(1)'s analysis in a table/relation that maps those
into formal "object types".

Then, query the import set to see which objects still are "untyped"
for you to manually determine their types.

The first file hierarchy that I "inherited" suffered from a "file
type" (extension) that was unknown to the VCS that it silently
gagged upon...

> solution would be to set up a default differ (as in here:
> http://jeetworks.org/node/90) that would determine the filetypes involved
> using "file" or something and then call the appropriate application.
> This'll achieve the behavior you want, but now you're stuck maintaining
> your diffing script (which may or may not be a headache).

Yes.  That's the situation I am in currently.  What I have "works"
(for me) but means *I* have to maintain it (and would impose itself
on anyone else who wanted to use my repository once I release
everything)

> At some point, you've mentioned ad-hoc text files littering the heirarchy.
> Git's a bit better than CVS and SVN in this sense, as it at least keeps all
> of its stuff in the project's root directory (mostly in .git/, but custom
> configuration such as submodules, per-project diff configs, and so forth
> spill out into other dotfiles in this directory). This, however, has the
> notable downside of being unable to check out a subdirectory of a
> repository (like you can in SVN), so you should be aware of that.

Can these files be R/O and still use the repository?  Or, are they
used to track the dynamic state of the repository?  I.e., could
you copy the hierarchy onto a read-only medium and still *use* it?
(with the ASSURANCE that the repository's integrity is physically
"uncompromisable"?)

> The stuff
> *inside* the .git directory is described in the git book (
> http://git-scm.com/book/en/Git-Internals) if you want to interface with it.
> I think the general consensus is that if you are doing that, you're doing
> something wrong.

So, how do "other (non-git) tools" determine the state of the repository
and the files that it governs?  Or, is *all* of that information exposed
via formal git(1) interfaces (the output of which is *stable* enough
that I could parse it without worrying that the next version of git will
alter those reports in some way that would break my scripts)?

> An interesting (though probably not relevant) thing is that Perforce
> supports git clients.

I saw something called "git fusion" but haven't looked into it.
I also see they have a "sandbox" tool that appears to give some
form of support for remote repositories (?)

[Again, I haven't begun to explore the capabilities of any of these
VCS's in detail, yet.  I'm still in the "gathering opinions" stage]

> Finally, git is not perfect (examples:
> http://steveko.wordpress.com/2012/02/24/10-things-i-hate-about-git/,
> http://www.forouzani.com/disadvantages-of-git.html). I feel it's better
> than the other things I've used (significant point: I haven't used
> Mercurial to any great extent), and using a commercial application (like
> Perforce) for something so integral to your workflow seems really weird to
> me, but I guess some people do it.

If you assume the stipulation that versioning is a key requirement
for *all* electronic "documents"/objects, then you either have to
expect all of the tools that create and maintain those objects to
incorporate robust versioning IN A MUTUALLY COMPATIBLE MANNER
*or* have some "free-standing" tool that handles *any* of their
objects in a consistent manner.  (or, some mega-app that does
*all* of your object creation/maintenance/versioning activities
under one umbrella!  Ain't gonna happen...)

I can't see app vendors cooperating to ensure they all embrace a
compatible means of versioning in their products.  So, it seems
the only realistic solution is a separate tool that does this
"after-the-fact".  That can somehow be made aware of the needs
and characteristics of those different object types.

> Personally, I wouldn't trust Perforce to
> stay around (for example, Google has a stunning history of axing products
> after purchasing their parent company).

Would you trust Linus to not change his mind regarding the VCS
*he* wants to embrace?  When you've got an army of UNPAID droids
doing your bidding, its pretty easy to rationalize your decisions.
OTOH, if he was CEO of Linus, Inc and was *paying* all those
developers, how keen would he be on porting their repository to
some other tool -- given that he'd have to justify to stockholders
the expense for doing same?

I've "tolerated" cvs for 25+ years, now.  And, I only have to justify
the cost of making the change to another VCS to *myself*.  OTOH, I
know that the cost of porting the repository will be 100% *mine*
(what other things could I do with that time/money??)

I suspect Perforce isn't going to go away anytime soon -- even if
the company disappears overnight.  People have too much tied up
in their repositories to just shrug it off.

FOSS projects don't typically have moral or legal requirements
to keep their repositories "usable".  They have no *liability*
for the users and uses of their products.

OTOH, Toyota might have to *prove* that the software controlling
their autos CAN'T POSSIBLY cause the vehicle to accelerate uncommanded.
In which case, they need a verifiable record of which version of the
software was running on which version of the hardware in which
particular vehicle claiming to have exhibited this behavior, etc.

Big incentive to keep your "object repository" accessible!




More information about the tfug mailing list