Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.
Modeling Project Releng/SVN Support
Contents
Overview
Currently, all Modeling projects store their source code in CVS, and there are a number of tools that take advantage of CVS's logs, in particular:
Affected Tools
While these tools query a MySQL database which contains the parsed CVS logs and thus don't directly touch the CVS logs, the set of information that SVN provides is not strictly a superset (or subset) of the information that CVS provides and mapping it on to the current database schema is infeasible.
The Good
Tracking updates in SVN is much easier than tracking updates in CVS, in particular:
- new "branches" and "tags" can be parsed out without needing to parse the entire log
- all changes in a particular commit are tied together
- files can be formally moved/renamed
Overall, a "Search SVN" tool should be easier to develop from scratch than Search CVS was.
The Bad
SVN in very flexible and in particular:
- there is no formal concept of tags
- there is no formal concept of branches
- SVN revisions are per-repository, not per-file
Tags and Branches
The majority of the complexity in parsing the CVS logs and doing queries on the resulting data comes from handling branches and tags properly. The fact that SVN's approach to both is completely different (and vastly more flexible) makes matters difficult.
SVN has the convention of laying out a repository with three top level directories, like so:
/trunk /tags /branches
Where active development is done on /trunk, /trunk is copied to a new directory in /tags for releases (as copies are essentially free in SVN), and /trunk is copied to a new directory in /branches when there is a desire to branch and work on that branch is done on the copy.
Why is this bad? It's bad (for a potential "Search SVN" tool) as none of this is enforced by SVN, and there's nothing special about those directories beyond the convention of the committer. Given n committers, it's inevitable that one will end up with at least n+1 different commit conventions. It should be obvious that supporting an ever growing number of commit conventions is a losing battle.
Revisions
The SVN approach to revisions is actually preferable when starting from scratch, but it's so completely different from the CVS approach that mapping one onto the other is a bad idea.
Supporting SVN
I would suggest a fresh start for the above tools in supporting SVN, not only because the approach and data presented would be fairly different, but because bolting on SVN support to the above tools would be more work than starting afresh, and the end result wouldn't be as useful in comparison to two distinct tools.
Additionally, I think it's crucially important that all projects that want support should follow the same commit conventions, such that branches and tags can be found algorithmically rather than heuristically.
However, regardless of the approach taken to support SVN, there will be a non-trivial amount of development work to do.