From a5eaf0439c92ab98f78e557e62dea956bed33904 Mon Sep 17 00:00:00 2001 From: Matthias Beyer Date: Tue, 13 Feb 2018 22:49:31 +0100 Subject: [PATCH] Update documentation for new implementation --- doc/src/05100-lib-entryref.md | 70 +++++++++++++---------------------- 1 file changed, 25 insertions(+), 45 deletions(-) diff --git a/doc/src/05100-lib-entryref.md b/doc/src/05100-lib-entryref.md index 459735f0..37dd8764 100644 --- a/doc/src/05100-lib-entryref.md +++ b/doc/src/05100-lib-entryref.md @@ -3,27 +3,24 @@ This library crate contains functionality to generate _references_ within the imag store. -It can be used to create references to other files on the filesystem (reachable -via a filesystem path). It differs from `libimagentrylink`/external linking as +A reference is a "pointer" to a file or directory on the filesystem and outside +the store. +It differs from `libimagentrylink`/external linking as it is designed exclusively for filesystem references, not for URLs. -A reference can have several properties, for example can a reference track the -content of a filesystem path by hashing the content with a hashsum (SHA1) and -one can check whether a file was changed by that. -As files can get big (think of `debian.iso`) _partial hashing_ is supported -(think of "hash the first 2048 bytes of a file). - -The library contains functionality to re-find a moved file automatically by -checking the content hash which was stored before. - -Permission changes can be tracked as well. +A reference is created with a unique identifier, like a hash. The implementation +how this hash is calculated can be defined by the user of `libimagentryref`. So this library helps to resemble something like a _symlink_. +### Usage + +Users have to implement the `UniqueRefPathGenerator` trait which should +implement a hashing functionality for pathes. + ### Limits -Please understand that this is _not_ intended to be a version control system or -something like that. +This is _not_ intended to be a version control system or something like that. We also can not use _real symlinks_ as we need imag-store-objects to be able to link stuff. @@ -31,39 +28,22 @@ link stuff. This library offers functionality to refer to content outside of the store. It can be used to refer to _nearly static stuff_ pretty easily - think of a -Maildir - you add new mails by fetching them, but you mostly do not remove mails -and if you do you end up with a "null pointer" in the store, which can then be -handled properly. - -As this library supports custom hashes (you don't have to hash the full file, -you can also parse the file and hash only _some_ content) this is pretty -flexible. -For example if you want to implement a imag module which tracks a certain kind -of files which constantly change... but the first 5 lines do never change -after the file is created - you can write a custom hasher that only uses the -first 5 lines for the hash. - -### Internals - -Internally, in the store, the file gets created under -`/ref/`. -If the content of the file is hashed, we can still re-find the file via the -content hash (which is stored in the header of the store entry). - -The reference object can, after the path was re-found, be updated. +Maildir - you add new mails by fetching them, but you mostly do not remove +mails. +If mails get moved, they can be re-found via their hash, because Maildir objects +hardly change. Or because the hash implementation which is used to refer to them +hashes only the `Message-Id` and that does not change. ### Long-term TODO -Things which have to be done here or are not yet properly tested: +Not implemented yet: -- [ ] Testing of different Hashers -- [ ] Testing of re-finding of objects, including: - - [ ] Can a moved file automatically be found by content hash? - - [ ] Does a store-reference get updated automatically if it was moved, - including links (as in `libimaglink`)? - - [ ] If the content of a file changes, does the content hash get updated - automatically? - -("automatically" is a strechable term here, as these things have to be triggered -by the user anyways) +- [ ] Re-finding of files via their hash. + This must be implemented with several things in mind + * The user of the library should be able to provide a way how the + filesystem is searched. Basically a Functor which yields pathes to + check based on the original path of the missing file. + This enables implementations which do only search a certain subset + of pathes, or does depth-first-search rather than + breadth-first-search.