doc: Rewrite README for libimagentryref

This commit is contained in:
Matthias Beyer 2018-10-23 15:26:57 +02:00
parent 820ac41443
commit a41479c0ec

View file

@ -3,52 +3,74 @@
This library crate contains functionality to generate _references_ within the This library crate contains functionality to generate _references_ within the
imag store. imag store.
A reference is a "pointer" to a file or directory on the filesystem and outside ### Problem
the store.
It differs from `libimagentrylink`/external linking as
it is designed exclusively for filesystem references, not for URLs.
A reference is created with a unique identifier, like a hash. The implementation The problem this library solves is the following: A user wants to refer to a
how this hash is calculated can be defined by the user of `libimagentryref`. file which exists on her filesystem from within imag.
But unfortunately, the user has several devices and the filesystem layout (the
way the $HOME is organized) is not the same on every device.
With this library, the user is able to refer to a file, but without specifying
the whole path.
So this library helps to resemble something like a _symlink_. Each device can have a different "base path", files are re-found via their
hashes and file names, assuming that the files are equal on different devices or
have at least the same name.
### Usage
Users have to implement the `UniqueRefPathGenerator` trait which should ### User Story / Usecase
implement a hashing functionality for pathes.
Alice has a music library on her workstation and on her notebook. On her
workstation, the music collection is at `home/alice/music`, on the notebook, it
exists in `/home/al/media/music`.
From within imag, alice wants to create a link to a file
`$music_store/Psy_trance_2018_yearmix.mp3`.
`libimagentryref` helps her, because she can provide a "base path" in the
imag configuration file of each device and then link the file. imag only stores
data about the file and its relative path, but not its abolute path.
When moving the imag store from the workstation to the notebook, the base path
for the music collection is not `/home/alice/music` anymore, but
`/home/al/media/music` and imag can find the file automatically.
### Solution, Details
libimagentryref does store the following data:
```toml
[ref]
filehash.sha1 = "<sha1 hash of the file>"
relpath = "/Psy_trance_2018_yearmix.mp3"
collection = "music"
```
The filehash is stored so that libimagentryref can re-find the file whenever it
was moved. The `sha1` key is added to be able to upgrade hashes later to other
hashing algorithms.
`relpath` is the part of the path that when joined with the "base" path from
the configuration results in the full path of the file for the current machine.
The "collection" key hints to the configuration key in the imag config file.
The configuration section for the collections looks like this:
```toml
[ref.basepathes]
music = "/home/alice/music"
documents = "/home/alice/doc"
```
libimagentryref provides functionality to get the file.
libimagentryref also offers functionality to find files _only_ using their
filename (x)or filehash and correct the filehash or filename respectively
(automatically or explicitely).
### Limits ### Limits
This is _not_ intended to be a version control system or something like that. As soon as the file is renamed _and_ modified, this fails.
We also can not use _real symlinks_ as we need imag-store-objects to be able to This does also not cover the use case where the same file has different names on
link stuff. different machines.
### Usecase
This library offers functionality to refer to content outside of the store.
It can be used to refer to _nearly static stuff_ pretty easily - think of a
Maildir - you add new mails by fetching them, but you mostly do not remove
mails.
If mails get moved, they can be re-found via their hash, because Maildir objects
hardly change. Or because the hash implementation which is used to refer to them
hashes only the `Message-Id` and that does not change.
### Long-term TODO
Not implemented yet:
- [ ] Re-finding of files via their hash.
This must be implemented with several things in mind
* The user of the library should be able to provide a way how the
filesystem is searched. Basically a Functor which yields pathes to
check based on the original path of the missing file.
This enables implementations which do only search a certain subset
of pathes, or does depth-first-search rather than
breadth-first-search.
### Known problems
The functionality this library provides fails to work when syncing the imag
store between two devices where the data layout is different on each device.