Software /
code /
prosody-modules
Diff
mod_storage_xmlarchive/README.markdown @ 2854:687b19cad4f5
mod_storage_xmlarchive/README: Add description of how data is stored
author | Kim Alvefur <zash@zash.se> |
---|---|
date | Thu, 28 Dec 2017 22:30:56 +0100 |
parent | 2818:88474dd1af48 |
child | 2855:7713cd4fff8f |
line wrap: on
line diff
--- a/mod_storage_xmlarchive/README.markdown Fri Dec 08 21:14:10 2017 +0100 +++ b/mod_storage_xmlarchive/README.markdown Thu Dec 28 22:30:56 2017 +0100 @@ -63,3 +63,27 @@ Where `$DIR` is `to` or `from`, `$STORE` is e.g. `archive` or `archive2` for MAM and `muc_log` for MUC logs. Finally, `$JID` is the JID of the user or MUC room to me migrated, which can be repeated. + +Data structure +============== + +Data is split in three kinds of files and messages are grouped by day. +Prosodys `util.datamanager` is used, so all special characters in these +filenames are escaped and reside under `hostname/store` in Prosodys Data +directory, commonly `/var/lib/prosody`. + +`username.list` +: A list of dates in `YYYY-MM-DD` format. + +`username@YYYY-MM-DD.list` +: Index containing metadata for messages stored on that day. + +`username@YYYY-MM-DD.xml` +: Messages in textual XML format, separated by newlines. + +This makes it fairly simple and fast to find messages by timestamp. +Queries that are not time based, but limited to a specific contact may +be expensive as potentially the entire archive will be read. + +Each archive ID is of the form `YYYY-MM-DD-random`, making lookups by +archive id just as simple as time based queries.