summaryrefslogtreecommitdiffstats
path: root/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'README.md')
-rw-r--r--README.md69
1 files changed, 69 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..98e3ac1
--- /dev/null
+++ b/README.md
@@ -0,0 +1,69 @@
+# Soup.io backup scripts
+
+## Usage
+
+This Soup.io backup solution consists of two scripts:
+
+## fetch-pages
+
+Will crawl through the Soup pages (which consist of 20 posts each) and download
+them to a given output directory.
+
+Usage: fetch-pages URL [ OUTDIR ]
+
+URL is the base domain of your Soup (e.g. 'kitchen.soup.io').
+
+OUTDIR defaults the current directory. A directory called 'pages' will be
+created inside the output directory.
+
+
+## fetch-enclosures
+
+Tries to download all enclosed images and videos of the previously downloaded
+pages.
+
+Usage: fetch-enclosures [ OUTDIR ]
+
+OUTDIR defaults the current directory. A directory called 'enclosures' will be
+created inside the output directory; the output of fetch-pages is expected in
+the 'pages' directory inside OUTDIR.
+
+
+## Bugs and missing features
+
+* A failed page download will interrupt fetch-pages. fetch-pages can't resume
+ the backup at the point it failed; either the base URL or LIMIT need to be
+ adjusted in the script, or previously downloaded pages need to be removed so
+ the LIMIT calculation will allow downloading the missing pages
+* fetch-enclosures could be adjusted to try multiple asset servers on failures.
+ Just re-running fetch-enclosures will work in case of transient failures, the
+ script will only attempt to retrieve missing files.
+* Adding a script to extract the HTML code of individual posts from the pages
+ might be interesting to allow mirroring Soups that aren't primarily made up of
+ images and videos to other blog systems.
+
+
+## LICENSE
+
+Copyright (c) 2017, Matthias Schiffer <mschiffer@universe-factory.net>
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright notice,
+ this list of conditions and the following disclaimer.
+ 2. Redistributions in binary form must reproduce the above copyright notice,
+ this list of conditions and the following disclaimer in the documentation
+ and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.