replicating & filtering larissa's couchdb

login to the LRS VM

$ ssh user:pass@machine.ic.uva.nl

get the design doc

$ curl -X GET localhost:5984/larissa/_design/statements | python -m json.tool > ddoc.json

(this is a regular couchDB request: _design/statements is the doc's whole id)

thanks to tjeerd for cluing me in to python -m json.tool for prettyprinting JSON

edit it to include a filter function

// before
{
    // ...
    "filters": {},
    // ...
}

// code up a filter
    "filters": {
        "testfilter": function(doc,req){
            // will loop over all docs
            // should return true for every doc that passes the filter
            // and false for all others
            // (couchdb will error out if the function misses any docs)
            // don't forget a condition to catch the design doc itself

            if (doc.verb && doc.verb.indexOf('blat') === -1) { 
                return true; 
            } else {
                return false;
            }
        }
    },

// the actual value of the "testfilter" property should be a (minified) string, though:
{
    // ...
    "filters": {
        "testfilter": "function(doc,req){if(doc.verb && doc.verb.indexOf('blat')===-1){return true;}else{return false;}}"
    },
    // ...
}

PUT this design doc back as a new revision

the JSON already includes a _rev property, which is necessary to add a new revision; additionally it should contain "language": "javascript" so edit that in if for some reason it doesn't

$ curl -X PUT -H "Content-Type: application/json" localhost:5984/larissa/_design/statements -d @ddoc.json

also remove the design doc you currently have - if you want to edit it again, you'll need to GET it again because the _rev will have changed

$ rm ddoc.json

check that it's correct if you feel like it

$ curl -X GET localhost:5984/larissa/_design/statements | python -m json.tool | less

issue the actual replication command

$ curl -X POST -H "Content-Type: application/json" localhost:5984/_replicate \
-d '{"source":"larissa","target":"larissa_rep","create_target":true,"filter":"statements/testfilter"}'

important:

check that this worked as well if you feel like it

$ curl -X GET localhost:5984/_all_dbs
$ curl -X GET localhost:5984/larissa/_all_docs
$ curl -X GET localhost:5984/larissa_rep/_all_docs

the old switcheroo

find your config files

$ couchdb -c
//-> /etc/couchdb/default.ini
//-> /etc/couchdb/local.ini 

find your db files

$ sudo su - couchdb
$ cat /etc/couchdb/local.ini | grep database_dir
//-> database_dir = /data/couches/lrs/couchdb

$ ls -al /data/couches/lrs/couchdb
//-> drwxr-xr-x 4 couchdb couchdb  4096 Dec 15 16:17 .
//-> drwxr-xr-x 3 couchdb couchdb  4096 Nov 20 16:53 ..
//-> drwxrwxr-x 2 couchdb couchdb  4096 Dec 15 14:59 .delete
//-> -rw-rw-r-- 1 couchdb couchdb 20588 Dec 15 16:17 larissa.couch
//-> drwxrwxr-x 3 couchdb couchdb  4096 Dec 15 15:01 .larissa_design
//-> -rw-rw-r-- 1 couchdb couchdb 12393 Dec 15 16:17 larissa_rep.couch
//-> -rw-r--r-- 1 couchdb couchdb  4194 Nov 20 16:53 _replicator.couch
//-> -rw-r--r-- 1 couchdb couchdb  4194 Nov 20 16:53 _users.couch

backup the old db

$ cp /data/couches/lrs/couchdb/larissa.couch backups/
$ exit

delete original, replicate in reverse with #nofilter (protip: alias this shit to shorter commands, and cut downtime to an absolute minimum)

$ curl -X DELETE localhost:5984/larissa
//-> {"ok":true}
$ curl -X POST -H "Content-Type: application/json" localhost:5984/_replicate \
-d '{"source":"larissa_rep","target":"larissa","create_target":true}'

test that it worked

curl -X GET localhost:5984/larissa/_all_docs

congrats you're a db-maintenance god