replicating & filtering larissa's couchdb
login to the LRS VM
$ ssh user:pass@machine.ic.uva.nl
get the design doc
$ curl -X GET localhost:5984/larissa/_design/statements | python -m json.tool > ddoc.json
(this is a regular couchDB request: _design/statements
is the doc's whole id)
thanks to tjeerd for cluing me in to python -m json.tool
for prettyprinting JSON
edit it to include a filter function
// before
{
// ...
"filters": {},
// ...
}
// code up a filter
"filters": {
"testfilter": function(doc,req){
// will loop over all docs
// should return true for every doc that passes the filter
// and false for all others
// (couchdb will error out if the function misses any docs)
// don't forget a condition to catch the design doc itself
if (doc.verb && doc.verb.indexOf('blat') === -1) {
return true;
} else {
return false;
}
}
},
// the actual value of the "testfilter" property should be a (minified) string, though:
{
// ...
"filters": {
"testfilter": "function(doc,req){if(doc.verb && doc.verb.indexOf('blat')===-1){return true;}else{return false;}}"
},
// ...
}
PUT this design doc back as a new revision
the JSON already includes a _rev
property, which is necessary to add a new revision; additionally it should contain "language": "javascript"
so edit that in if for some reason it doesn't
$ curl -X PUT -H "Content-Type: application/json" localhost:5984/larissa/_design/statements -d @ddoc.json
also remove the design doc you currently have - if you want to edit it again, you'll need to GET it again because the _rev
will have changed
$ rm ddoc.json
check that it's correct if you feel like it
$ curl -X GET localhost:5984/larissa/_design/statements | python -m json.tool | less
issue the actual replication command
$ curl -X POST -H "Content-Type: application/json" localhost:5984/_replicate \
-d '{"source":"larissa","target":"larissa_rep","create_target":true,"filter":"statements/testfilter"}'
important:
- you POST right to the couchdb instance, not to a specific DB
- the value of
create_target
is a boolean, not a string - the filter lives under the namespace of the design doc
check that this worked as well if you feel like it
$ curl -X GET localhost:5984/_all_dbs
$ curl -X GET localhost:5984/larissa/_all_docs
$ curl -X GET localhost:5984/larissa_rep/_all_docs
the old switcheroo
find your config files
$ couchdb -c
//-> /etc/couchdb/default.ini
//-> /etc/couchdb/local.ini
find your db files
$ sudo su - couchdb
$ cat /etc/couchdb/local.ini | grep database_dir
//-> database_dir = /data/couches/lrs/couchdb
$ ls -al /data/couches/lrs/couchdb
//-> drwxr-xr-x 4 couchdb couchdb 4096 Dec 15 16:17 .
//-> drwxr-xr-x 3 couchdb couchdb 4096 Nov 20 16:53 ..
//-> drwxrwxr-x 2 couchdb couchdb 4096 Dec 15 14:59 .delete
//-> -rw-rw-r-- 1 couchdb couchdb 20588 Dec 15 16:17 larissa.couch
//-> drwxrwxr-x 3 couchdb couchdb 4096 Dec 15 15:01 .larissa_design
//-> -rw-rw-r-- 1 couchdb couchdb 12393 Dec 15 16:17 larissa_rep.couch
//-> -rw-r--r-- 1 couchdb couchdb 4194 Nov 20 16:53 _replicator.couch
//-> -rw-r--r-- 1 couchdb couchdb 4194 Nov 20 16:53 _users.couch
backup the old db
$ cp /data/couches/lrs/couchdb/larissa.couch backups/
$ exit
delete original, replicate in reverse with #nofilter (protip: alias this shit to shorter commands, and cut downtime to an absolute minimum)
$ curl -X DELETE localhost:5984/larissa
//-> {"ok":true}
$ curl -X POST -H "Content-Type: application/json" localhost:5984/_replicate \
-d '{"source":"larissa_rep","target":"larissa","create_target":true}'
test that it worked
curl -X GET localhost:5984/larissa/_all_docs