notes on kettle jobs

generally:

FNWI/CJKR

get cmd args

get latest timestamp

generate rows (input/generate rows)

set header (scripting/modified java script value)

statements from LRS (lookup/rest client)

parse json (input/json input)

max of timestamp (statistics/memory group by)

write to log (utility/write to log)

user data to lrs

courses and consenting students (input/table input)

pre pare down (transform/select values)

more queries (input/table input)

row to statement (scripting/modified java script value)

pare down (transform/select values)

statement to LRS (lookup/rest client)

write to log (utility/write to log)

Annemarie

user data to csv

get filenames (job/get variables)

read course/user ids (input/text file input)

join rows (joins/join rows (cartesian product))

dupe (scripting/modified java script value)

query db for participants (input/table input)

query db for anon activity/dsll (input/table input)

anon data to vectors (statistics/memory group by)

join rows 2+3 (joins/join rows (cartesian product))

aggregate (scripting/modified java script value)

csv output (output/text file output)

mail

deploy notes

[sudo] crontab -e
0 * * * * sh path/to/kitchen.sh -norep -file path/to/job.kjb arg arg2 >> cron.log 2>&1

or, if you’re doing a one-time thing:

nohup sh path/to/kitchen.sh -norep -file path/to/job.kjb arg arg2 &

version control

shit’s on subversion

https://source.ic.uva.nl/repos/svn/odg-1/Cameron/etl/kettle