bash - Save sqoop incremental import id -
i have lot of sqoop jobs running in aws emr, need turn off instance.
there's way save last id incremental import, maybe localy , upload s3 via cronjob.
my first idea is, when create job send request redshift, data stored , last id or last_modified, via bash script.
another idea output of sqoop job --show $jobid, filter parameter of last_id , using create job again.
but don't know if sqoop offer way more easily.
as per sqoop docs,
if incremental import run command line, value should specified --last-value in subsequent incremental import printed screen reference. if incremental import run saved job, value retained in saved job. subsequent runs of sqoop job --exec someincrementaljob continue import newer rows imported.
so, need store nothing. sqoop's metastore take care of saving last value , avail next incremental import job.
example,
sqoop job \ --create new_job \ -- \ import \ --connect jdbc:mysql://localhost/testdb \ --username xxxx \ --password xxxx \ --table employee \ --incremental append \ --check-column id \ --last-value 0
and start job --exec
parameter:
sqoop job --exec new_job
Comments
Post a Comment