An admin application sends a request to start the reporting ETL workflow.
The decider assigns the first task to on-premises data extraction workers to extract data from a transactional database.
Upon completion, the decider assigns the next task to the EMR Starter to launch an EMR ETL cluster to validate and transform the extracted data set.
Upon completion, the decider assigns the last task to the Amazon Redshift Data Loader to load the transformed data into Amazon Redshift.
This workflow uses SWF for cron to automate failure handling and scaling in case you want to run your cron job on a pool of machines on-premises. In the latter case, this would eliminate any single point of failure, which is not possible with the traditional operating system cron.