When assessing the two solutions, reviewers found Apache Airflow easier to use and administer. However, reviewers preferred the ease of set up, and doing business with AWS Step Functions overall.
Step Functions are perfect to implement stateful behavior while using stateless workers (lambdas). It's fairly easy to have conditional operations or operations on array-like inputs. This makes Step Functions very powerful.
Step functions can only submit one spark streaming job in an EMR. It should be enhanced to be able to submit multiple spark streaming jobs in the same EMR in parallel.
The best part of Airflow is that it is open source and also its power to run on distributed environment. It provides out of the box operators (connectors) to plugin almost any data source.
Browser UI is very buggy and has issues sometimes during my projects, can make my work lag behind due to this flaw.
Step Functions are perfect to implement stateful behavior while using stateless workers (lambdas). It's fairly easy to have conditional operations or operations on array-like inputs. This makes Step Functions very powerful.
The best part of Airflow is that it is open source and also its power to run on distributed environment. It provides out of the box operators (connectors) to plugin almost any data source.
Step functions can only submit one spark streaming job in an EMR. It should be enhanced to be able to submit multiple spark streaming jobs in the same EMR in parallel.
Browser UI is very buggy and has issues sometimes during my projects, can make my work lag behind due to this flaw.