Orchestera (https://orchestera.com/) - Fully managed Apache Spark clusters in your own AWS account with no additonal compute markups, unlike EMR and Databricks.
Currently implemented the following:
- Automated scale in / scale out of nodes for Spark executors and drivers via Karpenter
- Jupyter notebook integration that works as a Spark driver for quick iteration and prototyping
- A simple JSON based IAM permissions managementent via AWS Parameter Store
Work-in-progress this month:
- Jupyterhub based Spark notebook provisioning
- Spark History Server
- Spark History Server MCP support with chat interface to support Spark pipeline debugging and diagnostics
Open to feedback and connecting. Docs at https://docs.orchestera.com/