We have a similar use case. All Elixir code base, but need to use Python for ML libraries. We decided to use IPC. Elixir will spawn a process and communicate over stdio. https://github.com/akash-akya/ex_cmd makes it a breeze to stream stdin and stdout. This also has the added benefit of keeping the Python side completely stateless and keeping all the domain logic on the Elixir side. Spawning a process might be slower compared to enqueuing a job, but in our case the job usually takes long enough to make it irrelevant.
We also had a similar use case, so I built Snex[0] - specifically for Elixir-Python interop. Elixir-side spawns interpreters with Ports managed by GenServers, Python-side has a thin asyncio runtime to run arbitrary user code. Declarative environments (uv), optimized serde with language-specific objects (like `%MapSet{}` <-> `set`), etc. Interpreters are meant to be long lived, so you pay for initialization once.
It's a very different approach than ex_cmd, as it's not really focused on the "streaming data" use case. Mine is a very command/reply oriented approach, though the commands can flow both ways (calling BEAM modules from Python). The assumption is that big data is passed around out of band; I may have to revisit that.
Similar use case as well. I use erl ports to spawn a python process as well. Error handling is a mess, but using python as a short scripting language and elixir for all the database/application/architecture has been very ideal
I have one vibecoded ml pipeline now and I'm strongly considering just clauding it into Nx so I can ditch the python
Is this part of a web server or some other system where you could end up spawning N python processes instead of 1 at a time?
Honestly you saved yourself major possible headaches down the line with this approach.
At my work we run a fairly large webshop and have a ridiculous number of jobs running at all times. At this point most are running in Sidekiq, but a sizeable portion remain in Resque simply because it does just that, start a process.
Resque workers start by creating a fork, and that becomes the actual worker.
So when you allocate half your available RAM for the job, its all discarded and returned to the OS, which is FANTASTIC.
Sidekiq, and most job queues uses threads which is great, but all RAM allocated to the process stays allocated, and generally unused. Especially if you're using malloc it's especially bad. We used jemalloc for a while which helped since it allocates memory better for multithreaded applications, but easiest is to just create a process.
I don't know how memory intensive ML is, what generally screwed us over was image processing (ImageMagick and its many memory leaks) and... large CSV files. Yeah come to think of it, you made an excellent architectural choice.
This might be of interest to others: Last night I stumbled across Hornbeam, a library in a similar vein from the author of Gunicorn that handles WSGI / ASGI apps as well as a specific wrapper for ML inference
https://erlangforums.com/t/hornbeam-wsgi-asgi-server-for-run... https://github.com/benoitc/hornbeam