Same pattern in data engineering generally. LLMs default to the obvious row-by-row or download-then-insert approach and you have to steer them toward the efficient path (COPY, bulk loaders, server-side imports). Once you name the right primitive, they execute it correctly, permissions and all, as you found.
The deeper issue is that "efficient ingest" depends heavily on context that's implicit in your setup: file sizes, partitioning, schema evolution expectations, downstream consumers. A Lambda doing direct S3-to-Postgres import is fine for small/occasional files, but if you're dealing with high-volume event-driven ingestion you'll hit connection pool pressure fast on RDS. At that point the conversation shifts to something like a queue buffer or moving toward a proper staging layer (S3 → Redshift/Snowflake/Databricks with native COPY or autoloader). The LLM won't surface that tradeoff unless you explicitly bring it up. It optimizes for the stated task, not for the unstated architectural constraints.
Also with Redshift - split the file up before ingestion to equal the number of nodes or combine a lot of small files into larger files before putting them into S3 and/or use an Athena CTAS command to combine a lot of small files into one big file.
So in my other case, the whole thing was
Web crawler (internal customer website) using Playwrite -> S3 -> SNS -> SQS -> Lambda (embed with Bedrock) -> S3 Vector Store.
Similar to what you said, I ran into Bedrock embedding service limits. Then once I told it that, it knew how to adjust the lambda concurrency limits. Of course I had to tell it to also adjust the sqs poller so messages wouldn’t be backed up in flight, then go to the DLQ without ever being processed.