Also with Redshift - split the file up before ingestion to equal the number of nodes or combine a lot of small files into larger files before putting them into S3 and/or use an Athena CTAS command to combine a lot of small files into one big file.
So in my other case, the whole thing was
Web crawler (internal customer website) using Playwrite -> S3 -> SNS -> SQS -> Lambda (embed with Bedrock) -> S3 Vector Store.
Similar to what you said, I ran into Bedrock embedding service limits. Then once I told it that, it knew how to adjust the lambda concurrency limits. Of course I had to tell it to also adjust the sqs poller so messages wouldn’t be backed up in flight, then go to the DLQ without ever being processed.
The file splitting tip for Redshift is solid. One thing that caught us in a similar SNS/SQS/Lambda/Bedrock setup was not having a DLQ on the Lambda event source. When Bedrock started throttling hard, messages dropped silently and our vector store ended up with gaps we didn't notice for almost a week. Worth adding if you haven't ... it's the kind of thing you only miss once.