logoalt Hacker News

simonwyesterday at 8:28 PM8 repliesview on HN

I wonder how much this thing costs to run.

https://github.com/anthropics/defending-code-reference-harne... says:

> As a rough guideline, expect ~10K uncached input tokens/min and ~2K output tokens/min per agent. You can scale parallelism up to your account's ITPM limit (roughly 10 agents per 100K ITPM).

My guess would be hundreds of dollars with Opus and thousands of dollars with Mythos.


Replies

nikcubyesterday at 8:34 PM

It's becoming apparent that it requires more tokens to secure code than it does to write it

May even be an order of magnitude more

show 4 replies
niros_valtostoday at 1:02 AM

I think that the cost of Opus is already prohibitively expensive, so not sure how that would compare to Mythos. Check this calculator- it shows that a company with 100 devs can hit ~2.5M cost on tokens annually, which is wild! https://ai-cost-calculator.arnica.io

show 1 reply
binyuyesterday at 9:34 PM

Claude workflows in ultra code mode works in a very similar fashion and it consumes a moderate amount of the session usage limit, depending on the complexity of the task. With the API it would probably get expensive quickly though

eranationtoday at 12:59 AM

We actually created a calculator to estimate scanning costs (including whether you do this continuously or not) https://ai-cost-calculator.arnica.io

It's an estimate, so it might be wrong, but it gives the ballpark based on our experience. Happy to hear everyone's feedback.

Terrettayesterday at 10:09 PM

If you compare to their managed service, that estimate is likely 1/10th expectation, depending on codebase.

But even this larger number, in turn, can be about 1/10th the cost of a formal engagement to discover the type of findings it seems to be going for: things that do not show up from PR reviews or even /security-review without the pre-work steps in the open-source framework guided by an expert. That's not counting the time and delay to figure out how to do that engagement.

Bluntly: if it matters, while this is a month's vibing budget for a single scan, it is also "pennies on the dollar" dirt cheap.

At the same time, its findings still need an expert. Its suggestions may be helpful, they may be actively harmful, depends on the prework quality.

Recommendation to IT department heads: spend a couple grand on this, use the scare page to rustle up the budget to build a relationship with a red team that can find, triage, help remediate if needed, and train your in-house team to be "security minded".

mmaney13today at 1:34 AM

Just another example of an overextension of technology in a scenario where applying a proper harness would suffice.

Reminiscent of the early days of tax automation where importing a W2 cost hundreds of dollars until people realized typing in 6 boxes worth of data was easy and paying the automation fee ate up their entire tax return.

Analemma_yesterday at 8:30 PM

I mean, you don't need to run it all the time, right? You do it once over your entire existing codebase to start and then once over the diff in your CI/CD pipeline when you make a new change. I'm sure it's not literally that simple but I doubt these need to churn 24/7/365 either.

show 3 replies
kolesnikov-archtoday at 7:12 AM

[flagged]