logoalt Hacker News

mooredsyesterday at 4:28 PM2 repliesview on HN

How do you test these skills for consistency over time, or is that not needed?


Replies

theshrike79yesterday at 5:10 PM

The same way you'd test a human following written instructions over time.

Check the results.

pizzafeelsrightyesterday at 6:31 PM

My experience has been that if the skill is broken down into a function, possibly paired with a validator in another stage, you're at 99.9% deterministic.

I have not yet tested this at scale but give me six months.