Post
1169
Finch π° an enterprise-grade benchmark that measures whether AI agents can truly handle real world finance & accounting work.
FinWorkBench/Finch
β¨ Built from real enterprise data (Enron + financial institutions), not synthetic tasks
β¨ Tests end-to-end finance workflows
β¨ Multimodal & cross-file reasoning
β¨ Expert annotated (700+ hours) and genuinely challenging hard
FinWorkBench/Finch
β¨ Built from real enterprise data (Enron + financial institutions), not synthetic tasks
β¨ Tests end-to-end finance workflows
β¨ Multimodal & cross-file reasoning
β¨ Expert annotated (700+ hours) and genuinely challenging hard