Modern LLMs are excellent at looking right - and for bank-statement extraction, looking right is the failure mode you cannot catch by eye. Here's the 20 lines of Python that separate a plausible answer from a reconciled one.
You can't hit the real Binance API in tests. ExchangeClient interface, mockgen mocks, httptest for parsing real responses, testcontainers for PostgreSQL integration tests.