Note: This is something I think should exist and I’m writing about it to flesh out what the problems might be and why no one’s built it. If you’re either a) working on it or b) are deeply knowledgeable about it and know something I’m wrong about, please hit reply or leave a comment.
A problem whose time has come is an easy way to programmatically retrieve itemized receipt data, that doesn’t require you to be a closed loop network [1]. Here’s why.
Across a bunch of industries and workflows, a lot of human bandwidth is spent discerning the purpose or nature of a transaction, and then accurately categorizing that transaction. In a lot of cases this is accompanied by a lot of dead-weight loss. Some examples:
Consumer use cases
Disputes
Consumers often see a purchase on a credit or debit card, don’t recognize it, and file a dispute. With itemized receipts in their banking app, they’re literally less likely to file it. The actual cost of a dispute is immense. It includes
Fees a bank pays to process a dispute to their processor can be $25 - $50 per dispute interaction regardless of the size of the transaction (and a dispute can have an infinite number of interactions, but from what I have seen, 3 interactions per dispute is a reasonable average)
Provisions for Reg E to ensure the customer has access to funds if the dispute is eligible
The amount of human time the issuer’s staff spends to process the dispute, interact with the customer etc
The amount of human time the merchant’s staff spends to process the dispute
Fees a merchant pays to their acquirer to process a dispute (for example Stripe charges merchants $15 per dispute)
Some issuers optimize their dispute processes by simply writing off a large percentage of transactions below the cost of processing (for example, if a dispute comes in that’s $3, its actually cheaper to just make the funds available to the cardholder and move on, than to require a human to respond. Maybe AI & companies like Casap [2] will change this?). But overall, it isn’t crazy to assume the average dispute costs the system over $100 in aggregated human time and processing fees.
Budgeting
Consumers spend so much time and effort budgeting, but even that budgeting is generally at the level of a transaction rather than at the level of an item. For example, most budgeting automation today relies on a combo of MCC code and then merchant level categorization. This creates really strange experiences anytime a merchant provides multiple services. For example, you can buy snacks from Office Depot and it will be categorized as office supplies [3]. We lose lots of precision with this aggregation.
HSA & FSA Healthcare spend
A massive area of dead weight loss is HSA/FSA cards. Much more of our spend is HSA/FSA eligible than we’re able to take credit for as consumers in the US. To be clear, everybody wants this; the government wants more HSA/FSA utilization, patients want to spend tax free dollars, providers want the patient to feel like it’s cheaper, and the benefits administrators want the interchange.
But at the point of sale the patients more likely to pull up their primary credit/debit card because of muscle memory, and even if they remember, they’re not sure if the Cheetos in their shopping cart would make the transaction decline, which would be embarrassing despite the fact that the majority of the items they’re paying for are 213(d) eligible.
SMB + Corporate use cases
Fleet & Fuel Cards
Fleet operators that provide fuel cards to drivers have huge problems with the fact that they can’t see the underlying items they’re providing discounts on without L2/L3 data (which is slow). This problem contributed to the growth of closed loop card networks. As open loop spend grows in importance in the fuel card segment, the operators are losing this visibility. Companies like Wex & Fleetcr would be potential buyers from this segment.
Expense management [2]
For smbs and 1099 workers, whether spend is tax deductible depends on what the spend was on. This is one reason it’s important to keep receipts. For large corporates, receipts help them ensure employees adhere to expense policies, help detect fraudulent spend, and help provide insight into cost-of-goods-sold.
In the mobile era adding receipts is easier than ever, but is still fairly manual. Services like Found, Brex and Mercury all have mobile apps that make it easy to photograph a receipt. Many use OCR to scan the receipt data and match it to the transaction, minimizing data entry along the way. Here’s an example of how this works using Found:
Ultimately however, this process consumes a meaningful amount of human attention in small increments along the way. For 1099 workers improper documentation can become a tax issue down the line, and for corporates, it means they have crappy spend control.
Transaction categorization
This is really a subcategory of expense management. It’s the process of selecting which category the transaction belongs in. Existing attempts at automation will sometimes use the MCC code on the transaction to categorize it, but most software packages have moved away from it as it is quite blunt. For instance, if I expense gas and some snacks for $75 at a gas station, that entire transaction could be categorized as transportation or fuel (depending on your taxonomy) yet it gives the company an imprecise view into their actual fuel spend.
Here’s an example of transaction categorization in Mercury:
With detailed receipt data, you’d know exactly what was purchased and for how much, and could completely automate these flows and save employees a ton of time. This would also give accounting departments much more precise data to use to design and execute on expense policies (eg measuring how much spend is in policy and out with a high amount of precision) and catch outright fraud.
How to bootstrap this
Processor + Point of Sale aggregators
The fast way to bootstrap this is to use “open” endpoints available by the large modern payment processors. Square, Stripe, and Adyen have sites you can go to online and enter transaction information to extract receipt data:
Square: https://squareup.com/receipts
Amazon: https://www.amazon.com/receipts
The lookups are relatively onerous (require exact details of the card and transaction, and sometimes require a captcha).
In an authenticated environment you can probably utilize one of the many transaction IDs that come with payment card transactions as a key, plus maybe one other field as all you need to lookup the transaction.
These receipts will often include item data on what was purchased in the transaction, like so:
You could easily convert the receipt data into a json blob for a developer customer to consume/ingest into their system.
For the aggregators with open endpoints on the web, you could bootstrap this by building some RPA that would do the retrieval. This is easiest for issuers who actually own all the card details; in the past you didn’t need to enter card details to hit these endpoints, but I assume someone tried this at some point, and now you almost unilaterally need to enter the full PAN and a CAPTCHA for receipt lookup.
Square, Stripe and others already make it easy to get email or text receipts. You can think of the issuer, expense management system, accounting software, or PFM as another receipt endpoint to facilitate, one that’s always “requesting” receipts for any cards it issues.
For more traditional merchants (Eg CVS, Walgreens etc) I know this is possible because companies like Solutran already have some of these integrations, specifically for HSA/FSA spend. It’s just not generally available yet.
High volume digital merchants
Merchants like Airlines (United, Delta etc), Rideshare (Uber, Lyft etc), delivery (Seamless, DoorDash) and others whose purchases are primarily a) card-not-present with b) a card on file often will send an email receipt by default. In all these cases, where the transaction is for a corporate/business purchase, the ultimate home of that receipt is literally always the expense management system. Today, that expense management step requires a human to do things. If the issuer was just an endpoint in all these cases (for all subscribed issuers), similar to the cardholder’s email address or phone number, you’d have far higher compliance with expense policies, more spend precision and far less fraud.
Who would pay for this
Building this requires building a new ecosystem; you have to give the payment processors & merchants financial incentive to invest in the engineering required to surface these data in a fast/useful way. Initially that incentive might be funded by venture, but ultimately for a new ecosystem to work, the counterparties paying for the new services, have to value it higher (and pay more) than it costs for the counterparties developing & providing the data, with some margin leftover to fund the company putting the ecosystem together.
It’s not hyper clear to me who would pay for the consumer use cases (like I think there’s a world where consumers might pay, which is the dream of every PFM in the world, but I’ve just not seen anything that points to this being super scalable). In contrast it’s extraordinarily clear that the ultimate paying customers are the entities that provide expense management. This includes:
Corporate card issuers
Amex (in a class of it’s own as its so large)
Travel Cards like Navan
SMB Cards like Found, Mercury, Ramp, Brex
Expense Management software like Concur, Expensify, Divvy/BILL & Airbase
Accounting Software like Quickbooks, Freshbooks, Harvest and more
For all the players in this ecosystem, a large amount of time their users spend in the system is uploading receipts and categorizing transactions. A well built receipt infrastructure could cut this activity meaningfully, freeing up a significant amount of human time and giving precision to accounting teams. A product that automatically categorized expenses would feel like magic, and it’s only really possible with high precision if the receipt data is also available.
One other thing this could unlock is item specific promotions, which enables manufacturers to participate in card linked offers (not just retailers). Most often in the card linked offers world, retailers reward cardholders for spend at a certain retailer:
In a world where the issuer/card is aware of what actually was purchased, a manufacturer could reward spend on a specific item (imagine Sony rewarding a user for buying a specific accessory, regardless of if the accessory was purchased at Best Buy, and being able to track the purchase all the way through from promotion to transaction with no manual work from the user). This gives manufacturers a high fidelity channel to for direct response advertising, and expands the pool of dollars available in the card linked offers world.
Challenges
A few obvious challenges stand out right away. First; building new ecosystems is hard. You’re essentially inserting a new priority onto the roadmap of a bunch of companies that are already fiercely competitive with each other and trying to figure out ways to differentiate. This is true on both sides of the ecosystem; its unlikely that Ramp wants to sign on to support a system that Brex would benefit from, and even if they do, they’re unlikely to want to pay for something its not immediately clear their customers value. (This isn’t a knock on any of these companies - just an observation of the competitive dynamics having been around several companies trying to build new ecosystems).
Second, some companies on the issuer side already have premium-esque tiers where this functionality can live, so there already exists an economic model that might support the new development overall. But for most of them, being an early adopter means taking the risk that their end customers don’t value this, or crucially, don’t value it enough that its financially worth the cost vs the alternative, (which is that cardholders do this categorization and attach receipts manually).
A third (again common to building new ecosystems) is how large it could get financially. If you’re monetizing on a transaction basis, you’d need to charge sub 1c per transaction to make this work. Generously if you assume 1c per transaction, you need 100 million transactions to hit $1m in annual revenue. Not impossible. But you’re paying a significant amount of that out to the ecosystem (merchants, acquirers, point of sale companies) to fund the development. You’d just want to have conviction that the profit pool is large enough to justify your spend.
Last (and I think this is the core problem) - the entities that benefit the most are issuers who will save on fraud and disputes and businesses who will give time back to employees, improve accounting and save on expense fraud. For issuers, fraud and disputes are a cost center. For businesses time saved for employees, and the accounting benefits are hard to measure. Expense fraud is easier to quantify and justify. It’s likely that the aggregate amount saved in employee time, and fraud and dispute reductions is meaningful, but will not be valued highly enough to compensate the entities on the merchant side, sufficiently enough to invest in making this available. In addition, many merchants view this data as incredibly strategic to building an understanding of their users; Amazon removed receipts from email so Google couldn’t get access via Gmail for this reason. This can be solved by selective access to specific issuers (eg I doubt Amazon cares if receipts are used to categorize transactions or do fraud detection on your Mercury card, but they definitely care that Chase doesn’t get access to them).
Fundamentally, no matter how large this can get, and no matter how many businesses you can build on top of it, if the first use case can’t fund the entire ecosystem being built, it won’t be built at all. If you can solve this fundamental challenge you have a shot.
Thanks to Aaron Frank, Rory O’Reilly, Oban MacTavish, Timothy Thairu and Immad Akhund for reading in drafts
[1] You can kind of get at this data with the Level 2/Level 3 data from the card networks (some detail here https://www.marqeta.com/blog/data-details-what-is-level-1-2-and-3-data), but the adoption of those is relatively minimal, and in my interactions with the card networks over the years my read is 1) those services are relatively underinvested and not a priority, and b) the integration work is material for both merchants and issuers to utilize it.
[2] In full disclosure, I’m an investor in Casap, Mercury and Found.
[3] This dynamic leads to the single funniest employee hack I’ve heard of. To control costs at a very large company (Fortune 5), the ability to buy snacks was taken away from most employees (eg Amazon, Target etc). However, one of the large office supply stores (one of Office Depot or Office Max, can’t remember exactly which) remained an approved vendor, so employees would just buy snacks from there, and they’d be classified as office supplies! Item level data from receipts would prevent this from happening!
Check out Banyan.
https://www.banyan.com/
Check out Silver
https://www.withsilver.app/