April 19, 2015

The missing MtGox bitcoins

We received a lot of positive feedback on our release of our preliminary investigation into the Willy bot, even though it was already quite old (written circa August 2014). We also said we hoped to release more of our information over time, even though we have to take some care doing so. But while we have good relations to responsibly acquire and exchange information and share efforts, and reception from people online has been very encouraging, official interest has been pretty low.

In recent weeks, there has been a lot of news surrounding the U.S. agents accused of large bitcoin thefts in early 2013, and their interactions with Silk Road as well as MtGox. In the wake of this, there has been intense speculation that what happened at MtGox may have been related to these agents.

With the next creditors' meeting also on the horizon, this seems like a pretty good time for me to step in and share a bit more of what we know. First of all: no, we don't think these agents are main characters in the story of the missing MtGox bitcoins, and the story doesn't begin in 2013 either.

— Kim Nilsson, lead investigator   (available for comments: kim@wizsec.com)


Updates since initial publication:

  • The proper term for MtGox's financial status would have been "insolvency", not "fractional reserve". Thanks for pointing this out.
  • The graphs related to MtGox's BTC holdings have been updated to include older, heuristically identified outputs, filling in most of the "gap" and giving a clearer picture of holdings over time.
  • Minor language cleanups.



Executive summary

Most or all of the missing bitcoins were stolen straight out of the MtGox hot wallet over time, beginning in late 2011. As a result, MtGox was technically insolvent for years (knowingly or not), and was practically depleted of bitcoins by 2013. A significant number of stolen bitcoins were deposited onto various exchanges, including MtGox itself, and probably sold for cash (which at the bitcoin prices of the day would have been substantially less than the hundreds of millions of dollars they were worth at the time of MtGox's collapse).



Background

After a string of reported problems with bitcoin withdrawals, the MtGox exchange made big news when it collapsed in early 2014 and declared bankruptcy. Ever since then it has been the topic of vivid speculation what actually happened, but little actual evidence has surfaced and over a year later there are still more questions than answers. Some, like us, have made their own unofficial attempts at independent investigations of MtGox; ours is probably both the longest-running and the one that has made the furthest progress.

Note that unlike our last report, this one is more of a peek into ongoing investigations, and as such there will be plenty of questions that for the moment go unanswered.

Missing or stolen?

An early question posed and debated by many, including ourselves, was whether all the missing MtGox bitcoins were ever actually real; that is, did MtGox at any point actually hold the coins in question, or have there been faked deposit entries merely making it look that way? Having a look at the leaked MtGox data from 2014, if we for a moment assume that the logged deposits and withdrawals in it are genuine, we can graph how many bitcoins MtGox should, in principle, have held in deposits over time:


Since the data does not go all the way back to the beginning of MtGox, all we would initially see are changes in total expected BTC deposits over time, not the actual totals. We can however make a decent estimate by noting that after the 2011 hack, Mark Karpeles performed a proof-of-holdings transaction of 424242.42424242 BTC, so MtGox's holdings must have held at least that much at the time. Shifting our graph up accordingly, we note that it now ends at MtGox's collapse at around 950,000 BTC, matching the supposed total holdings stated elsewhere in leaked data, hinting that we're not too far off the mark.

The next, much harder step is to figure out how many bitcoins MtGox actually held over time, and for this we need to know all the bitcoin addresses (public keys) used by MtGox. This is no small requirement, and so a multi-pronged approach was necessary:

  • See if the trustee will agree to release the list. Nope.
  • Try to match up the logged deposits and withdrawals from the leaked data with transactions on the blockchain, and from there start flagging addresses belonging to MtGox.
  • Further deduce ownership via clustering analysis, identifying additional addresses owned by the same entity by observing which addresses are ever spent together.
  • Improve and fill in gaps in the data through insider sources.

After much work, our end result is a surprisingly dependable list of over 2 million MtGox addresses, albeit fairly bare-bones with no knowledge or their intended purpose, associated user accounts etc. We have confirmed with our sources that this list is more or less complete, though there are still some gaps, mostly related to older internal manual transactions where bitcoins were moved around (e.g. cold storage). We can still estimate these legacy holdings by tracing bitcoins that later end up back in strongly identified MtGox addresses.

Our current best-effort estimate of all bitcoins held by MtGox can be seen in the following graph:

Notes: The "legacy" bitcoins in this graph are typically bitcoins that moved through multiple manual transactions (e.g. cold storage and proof-of-solvency), before merging back into normal wallets. The well-known "lost" 200,000 BTC show up clearly here as legacy bitcoins that were never merged back, matching the official story. As such, those bitcoins were in effect unavailable to MtGox.

(The original version of this graph can be found here.)
What appears to happen is that as early as August 2011 we are seeing a small discrepancy between expected holdings and actual holdings, which by the end of that year has rapidly grown to several hundred thousand BTC. Furthermore, it appears to keep growing over time, until by the middle of 2013 there are practically no (spendable) bitcoins left at all in MtGox. Even with the lower, more manageable bitcoin prices back in the day, a BTC shortage of this magnitude would have cost millions of dollars to cover. Barring any major hidden fiat or BTC reserves, the all but inescapable conclusion is that knowingly or not, MtGox was technically insolvent since at least 2012.

Plotting the difference between expected and actual bitcoins by itself paints yet another interesting and revealing picture:

(The original version of this graph can be found here.)
This is a pretty eye-catching pattern, suggesting that bitcoins more or less continuously went missing over time, but at a decreasing pace. Again by the middle of 2013, the curve goes more or less flat, matching the hypothesis that by that time there may not have been any more bitcoins left to lose. Apart from some minor discontinuities early on (which may be due to inaccuracies when estimating "legacy" bitcoins), the rate of loss from October 2011 onwards seems unusually smooth and at the same time not obviously relative to any readily available factors such as remaining BTC holdings, transaction volumes or the BTC price.

Worth pointing out is that, thanks to having matched up most of the deposit/withdrawal logs earlier, we can at this point at least rule out the possibility of any large-scale fake deposits — the bitcoins going into MtGox were real, meaning the discrepancy was likely rather caused by bitcoins leaving MtGox without going through valid withdrawals. The next task was to try to determine if these represented some kind of (major) glitch in MtGox's operation, or an intentional theft.

Tracing the coins

As touched upon earlier, apart from mere deposits and withdrawals, an exchange like MtGox would also perform internal transactions, such as:

  • Moving coins into cold storage
  • Splitting or merging bitcoin outputs to keep the wallet operating smoothly
  • Manual transactions (e.g. proof-of-solvency, reissuing failed withdrawals etc.)

Since none of these type of transactions were logged in any data we have access to, merely scraping the blockchain for transactions that spend MtGox bitcoins will yield false positives in the form of these as well. However, with painstaking work they can be accounted for and carefully filtered out while sifting for relevant data. How painstaking? Well, the total number of MtGox-related transactions count in the millions, and even heavily filtered down there are still thousands of potentially interesting transactions to go through.

One recurring pattern eventually stood out: MtGox bitcoins would suddenly get sent to a new non-MtGox address, without any withdrawal log entry, often in fairly recognizable amounts of a few hundred BTC at a time. Shortly afterwards, these addresses in turn would get gathered up into bigger addresses holding a few thousand BTC. From there, the coins would get deposited in chunks of some hundred BTC at a time onto various bitcoin exchanges.


This kind of activity would be hard to interpret as anything but intentional theft. The following destinations have so far been observed for the stolen bitcoins:

  • MtGox itself
  • BTC-e
  • Bitcoinica
  • Other as-of-yet unidentified wallets

The use of bitcoin exchanges like this can often be an attempt to obfuscate the trail via mixing or laundering, but in this case we believe those bitcoins were simply sold off for cash. (They would, if instead held for another 1-2 years, have appreciated in value by up to a hundred fold.) The possibility remains that other stolen bitcoins may have been kept somewhere (as bitcoins); this can hopefully be clarified by further tracing their movements.

So far, some estimated 300,000 BTC have been identified as taken out of MtGox in this fashion between late 2011 and the end of 2012, but this number may well rise as the investigation continues (preferably by other people!). For example, a different, more relaxed search pattern, the estimated number of bitcoins that at any point disappear anywhere balloons up to 800,000 BTC. The actual number is probably somewhere in-between.

For reference, the expected bitcoin holdings when MtGox collapsed were roughly 950,000 BTC. Out of those, 200,000 BTC were later found in an old wallet, and possibly up to 100,000 BTC belonged to MtGox itself (and would presumably be excluded during bankruptcy proceedings), leaving some 650,000 BTC ultimately missing.

Cold storage

"But wait, how can anyone have managed to steal all of MtGox's bitcoins?", I hear you say. Wouldn't the majority of coins have been secured in cold storage? While we would assume so, frankly we don't know enough details about the handling and historical use of MtGox's cold storage, and are forced to speculate.

One theory that springs to mind is that their cold storage may simply have been compromised, either physically by someone with on-site access, or somehow electronically through some security flaw in the key generation process. However, there are other possible explanations.

Our understanding is that MtGox did not have continuous monitoring of its cold storage, which consisted of paper wallets generated ahead of time and stored away. These locked-up paper wallets would then gradually and automatically be filled one by one by the system, by depositing surplus bitcoins out from the hot wallet. Vice versa, whenever the hot wallet ran low, staff would manually scan a paper wallet, refilling the hot wallet with stored bitcoins.

One possibility is that without any monitoring of the storage or comparing incoming and outgoing amounts, MtGox staff may have blindly kept pouring their cold storage into their leaking hot wallet, assuming that they were just dealing with frequent swings in deposits/withdrawals and that on average the cold storage was being refilled at roughly the same rate they were draining it.

A reminder to all bitcoin businesses out there: Always. Monitor. Your. Bitcoins.

Implications

A large amount of stolen MtGox bitcoins appear to have been sold off at MtGox and other exchange markets, which would have somewhat pushed the bitcoin price down. However, since the coins were moved relatively slowly over time, and this was back before the bitcoin price exploded, the net monetary effect on the market may well have been pretty limited.

While a decent chunk of MtGox deposits were in a sense "fake" after all (that is, made using stolen funds), the net result for the creditors remains fairly unchanged; if the thieves deposited stolen coins onto MtGox, sold them for cash and then withdrew that money, we're still left with creditors who bought those coins for real money in good faith.

Notably, the fact that the coins were real, stolen and at significant risk of already having been spent would seem to further dim any hopes for creditors of recovering any more bitcoins. Realistically, we are left to hope the payout percentage might improve as invalid and illegitimate claims could potentially be filtered out.

So... Who did it then?

...To be continued.

No really, I'm not trying to make some cliffhanger ending here. The truth is that as independent investigators, something like narrowing down actual suspects is close to impossible without proper access to all available data. There will be follow-ups and refinements to this report, but as it stands, if anyone is to continue where we leave off and anyone is to eventually get caught, then it's up to officials and law enforcement at this point (as well it should be).




What about Willy?

Willy, the automated bitcoin bulk buying bot described in the Willy Report and our previous blog post, came along much later during 2013. By this point, most of MtGox's deposited bitcoins were already long gone, and as such Willy and similar irregularities cannot have been responsible for the bulk of the missing bitcoins. They do, however, play a part in that they transformed a large number of missing bitcoins into missing fiat instead (and later vice versa). The possibility exists that this kind of manipulation may have been the main purpose behind Willy as a way of coping with the practical problems caused by such a massive bitcoin shortage. This is left for later investigations to clarify.

Can I get a copy of your data?

Not at this time. Not only have we received some of our information confidentially from our sources, but like we have said earlier, our primary goal is to directly or indirectly aid law enforcement in this matter, and some care when handling potentially sensitive data is always prudent. Also, I'm doing all of this for free, and going through and cleaning up big data for publication is not the best way to spend what little time I have. We may, however, share some of the processed data later. (Please be patient.)

Besides, most of you would probably agree that the law should have at least a head start at catching the bad guys before we let slip the dogs of the internet.