I'm a big fan of Interactive Brokers. Cheap trading, great APIs, lightning fast execution with no frills, designed for the professional trader. Perfect for people like me. Here's the background for this article: I started writing short-dated out-of-the-money call options on my stock and ETF positions last year as a way to hedge my downside risk (while limiting upside) and maybe even make a little extra income on the side in this volatile and frequently sideways market. Recently I was looking over my brokerage statements and trying to get a grasp on my realized and mark-to-market PnL (profit-and-loss) this year and last since we're talking about equity positions plus options that expired in each month since the beginning of the year. After poring over the various forms of reports I can get from IB I concluded the quickest-and-dirtiest (not to mention the most fun) way was to write a little Python program to do it for me.
I'll spare you the munging bit of getting the data out of the HTML activity statement using BeautifulSoup and skip to where I've scraped the relevant data into a DataFrame. Here's some faux data for illustration purposes:
In [65]: data
Out[65]:
Symbol Underlying Kind MTM Realized
0 AAPL AAPL Stock 120.5 108.5
1 CTXS CTXS Stock 101.5 91.39
2 NFLX NFLX Stock -193.5 -174.1
3 SPY SPY Stock 67.77 60.99
4 USO USO Stock -3.355 -3.019
5 VMW VMW Stock 39.13 35.21
6 AAPL 18JUN11 350.0 C AAPL Option -52.76 -47.48
7 AAPL 18JUN11 355.0 C AAPL Option 211.2 190.1
8 CTXS 18JUN11 87.5 C CTXS Option 65.17 58.65
9 CTXS 18JUN11 90.0 C CTXS Option -33.16 -29.84
10 NFLX 18JUN11 270.0 C NFLX Option 80.38 72.34
11 NFLX 18JUN11 280.0 C NFLX Option 76.82 69.14
12 SPY 16APR11 132.0 C SPY Option 54.25 48.83
13 USO 16APR11 44.0 C USO Option -122.4 -110.1
14 USO 16APR11 45.0 C USO Option 73.06 65.76
15 USO 18JUN11 40.0 C USO Option 50.96 45.86
16 VMW 18JUN11 100.0 C VMW Option 25.59 23.03
17 VMW 18JUN11 105.0 C VMW Option 91.11 82
So, I want to aggregate the Realized and MTM columns by Underlying and by whether it was an equity or option instrument. Were this data in a SQL table, this operation could be expressed extremely concisely:
SELECT Underlying, Kind,
SUM(Realized) AS Realized, SUM(MTM) AS MTM
FROM MTMReport
GROUP BY Underlying, Kind
ORDER BY Underlying
But since I like doing interactive data analysis in Python (and loath writing lots of SQL queries), I'd rather do this with pandas and also produce some easier-to-read console output. Here's what I ended up with:
mtm = {}
realized = {}
for kind, group in data.groupby('Kind'):
mtm_by_underlying = group['MTM'].groupby(group['Underlying'])
realized_by_underlying = group['Realized'].groupby(group['Underlying'])
mtm[kind] = mtm_by_underlying.sum()
realized[kind] = realized_by_underlying.sum()
# Convert to DataFrame and fill NA values with 0
mtm = DataFrame(mtm).fillna()
realized = DataFrame(realized).fillna()
And it outputs exactly what I'm looking for:
In [77]: mtm
Out[77]:
Option Stock
AAPL 158.4 120.5
CTXS 32.01 101.5
NFLX 157.2 -193.5
SPY 54.25 67.77
USO 1.668 -3.355
VMW 116.7 39.13
In [79]: mtm.sum(1)
Out[79]:
AAPL 278.941528842
CTXS 133.551244568
NFLX -36.2669453776
SPY 122.022253941
USO -1.68704136356
VMW 155.829013419
It occurred to me after writing this code that, with some changes to the GroupBy functionality in pandas, this could be made *much easier*. Ideally I'd like to write:
grouped = frame.groupby('Underlying', 'Kind')
mtm = grouped['MTM'].sum()
realized = grouped['Realized'].sum()
I'll get to work on this in the pandas repo.
Here's the Gist of the full script