Bio

I am an entrepreneur and open source software developer focusing on analytical computing. I am currently a Principal Architect at Posit PBC.

I co-founded Voltron Data and now serve on its advisory board. I created or co-created the pandas, Apache Arrow, and Ibis projects. I am a Member of The ASF and I have authored three editions of Python for Data Analysis.

In the past, I was with Ursa Computing, Ursa Labs (with Posit’s help), Two Sigma Investments, Cloudera, DataPad, and AQR Capital Management. I received my S.B. in mathematics from MIT in 2007. I grew up in Knoxville, TN and Akron, OH, and currently live in Nashville, TN.

Projects

  • Python for Data Analysis: Book authored for O’Reilly Media. Three editions (2012, 2017, 2022) and translations in many languages.
  • Apache Arrow: I am a co-creator and PMC member, focusing on the C++ and Python implementations. I helped design the core Arrow format and helped start the Flight, Nanoarrow, and ADBC subprojects.
  • pandas: Python data analytics toolkit. I created the project in 2008 and turned it over to its open source community in 2013. I remain the “Benevolent Dictator for Life” but this is mostly a ceremonial role.
  • Ibis: A Python DSL toolkit bringing together the best ideas of data frames and SQL. Created in 2015 at Cloudera.
  • Apache Parquet: I am a PMC member and a principal author of the C++ implementation and Python bindings.