Wes McKinney -

Joining Posit’s Polyglot Data Science Mission

work

open source

posit

TL;DR I am joining Posit today as a Principal Architect where I will advocate for…

Nov 6, 2023

Wes McKinney

Voltron Data Update: Transitions

startups

work

TL;DR I am transitioning out of my full-time CTO role at Voltron Data so that I can expand my portfolio of entrepreneurial and open source data projects. While no longer serving in a full-time operational role, I will remain engaged as a…

Oct 23, 2023

Wes McKinney

The Road to Composable Data Systems: Thoughts on the Last 15 Years and the Future

retrospective

thoughts

A new joint VLDB paper on Composable Data Management Systems with Meta, Databricks, Sundeck, and others at is out!…

Sep 1, 2023

Wes McKinney

Joining Forces for an Arrow-Native Future

work

apache arrow

Too often people say “let’s do something together” in passing, and don’t. There’s the occasional inter-project collaboration, but rarely will people take that next step.…

Aug 5, 2021

Wes McKinney

Ursa Labs March 2019 Report

ursa labs

work

The first quarter of 2019 has now wrapped up. In March we spent a good amount of time focused on…

Apr 4, 2019

Wes McKinney

Ursa Labs February 2019 Report

ursa labs

work

The team had a busy 28 days this February. The Apache Arrow community is discussing a 0.13 release…

Mar 6, 2019

Wes McKinney

Ursa Labs January 2019 Report

ursa labs

work

Ursa Labs had a busy January that went by too quickly. After a high-intensity 3 months of development, we helped release Apache Arrow 0.12 on January 20th. A good chunk of our time was spent fighting fires (in packaging and builds) related to the continued…

Feb 5, 2019

Wes McKinney

Leaving NYC for Nashville

For ten out of the last eleven years, I’ve lived in two places: New York City and San Francisco. The last two years have been in NYC. After founding Ursa Labs, a not-for-profit open source development group, I felt it was time to make my home somewhere that isn’t either of those places. After some contemplation and consulting…

Dec 3, 2018

Wes McKinney

Announcing Ursa Labs’s partnership with NVIDIA

ursa labs

work

I’m excited to announce that NVIDIA AI Labs has signed on as a supporter of Ursa Labs. NVIDIA’s new open source RAPIDS data science platform uses Apache Arrow for an interoperable representation of tabular data (data frames). We are looking forward to collaborating on our respective development roadmaps and growing the ecosystem…

Oct 10, 2018

Wes McKinney

Announcing Ursa Labs: an innovation lab for open source data science

work

ursa labs

apache arrow

Funding open source software development is a complicated subject. I’m excited to announce that I’ve founded Ursa Labs (https://ursalabs.org), an independent development lab with the mission…

Apr 19, 2018

Wes McKinney

Some comments to Daniel Abadi’s blog about Apache Arrow

apache arrow

databases

Well-known database systems researcher Daniel Abadi published a blog post yesterday asking Apache Arrow…

Nov 1, 2017

Wes McKinney

Feather format update: Whence and Whither?

apache arrow

Earlier this year…

Oct 16, 2017

Wes McKinney

Apache Arrow and the “10 Things I Hate About pandas”

pandas

apache arrow

This post is the first of many to come on Apache Arrow, pandas, pandas2, and the general trajectory of my…

Sep 21, 2017

Wes McKinney

Extreme IO performance with parallel Apache Parquet in Python

parquet

apache arrow

In this post, I show how Parquet can…

Feb 10, 2017

Wes McKinney

Streaming Columnar Data with Apache Arrow

apache arrow

Over the past couple weeks, Nong Li and I added a streaming binary format to Apache Arrow, accompanying the existing random access / IPC file format. We have implementations in Java and C++, plus Python bindings. In this post, I explain how the…

Jan 27, 2017

Wes McKinney

2017 Outlook: pandas, Arrow, Feather, Parquet, Spark, Ibis

apache arrow

pandas

ibis

parquet

work

2017 is shaping up to be an exciting year in Python data development. In this post I’ll give you a flavor of what to expect from my end. In follow up blog posts, I plan to…

Dec 27, 2016

Wes McKinney

From Arrow to pandas at 10 Gigabytes Per Second

apache arrow

pandas

In this post I discuss some recent work in Apache Arrow to accelerate converting to pandas objects from general Arrow columnar memory.

Dec 27, 2016

Kinesis Advantage2: Impressions

work

gear

I discuss my impressions of the newest version of the classic Kinesis Advantage contoured mechnical keyboard

Dec 4, 2016

Wes McKinney

GitHub’s one-dimensional view of open source contributions

rants

open source

TL;DR One of the most harmful parts of the GitHub platform is the code contribution calendar. This “hacker score card” overemphasizes the…

Nov 6, 2016

Feather: it’s about metadata

apache arrow

Summary: Feather’s good performance is a side effect of…

Apr 26, 2016

Why pandas users should be excited about Apache Arrow

apache arrow

pandas

I’m super excited to be involved in the new open source Apache Arrow community initiative. For Python (and R, too!), it will help enable

Feb 22, 2016

Wes McKinney

The problem with the data science language wars

rants

data science

python

I really enjoyed the cheeky blog post by my pal Rob Story.

Nov 2, 2015

Wes McKinney

What’s changed

personal

Some reflections from turning 30.

Mar 23, 2015

Wes McKinney

Thoughts on joining Cloudera

work

After some unanticipated media leaks (here and here), I was very excited to finally share that my team and I are joining Cloudera. You can find out all the concrete details in those articles, but I wanted to give a bit more intimate perspective on the move and what…

Oct 6, 2014

Wes McKinney

Introducing vbench, new code performance analysis and monitoring tool

python

benchmarks

Do you know how fast your code is? Is it faster than it was last week? Or a month ago? How do you know if you accidentally made a function slower by changes elsewhere?…

Dec 18, 2011

Wes McKinney

Categories

Categories

Joining Posit’s Polyglot Data Science Mission

Voltron Data Update: Transitions

The Road to Composable Data Systems: Thoughts on the Last 15 Years and the Future

Joining Forces for an Arrow-Native Future

Ursa Labs March 2019 Report

Ursa Labs February 2019 Report

Ursa Labs January 2019 Report

Leaving NYC for Nashville

Announcing Ursa Labs’s partnership with NVIDIA

Announcing Ursa Labs: an innovation lab for open source data science

Some comments to Daniel Abadi’s blog about Apache Arrow

Feather format update: Whence and Whither?

Apache Arrow and the “10 Things I Hate About pandas”

Extreme IO performance with parallel Apache Parquet in Python

Streaming Columnar Data with Apache Arrow

2017 Outlook: pandas, Arrow, Feather, Parquet, Spark, Ibis

From Arrow to pandas at 10 Gigabytes Per Second

Kinesis Advantage2: Impressions

GitHub’s one-dimensional view of open source contributions

Feather: it’s about metadata

Why pandas users should be excited about Apache Arrow

The problem with the data science language wars

What’s changed

Thoughts on joining Cloudera

Introducing vbench, new code performance analysis and monitoring tool