Archives for Wes McKinney
Linux Developer Laptops: Dell's Precision 5500 series reigns supreme
Ursa Labs February 2019 Report
The Ultrarich's dirty secret: not paying taxes
Is it time to stop using sentinel values for null / "NA" values?
Announcing Ursa Labs's partnership with NVIDIA
Announcing Ursa Labs: an innovation lab for open source data science
Some comments to Daniel Abadi's blog about Apache Arrow
Feather format update: Whence and Whither?
Apache Arrow and the "10 Things I Hate About pandas"
Making Smart Phones Dumb Again
Software patents are evil, but BSD+Patents is probably not the solution
Extreme IO performance with parallel Apache Parquet in Python
Streaming Columnar Data with Apache Arrow
Development update: High speed Apache Parquet in Python with Apache Arrow
Native Hadoop file system (HDFS) connectivity in Python
2017 Outlook: pandas, Arrow, Feather, Parquet, Spark, Ibis
From Arrow to pandas at 10 Gigabytes Per Second
Kinesis Advantage2: Impressions
GitHub's one-dimensional view of open source contributions
Kinesis Savant Elite 2 Foot pedals
Feather and Apache Arrow: Grokking file formats vs. in-memory representations
Rejoinder: the problem with conda-forge right now
conda-forge and PyData's CentOS moment
On Software Demos and Potemkin Villages
Avoid unsigned integers in C++ if you can
Compiling DataFrame code is harder than it looks
Do average consumers still need Dropbox?
Why pandas users should be excited about Apache Arrow
Analyzing Interactive Brokers XML Flex Statements with pandas
The problem with the data science language wars
Spying on instance methods with Python's mock module
Strata NYC 2013 and PyData 2013 Talks
I'm moving to San Francisco. And hiring
Whirlwind tour of pandas in 10 minutes
Update on upcoming pandas v0.10, new file parser, other performance wins
A new high performance, memory-efficient file parser engine for pandas
Intro to Python for Financial Data Analysis at General Assembly
Easy, high performance time zone handling in pandas 0.8.0
Mastering high performance data algorithms I: Group By
A O(n log n) NA-friendly time series "as of" using array operations
The need for an embedded array expression compiler for NumPy
vbench Lightning Talk Slides from PyCon 2012
Even easier frequency tables in pandas 0.7.0
Contingency tables and cross-tabulations in pandas
NYCPython 1/10/2012: A look inside pandas design and development
High performance database joins with pandas DataFrame, more benchmarks
Some pandas Database Join (merge) Benchmarks vs. R base::merge
Introducing vbench, new code performance analysis and monitoring tool
Talk at Rice Stats on structured data analysis, pandas, 11/21/2011
pandas talk at PyHPC 2011 workshop in SC11, thoughts on hash tables
Filtering out duplicate pandas.DataFrame rows
PyHPC 2011 Pre-print paper on pandas
Fast and easy pivot tables in pandas 0.5.0
Performance quirk: making a 1D object ndarray of tuples
Python for Financial Data Analysis with pandas
Speeding up pandas's file parsers with Cython
Python, R, and the allure of magic
The pandas escaped the zoo: Python's pandas vs. R's zoo benchmarks
Faster time series alignment / joins for pandas, beating R's xts package
NYC Open Statistical Programming Meetup on 9/14/2011
GroupBy-fu: improvements in grouping and aggregating data in pandas
A Roadmap for Rich Scientific Data Structures in Python