Welcome to my personal website, where I write about Python, pandas, Emacs, C, and anything development-wise that I may find interesting. In my daily line of work I am a consultant trying to bring best practices from software engineering into the realm of data engineering. For consulting requests, check out my company website or email me at will_ayd@innobi.io
This article shows you how to use pdb to debug your Python applications. While not the most visually appealing option, knowledge of pdb becomes very useful for using gdb, which will be covered in the next article on debugging Python extensions.
This article shows you how to step into Python extensions, which are often used to wrap C/C++ libraries for interoperability or optimized performance. Because pdb cannot step through Python extensions we opt for gdb instead.
This article shows you how to use cygdb to debug Cython extensions. While dauting at first glance, the knowledge of pdb and gdb we gained in the previous two articles makes it much easier to step through Cython!
It is common practice in the Python world to write C/C++ extensions to optimize performance, but what do you do when that is not enough? How could you find bottlenecks within your extensions? Use callgrind of course!
Here we will see how a Python developer can consider Rust as a viable alternative to Cython. Rust abstracts a lot of the same things that Cython does, albeit it with a different architecture and syntax. Though a truly apples-to-apples comparison is difficult, this article will show you just how well it compares.
Under active development, Arrow ADBC drivers represent a new way of interacting with databases that is tailored more towards analytics. Traditional ODBC/JDBC connections have been stable for general use, but ADBC represents a better, more performant, and type-preserving architecture that should become more commonplace in ETL workflows soon.