All Hands on Data #7
Happy 2nd half of the year! Here are the articles the team is reading as we prepare for the 2nd half of 2022:
The State of Data Engineering 2022
LakeFS recently published their "state of data" companies and touched on the theme of the participants in the space building out features to consolidate their place in the marketplace. One theme that stood out for orchestration was the trend towards data lineage and observability features. - John Forstmeier
People-first Data Stacks
I like the reminder to always keep in mind the people that are using the tools. It always is helpful to remember not everyone you're dealing with in business thinks or understands things the same way. - Jon Davidson
Essential Techniques to Style Pandas DataFrames
Provides great examples for adding some pop to Pandas DataFrames, and it comes with a cheat sheet! - Eric Elsken
Everything is a Funnel, but SQL doesn't get it
While this article is technically a sales pitch, I thought it did a really good job of explaining where SQL struggles when it comes to the most common form of analysis - Funnels. It opened my eyes to what could be possible if we reworked how SQL interpreted these events rather than treating each of them equally. - Blake Burch
Linear Regression for Data Science
As a former math teacher, I can remember teaching linear regression to my students and trying to push why the ideas in regression are important. I think it is great that a topic that you learn in middle school can be pushed up to be used in the Data Science field. Benjamin takes a look at the level of knowledge needed to go from that middle school level to the data science level in this article. - Steven Johnson
Your data lies: Be a data-driven Luddite
Perhaps a bit heretical for our profession, the author recommends jumping at new tools when working with data in favor of encouraging "under the hood" and "hands on" work with data to drive information from it. This is important because data can't always be trusted, both in terms of structure and content, and should be handled as guilty until proven innocent and able to produce insight. - John Forstmeier