All Hands on Data #35
Welcome to our 35th installment of AHoD! We have our first community submission this week. If you want to be included in a future AHoD, click the button below!
Exploratory programming: what it is, why it matters
Data teams are different from dev teams. Tools and tactics that help us build software aren’t designed for exploring data and sharing insights. Exploratory programming is. This post looks at why exploratory programming is ideal for data teams — and to unlock its value. - Kevin White
Data is a Public Good
The utilization of public data without pay to the data providers in the popular large language and generative models is questioned by the author. Several interesting methods for measuring and potentially enforcing profit distribution to contributors are raised; however, the cited sources like GitHub and Wikipedia would also need to update their terms and licenses to accommodate such a shift. It's not impossible, but it would indicate a pretty significant shift. - John Forstmeier
Prediction: AI Is Just Getting Started. In 2023, It Will Begin to Power Influencer Content
If you use Instagram, I'm sure your feed has recently been filled with some AI generated images via Lensa. You've also probably seen stories, articles, or even code generated from ChatGPT. So it looks like the next era of social media might be AI generated content..? I'm only hoping the article is right in saying it will be used to "enhance" content, rather than our feeds turning into nothing that is real! - Joseph McDermott
Goodbye Data Science
When I graduated college with my degree to transition from teaching to the data field a couple of years ago, I asked the consultant that I was working with at my internship to define what a Data Scientist is. He told me that a Data Scientist is a person who knows how to do everything in the data stack and can be self-sufficient. He added that the Data Science title is more about prestige than actual requirements. I feel like this article is a good insight into what being a Data Scientist is like and why some people may want to avoid going towards that type of role. - Steven Johnson
Can 'radioactive data' save the internet from AI's influence?
Casey brings up an interesting idea on how to help identify AI generated content in the future. The thought of somehow altering AI generated content to have a "radioactive" identifier attached to it is really interesting, even if it might not be possible. - Jon Davidson
Maximum Subarray Sum Using SQL
I love examples like this where we start with a working solution and continue to build new solutions that expand on growing knowledge, and where you can test each stage to guarantee accuracy. I definitely don't go to SQL for solving problems like this, but it is certainly a fun experiment. Also, TIL the LAG and LEAD functions!! - Eric Elsken
Unimpressed With Your Scatter and Bar Plots? Give These Four Classic Alternatives A Try.
I immediately gravitated to this article based on title alone. I used to visualize data often and, frankly, I wasn't great at it. Mostly because I found the visualizations tired. But hexbins? What an awesome name and beautiful image. And that waterfall chart? <chef's kiss> - Katt Baum
Modern Polars
I've been hearing of the rise of Polars and rust, but I didn't actually realize Polars could be imported into Python and used with very similar syntax to Pandas. This is a fantastic resource for seeing the differences between Pandas and Polars, side by side, with examples for most typical use cases. - Blake Burch
ChatGPT as a Python Programming Assistant
Leveraging ChatGPT as a tool for programmers is going to be the new normal, as "googling" something is today. This article provides some practical examples of how to leverage ChatGPT for refactoring, simple script generation, and more. In my usage so far, I've seen that while ChatGPT does not produce perfect code that works 100 percent out of the box, it fills the gap and can bring me closer to the solution. - Wes Poulsen