All Hands on Data #99
You may have 99 problems but All Hands on Data ain't one! If the terrible 50 cent reference didn't scare you away, keep scrolling to learn more about what's going on in data.
Article Recommendations
Why did Golang lose to Rust for Data Engineering?
I think this is an interesting discussion for a few reasons, first of all I think Rust and Go are both excellent languages but it is true that Rust has a bigger footprint in the DE space. That being said, neither language are used as the optimal choice for standard tasks in data engineering. The fact remains that SQL and Python are still king, and not going to be directly replaced by Rust. I think the main effect of Rust will be felt in new libraries that are improvements on existing solutions in the Python ecosystem (i.e. Polars) - Wes Poulsen
Where did we come from? Exploring the explosion of interest in data and data tooling
I started my data journey when the "Modern" Data Stack was the goal for all companies. Not only was there a desire to have the best in breed tool, but the "explosion" of a lot of tools that accomplished niche tasks was highly sought out by companies looking to be data-forward. With the rise in AI, what is now considered to be "best in breed" has shifted, and this article does a great job at discussing the ushering in of the AI era in the data tooling space. - Angel Catalan
Stack Overflow signs deal with OpenAI to supply data to its models
OpenAI has made an agreement with StackOverflow to access its data. It's a reversal from their prior stance against their data being used to train models but makes sense since generative AI models have taken a chunk of their traffic. I for one am excited because I can now do CTRL+C, CTRL+V from ChatGPT instead of from both it and StackOverflow. - John Forstmeier
The Limits of Data
This article provides a deep dive into why you should always be incredulous of the data you're presented. This data likely contains biases and a lack of context that's inherent with the very nature of data collection. This was a fantastic read to be aware of in a world where it seems like data dominates every decision we make. - Blake Burch
Podcast Recommendations
Microsoft Research Podcast - Abstracts: May 6, 2024
This episode discusses Math Vista, a new benchmark for evaluating how well large language models can reason about math problems presented with both text and images. Dr. Michelle Galley explains why multimodal reasoning is an important but largely unexplored capability, how her team collected visual math reasoning data sets, and what they found when testing models like GPT-4.
Machine Learning Street Talk (MLST) - Can Machines Replace Us? (AI vs Humanity)
AI expert Maria Santacaterina discusses the promise and perils of AI, arguing we must ensure it serves humanity not replaces it. She provides an insightful critique of AI's failings and advocates for a 'human-centered' approach focused on empowering peopl