All Hands on Data #96
The Stanley Cup Playoffs have just begun! As we anticipate the crowning of this year's champions, explore our selection of articles and podcasts to pass the time.
Article Recommendations
Microsoft, OpenAI plan $100 billion data-center project, media report says
In the continuing data war between the tech giants, Microsoft and OpenAI are planning a massive $100 billion data center project. Bigger, better, faster, stronger. - John Forstmeier
From Data Scientist to ML / AI Product Manager
What are the differences between an ML/AI Product Manager vs a software Product Manager? Both have to communicate complex technical concepts to stakeholders and translate stakeholders' desires into viable product decisions that engineers can easily act on. However, the unique challenges an ML/AI PM must face are interesting: maintaining a deep understanding of machine learning and data science in an ever-changing landscape, continuously measuring model performance through rapid growth, and ensuring the ethical use of AI. That last one is curial as the ethical considerations of AI are far-reaching and nuanced. - Katt Baum
Why pandas feels clunky when coming from R
I completely agree with this article that R's data manipulation libraries are superior to pandas. Of course you can accomplish the same task in either, but I have always felt that the dplyr/tidyverse syntax is more straightforward, primarily because it resembles SQL. I use Python everyday and am not an R user anymore (and haven't been for about 2.5 years), but if something comes where I need to analyze tabular data or make a few charts, I would much rather do that in R than in pandas/matplotlib. For literally any other task, Python all the way - Wes Poulsen
State of Analytics Engineering by DBT
For non-technical users, understanding the state of analytics engineering is crucial for several reasons: Business Alignment: It helps in understanding how analytics is being used to drive business decisions and strategies, enabling non-technical users to align their efforts with broader organizational goals. Data Literacy: Being aware of trends and challenges in analytics engineering enhances data literacy among non-technical users, empowering them to interpret and utilize data effectively in their roles. Collaboration: It facilitates better collaboration between technical and non-technical teams by providing insights into the tools, processes, and methodologies used in analytics engineering. Informed Decision-Making: Non-technical users can make more informed decisions when they understand the capabilities and limitations of analytics systems, leading to better outcomes for their projects and initiatives. - Jack Ryan
Podcast Recommendations
R for the Rest of Us
Statistician Will Landau discusses how he uses R at Eli Lilly and as an open source developer. He provides an overview of his R package targets which helps streamline data analysis workflows and ensures reproducibility.
R Weekly Highlights
This R Weekly podcast episode discusses integrating AI chatbots into RStudio, writing tests to fail fast, and checking R packages across platforms with the updated Rhub package. The hosts review useful tools for R developers.