All Hands on Data #85
Are pyramids the best way to present information? I'm not sure I can prove that just yet, but you can see a data ROI pyramid along with other articles from this week in the data world below.
Best Practices for Creating Domain-Specific AI Models
It gives a great set of recipes, advice to follow for making your company's datasets more beneficial for domain specific applications. Having dealt with a narrow and internal dataset in the past, this would have been hugely beneficial before our efforts. - Eric Elsken
This dashboard could've been a…spreadsheet?
While you may think of Beauty and the Beast as the "Tale as Old as Time", I would argue the debate between spreadsheets and dashboards may be the ultimate tale. Elena discusses how she handled this debate when she worked at Peloton in this article. - Steven Johnson
Data Storage and Indexing
Just in case you were curious about the specific implementation tradeoffs of the various database solutions you're considering, here's the article for you. It gives a great, and fairly succinct, breakdown of different storage techniques. It'll come in handy as you're looking through the Shipyard Blueprints while setting up your data pipeline. - John Forstmeier
The Data ROI Pyramid: A Method for Measuring & Maximizing Your Data Team
There are many articles about how to measure the value-add of a data team, but this one stood out to me. Moses offers a new formula for calculating data ROI ((data product value â data downtime) / data investment). She breaks down what goes into each variable, provides suggestions on how to quantify them, and the levers a leader can use optimize each. This article is a well-organized guide on how to demonstrate value AND increase it. - Katt Baum
What data scientists overlook when it comes to knowledge graphs
The Data Science Central article underlines the often-undervalued potential of knowledge graphs in data science, emphasizing their need for a deep problem understanding and architectural change. It also notes the critical role leadership plays in promoting knowledge sharing and leveraging knowledge graphs to boost AI initiatives within organizations. - Johnathan Rodriguez
How to Enhance Data Quality in Your Data Pipeline
Data Quality is more than just accurate data, it is also completeness, consistency, reliability, and timeliness. This article dives deep into the data pipeline process, as well as how to consistently make conscious efforts to improve that pipeline to constantly improve data quality. It's a process that requires attention, but the end result is knowing that you are providing reliable and useful data. - Reed Cowan