All Hands on Data #47
Thrilled to have you with us for this week's AHoD adventure: discover our team's top article picks of the week below!
How to (mostly) gain back control of your data? - Self-Hosting explained and alternatives
For the privacy-minded, the author details several steps they took to introduce self-hosted resources to control their own data. Maybe I should rethink posting my social security card on my email signature... - John Forstmeier
DataLang: A New Programming Language for Data Scientists... Created by ChatGPT?
van Rossum created Python in 3 months. Javascript took 10 days of Eich's time. DataLang? Hours. Ok, that's an exaggeration but this experiment is still fascinating. Mayo collaborated with ChatGPT to create a language optimized for Data Scientists. For a couple hours of work, I think it is a pretty decent first effort. What do you think? - Katt Baum
A Guide to Top Natural Language Processing Libraries
Ever wonder how voice assistants are able to understand what you're saying and respond? This is a great introductory and comprehensive guide on various NLP libraries that can be added to your application in order to possibly create a voice assistant of your own! - Reed Cowan
OpenAI's hunger for data is coming back to bite it
With the fascination with OpenAI and how quickly they're developing, it's surprisingly not surprising to see they're skirting around data protection laws. Being in the data space, privacy comes first and foremost so this could be an interesting read for folks. - Angel Catalan
The Truth about Prefect, Mage, and Airflow.
This is a great synopsis of some of the major players in the orchestration field; if you are unfamiliar with Airflow, Prefect or Mage (I particularly didn't know much about Mage before) then this is a great read to gain some understanding about the strengths and differences between each. Keep in mind the four bullet points at the end:
Scalable, Push you towards best practices, Reliable, Don't require vast amounts of management and a deep understanding of the system architecture simply to use them at scale.
Using these criteria, see how Shipyard stacks up as well. - Wes Poulsen
The Marketing Behind MongoDB
As someone who wasn't working in the tech field during the rise of MongoDB, the parallels between it and tooling in the modern data stack is uncanny. While this article is almost 6 years old, it tells the historical tale of how MongoDB rose to popularity as a developer tool through hype cycles, influencer marketing, and bad engineering decisions. Seriously, give this one a read and see if you can't discover some similarities. Everything old is new again. - Blake Burch
Future of Education: Application not Regurgitation of Knowledge - Part III
I've already shared the previous 2 parts of this 3 part series. I think Bill wraps up his thoughts on how to transform education to meet the challenges we're facing in a solid way. #7 is a major point that I think all of us need to take to heart. Teaching younger generations about how their data is harvested and used is incredibly important. - Jon Davidson