The past few weeks I’ve been testing Amazon Athena as an alternative to standing up Hadoop and Spark for ad hoc analytical queries. During that research, I’ve been looking closely at file formats for the style of data stored in S3 for Athena. I have typically been happy with Apache Parquet as my go-to, because of it’s popularity and guarantees, but some research pointed me to Apache ORC and it’s advantages in this context.
Read MoreI loved the emotion displayed in the show this season. Both Donna and Cameron had a great deal of emotion on display, and not in a cliche “girls like to cry” kind of way. With the growth and investor interest in their company, the test of their partnership was on full display and Donna failed in the worst way possible.
Read MoreIn a recent project, I wanted to do text searches over a large unstructured dataset (100 GB) in memory and I was able to do it in Spark once I provisioned a machine with enough memory. I was able to do it quickly and efficiently, but I was bugged that I couldn't compress the data and had to spin up a master with that much memory.
Read MoreOn my last trip to the farmers market, I spent $20 on a few peaches and a personal watermelon. The fruit was untouchable, but didn’t get me through the weekend (It was that good). I felt good about my decision to support local business and to buy fruit that was in season, but I couldn’t help but feel elite. The atmosphere makes me feel like I’m not only supporting local companies but that I’m better than everyone who isn’t.
Read MoreI bought an Apple Watch Series 2 for Christmas. Most of the motivation for getting it was the struggles I had syncing my workout data. I was using a Garmin Vivosmart watch plus a heart rate monitor to get workout data.
Read More