1.3 Billion NYC Taxi Rows into MemSQL Cloud
Feed: MemSQL Blog. Author: Seth Luersen. Experience teaches us that when loading data into a database, in whatever form ― normalized, denormalized, schema-less, hierarchical, key-value, document, etc ―...
View ArticleApache Spark Streaming
Feed: Featured Blog Posts – Data Science Central. Author: Shreya Gupta. Introduction to Apache Spark Streaming? A data stream is an unbounded sequence of data arriving continuously. Streaming divides...
View ArticleAttunity Accelerates Data Loading and Transformation for Data Lakes
Feed: Hortonworks Blog – Hortonworks. Author: Nadeem Asghar. Attunity is a long-time Hortonworks partner who provides data optimization and data integration software to help Hortonworks customers...
View ArticleNew in Cloudera Enterprise 5.5: Improvements to HUE for Automatic HA Setup...
Feed: CDH – Cloudera Engineering Blog. Author: Justin Kestelyn. Cloudera Enterprise 5.5 improves the life of the admin through a deeper integration between HUE and Cloudera Manager, as well as a rebase...
View ArticleInside Wargaming.net’s Data-driven, Real-time Rules Engine
Feed: CDH – Cloudera Engineering Blog. Author: Justin Kestelyn. In this post, engineers from Wargaming.net, the online game developer and publisher, describe the design of their real-time...
View ArticleQuality Assurance at Cloudera: Fault Injection and Elastic Partitioning
Feed: CDH – Cloudera Engineering Blog. Author: Justin Kestelyn. In this installment of our series about how quality assurance is done at Cloudera, learn about the important role of fault injection in...
View ArticleCAS data modeling for performance
Feed: SAS Blogs. Author: Stephen Foerster. The Cloud Analytic Server (CAS for short) is SAS’ latest high-performance, scalable, in-memory analytic data server. In this post, I’d like to discuss the...
View ArticleApache Impala (incubating) in CDH 5.7: 4x Faster for BI Workloads on Apache...
Feed: CDH – Cloudera Engineering Blog. Author: Justin Kestelyn. Impala 2.5, now shipping in CDH 5.7, brings significant performance improvements and some highly requested features. Impala has proven to...
View ArticleUpdate Hive Tables the Easy Way
Feed: Hortonworks Blog – Hortonworks. Author: Carter Shanklin. This is part 1 of a 2 part series for how to update Hive Tables the easy way Historically, keeping data up-to-date in Apache Hive required...
View ArticleData Ingestion with Spark and Kafka
Feed: Planet big data. Author: Meg Blanchette. August 15th, 2017 An important architectural component of any data platform is those pieces that manage data ingestion. In many of today’s “big data”...
View ArticleUnderstanding overfitting: an inaccurate meme in supervised learning
Feed: R-bloggers. Author: msuzen. Preamble There is a lot of confusion among practitioners regarding the concept of overfitting. It seems like, a kind of an urban legend or a meme, a folklore is...
View ArticleHow-to: Detect and Report Web-Traffic Anomalies in Near Real-Time
Feed: CDH – Cloudera Engineering Blog. Author: Justin Kestelyn. This framework based on Apache Flume, Apache Spark Streaming, and Apache Impala (incubating) can detect and report on abnormal bad HTTP...
View ArticleFrom Data Lake to Data Warehouse: Enhancing Customer 360 with Amazon Redshift...
Feed: AWS Big Data Blog. Achieving a 360o-view of your customer has become increasingly challenging as companies embrace omni-channel strategies, engaging customers across websites, mobile, call...
View ArticleUpdate Hive Tables the Easy Way Part 2
Feed: Hortonworks Blog – Hortonworks. Author: Carter Shanklin. This is part 2 of a 2 part series for how to update Hive Tables the easy way Managing Slowly Changing Dimensions In Part 1, we showed how...
View ArticleAutomatically Dropping Old Partitions in MySQL and MariaDB
Feed: Planet MySQL; Author: Geoff Montee; A MariaDB Support customer recently asked how they could automatically drop old partitions after 6 months. MariaDB and MySQL do not have a mechanism to do this...
View ArticleLog Buffer #519: A Carnival of the Vanities for DBAs
Feed: Planet MySQL; Author: The Pythian Group; This Log Buffer Edition covers Oracle, SQL Server and MySQL. Oracle: Fix index corruption found using analyze table validate structure Customizing a...
View ArticleAzure Management Libraries for Java – v1.2
Feed: Microsoft Azure Blog. Author: Asir Selvasingh. We released 1.2 of the Azure Management Libraries for Java. This release adds support for additional security and deployment features, and more...
View ArticleAutomatically Dropping Old Partitions in MySQL and MariaDB: Part 2
Feed: Planet MySQL; Author: Geoff Montee; In a previous blog post, I showed how a DBA could configure MySQL or MariaDB to automatically drop old partitions. Some readers provided some feedback on some...
View ArticleAutomatic Partition Maintenance in MySQL and MariaDB: Part 3
Feed: Planet MySQL; Author: Geoff Montee; In part 1 and part 2 of this blog series, I showed how a DBA could configure MySQL or MariaDB to automatically drop old partitions. Some readers mentioned that...
View ArticleMySQL Support Engineer’s Chronicles, Issue #8
Feed: Planet MySQL; Author: Valeriy Kravchuk; This week is special and full of anniversaries for me. This week 5 years ago I left Oracle behind and joined Percona… Same week 5 years ago I had written...
View Article