Replication at Speed – System of Record Capabilities for MemSQL 7.0
Feed: MemSQL Blog. Author: Nate Horan. System of record capability is the holy grail for transactional databases. Companies need to run their most trusted workloads on a database that has many ways to...
View ArticleAnalyze Google Analytics data using Upsolver, Amazon Athena, and Amazon...
Feed: AWS Big Data Blog. In this post, we present a solution for analyzing Google Analytics data using Amazon Athena. We’re including a reference architecture built on moving hit-level data from Google...
View ArticleMapping the Underlying Social Structure of Reddit
Feed: R-bloggers. Author: Posts on Data Science Diarist. Reddit is a popular website for opinion sharing and news aggregation. The site consists of thousands of user-made forums, called subreddits,...
View ArticleIntroduction to the Partition By Window Function
Feed: Databasejournal.com – Feature Database Articles. Author: . T-SQL window functions perform calculations over a set of rows (known as a “window”) and return a single value for each row from the...
View Articlecary huang: A Guide to Basic Postgres Partition Table and Trigger Function
Feed: Planet PostgreSQL. 1. Overview Table partitioning is introduced after Postgres version 9.4 that provides several performance improvement under extreme loads. Partitioning refers to splitting one...
View ArticleEvaluating Model Performance by Building Cross-Validation from Scratch
Feed: R-bloggers. Author: Lukas Feick. Cross-validation is a widely used technique to assess the generalization performance of a machine learning model. Here at STATWORX, we often discuss performance...
View ArticleDesign Principles for Big Data Performance
Feed: Featured Blog Posts – Data Science Central. Author: Stephanie Shen. The evolution of the technologies in Big Data in the last 20 years has presented a history of battles with growing data volume....
View ArticleOrchestrate Amazon Redshift-Based ETL workflows with AWS Step Functions and...
Feed: AWS Big Data Blog. Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud that offers fast query performance using the same SQL-based tools and business...
View ArticleThe Beauty of a Shared-Nothing SQL DBMS for Skewed Database Sizes
Feed: MemSQL Blog. Author: Eric Hanson. The limitations of a typical, traditional relational database management system (RDBMS) have forced all sorts of compromises on data processing systems: from...
View ArticleTraining, validation and testing for supervised machine learning models
Feed: SAS Blogs. Author: Beth Ebersole. Validating and testing our supervised machine learning models is essential to ensuring that they generalize well. SAS Viya makes it easy to train, validate, and...
View ArticleWhat is DRBD?
Feed: Liquid Web. Author: Jake Fellows; Did you know that it is possible for your server to crash while your website remains online? Highly reliable databases are critical for online services to...
View ArticleBest practices to scale Apache Spark jobs and partition data with AWS Glue
Feed: AWS Big Data Blog. AWS Glue provides a serverless environment to prepare (extract and transform) and load large amounts of datasets from a variety of sources for analytics and data processing...
View ArticleHow to Migrate Oracle Workloads to VMware Cloud on AWS
Feed: AWS Partner Network (APN) Blog. Author: Jayaraman VelloreSampathkumar. By Jayaraman VelloreSampathkumar, Sr. Partner Solutions Architect at AWS VMware Cloud on AWS is becoming a preferred choice...
View ArticleBeena Emerson: Benchmark Partition Table – 1
Feed: Planet PostgreSQL. With the addition of declarative partitioning in PostgreSQL 10, it only made sense to extend the existing pgbench benchmarking module to create partitioned tables. A recent...
View ArticleHubert ‘depesz’ Lubaczewski: Waiting for PostgreSQL 13 – pgbench: add...
Feed: Planet PostgreSQL. On 3rd of October 2019, Amit Kapila committed patch: pgbench: add --partitions and --partition-method options. These new options allow users to partition the pgbench_accounts...
View ArticleColumn Histograms on Percona Server and MySQL 8.0
Feed: Planet MySQL; Author: Corrado Pandiani; From time to time you may have experienced that MySQL was not able to find the best execution plan for a query. You felt the query should have been faster....
View ArticleSept 2019: “Top 40” New R Packages
Feed: R-bloggers. Author: R Views. [This article was first published on R Views, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your...
View ArticleInfrastructure repair with Bolt
Feed: Puppet Blog Feed. Author: Cas Donoghue; Not to beat on our own drum or anything, but Puppet is a great tool for managing and configuring your entire infrastructure. It allows you to ensure...
View ArticleTemporal Tables Part 3: Managing Historical Data Growth
Feed: Clustrix Blog. Author: Alejandro Infanzon. This is part 3 of a 5-part series, if you want to start from the beginning see Temporal Tables Part 1: Introduction & Use Case Example Up until now,...
View ArticleJulian Markwort: Introduction and How-To: etcd clusters for Patroni
Feed: Planet PostgreSQL. etcd is one of several solutions to a problem that is faced by many programs that run in a distributed fashion on a set of hosts, each of which may fail or need rebooting at...
View Article