partitions – Cloud Data Architect

↧

Lessons Learned from Building and Testing a Data Ingest Workflow at Scale

October 10, 2017, 10:33 pm

Feed: AWS Government, Education, & Nonprofits Blog. Author: publicsector. on 11 OCT 2017 | in Public Sector | Permalink | Share Dean Kleissas, Research Engineer working on the IARPA MICrONS...

View Article

Image may be NSFW.
Clik here to view.

Practical Machine Learning with R and Python – Part 2

October 12, 2017, 8:57 pm

Feed: R-bloggers. Author: Tinniam V Ganesh. In this 2nd part of the series “Practical Machine Learning with R and Python – Part 2”, I continue where I left off in my first post Practical Machine...

View Article

Image may be NSFW.
Clik here to view.

Rethinking data marts in the cloud

October 18, 2017, 5:10 am

Feed: All – O’Reilly Media. Author: Greg Rahn. Clouds (source: Pexels) Check out Greg Rahn’s session, “Rethinking data marts in the cloud: Common architectural patterns for analytics” at the Strata...

View Article

Image may be NSFW.
Clik here to view.

Migrating to MySQL 8.0 for WordPress – episode 3: query optimization

October 25, 2017, 11:36 pm

Feed: Planet MySQL; Author: Frederic Descamps; Now that MySQL 8.0.3 RC1 is installed and that we saw how to verify the workload, it’s time to see if we can optimize some of the queries. As explained...

View Article

Image may be NSFW.
Clik here to view.

Big SQL Best Practice and Guidelines – Data Ingestion with LOAD

October 26, 2017, 8:01 am

Feed: Hadoop Dev. Author: Nailah Bissoon. In some cases, data is in a certain format which needs to be converted. If the data is coming from the warehouse in text format and must be changed to a...

View Article

Image may be NSFW.
Clik here to view.

Advanced ODS Graphics: A deeper dive into documents, dynamics, and data objects

November 2, 2017, 6:08 pm

Feed: SAS Blogs. Author: Warren F. Kuhfeld. My second blog described modifying dynamic variables in ODS Graphics. Little did I know the extent to which it would launch a series of blogs, papers,...

View Article

Big SQL Best Practices – Data Ingestion

November 3, 2017, 1:29 pm

Feed: Hadoop Dev. Author: Nailah Bissoon. There are various methods to ingest data into Big SQL. This blog gives an overview of each of these options and provide some best practices for data ingestion...

View Article

Image may be NSFW.
Clik here to view.

Reserved Words

November 3, 2017, 2:14 pm

Feed: Planet MySQL; Author: Peter Gulutzan; In the 1990s C.J.Date said: “The rule by which it is determined within the standard that one key word needs to be reserved while another need not be is not...

View Article

Image may be NSFW.
Clik here to view.

#AzureSQLDW: Hub and Spoke series Integration with Azure Analysis Services

November 13, 2017, 3:00 am

Feed: Microsoft Azure Blog. Author: Arnaud Comet. This blog post is co-authored with Ellis Hiroki Butterfield, Program Manager for SQL DW and Josh Caplan, Program Manager for Azure Analysis Services....

View Article

Image may be NSFW.
Clik here to view.

Big SQL v5 Best Practices and Guidelines – Performance

November 13, 2017, 9:56 am

Feed: Hadoop Dev. Author: Nailah Bissoon. This blog outlines some Best Practices to improve Big SQL performance in Big SQL v5. The tips are ordered according to when we believe they should be...

View Article

Extracting knowledge from knowledge graphs using Facebook Pytorch BigGraph.

May 6, 2019, 12:50 am

Feed: Featured Blog Posts – Data Science Central. Author: Sergey Zelvenskiy. Machine learning gives us the ability to train a model, which can convert data rows into labels in such a way that similar...

View Article

Image may be NSFW.
Clik here to view.

Planet scale operational analytics and AI with Azure Cosmos DB

May 7, 2019, 4:00 am

Feed: Microsoft Azure Blog. Author: Rimma Nehme. We’re excited to announce new Azure Cosmos DB capabilities at Microsoft Build 2019 that enable anyone to easily build intelligent globally distributed...

View Article

Image may be NSFW.
Clik here to view.

Partition Management in Hadoop

May 7, 2019, 6:00 am

Feed: Hadoop – Cloudera Engineering Blog. Author: Shelby Khan. Guest blog post written by Adir Mashiach In this post I’ll talk about the problem of Hive tables with a lot of small partitions and files...

View Article

Image may be NSFW.
Clik here to view.

Introducing the MemSQL Kubernetes Operator

May 8, 2019, 5:59 am

Feed: MemSQL Blog. Author: Carl Sverre. Kubernetes has taken the world by storm, transforming how applications are developed, deployed, and maintained. For a time, managing stateful services with...

View Article

Image may be NSFW.
Clik here to view.

Small Files, Big Foils: Addressing the Associated Metadata and Application...

May 9, 2019, 3:08 pm

Feed: Hadoop – Cloudera Engineering Blog. Author: Shelby Khan. Small files are a common challenge in the Apache Hadoop world and when not handled with care, they can lead to a number of complications....

View Article

Image may be NSFW.
Clik here to view.

Kafka Replication: The case for MirrorMaker 2.0

May 13, 2019, 11:38 am

Feed: CDH – Cloudera Engineering Blog. Author: Renu Tewari. Apache Kafka has become an essential component of enterprise data pipelines and is used for tracking clickstream event data, collecting logs,...

View Article

AWS Glue crawlers now support existing Data Catalog tables as sources

May 14, 2019, 3:35 am

Feed: Recent Announcements. With this release, crawlers can now take existing tables as sources, detect changes to their schema and update the table definitions, and register new partitions as new data...

View Article

Image may be NSFW.
Clik here to view.

Microsoft 365 boosts usage analytics with Azure Cosmos DB – Part 2

May 16, 2019, 1:01 am

Feed: Microsoft Azure Blog. Author: Parul Matah. This post is part of a 2-part series about how organizations are using Azure Cosmos DB to meet real world needs, and the difference it’s making for...

View Article

Image may be NSFW.
Clik here to view.

Test data quality at scale with Deequ

May 16, 2019, 2:39 am

Feed: AWS Big Data Blog. You generally write unit tests for your code, but do you also test your data? Incorrect or malformed data can have a large impact on production systems. Examples of data...

View Article

Unsupervised learning and its role in the knowledge discovery process

May 24, 2019, 12:50 am

Feed: Featured Blog Posts – Data Science Central. Author: Ariful Islam. Unlike supervised learning, unsupervised learning not working with labeled data, it is not showing the machine the correct...

View Article