Splice Machine – The Real-Time SQL-on-Hadoop Database

Splice Machine is the only transactional SQL-on-Hadoop database for real-time Big Data applications. via Splice Machine – The Real-Time SQL-on-Hadoop Database.

Yahoo’s New Long Game: Contextual Search

For instance, a normal search for sushi might turn up a Wikipedia page or various websites about sushi. If one were to look up sushi from a phone through a contextualized mobile search, it could conceivably return nearby sushi restaurants with review, advertisements and coupons. The reason for Mayer to get excited is twofold: Nobody …

HBase vs Cassandra

This comparative study was done by me and Larry Thomas in May, 2012. Cassandra stuff was prepared by Larry Thomas. This information is NOT intended to be a tutorial for either Apache Cassandra or Apache HBase. We tried our level best to provide the most accurate information. Please comment or email me if you find any corrections. I …

Apache Cassandra™ 2.0 | DataStax Cassandra 2.0 Documentation

Apache Cassandra™ 2.0 | DataStax Cassandra 2.0 Documentation.

Comparing the Hadoop File System (HDFS) with the Cassandra File System (CFS) : DataStax

The Hadoop Distributed File System (HDFS) is one of many different components and projects contained within the community Hadoop™ ecosystem. The Apache Hadoop project defines HDFS as: “the primary storage system used by Hadoop applications. HDFS creates multiple replicas of data blocks and distributes them on compute nodes throughout a cluster to enable reliable, extremely …

CitusDB: Scalable Analytics Database – FAQ

Citus DB is an analytics database that modifies and extends PostgreSQL for scalability. Users talk to Citus DB’s master node as they do with a regular database; and the master node partitions the data and queries across worker nodes in the cluster. The specifics of the underlying architecture closely resemble those of Hadoop. via CitusDB: …

Scaling Distributed Counters | WhyNosql

Distributed counters is an important functionality many distributed databases offer. For an ad network distributed counters are important for many reasons. Real time ad impressions and click data can be used for ad optimization. HBase and Cassandra both support distributed counters. via Scaling Distributed Counters | WhyNosql.

Aggregation — MongoDB Manual 2.4.9

Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. MongoDB provides three ways to perform aggregation: the aggregation pipeline, the map-reduce function, and single purpose aggregation methods and commands. via Aggregation …

tonyheupel/node-ravendb

tonyheupel/node-ravendb.

Integrating Hadoop into Business Intelligence | Tableau Software

TDWI recognizes that Hadoop usage is a minority practice today, but assumes that mainstream usage of Hadoop within business intelligence (BI) and data warehousing (DW) applications will become common across many industries within a few years. This Webinar provides an overview of Hadoop products and best practices in the context of BI/DW applications so that …

The Daily Kebab

The ramblings of a technomuse

Tag Archives: BigData