Profile cover photo
Profile photo
Andrew Purtell
About
Andrew's posts

Post has attachment
Phoenix v 2.1 is released.

From James Taylor (@JamesPlusPlus):

The Phoenix team is pleased to announce the immediate availability of Phoenix 2.1 [1].

More than 20 individuals contributed to the release. Here are some of the new features now available:

* Secondary Indexing [2] to create and automatically maintain global
indexes over your primary table.
   - Queries automatically use an index when more efficient, turning your full table scans into point and range scans.
   - Multiple columns may be indexed in ascending or descending sort order.
   - Additional primary table columns may be included in the index to form a covered index.
   - Available in two flavors:
        o Server-side index maintenance for mutable data.
        o Client-side index maintenance optimized for write-once, append-only use cases.

* Row Value Constructors [3], a standard SQL construct to efficiently
locate the row at or after a composite key value.
   - Enables a query-more capability to efficiently step through your data.
   - Optimizes IN list of composite key values to be point gets.

* Map-reduce based CSV Bulk Loader [4] to build Phoenix-compliant HFiles and load them into HBase.

* MD5 hash and INVERT built-in functions

Phoenix 2.1 requires HBase 0.94.4 or above, with 0.94.10 or above required for mutable secondary indexing. For the best performance, we recommend HBase 0.94.12 or above. 

Regards,

James
@JamesPlusPlus
http://phoenix-hbase.blogspot.com/

[1] https://github.com/forcedotcom/phoenix/wiki/Download
[2] https://github.com/forcedotcom/phoenix/wiki/Secondary-Indexing
[3] https://github.com/forcedotcom/phoenix/wiki/Row-Value-Constructors
[4] https://github.com/forcedotcom/phoenix/wiki/Bulk-CSV-loading-through-map-reduce 

Post has attachment
"HFT (high-frequency trading) systems operate and evolve at astounding speeds. Moore's law is of little comfort when compared with the exponential increase in market-data rates and the logarithmic decay in demanded latency. As an example, during a period of six months the requirement for a functional trading system went from a "tick-to-trade" latency of 250 microseconds to 50. To put that in perspective, 50 microseconds is the access latency for a modern solid-state drive. [...] The goal of this article is to introduce the problems on both sides of the wire. Today a big Wall Street trader is more likely to have a Ph.D from Caltech or MIT than an MBA from Harvard or Yale. The reality is that automated trading is the new marketplace, accounting for an estimated 77 percent of the volume of transactions in the U.K. market and 73 percent in the U.S. market. As a community, it's starting to push the limits of physics."

Detailed and fascinating read. 

Also depressing... looks like a substantial fraction of our resources for technological development is captured by this local optimum. 

Post has shared content
Space Pens vs. Pencils

Via +I fucking love science over on another platform.

Things are very rarely as simple as they seem... it's easy to mock when you don't know the context.
Photo

Post has attachment
"We propose a new approach to mitigating predictive privacy harms – that of a right to procedural data due process. In the Anglo-American legal tradition, procedural due process prohibits the government from depriving an individual’s rights to life, liberty, or property without affording her access to certain basic procedural components of the adjudication process – including the rights to review and contest the evidence at issue, the right to appeal any adverse decision, the right to know the allegations presented and be heard on the issues they raise."

Post has attachment

Post has attachment
Photo

Post has attachment

Post has attachment

Post has attachment

Post has attachment
This WSJ opinion piece explicitly links the NSA scandal with Apache Hadoop and Accumulo.
Wait while more posts are being loaded