Profile

Cover photo
David Smiley
Works at MITRE
Attended Northeastern University
Lives in Lowell, MA
186 followers|17,338 views
AboutPostsPhotosYouTube+1's

Stream

David Smiley

Shared publicly  - 
 
If you index polygons (or other non-point shapes) in Lucene, Solr, or ElasticSearch, then I'm sure you'll be interested in this.
3
Add a comment...

David Smiley

Shared publicly  - 
 
Cool to see that CloudSearch is based on Solr.
 
Amazon Web Services announced a new release of their CloudSearch, now based on Apache Solr/Lucene!  Very cool.
3
Add a comment...

David Smiley

Shared publicly  - 
 
In about a week I'll be a full-time independent Lucene/Solr search consultant! MITRE has been great to me but I will now be able to do more of what I love, and have more time to improve Lucene, Solr, Spatial4j, and perhaps ElasticSearch too.
5
Add a comment...

David Smiley

Shared publicly  - 
 
Great video to share with anyone unfamiliar with the open-source model.
1
1
isabel drost-fromm's profile photo
Add a comment...

David Smiley

Shared publicly  - 
 
Submitted my first ElasticSearch Pull Request!
2
Add a comment...

David Smiley

Shared publicly  - 
 
I really like Mike's latest post about Lucene's Suggesters, as it covers nearly all of them. There has been great progress here in the last couple years! 
4
Add a comment...
In his circles
136 people
Have him in circles
186 people
Janet Kearns's profile photo
Mauricio Scheffer's profile photo
Noble Paul's profile photo
Chris Moesel's profile photo
James Wrabel's profile photo
Mark Miller's profile photo

David Smiley

Shared publicly  - 
 
I'm super impressed with the technology behind SIREn (open-source ASL), based on Lucene/Solr.  In a nutshell, it enables documents to have a rich structure akin to JSON/XML documents and lets you query in ways before not possible with Lucene/Solr's flat schema.  And they added faceting on this, with the ability to pivot from different linked relations.  Hats off to them!
1
David Smiley's profile photoShalin Shekhar Mangar's profile photo
3 comments
 
Cool!
Add a comment...

David Smiley

Shared publicly  - 
 
Awesome faceting speedups by Toke.  Looking forward to this in SOLR-5894
 
So I tried doing the sparse faceting thing on our 1TB, 420M document web index on the URL field (300M unique values). Problem was that I did not have any queries, so I just used random danish words from a dictionary. That gave a very low hit-rate and thus favoured the sparse implementation. But hey, it was quite a kick to see just how fast it responds in a best-case scenario.

#solr #lucene #faceting  
This post is a folow-up to Sparse facet counting on a real index. Where the last post explored using a sparse counter for faceting on author on Statsbibliotekets index of library material, this pos...
1
Add a comment...
2
Scott Stults's profile photo
 
Wow. This should be in Coursera.
Add a comment...

David Smiley

Shared publicly  - 
 
Cool to see one of my favorite bloggers use my "Solr Text Tagger"
1
Add a comment...
People
In his circles
136 people
Have him in circles
186 people
Janet Kearns's profile photo
Mauricio Scheffer's profile photo
Noble Paul's profile photo
Chris Moesel's profile photo
James Wrabel's profile photo
Mark Miller's profile photo
Work
Occupation
Software Engineer
Employment
  • MITRE
    Software Engineer, 1997 - present
Places
Map of the places this user has livedMap of the places this user has livedMap of the places this user has lived
Currently
Lowell, MA
Previously
Lowell, MA - Boston, MA - Franklin, MA - Barrington, RI - Williamsburg, VA - Somerville, MA
Story
Introduction
Proud father of two girls, and married to Sylvie.

I'm a software engineer and I love programming! Lately I have specialized in Lucene/Solr, and geospatial. I wrote the first book on Solr with Eric Pugh.

In my spare time (ha!) I watch professional Starcraft 2 e-sports.
Bragging rights
Wrote the first book on Apache Solr.
Education
  • Northeastern University
    Computer Science, 1996 - 2000
Basic Information
Gender
Male
David Smiley's +1's are the things they like, agree with, or want to recommend.
SirenDB
sirendb.com

the next generation. Structured Document Search. system. SIREn is a schemaless structured document search system that combines free text sea

The Log: What every software engineer should know about real-time data's...
engineering.linkedin.com

We're seeking intelligent problem solvers who are inspired and motivated to change the world.

Codd’s Relational Vision: Has NoSQL Come Full Circle? | Javalobby
java.dzone.com

Recently, I spoke at NoSQL Matters in Barcelona about database history. As somebody with a history background, I was pretty excited to dig i

Piplin – A DSL for Describing Silicon in Clojure
www.infoq.com

David Greenberg introduces Piplin, a DSL that allows a subset of Clojure to be automatically converted into a hardware description, which ca

Podcast: ratings, rankings, and the advantage of being born lucky - O'Re...
radar.oreilly.com

Is popularity just a matter of simple luck--of some early advantage compounded by human preference for things that are already popular? A pa

Maven and wildcard exclusions | Smartjava.org
www.smartjava.org

Once in a while I run into the problem where I don't want to exclude a single dependency from a maven dependency, but all dependencies. For

Solr block-join support
blog.griddynamics.com

As you may already know block join support has been committed into Solr and will be available starting from 4.5. Here Solr catches up with E

Optional Dependency Strategies for Java Libraries - Blog - Axel Fontaine...
axelfontaine.com

In software, contrary to common belief, lines of code are a liability not an asset. As you gradually accumulate code, little by little the p

Salmon Run: Dictionary Backed Named Entity Recognition with Lucene and L...
sujitpal.blogspot.com

Domain-specific Concept Search (such as ours) typically involves recognizing entities in the query and matching them up to entities that mak

Lux: Lux - The XML Search Engine
luxdb.org

Lux. The XML Search Engine. Lux is an open source XML search engine formed by fusing two excellent technologies: the Apache Lucene/Solr sear

Leaflet — an open-source JavaScript library for interactive maps
leafletjs.com

Leaflet is a modern, lightweight open-source JavaScript library for mobile-friendly interactive maps.

Index Sorting with Lucene
shaierera.blogspot.com

When you index documents with Lucene, you often index them in some arbitrary order, usually by a first-come first-served manner. Most applic

There's No Economic Imperative to Reconsider an Open Internet by Benoît ...
papers.ssrn.com

The debate on the neutrality of Internet access isn’t new, and if its intensity varies over time, it has for a long while tainted the relati

Home | Dropwizard
dropwizard.codahale.com

Dropwizard is a Java framework for developing ops-friendly, high-performance, RESTful web services. Developed by Yammer to power their JVM-b

Efficient compressed stored fields with Lucene
blog.jpountz.net

Whatever you are storing on disk, everything usually goes perfectly well until your data becomes too large for your I/O cache. Until then, m

Why Rackspace Is Suing The Most Notorious Patent Troll In America - The ...
www.rackspace.com

Today we drove a stake into the ground in our dogged fight against patent trolls – we sued the most notorious patent troll in America.

Twitter / dep4b: RT @tastapod: OH: I'm going ...
twitter.com

Instantly connect to what's most important to you. Follow your friends, experts, favorite celebrities, and breaking news.

Open Source - DocumentCloud
www.documentcloud.org

As we work on DocumentCloud, we're constantly building pieces of infrastructure that could be useful for other organizations that work with

DocumentCloud's VisualSearch.js
documentcloud.github.io

VisualSearch.js. Created by Samuel Clay, @samuelclay. VisualSearch.js enhances ordinary search boxes with the ability to autocomplete facete