Profile

Cover photo
Hugh Cayless
Works at Duke University
Attended Cornell University
Lived in Barbados
315 followers|8,856 views
AboutPosts

Stream

Hugh Cayless

Shared publicly  - 
1
Add a comment...

Hugh Cayless

Shared publicly  - 
 
New Year's resolutions:

* start my own business
* run a marathon
* think of other ways to terrify myself :-)
3
Hugh Cayless's profile photoJosef Komenda's profile photo
3 comments
 
Cool - I think pre-reg starts in May.
Add a comment...

Hugh Cayless

Shared publicly  - 
 
Dan Chudnov originally shared:
 
The charges against Aaron Swartz hit home. I know little about it other than what was published yesterday. But every aspect of the case touches my professional career directly, if only conceptually.

JSTOR was developed at UMich when I was a library school student there; I was an early user, and somewhat knew people working on it and working at JSTOR later. Ten years ago I was a librarian at MIT Libraries; I somewhat know the people there who were probably involved in dealing with this when it happened. I somewhat know Aaron; I first met him at an O’Reilly conference in 2001, then again in 2007, and enjoyed his talk on Open Library at code4lib 2008 (see http://code4lib.org/conference/2008/swartz). I currently work at a large library where among other things I’m somewhat involved in a project that gives away everything about it: all the data, all the metadata, all the software necessary to run exactly the same service anywhere else in the world. Indeed when Aaron and I crossed paths at Science Foo Camp in 2007 we talked about just that project; he noted the conversation on his blog (see http://www.aaronsw.com/weblog/scifoo07). I’ve served on a federal grand jury; the burden of proof is low for indictment (probable cause), and most cases with indictments never make it to trial, where the burden of proof is high for a guilty charge (beyond a reasonable doubt).

When Aaron and I had that conversation at SciFoo ‘07, I was excited to tell him we had 10 TB of data in NDNP/Chronicling America to give away. As of the latest update to the site -- today -- we now manage a half-petabyte of live data (including backup copies) for NDNP alone, roughly 40 TB of which is publicly available and free, with many modes of simple, no-registration-required web access available via a lovely web API (see http://chroniclingamerica.loc.gov/about/api/). My main contributions to NDNP were writing the first version of that API document and being project manager for the backend system that manages the data inventory and workflow, but even having had such a minor hand in the NDNP efforts, which give away so much, my involvement in the project has made me prouder to be a librarian than anything else I’ve done in the past fifteen years.

In 2006 I wrote a piece about “greedy librarians” that somewhat anticipated this trying-to-download-it-all situation (see http://onebiglibrary.net/story/greedy-librarian-moonshot). In it I argued that when, say, a new student arrives on a university campus, they should be able to download the whole library and take it with them, get regular updates, and share their navigatings through it with their friends over their college years. I stand by that argument. Having worked on projects like NDNP since then, I’m all the more convinced that it’s the right way to go whenever possible. We should want to enable people who want to work with entire collections, to smooth their paths, to provide update mechanisms for them and their copies of data we make available.

For NDNP and similar large data projects where I work, occasionally somebody fires up bots that try to pull down all the data using brute force methods. Sometimes this exceeds our capacity to serve the data, so when possible, we attempt to contact the remote person and talk them down into other methods. Some of our data may be purchased in bulk, more efficiently; some of it may be downloaded in bulk, more patiently. In the best case, on a project like NDNP, the person on the other side of the bots reacts well to us saying “it’s okay to take this data, but you’re preventing everyone else from using the site; please slow down your bots” and they slow down their bots. Then we watch the server’s load drop back down, and then get back to our wonderful daily drudge work of moving more data through the pipeline and making it available, happy to know that somebody wants the data. Like I said, this is a best case example, and it’s a real example.

I don’t know what Aaron was trying to accomplish (other than getting a local copy of lots of JSTOR); I don’t know what the people at JSTOR or MIT Libraries think about it, and I don’t know what the US Attorney’s office is thinking. I’m not signing any petitions, though. At every step of the process, it’s easy for me to imagine what might be on the minds of the people involved: I could see a grand jury member seeing probable cause in the charges as written; I could see the staff at JSTOR or MIT Libraries with mixed emotions about seeing services to their communities limited by somebody who wants to make extensive use of the service; I can visualize Aaron holding a bike helmet over his face as he accesses a network switch. It isn’t my place to judge any of these actions or reactions. I’m not qualified.

Instead, I’m going to stay focused on making data available, at scale, for free, along with software that makes it easy to access and use. I’m a librarian. This is what I do. I hope to do it for thirty more years.
2
1
Jerry Spiller's profile photo
 
Good read on the JSTOR situation.
Add a comment...
Have him in circles
315 people
Simon Spero's profile photo
Seth Denbo's profile photo
Leah Riley's profile photo
Josh Greenberg's profile photo
Dean Irvine's profile photo
John Unsworth's profile photo

Hugh Cayless

Shared publicly  - 
 
#teifuture

If we were going to reimagine the structure of the TEI Consortium, what might it look like? I'm going to jot down a few ideas/complaints/musings. Some of this is in reaction to Martin Mueller's (http://ariadne.northwestern.edu/mmueller/teiletter.pdf) and Stuart Yeates' (http://opensourceexile.blogspot.com/2011/08/thoughts-on-letter-about-tei-from.html) thoughts.

1. I think the current setup, where institutions with paid memberships are the only enfranchised members of the community is self-limiting. It means that the TEI leadership is closed to outsiders and that therefore new ideas and new blood don't get into the mix as much as they should. I don't see why the Council at least shouldn't have representatives elected by the community as a whole.

2. Declining institutional subscriptions are a related problem. When the leadership is this closed off, there aren't sufficient incentives to attract new subscribers (i.e. there's no good way to hook new paying members in). There are also no real incentives to attract individual members.

3. I think the separation of the Board and Council is a good idea and should stand. Maintenance of the standard is not the same sort of task as maintenance of the organization. And I think having a small number of experts be the gatekeepers for the standard works well.

4. Further, I think the fact that Council in-person meetings are subsidized is a very good thing, as it allows people whose organizations wouldn't support their travel to attend. But note that "outsiders" don't presently have a shot at getting onto the Council because of the voting structure.

5. I think some method should be found to recognize and reward major technical contributors to the TEI. In open source projects, they are made "committers"—which is rather like being on the Council. My worry is that people like this don't really stand much chance now of being elected, and might not even if elections are opened up.

6. The Council is pretty open, but I've more than once had the experience of making feature requests and then never hearing about their fate until I happen to think of asking a Council member. The system of handling and reporting back on feature requests could be improved.

7. The level of technical sophistication isn't always what I'd want it to be: there are, for example, chunks of the Guidelines that don't work (I'm looking at you, TEI XPointer Schemes) and have yet either to be revised or implemented.
1
2
Martin Mueller's profile photoMartin Holmes's profile photoLou Burnard's profile photoSebastian Rahtz's profile photo
13 comments
 
1. +Sebastian Rahtz I've used <fs>. I must admit I didn't particularly enjoy it, but it was the best tool for a specific job (related to a morphological dictionary).

2. +Martin Mueller and +Hugh Cayless I'm on Council, and I don't think I ever thought of the board as Kafkaesque, but I do admit to not really giving it any thought at all; up to now, I saw that as a positive thing, because I could focus on technical work and not worry about politics. When the Board is functioning well, it's a real benefit to Council that they are free not to think about all that stuff, but I now see that when it's a bit disfunctional we all suffer and we all have to respond somehow.
Add a comment...

Hugh Cayless

Shared publicly  - 
 
A glimpse at what Humanities Higher Ed should do, rather than (or at least in addition to) preparing students for a career most of them will never have.

http://chronicle.com/blognetwork/tenuredradical/2011/07/digital-dreams-a-case-for-producing-more-history-ph-ds/
1
Rebecca Davis's profile photo
 
I really liked this point: "but if all conversations are aimed at why digital publication is the same, or as good as, analogue publication, then we are not pushing ourselves intellectually about the distinctiveness of the digital world, what its promises and limitations for a more democratic public sphere are, and how historians need to change their view of what scholarship is to meet the challenges that the Internet poses."

which makes me think I also need a more granular +1 button
Add a comment...
People
Have him in circles
315 people
Simon Spero's profile photo
Seth Denbo's profile photo
Leah Riley's profile photo
Josh Greenberg's profile photo
Dean Irvine's profile photo
John Unsworth's profile photo
Work
Employment
  • Duke University
    Senior Digital Humanities Developer, 2013 - present
  • New York University
    Digital Library Analyst/Programmer, 2009 - 2013
Places
Map of the places this user has livedMap of the places this user has livedMap of the places this user has lived
Previously
Barbados - Chapel Hill, NC - Garden City, NY - Ithaca, NY
Links
Story
Tagline
Digital Classicist / Scholar-Programmer
Education
  • Cornell University
    Classics, History, 1987 - 1991
  • University of North Carolina at Chapel Hill
    Classics, 1992 - 1999
  • University of North Carolina at Chapel Hill
    Information Science, 2001 - 2005
Basic Information
Gender
Male