Profile

Cover photo
Paul Mantz
Worked at Zmanda, Inc.
Attended The University of Chicago
Lives in Chicago, IL
324 followers|2,546 views
AboutPostsPhotosVideos

Stream

Paul Mantz

Shared publicly  - 
1
1
Add a comment...

Paul Mantz

Shared publicly  - 
 
Cadbury 100 Year Egg
1
Add a comment...

Paul Mantz

Shared publicly  - 
 
Does anyone have any good tools for accurate encoding detection and/or conversion? I have a corpus of files (~15,000) that are maybe 90% us-ascii, but there is a significant chunk (about ~1,500) that have very disparate encodings, everything from latin-1 to in-is13194-devanagari.

As for current tools, I have been using a combination of `file -I`, Emacs, and `enca` (which has been next to worthless, though I suspect that this is due to it having been installed by homebrew on OS X with no OS internationalization integration). Emacs has actually done a fantastic job so far of detecting and reporting encodings (I get warned with the encoding whenever I open something that isn't ASCII or UTF-8), But I don't have the experience (or time to acquire it) to reverse-engineer the elisp and turn this into a batch job.

+Sai +Max Bane +Andrew Chalfant +"mitcho" Michael 芳貴 Erlewine I hope one of you has some info to share!
1
Paul Mantz's profile photoNoah Axon's profile photoTurner Xei's profile photo
5 comments
 
On further inspection, it seems like the real kicker will be finding out exactly what 8-bit encoding these files are using :-/

I didn't get much chance to use chardet, but I am a bit leery since both the python and ruby versions have been unmaintained since 2007 and 2008, respectively. If I re-approach the problem it will definitely be one of the first things I reach for.
Add a comment...

Paul Mantz

Shared publicly  - 
 
I love it when two of my favorite things collide; transcendental art and electronic music.
1
Add a comment...

Paul Mantz

Shared publicly  - 
 
I think the next app I write for fun is going to procedurally generate Skrillex albums.
3
Andrew Dudzik's profile photoDustin Mitchell's profile photoEric Gasienica's profile photoRyan English's profile photo
7 comments
 
Reddit also steals from Bash....

/just sayn
Add a comment...
Have him in circles
324 people
Joshua Leners's profile photo
Colin McFaul's profile photo
Becca Overby's profile photo
Kevin Sartin's profile photo
Geoffrey Topham's profile photo
Clare Phelps's profile photo

Paul Mantz

Shared publicly  - 
1
Rob Vincent's profile photo
 
I like turtles!
Add a comment...

Paul Mantz

Shared publicly  - 
 
Link says it all.
1
Add a comment...

Paul Mantz

Shared publicly  - 
 
Writing unit tests in Costa Rica. My priorities are mis-placed.
1
Add a comment...

Paul Mantz

Shared publicly  - 
 
It turns out that CyanogenMod now supports the Samsung Epic 4G in both a CM7.2 (Gingerbread) nightly build and with a CM9 (Ice Cream Sandwich) alpha. The 7.2 build felt smoother, but is there anything I'd miss in ICS if I wait for the dust to clear on it first?
1
Paul Mantz's profile photoEric Gasienica's profile photo
2 comments
 
You're lucky, Nvidia is being stingy and won't help with docs for Tegra. ICS on my G2X is buggy at best. I stick with MIUI (and a custom Kernel) and am pretty happy with it. The CM releases are nice but kind of ho-hum for me.
Add a comment...

Paul Mantz

Shared publicly  - 
 
Don't know what to say other than this shit is raw and a glorious celebration of life, however alien it is
1
Add a comment...
People
Have him in circles
324 people
Joshua Leners's profile photo
Colin McFaul's profile photo
Becca Overby's profile photo
Kevin Sartin's profile photo
Geoffrey Topham's profile photo
Clare Phelps's profile photo
Work
Occupation
Software Engineer
Employment
  • Zmanda, Inc.
  • The Academic Approach
Places
Map of the places this user has livedMap of the places this user has livedMap of the places this user has lived
Currently
Chicago, IL
Links
YouTube
Other profiles
Contributor to
Story
Introduction
Perl hacker, dad.
Education
  • The University of Chicago
Basic Information
Gender
Male
Apps with Google+ Sign-in