Profile

Scrapbook photo 1
Scrapbook photo 2
Scrapbook photo 3
Scrapbook photo 4
Scrapbook photo 5
Gustavo Barbieri
Works at Intel
Attended UNICAMP
Lives in Valinhos-SP, Brazil
767 followers|43,979 views
AboutPostsPhotosYouTube+1's

Stream

Gustavo Barbieri

Shared publicly  - 
 
As in the previous post, opencv is not fun. It's also fun to write simple and fast code once you have a very specific purpose, that led me to try face detection on my own -- after all I just need a rough match so I can align crop (0.0-1.0 in both xy-axis), avoiding pictures with faces at 1/4 or 3/4 grid lines to be cut.

The start was already interesting: algorithms work on grayscale (8bit) input, often acquire by converting RGB to Y (luminance). After some research found that the naive (R+G+B)/3 is suboptimal as components should weight differently... okay, I knew that, but I didn't remembered the weights. Anyway that was my "baseline" implementation.

Then found that TIFF spec have some weights that libjpeg-turbo simplifes as 0.29900 * R + 0.58700 * G + 0.11400 * B. This was my second implementation, just like that, using FPU.

Of course FPU would be slow (more than I expected), then libjpeg already does a fixed-point implementation at 16bits precision -- simple to implement and very close results. Although libjpeg-turbo uses a lookup table (storing all calculations for all 256 R, G an B), I dislike allocating memory for these and various platforms will have different costs, likely on x86 it will fit into L2 caches and be fast, while on others it won't and the cachemiss will be a PITA. So my 3rd version was without lookup table, doing calcs for every pixel: (19595 * R + 38469 * G + 7471 * B) >> 16.

Then my feeling of doing such kind of software tells me that all those multiplications would be expensive. Mulling during my flight I came with a series using only shifts to approximate the result, not as close as the fixed-point implementation above, but good enough for my purposes:
        /* approximate to nearest division by 2:
         *     0.29900 -> 1/4 = 0.250
         *     0.58700 -> 1/2 = 0.500
         *     0.11400 -> 1/8 = 0.125
         *     0.99999 ->     = 0.875
         *
         * Series to be near 1.0: 0.875 * (1 + 1/8 + 1/64) = 0.998046875
         *
         * r = 0x0000ff, (((r >>  0) << 16) >> 2) = r << 14
         * g = 0x00ff00, (((g >>  8) << 16) >> 1) = g <<  7
         * b = 0xff0000, (((b >> 16) << 16) >> 3) = b >>  3
         */
        const unsigned int r = (color & 0x0000ff) << 14;
        const unsigned int g = (color & 0x00ff00) << 7;
        const unsigned int b = (color & 0xff0000) >> 3;
        const unsigned int c = (r + g + b);
        *dst = (c + (c >> 3) + (c >> 6)) >> 16;

But was this last version faster? what is your guess on the fastest version? I can tell you FPU was slower by a 6-7x margin and that what I supposed to be the fastest wasn't!
1
Thiago Galesi's profile photoGustavo Barbieri's profile photo
2 comments
 
I did not do any SIMD and I did compile with -O0, yet that was the faster alternative! Indeed it was faster to multiply 3 times, sum then shift than just sum and divide by 3!

I hopped the shift-only version (last) would be faster than the one that multiplies, but not on x86. Maybe on ARM but I have nothing at hand to test it.
Add a comment...

Gustavo Barbieri

Shared publicly  - 
 
While I like web services that pop in the internet now and then and use them extensively, for some kind of data I fear these services vanishing in the future and letting me down. We all know this happened several times in the past, things like "multiply" and "orkut"were growing and then disappeared.

Then I keep my own services for some of those like my blog and photo gallery. In the blog I write some tech articles and for photo gallery it is just a backup of photos I want to keep as I usually post them to Flickr and Facebook as well. In both cases they are only updated by me or family members and thus new content is rarely added.

The software I was using were http:///wordpress.org for blog and http://galleryproject.org/ for photos. They are based on PHP and usually supported by most web hosting. While they are good for more complex and dynamic use, for me they are a source of constant updates and attempts to breach server security. The net result is that they cause me more work than what they save.

Since I'm moving my servers from shared dreamhost.com to a private amazon aws (thanks to +Osvaldo Santana Neto for the hint about a local data center in São Paulo), I also decided I'd simplify these services and replace the dynamic stuff with pre-generated static files.

With some effort I managed to generate simple html pages for each post of my blog. Then I wrote a python script that parses them and generates index.html and some JSON files with archive, tags and categories. With some javascript I could add back the dynamic behavior, but this time at client's machine, which saves me from server load and security issues. The result is http://blog.gustavobarbieri.com.br/ is now much more server and client friendly, consumes less bandwidth and is easier to cache on both sides. To add new data I just write a new html file and run the script to parse it and update relevant files such as indexes, categories, tags, archives and feed. No login, no php, no attempts to breach security or brute force login attempts to pollute my httpd logs.

But the worse part was gallery, as it was super-slow. Due its horrible upload system I already used rsync to upload files to server and then process them using "add files from server". The thumbnail generation would often fail, the php execution would be aborted and so on. Of course the thumbnailing is always implemented in the worst possible way, usually by calling a huge tool like imagemagick or netpbm for each file and desired size. These tools are great, but for simple tasks such as generating a smaller version of an image you don't need them.

More than the slowness to add new pictures, gallery is slow to navigate. The ui feels outdated and you constantly need to load new html to go to the next page. Google shows me that some projects provides javascript galleries based on JSON information, being completely static at server and much more dynamic for user as it's more responsive. See http://galleria.io/ for one good example.

Then I was left with the thumbnail generation and creating navigation between albums. While there are some tools such as http://sigal.saimon.org/ and https://github.com/wavexx/fgallery, they did not provide all I needed in terms of speed (sigal, fgallery) or multiple albums (fgallery). However they provided me with hope and good ideas (like fgallery's idea to center cropped photos based on face detection). They were also quite painful to get running on a "stable" server, my AWS is running CentOS 6.5 and the newest software may be from 1999 :-P.

One of the largest source of slowness in these tools is the fact that they are naive and open the actual image at its full size, then scale it down to the desired target size, then repeat the process if there are multiple sizes of a given image. Even for the average programmer it should be clear that repeating an expensive process over and over again (ie: open file, read headers) is not good. But if you know graphics and JPEG standard you should know you don't need to load the whole image pixels if you have a near macroblock that would do. Say you want the image to be scaled to 1:4, then you can load up to that macroblock size and skip the pixels that region would represent -- it is already scaled for you, saving disk reads, memory and cpu cycles to scale many more pixels than you need.

Luckily enough I know +Carsten Haitzler and remembered he wrote http://svn.enlightenment.org/svn/e/OLD/epeg/ some years ago doing very efficient jpeg thumbnailing. The software was deprecated as the features were incorporated directly into Evas jpeg image loader, but it still works perfectly and if matched with libjpeg-turbo will run even faster. Bonus point because it is very small and only depends on libjpeg, being great to be used in a "stable" server environment like CentOS 6.5.

As in the past 11 years I'm doing computer graphics related work I felt obligated to do something better, so I decided to create a new software: egg (efficient gallery generator, and starts with "e" as a tribute to Enlightenment project) that uses epeg to do its work. I should publish it soon, but it will ship as a single binary to generate stuff in the way I need (should be usable by others). Later on I plan to include support for png/raw images as they are also part of my library and video thumbnailing (likely using libavcodec).

My goal with egg is to be efficient in every possible way, like avoiding useless memory allocations, using efficient directory walking such as openat(2)/fstatat(2)/mkdirat(2), instructing the kernel on the usage pattern of file and memory with posix_fadvise(3)/posix_madvise(3), using CPU vector instructions and so on.

The output will be only images and JSON with extra information which can be converted to something else at client side (ie: to use with http://galleria.io) or server side.

Note: googling for epeg I found there are some bindings for it, so you can use from your own server infrastructure: Perl (https://github.com/tokuhirom/image-epeg), ObjC (http://lists.apple.com/archives/cocoa-dev/2004/Jan/msg00955.html). Should be easy to add python or php.
WordPress is web software you can use to create a beautiful website or blog. We like to say that WordPress is both free and priceless at the same time. The core software is built by hundreds of community volunteers, and when you're ready for more there are thousands of plugins and themes ...
2
Gustavo Barbieri's profile photoCedric BAIL's profile photo
3 comments
 
Oh,forgot the trick for efm! 
Add a comment...

Gustavo Barbieri

Shared publicly  - 
1
Add a comment...

Gustavo Barbieri

Shared publicly  - 
 
I am currently implementing GSM ARFCN range encoding and I do this by writing the algorithm and a test application. Somehow my test application ended in a segmentation fault after all tests ran. The f...
1
Adenilson Cavalcanti's profile photoMilton Soares Filho's profile photo
2 comments
 
Minha primeira opção sempre foi rodar testes passando LD_PRELOAD=/usr/lib/libefence.so. Não tem a mesma granularidade do mudflap, mas dá pra experimentar sem precisar recompilar o código
 ·  Translate
Add a comment...
Have him in circles
767 people
Michael Bouchaud's profile photo
Marccel Balance's profile photo
Fernando Grassi de Oliveira's profile photo
Michael Heath's profile photo
Kasper Souren's profile photo
Joao Barbosa's profile photo
André Luiz Carvalho's profile photo
Daniel Monteiro (MontyOnTheRun)'s profile photo
Abu Fathan's profile photo

Gustavo Barbieri

Shared publicly  - 
 
Doing "egg" led me to investigate opencv for face detection as done by fgallery (http://www.thregr.org/~wavexx/software/fgallery/)... what to say? Not so hard to use, but man, to find a face in a string of pixels I need GObject and XML... not to say a lack of personality with all the C-C++ back and forth.

Then my software will not use opencv as I disliked this project... too bad nobody did a sane alternative (or I missed it?)
"fgallery" is a static photo gallery generator with no frills that has a stylish, minimalist look. "fgallery" shows your photos, and nothing else. There is no server-side processing, only static generation. The resulting gallery can be uploaded anywhere without additional requirements and works ...
1
Auke Kok's profile photoGustavo Barbieri's profile photo
2 comments
 
that's why I'm not using fgallery, instead I'm writing egg. I'll start with the part that I know and like most: fast pixel banging, so far I was able to rescale all my 8.9Gb gallery in 4 minutes running on an old Core2 duo macbook running on 5400rpm disk.

I'll skip the face detection for now and proceed to generate the JSON representing each album and hierarchy. Then I'll integrate with galleria.io (JS/CSS), I got a modified theme that works great on desktop and my iPhone, after that I'll resume working on these fancy bits such as face detection for better cropping.
Add a comment...

Gustavo Barbieri

Shared publicly  - 
 
Since it seems g+ is just for tech stuff, I'll try to post some stuff I'm working on and that I did not had the motivation to create a blogpost for it yet. Posts to follow this one.
2
Add a comment...
People
Have him in circles
767 people
Michael Bouchaud's profile photo
Marccel Balance's profile photo
Fernando Grassi de Oliveira's profile photo
Michael Heath's profile photo
Kasper Souren's profile photo
Joao Barbosa's profile photo
André Luiz Carvalho's profile photo
Daniel Monteiro (MontyOnTheRun)'s profile photo
Abu Fathan's profile photo
Work
Occupation
Software Architect & Owner at ProFUSION
Employment
  • Intel
    Manager, 2013 - present
  • ProFUSION
    Owner, 2008 - present
  • INdT
    Software Engineer, 2006 - 2008
  • IBM
    Software Engineer (Intern), 2003 - 2005
Places
Map of the places this user has livedMap of the places this user has livedMap of the places this user has lived
Currently
Valinhos-SP, Brazil
Previously
Campinas-SP, Brazil - Sertãozinho-SP, Brazil - Recife-PE, Brazil - Paulínia-SP, Brazil
Story
Introduction
Computer engineer that loves free and open source software.

Owner of ProFUSION, company focused on embedded systems. Expertise areas include graphics, multimedia, connectivity and other infrastructure blocks.
Education
  • UNICAMP
    2001 - 2005
  • COC
    1997 - 1998
  • Liceu Albert Sabin
    1999 - 2000
  • SEMAR
    1989 - 1996
Basic Information
Gender
Male
Gustavo Barbieri's +1's are the things they like, agree with, or want to recommend.
Video Game é bom para relaxar |
blog.drpepper.com.br

Cuidado, moça... vai sobrar pra você! =X

Um prático indicador para avaliação de fundos
mercadoineficiente.wordpress.com

O mercado de avaliação de investimentos e fundos é composto por uma série de indicadores de desempenho, dentre os quais se destacam como mai

billiob's blog - Terminology at the EFL Dev Day 2013
billiob.net

Slides from my talk about terminology at the EFL Dev Day 2013

Optimizing hash table with kmod as testbed | Politreco
www.politreco.com

One thing that caught my interest lately was the implementation of hash tables, particularly the algorithms we are currently using for calcu

Welcome to Chromium's Ozone-Wayland
vignatti.com

The following message was sent out this morning -- I'm copying it here and attaching a cute screenshot of my desktop :) --- Ozone is a set o

Login
phab.enlightenment.org

Login or Register with Facebook. Login or register for Phabricator using your Facebook account. Login or Register with Facebook ». Login or

A Double Dose Of The W
e19releasemanager.wordpress.com

As some of you may have gathered, there's been a lot of work happening on E19 in recent times. I'd say that the amount of work happening on

Google is killing Free Software — Swfblag
blogs.gnome.org

I&#39;m not sure I should presume intent because of Hanlon&#39;s razor, but a lot of smart people concerned about Free Software work at Goog

WulffMorgenthaler.com – Daily strip 09.05.2012
feedproxy.google.com

Entertainment - Since 2002. Wulff &amp; Morgenthaler&#39;s Personal humoristic social commentary on life, nostalgia and the World in general

xkcd: Share Buttons
xkcd.com

... Perry Bible Fellowship, Questionable Content, Buttercup Festival. Warning: this comic occasionally contains strong language (whic

Pawned | Fredo and Pidjin. The Webcomic.
www.pidjin.net

World&#39;s funniest webcomic by Eugen Erhan &amp; Tudor Muscalu. Shop · Contact · Fredo and Pid&#39;jin · Prev Next. Pawned. Prev Next. EPISODE SYNOPSI

Dilbert comic strip for 03/17/2012 from the official Dilbert comic strip...
feedproxy.google.com

The Official Dilbert Website featuring Scott Adams Dilbert strips, animation, mashups and more starring Dilbert, Dogbert, Wally, The Pointy

New EFL release cycle 1.0/1.2/1.6 ALPHA
enlightenment.org

Mar 24, 2012 at 10:00 AM. Carsten Haitzler - Mar 24, 2012 at 10:00 AM. We&#39;d like to announce a new release cycle alpha release of severa

Enlightenment
plus.google.com

Enlightenment Window Manager

PHD Comics: Academic Homepage
www.phdcomics.com

Link to Piled Higher and Deeper

Leandro Pereira - Presenting EasyUI
tia.mat.br

Presenting EasyUI. Introduction. I've been working at ProFUSION on a project called EasyUI for the past few months. This library is based on

ProFUSION | PROJETO PING
profusion.mobi

A ProFUSION, empresa especializada no desenvolvimento de softwares embarcados com a plataforma Linux, iniciou em maio na PUCC o Projeto Ping

ProFUSION | EDBus – EFL D-Bus wrapper
profusion.mobi

As I have wrote in a previous post, I will talk about the newest EFL library: EDBus. EDBus is a D-Bus wrapper, that provides easy access to

ProFUSION | ProFUSION is part of the GENIVI alliance
profusion.mobi

GENIVI is a non-profit industry alliance committed to drive the broad adoption of an In-Vehicle Infotainment (IVI) open-source development p

ProFUSION | First EFL Developer Day
profusion.mobi

The first EFL Developer Day will happen during Linuxcon Europe on November 5, 2012 in Barcelona, Spain at the same location. Enlightenment i