You gotta be kidding me Facebook...

(note: this is a rant... but I need to scream out loud for a second :))

So, as some of you know, I'm currently doing a Facebook study of how the many changes has actually affected brands' organic reach and impact over time. 

It's a very simple study in concept. I'm looking at each post, over time, in relation to how many who liked that page at the moment it was published. And then I calculate the ratios of things like reach or engagement.

In reality though, Facebook has made outputting data from their insights dashboard the most complicated thing ever.

First of all, Facebook is limiting that you can only export 180 days at the time. The problem with this, of course, is that it sounds like six months, but it's not. Between July 1 and Dec 31 there is a total of 183 days ... three too many. So exporting 180 days is suddenly rather complicated. Why Facebook?

Then, the amount of data is so big that I cannot import it in Apple Numbers as it exceeds the maximum column limit. Okay, I admit, this is not the fault of Facebook... but still. Why I can't I just choose what I need? I'm dreaming about Google Analytics custom reports right now :)

Secondly, even if you delete the columns that you don't need (manually) before you import them, the amount of data is still so big that Numbers crash (repeatedly). Again, not the fault of Facebook... but custom reports... please!

Okay, so I gave up on Apple Numbers and decided to simply import the data directly into my own database server, from which I would do all the analysis I want.

This should be simple. From Facebook you get a comma separated file. The concept of such a file is the simplest thing in the entire world. Each row of data is on a separate line, and each field of data is separated by a comma. Split on those points and you have a very simple import script. 

Yeah.... right...

First problem is that a Facebook post contains both commas and newlines within the posts themselves. So you get this:

24525,3252,"And, then there is this.

And this, that, and those.",share,5234,242

Three lines and way too many commas. So I wrote a script that could handle that. It would detect what commas and lines that were a part of the actual post text and which weren't. 

Now I had my import script, so I clicked the button to import all my comma separated files. And I was really happy... for about 2 seconds.

Remember how Facebook only allows you to output 180 days at the time? Well, it also turns out that the number of fields, as well as the placement of where they are, changes slightly between each output. And this is despite that the only difference is the date.

The first file might look like this:

Simply, right? But the next file (same output), now looks like this:

[id],[post],[type],[people],[impressions],[ answers ],[comments],[likes],[shared]

And the third file looks like this:
[id],[post],[type],[people],[impressions],[ answers ],[comments],[ other ],[likes],[shared]

Hey.. what? Stop!!

Fourth file looks like this:
[id],[post],[type],[people],[impressions],[comments],[likes],[ other ],[shared]

Arrgghh... [now sobbing in a corner of my office]

Notice how the 'answers' field is gone (again), and the 'other' field has shifted position. And this is happening in an inconsistent way between all the files !!!

The only way to fix this is to manually adjust my import script, on a per file basis, to make sure I import the right data into the right fields in the database. Because, if I don't, my analysis will be crap. 


My plan this week was to import the data on Tuesday and then do the analysis on Thursday and Friday. Instead, I have spent the past two days just coding scripts to import the data.

This is now done, but damn Facebook. Total facepalm!
Animated Photo
Shared publiclyView activity