Facebook API is good... but weird.

Note: This post is offered retrospectively. The work noted here was completed a week ago, so some actions may have been forgotten.


I wanted to include a calendar for tournaments on Command Deck. At the moment, there's no data storage on the page; it's all polled live. This means I have to rely on an external source of data for tournament scheduling. As there's no events page or anything for the application, I'll have to borrow the feed from other event lists.

The first problem, of course, is where do I poll for data? Facebook group events are the obvious route, but which group? As the Facebook application isn't requesting any specific permissions beyond a user token, It'll have to be a public group. I'm already a member of "X-Wing UK Events" and can see there's about 1000 members, and a reasonable supply of events listed. As this whole exercise is more about coding practice than usability, it seems like a good candidate. So, Facebook, what can you offer us developers?

Turns out, quite a lot, but never really exactly what you need. One of the most popular wars to gather information from Facebook is using the Graph API. As Facebook puts it; "The primary way for apps to read and write to the Facebook social graph". There's an in-depth guide on using the API, but some of the nuance is apparently down to the developer to figure out.

Fortunately, there's an API explorer available to test responses. Throw requests to the API, and you'll get a bunch of JSON back, relevant to the request. Seems simple. What's interesting is that all requests are submitted by the objects Unique ID. Reading a request URL is therefore confusing. Notice in the image that simple performing a GET on the ID for the group, returns the basic data for the group. Excellent...


We can get events by suffixing the request with either ?fields=events, or /events. The method is slightly different, but the results are the same. We get a full JSON field of every every event associated with the group, yes? well... not quite.



Before I go on, I want you image a long, ranting, post full of expletives and frustrations. With posts to Quora and stack overflow about the issue. As I'm writing this post after-the-fact, you only get the highlights, but essentially, no what what parameters I was passing to the API, I wasn't getting future-dated events. I was getting expired events, and the "from" and "until" parameters seemed to be working. Oddly, this particular function of the graph isn't widely talked about. Evidently, there's not many people polling specific group calendars and getting issues like this.  I wasn't going mad. There were definitely events coming up I wanted to see in the response.

Fast-forward a few days, and I notice something that ultimately led me to a workable solution. 

First (Left) we have the Event Listings for the group (as taken today). Then see the calendar view (Right). These are both taken from the same group page at the same time. Notice there's only a single entry in the calendar view? That's also the most recent event listed in the JSON for the group events request. There's nothing wrong with the request, or how I'm polling the data. The data simply doesn't exist. None of the upcoming events for the group actually exist as group events. Evidently, when a group admin creates an event for a group, that event is considered linked to that group. However, if a group member shares an event hosted by another group, or user (including themselves), that event is listed in the group, but has no link to it. There is -nothing- available in the API to directly poll events listed on a group event page, if those events aren't actually owned by the group.

The "Correct" way to do this, according to some is to iterate through he events calendar of each group member, and poll their user page for relevant events. That's insane.

Here's what I did (Also insane):

Poll the group for it's feed content. This opens up every single, existing post ever made to the group. I'll let that sink in for a moment. The only way to reliably collate events shared to a group is to grab everything. I'm only interested in upcoming events, but I can't really filter the feed based on date, because events posted further in advance than what is polled won't be picked up.
Fortunately, the API is quick, and is only returning text. As the application is hosted on Azure, and it's pulling JSON from Facebook, it should be really, really quick. You know, as quick as one can expect given the potentially enormity of the task.

Polling the feed returns 2 objects (or rather, 1 object, containing 2 child objects). The first is a collection of feed objects which is basically the content and metadata of every post made to the group. The second is a pagination object. Each request returns the X objects, as well as links to poll for the next, or previous subset of returned data. This is actually really helpful! Facebook seamlessly handles the task of pagination without any input from the developer, all we need to do on our end, is recursively work through the pages. That saves a huge headache, and makes each transaction small enough to handle. I imagine I'll be making better use of this later.


Obviously, most posts to a group are NOT event listings (actually, they probably are for this group, but the statement still stands). However, every post relating to a shared link contains a link field. Not just a link text, but an actual JSON field titled "link". As every Facebook event is reached through "https://www.facebook.com/events/", it becomes incredibly trivial to filter the events from the other posts. Check presence of link field, if found, check link data for relevant prefix. Simple!

Finally it's worth noting that event updates don't conain the link field, but that's ok. The link to the event remains the same, and we've already got that, we'll get whatever the updated details are when we poll the event.  Here's the code I've used to collate the events...


Here's what it's doing in case it's not clear:
1. Search through all posts to the group. Use a Do/While loop to work through every page.
2. Keep a List of all the posts with a "link" field, that points to anything at facebookcom/events
3. Work through each detected link, cleanse the URL into a link we can throw at the API
4. Get details from the API about each relevant event. We want location, time, name, and attendees.
5. Drop any event that doesn't provide either a start/end time. We then drop expired events, keeping only events that have a start or end time advertised as being today, or later.

The result we get matches exactly (If you'll excuse sorting) the list of events as seen on the page.
Ignore the presentation. This is currently just a proof of concept. i.e. Seeing a list of events meets the intention so far. 


Comments