Fork me on GitHub

Flickr Interesting ISOs with httr & rCharts + slidify

I have rewritten this old post to use Hadley Wickham's httr instead of Rflickr for two reasons:

  1. Rflickr is not working for me anymore
  2. httr is a very helpful package for navigating the "what was scary to me" world of http and oauth

In addition to the changes above, I will also demonstrate use of the pipeR package from Kun Ren who has been quite prolific lately. I feel pretty strongly I will be rewriting this post one more time in the near future employing his rlist package.

I will incorporate the original text into the content below.

The information available from the Flickr API is incredibly rich. This Atlantic article This is the World on Flickr motivated me to open up R and do some analysis on the Flickr Explore list. As you might expect, I'll be using my new favorite tools rCharts and slidify.

ISO Speed Popularity

I have always wondered what ISO speeds occur most frequently on Explore. I never imagined that I could answer my question with R. As usual, we will start by loading all the necessary packages.

Load Our Packages
# analyze EXIF data for interesting list
library(httr)
#updated to use version 0.4 devtools::install_github("renkun-ken/pipeR/0.4")
library(pipeR)  
library(jsonlite)

If you do not have a free noncommercial API key, apply for one here. Trust me it is very easy, so don't let this be an excuse not to try it out. I put mine in a little secrets.Rdata file that I will load with following code and then start a session.

Authorize Our flickr
load("secrets.Rdata")

flickr.app <- oauth_app("r to flickr",api_key,secret)
flickr.endpoint <- oauth_endpoint(
  request = "https://www.flickr.com/services/oauth/request_token"
  , authorize = "https://www.flickr.com/services/oauth/authorize"
  , access = "https://www.flickr.com/services/oauth/access_token"
)

tok <- oauth1.0_token(
  flickr.endpoint
  , flickr.app
  , cache = F
)

Since this is more a proof of concept rather than an ambitious scientific study, I'll just look back three days.

#use this to specify how many days to analyze
daysAnalyze = 3

My code gets a little sloppy here but it does work. Originally I was forced to use a whole lot of lapply, but httr plays nicely with JSON, and jsonlite converts it to traditional R data stuctures. I hope my comments will help you understand each of the steps.

Get the Interesting

This will give us a list interesting with about 100 photos for each day. We could get up to 500 per day if we are ambitious with the per page API option.

#get a list and then make it a data frame 
interesting <- lapply(1:daysAnalyze, function(i){
    GET(url=sprintf(
      "https://api.flickr.com/services/rest/?method=flickr.interestingness.getList&api_key=%s&date=%s&format=json&nojsoncallback=1"
      , api_key
      , format( Sys.Date() - i, "%Y-%m-%d")
      , tok$credentials$oauth_token
      )
    ) %>>%
      content( as = "text" ) %>>%
      jsonlite::fromJSON () %>>%
      ( .$photos$photo ) %>>%
      ( data.frame(
        date = format( Sys.Date() - i, "%Y-%m-%d")
        ,.
        ,stringsAsFactors=F
      )) %>>%
      return
  }
) %>>%
  # combine all the days into a data frame
  ( do.call(rbind, .) )
Get EXIF for the Interesting

Now that we have some interesting photos, we can use flickr.photos.getExif for all sorts of meta information embedded in the EXIF.

  #for each photo try to get the exif information
  #Flickr allows users to block EXIF
  #so use try to bypass error
  #in case you are wondering, yes we could use dplyr here
  exifData <- lapply(
    1:nrow(interesting)
    ,function(photo){  # now we will process each photo
      exif <-
        GET(url=sprintf(
          "https://api.flickr.com/services/rest/?method=flickr.photos.getExif&api_key=%s&photo_id=%s&secret=%s&format=json&nojsoncallback=1"
          , api_key
          , interesting[photo,"id"]
          , interesting[photo,"secret"]
          )
        ) %>>%
          content( as = "text" ) %>>%
          jsonlite::fromJSON ()
    }
  )# %>>% could chain it, but want to keep exifData

  #now that we have a list of EXIF for each photo
  #use another lapply
  #to extract the useful information
  iso <- exifData %>>% 
    # some photos will not have exif if their owners disable it
    # and the api call will give us a stat "fail" instead of "ok"
    list.map(
      f(p,index) -> {
        ifelse (
          p$stat == "ok"
          , p$photo$exif %>>%
              (.[which(.[,"label"]=="ISO Speed"),"raw"])  %>>% 
              as.numeric 
          , NA
        ) %>>%
          {
            data.frame(
              interesting[ index, c( "date", "id" )]
              , "iso" = .
            )
          }
      }
    ) %>>%
    list.table( date, iso ) %>>%
    data.frame( stringsAsFactors = F)

Plot Our Results

Now that we have a data.frame with ISO speeds, let's use rCharts to analyze it. I will use dimplejs.

# Thanks to http://tradeblotter.wordpress.com/
# Qualitative color schemes by Paul Tol
tol4qualitative <- c("#4477AA", "#117733", "#DDCC77", "#CC6677")

require(rCharts)
iso %>>% ( dPlot(
  y = "Freq",
  x = "iso",
  groups = "date",
  data = .,
  type = "bar",
  height = 400,
  width =600
  ,xAxis = list( orderRule = sort( .$iso ) )
  ,defaultColors = tol4qualitative
) ) %>>% ( .$show("inline") )
#  using {} instead of () for our enclosure
#  might be more understandable
iso %>>% {
  dPlot(
    y = "Freq",
    x = c("iso","date"),
    groups = "date",
    data = .,
    type = "bar",
    height = 400,
    width =600
    ,xAxis = list( orderRule = sort( .$iso ) )
    ,defaultColors = tol4qualitative
  )
} %>>% ( .$show("inline") )
iso %>>% ( dPlot(
  y = "Freq",
  x = c("date","iso"),
  groups = "date",
  data = .,
  type = "bar",
  height = 400,
  width =600
  ,xAxis = list( grouporderRule = sort( .$iso ) )
  ,defaultColors = tol4qualitative 
) )  %>>% ( .$show("inline") )
iso %>>% ( dPlot(
  y = "Freq",
  x = "iso",
  groups = "date",
  data = .,
  type = "line",
  height = 400,
  width =600
  ,xAxis = list( orderRule = sort( .$iso ) )
  ,defaultColors = tol4qualitative   
) )  %>>% ( .$show("inline") )
iso %>>%  ( dPlot(
  y = "Freq",
  x = c("date","iso"),
  groups = "date",
  data = .,
  type = "area",
  height = 400,
  width =600
  ,xAxis = list( grouporderRule = sort( .$iso ) )
  ,defaultColors = tol4qualitative 
) )  %>>% ( .$show("inline") )

As you might already know, I love R, especially with rCharts and slidify.

Now I can add that I love httr and pipeR.

Thanks to all those who contributed knowingly and unknowingly to this post.