Fork me on GitHub

Flickr Interesting ISOs with Rflickr & rCharts + slidify

The information available from the Flickr API is incredibly rich. This Atlantic article This is the World on Flickr motivated me to open up R and do some analysis on the Flickr Explore list. As you might expect, I'll be using my new favorite tools rCharts and slidify, and I will add one I have not mentioned Rflickr.

ISO Speed Popularity

I have always wondered what ISO speeds occur most frequently on Explore. I never imagined that I could answer my question with R. As usual, we will start by loading all the necessary packages.

# analyze EXIF data for interesting list
library(lubridate)
#if you have not installed Rflickr
#install.packages("Rflickr", repos = "http://www.omegahat.org/R", type="source")
library(Rflickr)
data(FlickrFunctions)

If you do not have a free noncommercial API key, apply for one here. Trust me it is very easy, so don't let this be an excuse not to try it out. I put mine in a little secrets.Rdata file that I will load with following code and then start a session.

load("secrets.Rdata")

tok = authenticate(api_key, secret)

s <- flickrSession(secret, tok, api_key)

Since this is more a proof of concept rather than an ambitious scientific study, I'll just look back three days.

#use this to specify how many days to analyze
daysAnalyze = 3

My code gets a little sloppy here but it does work. Sorry for all the lapply. I hope my comments will help you understand each of the steps.

#initialize a data frame to collect 
df <- data.frame()
for(i in 1:daysAnalyze) {
  interesting <- s$flickr.interestingness.getList(date=as.character(today()-ddays(i)))
  print(today()-ddays(i))  #debug print what day we are getting
  print(length(interesting)) #debug print the count of photos
  #for each photo try to get the exif information
  #Flickr allows users to block EXIF
  #so use try to bypass error
  exifData <- lapply( 
    1:length(interesting),
    function(x){
      exif <- try(s$flickr.photos.getExif(interesting[[x]]["id"]))
      if (inherits(exif, "try-error")) exif = NA
      return(exif)
    }
  )

  #now that we have a list of EXIF for each photo
  #that allows it
  #use another lapply
  #to extract the useful information
  exifData.df <- lapply(
    exifData,
    function(x){
      if (!(is.na(x))) {
        exif.df <- do.call(rbind,lapply(
          1:(length(x)-1),
          function(y) {
            df <- data.frame(
              t(
                data.frame(
                  x[[y]][".attrs"]
                )
              ),
              x[[y]]["raw"],
              stringsAsFactors = FALSE
            )
            rownames(df)<-y
            if("clean" %in% names(x[[y]])) {
              df$clean = x[[y]]["clean"]
            } else df$clean = NA
            return(as.vector(df))
          })
        )
      } else exif.df <- rep(NA,5)
      return(exif.df)
    }
  )

  #one more lapply to just get the ISO speed if available
  isospeeds <- unlist(lapply(
    exifData.df,
    function(x){
      if(!(is.na(x))) {
        iso = x[which(x[,"label"]=="ISO Speed"),"raw"]
      } else iso = NA
      return(as.numeric(iso))
    }
  ))

  #make one data.frame with a Frequency(count) of ISO speeds
  df <- rbind(
    df,
    data.frame(
      as.character(today()-ddays(i)),
      table(isospeeds)
    )
  )
}

[1] "2013-10-22" [1] 101 [1] "2013-10-21" [1] 101 [1] "2013-10-20" [1] 101

#name columns for our df data.frame
colnames(df) <- c("date","iso","Freq")
#get rid of factors
#thanks http://stackoverflow.com/questions/3418128/how-to-convert-a-factor-to-an-integer-numeric-without-a-loss-of-information
df$iso <- as.character(levels(df$iso))[df$iso]

Plot Our Results

Now that we have a data.frame with ISO speeds, let's use rCharts to analyze it. I will use dimplejs.

# Thanks to http://tradeblotter.wordpress.com/
# Qualitative color schemes by Paul Tol
 tol4qualitative=c("#4477AA", "#117733", "#DDCC77", "#CC6677")

require(rCharts)
dIso <- dPlot(
  y = "Freq",
  x = "iso",
  groups = "date",
  data = df,
  type = "bar",
  height = 400,
  width =600
)
dIso$xAxis( orderRule = "iso" )
dIso$defaultColors(
  #"#! d3.scale.category10() !#", 
  tol4qualitative,
  replace = T
)
dIso
dIso <- dPlot(
  y = "Freq",
  x = c("iso","date"),
  groups = "date",
  data = df,
  type = "bar",
  height = 400,
  width =600
)
dIso$xAxis( orderRule = "iso" )
dIso$defaultColors(
  #"#! d3.scale.category10() !#", 
  tol4qualitative,
  replace = T
)
dIso
dIso <- dPlot(
  y = "Freq",
  x = c("date","iso"),
  groups = "date",
  data = df,
  type = "bar",
  height = 400,
  width =600
)
dIso$xAxis( grouporderRule = "iso" )
dIso$defaultColors(
  #"#! d3.scale.category10() !#", 
  tol4qualitative,
  replace = T
)
dIso
dIso <- dPlot(
  y = "Freq",
  x = "iso",
  groups = "date",
  data = df,
  type = "line",
  height = 400,
  width =600
)
dIso$xAxis( orderRule = "iso" )
dIso$defaultColors(
  #"#! d3.scale.category10() !#", 
  tol4qualitative,
  replace = T
)
dIso
dIso <- dPlot(
  y = "Freq",
  x = c("date","iso"),
  groups = "date",
  data = df,
  type = "area",
  height = 400,
  width =600
)
dIso$xAxis( grouporderRule = "iso" )
dIso$defaultColors(
  #"#! d3.scale.category10() !#", 
  tol4qualitative,
  replace = T
)
dIso

As you might already know, I love R, especially with rCharts and slidify.