Learning about panel data in R

So far so simple in terms of reading data into the R environment (I’m using the wonderful RStudio by the way). Download a little library, run about 5 lines of code and, boom, my PostgreSQL view is available as a data frame in R. Easy peasy.

library(package=RPostgreSQL)
library(package=plm)
## Load the PostgreSQL driver
drv <- dbDriver("PostgreSQL")

## Open a connection
con <- dbConnect(drv, host="xxxxxxxx",dbname="xxxxxxx",user="xxxxxx",password="xxxxxx")

## Read the whole contents of a table into a dataframers
rs <- dbReadTable(con,"rlvw_cntry_year_aid_affected")

More tricky is the whole panel data regression part. Panel data has two dimensions, a time dimension (in my case the years from about 1960 to 2008) and an “individuals” dimension, in my case countries. So I have aid received data for all countries for each year in the set, making a lot of observations overall.

The hard part is using both dimensions when running a regression. In principle, the maths is not complicated. And in the statistics package I’m used to using, it’s straightforward (once you know how!). You tell the software which column represents your time dimension, and which your individual dimension and off you go. In R, I’m not sure yet how that stuff works, so it’s back to the reading board for me as I trawl through online tutorials etc. I’ll report back once I’ve worked out how to do it.

Advertisement

Author: Rob Levy

Economist at NEF. Former teaching Fellow in Economics at UCL and Bristol University. Recently submitted my PhD. We'll see what happens...

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: