In the course of our research projects we have to collect, transform, and produce datasets to analyze, get insights, build prototypes, and answer our posing questions. This type of data is usually not printed with papers or presented on conferences. It is rather hidden inside the code or locked behind password. Here we make these datasets publicly available and free for all researchers to use.

Twitter Debate Vancouver Transit Referendum

This dataset contains tweets related to the conversation about the Vancouver Transit Referendum between February 11 to March 11, 2015. It was collected using the following hashtags as filters: #cutcongestion, #transitreferendum, #yes4transit, #notranslinktax, #TransLink, #BCpoli, #vanpoli. After some cleaning and duplication removal, this dataset contains a total of 8,755 tweets from 2,710 profiles, using 437 different hashtags. Each rows includes several metadata, such as date, retweets, mentions, url, user id, user name, etc. For more information visit the project's page .

Download the CSV file

AMPds The Almanac of Minutely Power Dataset


This dataset contains electricity, water, and natural gas measurements at one minute intervals representing a multi-year consumption of regular house. A total of 1,051,200 readings per meter for 2 years of monitoring. Weather data from Environment Canada's YVR weather station is also included. This hourly weather data covers the same period of time as AMPds and includes a summary of climate normals observed from the years between 1981-2010.

For more information and to download the dataset please visit


Updated on 2015-08-14T21:02:51+00:00, by Luciano.