Tools & Datasets

Text Analytics Tools

Indico API Credentials:
  • Documentation
  • username: hackreduce
  • password: big_data_hackathon
 Indico will demo their API capabilities during networking at 6 pm on Friday, 3/13, and will be on site to help users with any questions.
 The Rosette API provided by Basis Technology
  • A brief description: The Rosette API is a series of RESTful tools that can be used to analyze, summarize and classify large amounts of unstructured text. Included in the functionality are entity extraction and linking, sentiment analysis, classification and a range of calls to analyze the structure and morphology of text in numerous languages. The documentation provides examples of how to make API calls in cURL and PHP.
  • Documentation
  • Shared API Key: cc31795098e06510e955a47253798cda

Data Sets

We listed below some publicly available data sets that represent collections of unstructured text data generated by people in our daily lives. These data sets can be used to answer (or at least try to answer) numerous questions about people and people’s behaviors that occupy marketers, strategists, business analysts and stock traders among others. The participants are not restricted to using these datasets: feel free to come to the event with your own data to work on!
Data Sets
MIT Human Dynamics Lab Data Sets
  • Badge Dataset: the data contain the performance, behavior, and interpersonal interactions of participating employees at a Chicago-area firm for one month.
  • Friends and Family Dataset: the data showing how people make decisions, with emphasis on the social aspects involved.
  • Reality Mining Dataset: the first mobile data set with rich personal behavior and interpersonal interactions.
  • Social Evolution Dataset: the data tracking the everyday life of a whole undergraduate dormitory with mobile phones.

The 4 datasets from the MIT Media Lab Human Dynamics Group require the following software:

  • Badge: Excel / CSV Reader (i.e. R); 7zip (unzip bz2 files)
  • Friends and Family: Excel / CSV Reader (i.e. R); 7zip (unzip bz2 files)
  • Reality: Matlab
  • Soical Evolution: Excel / CSV Reader (i.e. R); 7zip (unzip bz2 files)

To unzip bz2 compressed file you can install 7zip (for Windows or Mac), bzip2 for Linux.

Available Morning of March 14th
New City of Boston Hubhack Data
: 20 different city of Boston data sets – themed around  “a week in the life of the city” … 911 calls, parking tickets, bus locations, etc. You can be one of the first to look at this soon to be released information.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s