Coding Conventions - dssg/energywise GitHub Wiki

Python version:

This project was developed on, and tested for python 2.7.2 (default, Jun 12 2011, 15:08:59) MSC v.1500 32 bit (Intel)

Library versions

For this project, the required packages are:

  • scikit-learn: 0.13.1
  • matplotlib: 1.2.1
  • cPickle: 1.71
  • numpy: 1.7.1
  • scipy: 0.12.0
  • ephem: 3.7.5.1
  • dateutil: 1.5-mpl
  • csv: 1.0
  • pytz: 2012d-mpl

Running the script versions.py will display your currently installed versions.

Importing Conventions

We use the following conventions for importing packages

     import numpy as np
     import matplotlib.pyplot as plt
     import cPickle as pickle

Pickling

All pickled objects are in a pair, stored in a .pkl file. The pair contains the object as the first object, and a description (of type string) as the second. For example:

desc = "A 2-dimensional, (n x 3) array of building information.\
        The first column contains the name of the building (string).\
        The second column is the zip+4, stored as an (int, int) pair.\
        The third column is the square footage (float)."

foutn = "all_my_buildings.pkl"
pickle.dump((my_data, desc), open(foutn, "wb"))

This allows data and description to be loaded via:

data, desc = pickle.load(open("all_my_buildings.pkl"))

or the data to be loaded alone as

data,_ = pickle.load(open("all_my_buildings.pkl"))

Building Records

A "Building Record" is a dictionary with the following keys:

  • "bid": the building unique identifier. (int)
  • "naics": the NAICS code.(int) (Note: in previous versions, this was "sic", and corresponded with the SIC code of the building.)
  • "btype": A description of the building's type, for example "Refrigerated Warehouse." (string)
  • "times": The time stamps for which we have data for this building. Each entry is a datetime object. (1-dimensional array of datetimes)
  • "temps": A tuple (pair). First: The temperatures in degrees Fahrenheit, in the same order as Times. (1-dimensional array of floats) Second: Flags indicating which temperatures are original data (as opposed to imputed). (1-dimensional array of booleans)
  • "kwhs": A tuple (pair). First: The energy consumption (in kwh) of the building, in the same order as Times. Second: Flags indicating which temperatures are original data (as opposed to imputed). (1-dimensional array of booleans) For example for a building record br, at time br["times"][15], the temperature was br["temps"][15] and the energy usage was br["kwhs"][15].

Example code to extract from a building record d:

bid                  = d["bid"]
sic                  = d["sic"]
btype                = d["btype"]
times                = d["times"]
kwhs, kwhs_oriflag   = d["kwhs"]
temps, temps_oriflag = d["temps"]