Coding Conventions - dssg/energywise GitHub Wiki
Python version:
This project was developed on, and tested for python 2.7.2 (default, Jun 12 2011, 15:08:59) MSC v.1500 32 bit (Intel)
Library versions
For this project, the required packages are:
- scikit-learn: 0.13.1
- matplotlib: 1.2.1
- cPickle: 1.71
- numpy: 1.7.1
- scipy: 0.12.0
- ephem: 3.7.5.1
- dateutil: 1.5-mpl
- csv: 1.0
- pytz: 2012d-mpl
Running the script versions.py will display your currently installed versions.
Importing Conventions
We use the following conventions for importing packages
import numpy as np
import matplotlib.pyplot as plt
import cPickle as pickle
Pickling
All pickled objects are in a pair, stored in a .pkl file. The pair contains the object as the first object, and a description (of type string) as the second. For example:
desc = "A 2-dimensional, (n x 3) array of building information.\
The first column contains the name of the building (string).\
The second column is the zip+4, stored as an (int, int) pair.\
The third column is the square footage (float)."
foutn = "all_my_buildings.pkl"
pickle.dump((my_data, desc), open(foutn, "wb"))
This allows data and description to be loaded via:
data, desc = pickle.load(open("all_my_buildings.pkl"))
or the data to be loaded alone as
data,_ = pickle.load(open("all_my_buildings.pkl"))
Building Records
A "Building Record" is a dictionary with the following keys:
- "bid": the building unique identifier. (int)
- "naics": the NAICS code.(int) (Note: in previous versions, this was "sic", and corresponded with the SIC code of the building.)
- "btype": A description of the building's type, for example "Refrigerated Warehouse." (string)
- "times": The time stamps for which we have data for this building. Each entry is a datetime object. (1-dimensional array of datetimes)
- "temps": A tuple (pair). First: The temperatures in degrees Fahrenheit, in the same order as Times. (1-dimensional array of floats) Second: Flags indicating which temperatures are original data (as opposed to imputed). (1-dimensional array of booleans)
- "kwhs": A tuple (pair). First: The energy consumption (in kwh) of the building, in the same order as Times. Second: Flags indicating which temperatures are original data (as opposed to imputed). (1-dimensional array of booleans) For example for a building record br, at time br["times"][15], the temperature was br["temps"][15] and the energy usage was br["kwhs"][15].
Example code to extract from a building record d:
bid = d["bid"]
sic = d["sic"]
btype = d["btype"]
times = d["times"]
kwhs, kwhs_oriflag = d["kwhs"]
temps, temps_oriflag = d["temps"]