Introduction
Local Law 84 (ll84) is an energy benchmarking database that was established by New York City. As climate change has become more important to policymakers, NYC decided that a good method would be to create a benchmark of the amount of energy building consumed. ll84 required that owners of buildings of a size greater than 50,000 sq ft. must report certain data about their buildings to the city. Now that this data has become available it has become important to establish how to determine what makes a good building in terms of energy performance. This requires one to factor in the differences between buildings and compare buildings to their peers.
Literature Review
Benchmarking is a challenge that the city is interested in solving and Energy Star, an organization backed by the EPA created a score to determine energy performance. Others have also attempted to make a score to compare buildings. Kontokosta (2012) analyzed the ll84 dataset to come up with a benchmark for buildings based on their usage. He created two models to determine source energy use intensity (EUI) which is the total amount of raw fuel required to operate a building (energy star). Kontokosta first creates a base linear regression model that consists of features that both multifamily housing and office spaces use and then adds additional features depending on the building type. For office buildings Kontokosta found that the age of buildings was significant and that the older the building the more efficient they were. He also found the primary energy source, lotcoverage, and building size were significant variables with building size causing high energy. Multifamily housing had similar significant variables however the larger the house the more efficient it was in EUI. These results suggest appropriate features in order to properly compare buildings amongst their peer groups.
Data and Methods
This paper attempts to determine features appropriate for understanding the energy performance of buildings. The ll84 first needed to be combined with the PLUTO data consisting of pertinent building information. Each borough had a PLUTO dataset that was combined to cover the entire area of NYC. After combining these datasets, they were then merged with the ll84 dataset based on Building, block, and lot number (BBL). This reduced the total number of rows to 11403 rows. After the initial merge, the data then had to be cleaned. Any row with a missing source EUI value, total floor area, or was in a building type with less than 100 buildings was removed. This further reduced the row size to 10362. Rows were also removed if their source EUI was less than 5 or greater than 1000. This is because they are extreme values compared to the average values. Finally, the rows containing the highest and lowest 1% of source EUI was removed to get rid of outliers. This left 10060 rows in the dataset. Once the data was cleaned, descriptive statistics were used to get an observation of the data.