A recent surge of interest in building energy consumption has generated a tremendous amount of energy data, which boosts the data-driven algorithms for broad application throughout the building industry. This article reviews the prevailing data-driven approaches used in building energy analysis under different archetypes and granularities, including those methods for prediction (artificial neural networks, support vector machines, statistical regression, decision tree and genetic algorithm) and those methods for classification (K-mean clustering, self-organizing map and hierarchy clustering). The review results demonstrate that the data-driven approaches have well addressed a large variety of building energy related applications, such as load forecasting and prediction, energy pattern profiling, regional energy-consumption mapping, benchmarking for building stocks, global retrofit strategies and guideline making etc. Significantly, this review refines a few key tasks for modification of the data-driven approaches in the context of application to building energy analysis. The conclusions drawn in this review could facilitate future micro-scale changes of energy use for a particular building through the appropriate retrofit and the inclusion of renewable energy technologies. It also paves an avenue to explore potential in macro-scale energy-reduction with consideration of customer demands. All these will be useful to establish a better long-term strategy for urban sustainability.