The use of machine learning could improve the way we predict cost, quantity, schedule and outcomes for the built environment. Tristan Harvey-Rice and Edward Day of Aecom explain why the industry needs to leverage the benefits of machine learning to estimate projects faster, better and smarter

Machine learning shutterstock_1058815598

Source: Shutterstock

01 / What is machine learning?

Machine learning covers a range of techniques that allow computers to improve at a specific task without bespoke programming, in an automated fashion. 

Machine learning’s deep analysis allows us to deal more effectively with imperfect data, find patterns within a dataset and produce models from those patterns. We can then use these models to make predictions based on the historic patterns identified. 

The application of machine learning has previously largely been restricted to highly complex problems where the associated investment can be justified – rarely in the built environment. But recent advances in processing power, storage capacity and cloud services have brought the computing capability and statistical expertise into the affordable reach of commercial organisations, and in much shorter timescales than before. 

Advances in techniques have also provided a suite of algorithms to be packaged up for rapid deployment by analysts without needing any expertise in the statistical principles employed. 

As a result, those with a good understanding of data analysis can now use machine learning packages from companies including Microsoft, Oracle, Amazon and Google. The potential benefits to the construction industry of using machine learning as a predictive capability include:

  • Increasing the accuracy of estimating
  • Providing qualitative as well as quantitative predictions
  • Enabling schedule predictions, clash detection and automated prediction of the sustainability measures that industry is increasingly prioritising
  • Supporting faster estimating, with less resource requirement
  • Focusing resource on bespoke elements of an estimate and/or value engineering
  • Strengthening the accuracy of predictions of measures besides cost, for example, identifying the carbon efficiency achievable within a defined funding envelope and project specifications.

These are explored below.

02 / How can machine learning improve estimating in the built environment?

The built environment has a strong history of leading the global classification of cost information using the RICS New Method of Measurement (NRM1/2/3), previously the Standard Method of Measurement (SMM). Against these cost-breakdown structures, organisations have been collating historic project cost data against these and other cost-breakdown structures, such as ICMS (International Construction Measurement Standards) for decades. 

Using this, users have been able to estimate to a quantified bandwidth of expected accuracy, and with little resource, the anticipated cost of a new project by common units of measure such as per m2 of gross internal floor area (GIFA).

This methodology has driven how the industry collates, analyses and presents cost information, but draws on very limited project-level information to derive a single overall average rate for future estimating, and by using one unit of measure (commonly GIFA), inaccurate outputs are created.  

The ability of machine learning algorithms to identify interrelationships between all available scope details, rather than just one project-level measure, allows for far more accurate predictions with less analytical effort. 

Other sectors, such as infrastructure, are already using more robust data in machine learning. For example, Yorkshire Water is open-sourcing much of its operational data, inviting techniques including machine learning to be applied to help identify ways of improving performance, for example, the analysis of huge volumes of flow meter data to identify leaks and target pipe rehabilitation.

Aecom has already applied machine-learning techniques to infrastructure cost data, demonstrating in practice how cost prediction accuracy can be improved over traditional methods. 

Machine learning algorithms can be linked to cloud-based BIM models, such as BIM360, to provide real-time predictions that assist in other essential project aspects. These include:

Classification: By training machine learning approaches such as neural networks, Aecom has developed a tool that allows NRM codes (or any other global coding structure) to be added to any architectural BIM model (such as Revit). This allows for quicker extraction of quantities from the models to feed into costing tools such as CostX or a further machine learning model used to predict costs.  

Performance prediction: Machine-learning techniques are equally as applicable to predicting schedule, outcome, sustainability or key ratio data, such as carbon footprint, wall-to-floor or net-to-gross ratios. For example, this could include identifying the capacity of a secondary school for a given funding envelope, using a machine-learning model, drawing on historic educational data. 

However, to fully reap the benefits of machine learning, the built environment needs to collect more detailed data related to project estimates, tender returns and/or outturn costs. 

Teams in the built environment also need to adopt the same classification structures at the estimating, tender-return and final-account stages of a project. At present, experts may produce an estimate using the NRM2 classification, while aligning their final account information to an elemental standard, such as the BCIS Standard Form of Cost Analysis. However, the elemental standard will only capture the costs at a high level, using project-level measures — such as GIFA — resulting in very little specific project information being collated.

Machine learning, depending on the algorithm, can analyse qualitative as well as quantitative data, so teams can input more valuable project information, such as the site complexities, material specifications and standards (BREEAM ratings, etc). To facilitate a move beyond NRM, organisations in the built environment need to incorporate a new standard of qualitative, and outcome data classifications. 

Second, the industry needs to invest in training existing and new consultants in the data analysis skills required for machine learning. Increasingly, degree courses and graduate programmes are covering machine-learning methodology and these will need to continue to evolve their teaching as new technologies emerge.

Macnhine learning Aecom 27 sep 2019 Figure 1

Figure 1: Results of predicting project cost using machine learning vs traditional methods

03 / Testing machine learning as a cost prediction tool

Aecom has successfully prototyped several machine learning solutions, improving cost estimating accuracy by around 50% in the infrastructure sector.

To test whether the built environment could benefit from machine learning capabilities as infrastructure has, Aecom created a prototype based on data collected for 115 historic, new-build commercial offices. The original dataset only held a cost element breakdown against the project GIFA, producing a cost per m² for each element. We inputted further quantitative and qualitative information, such as number of storeys, basement, roof type, construction methodology, etc, and used the dataset to train a machine learning algorithm.

We randomly selected 81 projects from the sample to train the model, with the remaining 34 being used to test the estimating accuracy of the model when compared with the traditional project-level average rate method. We used our machine learning model to predict two variables: cost and contract schedule.

For cost, the traditional approach, using average cost per m2 based on GIFA, estimated the total value of 34 projects as £204m against an actual outturn total cost of £176m, ie a predictive error of +16% over-estimation. In comparison, the machine learning model estimated the total cost of the projects at £182m, an error of +3%. The model was particularly effective against the traditional approach for unusual schemes, for example, ones with a small GIFA but high cost driven by location and specification or design choices. 

However, as Figure 1 (above) shows, the machine learning model was more accurate at the individual project level as well, with 85% of projects achieving +/-20% predictive accuracy using the machine learning model versus 31% of projects using the traditional approach. Overall the machine learning model gave a more accurate prediction for 91% of the sample projects.

The machine learning model also had success predicting with a high degree of accuracy the contract period for the 34 test projects. Traditional methods based on total value or GIFA to estimate the contract period were very inaccurate with an error of -43% when predicting the period based on value, and an error of -32% when predicting the period based on GIFA. The model, however, achieved a much greater accuracy, with a +5% error in the predicted versus outturn contract period. Again, at the individual project level, the machine learning model was found to be more accurate, more of the time, achieving a prediction error of +/- 20% for 68% of the sample, versus 21% and 18% when using a traditional approach based on cost and GIFA respectively. Overall the machine learning model gave a more accurate prediction for 79% of the sample projects.

In all, the prototype has highlighted that collating and inputting additional quantitative and qualitative project information can have a significant impact on our predictive capabilities.

Macnhine learning Aecom 27 sep 2019 Figure 2

Figure 2: Applied iteratively, machine learning could facilitate increasingly sophisticated predictive capability

04 / What is the future of cost estimating and machine learning for the built environment?

Given the ability of machine learning to significantly enhance the built environment’s predictive capabilities, we foresee this technique becoming an integral feature in future design and estimating software solutions. 

As well as improving predictive capability and reducing estimating effort, machine learning techniques will be iteratively applied to the design and construct process to develop increasingly sophisticated integrated solutions. For example, machine learning-based predictive models could extract the required quantitative and qualitative data from BIM or estimating systems (CostX, Global Unite, Candy, Prism, etc) without manual intervention required from cost consultants. Furthermore, machine learning algorithms could be employed to analyse measurable project outcomes to facilitate, for example, estimating the cost of a school based initially on the number of student places it must provide. 

We believe that the role of machine learning both in assisting to align BIM design objects to a standard classification system such as NRM, and in developing predictive models, will be an incredibly powerful solution for the built environment, facilitating efficient and accurate prediction of project outcomes in terms of cost, schedule and performance.