top of page
  • Writer's pictureChristian Schitton

Digital Transformation challenges in the real estate industry - the Data Level

Digital transformation is the new big word, being the base of new global economy changes as well as the newest international standards such as ESG. But when it comes to the practical part and the implementation of the changes needed in order to switch to digital, there are many challenges companies are facing. Those challenges are slowing companies down as they are spending too much of their resources on this topic and putting too much pressure on their employees.

In this article we will touch the practical difficulties we see surfacing, when it comes to digital transformation efforts of real estate companies - with the focus on the data gathering process.

It seems that the data collection part took over most of human and time resources of many companies, partly still relying on lists and manual handling.

Our goal is to highlight the challenges and possibilities when it comes to digitalisation and digital transformation in order to offer an overview for companies trying to achieve a higher level of digital governance.

Levels in the Transformation Process

Although, different industries as well as different companies show quite a diverging picture with respect to digital transforming processes, they all go through -in a nutshell- three main levels:

  • data gathering - level

  • data analytics / operational - level

  • data result - level

Of course, this is a very rough picture and not only are those levels connected to each other but also each level is built up again by several layers. Above all, the time component is not to be forgotten.

Nevertheless, from a management point of view it is important to understand that those levels have to be in balance with each other.

The intensity those levels are handled with, depends on the constraints a company is facing, i.e. the stage in the digital transformation process, availability of data sources, cost budgets covering the transformation, time frame, personnel infrastructure and similar.

Data gathering level

Data gathering and data processing is extremely crucial in a more and more data driven business environment. Without an efficient use of data and -as a consequence- without moving an enterprise in the direction of digital transformation, it will be hard to manage it effectively and hence to thrive economically.

Data gathering means the collection of relevant data, the pre-preparation of the raw data for further processing and first exploratory data analysis. This is an extremely important level as it defines the quality and accuracy of all further steps.

However, as this step might be important, it is as well crucial to note that data collection procedure in real estate companies has to be well thought through in order to avoid the creation of unnecessary, duplicate or missing data leading to an overload of human and time resources, while leaving room for human errors.

The consequence of this development could be - and this is unfortunately reality in many real estate companies which have started the digitalisation and digital transformation process - to get stuck in the data gathering process for quite a long period of time.

The main burden is the manual gathering of data in different lists and spreadsheets, which is getting increasingly obsolete, considering advanced tech tools available which help to do the task in an automated way.

There are numerous tools to e.g.

  • read in pdf-files and other documentation

  • scrape the internet for information

  • get connected to internal and external data bases

  • get connected to API data points

  • handle, organise and use tsv - and csv - and similar file formats

  • read in excel files

  • organise the storing of gathered data in several different ways depending on data frequency, size, data frame needed and similar

As an example, the R-package DBI (DataBase Interface) is a relatively simple interface between the open source R-environment and external data base management systems (DBMS). Around 30 DBMS (e.g. MySQL, MariaDB, SQLite) are supported to, e.g.

  • manage the connection to a database

  • list the tables in a database

  • read a data base table into an R - data frame

  • making queries to organise the data

  • and other

Or take this - in order to be able to tackle with huge data sets and/ or making use of intensive computational power, the R-environment offers access to the possibilities of Apache Spark by means of the sparklyr - package. This broadens the range of potential applications enormously.

Another amazing example for collecting, using and sharing data is pins. Pins also come from the R - environment and are a way to have small- to medium-sized data objects shared on a virtual blackboard in order to keep track on updates and version-control within the data framework.

When thinking about all of the technological options we have nowadays, it seems intuitive to use them...

However, fact is that a lot of real estate companies rather engage a number of their specialised departments in a manual data collection process. As an example, excel-lists with a lot of input parameters are created and to be filled in manually by those specialists. You can imagine how much of the company resources this takes.

This is how a real estate company can end up in a digital transformation process which takes months, not to talk about years, but without achieving any significant progress. Moreover, the employees' experience is a very negative one as their job becomes focused on filling in lists, correcting them and sending around within different departments.

And it should be the other way: digital transformation should aim for the ease of workflows.

Data Preparation for further Processing

Up to now, we were just talking about raw data. The real beauty and potential is in the fact, that any data can be pre-prepared automatically in order for the next steps to run smoothly.

As soon as the task at hand and the operational tool is defined, any data read-in from internal/ external sources can be shaped and prepared accordingly in a completely automated way. Additional pre-preparation steps, exploratory data analysis and other quality checks make sure that already in this stage the quality of the data flow is ensured.

Staying with the open source environment of R, the tidy-verse - package is the gate for a world of data handling in almost infinite ways. This is the framework to prepare the data as it is needed for the operational level.

Or, take the case of machine learning applications. Those tools can be quite data hungry and the supply of data has to come in a certain form and shape for the respective machine learning tool to work properly (e.g. eXtreme Gradient Boosting algorithm just accepts a numeric matrix as input). Though, in a production stage there is no time and possibility to handle the pre-processing of data in a manual way.

Saying this, R offers an efficient solution in the form of the tidy-models - package. When implementing a machine learning application in this environment, there is an integrated stream (the recipe - package) which takes care for the data preparation and data handling in an automated way and which is connected via workflows with the overall machine learning process.

In other words, you pre-define the steps how the incoming data should be handled in an own module. This module is then plugged into the overall machine learning application via a workflow. As from that moment, data handling and procession is done on an automated basis.

This is a snapshot - of course. But you can see the potential and the importance of data gathering and data preparation in order to be able to offer appropriate analytical possibilities. In the next article we will talk about the data analytics level and challenges/opportunities companies are facing in that field.


Without doubt, pursuing the digital transformation, balancing its levels, finding the right intensity while taking care for a company's constraints are classic management tasks. Consequently, the success or failure of this transformation is in the management's responsibility as well.

It cannot be overemphasised how important the process of data collection is. After all, the quality of the input data is reflected in the quality of the results. However, this data gathering - level does not have to be a huge burden for the IT infrastructure of a company which is especially important for small to medium sized enterprises.

While the necessity to force the pace for a digital transformation is out of question, it would also be a mistake to jump for the big leap when a company's resources do not back up that approach. It is rather a well orchestrated bunch of small steps taking into account current constraints which may lead to success in this area.

Main aspect here is that data not only have to be collected as automated as possible but also data are to be prepared in an automated way and in a form so that the operational level is able to work with it. Open source frameworks, like R or Python, can be of tremendous assistance to this respect.


bottom of page