• Christian Schitton

Efficient Digital Transformation - It Starts With The Small Steps

Data Driven Decision Making


A recent article by one of the big deans of data science, Kirk Borne, addressed a very important topic: “How to Go from Data Paradox to Data Productivity with a Business Culture Transformation”.


In short, it takes more than just to hoard data in order to become an efficient data driven business venture.


It says that “just overhauling your technology and processes alone won’t turn you into a data driven business”. What it needs is “to make certain cultural changes too”. Reference to the article is added below.


I would like to supplement this notion by adding that business leaders have to find the right combination of technical understanding, mathematical knowledge and industry background to help their enterprises to achieve this efficient data driven environment which is undoubtedly necessary to stay ahead in one’s business segment.


This is even more true for businesses in the financial and alternative investment segment, like real estate, as especially deep industrial background is of essence here.


In this article, I would like to focus on the technical part with something seemingly as profane as object classes needed as backbone for writing algorithms in the world of data driven models.



Object Classes


Object classes (like vector, matrix, data frame…) are the first step and very foundation on which even the most complex data driven models or the latest state-of-the-art machine learning applications are built on.


Without having a tight grip on this foundation, it will be hard to run a data driven environment successfully.


It is those first, small steps which often define the outcome of the overall venture in a very early stage.


In my last article I mused about the difficulties I had and the boringness I felt when learning about different object classes when, as a heavy newbie, I started with programming.


First, I skipped this part. I wanted to do the “big” coding as fast as I could get there.


Of course, reality sank in quite fast and I had to come back and put a lot of effort studying this particular area of coding.


Here is a little example (from the perspective of the R language) why one should not skip the “lame” first parts in those coding sessions as they are as important as the big algorithm chunks in any model.



Creating a Grid Space


Assume, we have two vectors, vector 1 and vector 2 with length 20 each:

Now, we create a grid of possible combinations between those two vectors. The expand.grid() — function comes in:

This function creates a data frame with a dimension depending on the vectors involved as well as the length of the respective vectors.


In this case, the result is a data frame with 20^2 = 400 rows (i.e. observations) and 2 columns (for each of the two vectors). Involving a third vector with length = 20 would create a data frame with 20^3 = 8,000 rows and 3 columns.


Below is the grid of possible combinations based on the 2 vectors and their respective length of 20:

Though, this is not really spectacular, is it? I told you, it can be boring.


Increasing the length of the vectors to 100 while keeping the range of the vectors the same (i.e. -15 to +15) would give us a data frame with 10,000 observations in 2 columns.

Though, this is still not really spectacular…



Using the Grid Space


Of course, the grid space is not created for its own purpose. It is here to serve a bigger task (the big picture — if you want to put it this way).


Look at the case where a function is laid over the grid space in order to see how this function behaves in this area.


Point is that this function could be the cost function of a machine learning model which would have to be minimised in order to produce the best results for whatever the model is trained for.


Here is an example.


Let’s assume two vectors again:


and the following function:


f(x,y) = (x — 2.19)^2 + (y — 1.88)^2 + cos(3x + 1.41) + sin(4y — 1.73)


in other words:

We want to know which values this function can take within the grid space of those two vectors. So, first we create an area of possible values for those two vectors:

This creates a data frame of 10,000 combined data points with 2 columns. Here are some samples of this grid space:

Then we “lay” the (cost) function over this grid:

Take notice that in this step we do not directly address the vectors anymore. What we need are the columns of the data frame produced by the expand.grid() — function which represent the common observations in the grid space.


The result is a vector with 10,000 values according to the area of the grid.


Below are some of the results:

This vector of length 1e4 with all the values, the function takes in this grid space, is now available and can be chewed further in whatever task is in.


But to come there, we had to use data in different frameworks and interdependencies.



Local Minima, Global Minima


In case we want to visualise the result, there are several options to do this.


Here, we decided on a 3D — plot created by the persp3D() — function from the plot3D — package.


Though for the plotting — function used, we cannot just take the vector created above which has a length of 10,000. This would not work with the dimensions of the vectors (length of 100 each) in the visualisation part.


In this case, we need a matrix with dimension 100 x 100 embedding all the function values within the grid space.


In other words, the same amount of observations is bundled in a different object class with a different dimensional built-up for the plotting — function to work.


Here is the visual:

But also this is not a result on its own.

Staying with the cost function example, the visualisation clearly shows how the function creates several local minimums (blue area in the graph) and one global minimum (deep blue area) given the grid space.

And here we are. If this would be the cost function of a machine learning model we would e.g. already be exposed to the problem of how to find the global minimum while not getting stuck in local minima in order to get the most accurate model for the problem at hand.


Algorithm discussions like Gradient Descent or Particle Swarm Optimization as follow up…


In short, it started rather simple but quickly moves into the harder topics.



Conclusion


It is the small steps on which the heavy data structures, the complex statistical models respectively the latest state-of-the-art machine learning applications are built on.


So, one has to have a tight grip on those smaller steps which build the foundation for the bigger picture.


In order to become an efficient data driven enterprise, decision makers have to find the right combination of technical background, mathematical knowledge and a deep understanding of the industrial background.


Achieving this, one should not disregard the seemingly small steps which we showed here -as a rather modest example- in terms of a technical aspect.


Those small steps lead very fast to a level of more complex, much harder to solve problems.



References


How to Go from Data Paradox to Data Productivity with a Business Culture Transformation — Top 3 Data Paradoxes by Kirk Borne, Ph.D. published on LinkedIn/ October 21, 2021