Abstraction

Dec. 26, 2019

Posted in:

Concept of abstraction and it's use in software development and everyday life

Abstraction

The term abstract in itself for many people is abstract as well as is the concept of abstraction. In the context of software development and computer science I will define abstraction as the process of taking something such as a procedure or a function and then hiding the underlying minute details of how the process is actually implemented. Tightly interwoven to this concept of abstraction is the concept of abstraction layer. Which I will define in the context of this text as a part of abstract structure that consists of multiple layers that are all stacked on top of each other, much like a deck of cards, each layer hiding the implementation of the previous layer. As we go from the higher level layers to lower level, we are getting more and more into the precise details that explain on how the system actually works in a deeper level. (You might already see how this ties in with software architecture and computing systems)

The key to understanding complicated things is to know what not to look and what not to compute and what not to think.
Gerald Jay Sussman

Let’s start with a simple definition or actually more like an example of what I’m talking about, I will write this example from mathematical context as this is really a simple definition that most people will understand.

Such example of abstraction in mathematical terms would be process of exponentiating a number. For example, let’s define a function function f(x) = x^3, where x is any integer raised to the power of 3.

The function f(x), in this context is just an abstraction and the function itself can be viewed as an abstraction layer that’s just built on top of the actual implementation, which in this case is the expression x^3. The expression x^3 in itself is actually just another abstraction layer that could then be viewed as representation of the expression x*x*x. Well we might also say that x in itself is an abstraction, that’s meant to represent some integer, x is simply hiding the underlying implementation, which in this case is simply some integer. If you want to go deeper, you can but I will stop at this as I think this example is enough for now.

From more abstract definition to a more concrete one.

We are so used to abstractions in our everyday lives, that we rarely give any thought to this notion, in the big picture you can view almost anything as an abstraction of multiple smaller things built on top of each other.

There’s really great power to thinking about the concept of abstraction, especially when you are dealing with software systems and actually trying to build those software systems.

As an example we can take a simple method written in Java. Here we define a method that takes as an input 2 integers x and y and returns an integer that is the difference of squares of x and y.

public int differenceOfSquares(int x, int y) {
return square(x) - square(y);

public int square(int x) {
return x*x;
}

We can view the function diffOfSquares and it can be seen as a pretty abstract structure in itself. It takes as an input 2 abstract things, we have no idea what x and y are going to be. Then we run both of those variables through a abstraction called square(), the function square() in this can also be viewed as a black box function, because in the context of the diffOfSquares() method, we have no idea what the implementation is and we really don’t even care, we only care that it returns the correct results, which in this case is the square of the input.

In all parts of the piece of code, there’s multiple levels of implementations that you don’t know about.

What the square() method there is, is the implementation of that abstract thing. That takes 1 integers, and returns the product of the integer multiplied with itself.

The beauty of understanding the concept of abstraction is that if you can think about the concept in every day problem solving and programming, then it will help you view your methods, classes and the whole codebase, as a product of many much smaller discrete structures and disjoint the details that you are not concerned with thus helping you build your software structures in a more clear and concise way.

Going deep

When programming in particular Object-Oriented languages you might have gotten used to manipulating lists and using premade structures and objects to do that. Let’s view the data structure ArrayList, which we are really used to when building lists.

ArrayList is a data structure, what happens when you create a new list is this. ArrayList<Integer> list = new ArrayList<>(). The expression here creates a new variable with the type ArrayList<Integer>, after that it assigns that particular data type the value of something, in this case it assigns it the Object ArrayList.

What is that object really? What does that particular string ArrayList say?

It’s saying the compiler to create a new object and here is the interesting part, it actually doesn’t assign the actual object to the variable. Instead it assigns it the hexadecimal value of the memory-location where the object is actually located in the computers memory. This is the reason when you try to print out an object without toString() method, it will return something like 0x00E3FE63R3, which is the actual first byte of where the object starts in the memory. This in a way is just an abstraction of the actual object that is located somewhere in the memory.

We can manipulate the ArrayList object we have just created through methods built in to the class or object, which can be viewed as the interface of the object. We don’t know and don’t really care how the methods are implemented, we just care about the end result. For us the ArrayList is just an abstraction, a black box.

Well We can go deeper from ArrayList, what that object allows us to do is to manipulate arrays in a pretty simple way. How the ArrayList is actually implemented is with the use of most primitive type of list we can have in a programming language, which is the array. So in the end ArrayList is just an abstraction built out of multiple smaller pieces, one of which is the array.

int[] array = new int[10]

What is the actual value of the variable ‘array’ in this case is not some magical array or some list of numbers. It’s actually only a string of hexadecimals in the form 0x00E3FE63R3 that’s called a pointer. Why it’s called the pointer, is that it’s pointing to the actual memory location of the array.

Computer memory in it’s core is just an abstraction of an array of bytes, this is why we have to specify the size of the array. Which just specifies how much continuous space we allocate from the memory for the array. Otherwise we accidentally could touch parts of memory we shouldn’t when looking up a index. Also this is the reason we have to specify the datatype of the array, so the compiler knows how much memory we allocate per one element. In this case int is 4 bytes, so we actually have to allocate 10*4 bytes from the memory. The actual implementation of the the lookup for array is some simple array arithmetic, if we wouldn’t know the datatype, we would’t know the way in which the elements would be spaced out and we wouldn’t be able to lookup the array in a coherent way.

Once the elements of array are in memory, let’s say we want to get the element array[0], which would in turn return the memory location of the first byte of array, which in this case in a simplified form could be 1000, then we want the element from index array[2], that could be viewed as the memory location 1000 + (2*4) = 1008.

The memory location of 1000 in this case would be abstraction of the actual value in memory which is actually the value of element[0], let’s say 5 in binary. So that would be 00000000 00000000 00000000 00000101.

That is already pretty low level but what the binary notation if 0’s and 1’s is, it’s just abstraction of some actual elementary transistors that are either powered on or off.

Meaning of high level vs low level

If you’ve done any coding at all, you’ve been sure to run into the concept of “high” vs “low” -level languages. What that really means is let’s say as an example the programming language Python which is viewed as a high level language, the actual meaning of this concept ties in perfectly and correlates with the concept of abstraction. The programming language python for example is dynamically typed, which means that you don’t need to specify the data type of the variable. In python you can initialize a variable by simply writing x, where x is the actual variable and in compile time the compiler actually assigns the correct data type to the variable depending on the value that was thrust upon it.

When compared to lower level languages such as Java or C, the compiler needs to know beforehand what datatype is the variable, for example int x, otherwise you will run into runtime error during compile-time. You can’t just leave the datatype assignment to some abstraction (piece of code) within the actual compiler, you actually have to implement the abstraction yourself. Thus what the concept of high vs low -level really means is the level of abstraction of a particular language. High level meaning there’s high level of abstraction in the language and Python for example has way more of these abstraction layers built on top of each other, in contrast to C or even Assembly, which live in a much lower level and where you can’t just ignore the implementation of some details.

This also ties in with the concept that abstractions really make things easier. The speed that a quick application can be built from ground up with Python is much faster than in C. That’s because alot of the implementation of the structures can be left to the language itself. You don’t have to thing about so many details, thus freeing up your mental capacity to figuring out the actual logic of the code.

Return to blog