Book - 2.3) Modelling - Abstraction

Abstraction

Dealing with Complexity

The world is complex, so people often form simpler representations of real-world artifacts and phenomenon to deal with this complexity. These simplified representations focus on the essential aspects of something and ignore less important aspects. The simplification reduces the complexity to a level where the artifacts or phenomenon can be:

  • understood,
  • explained,
  • expressed,
  • planned for,
  • analyzed, or
  • manipulated.

What is essential or not essential to a simplified representation depends on the use that one has in mind. Just as “beauty is in the eye of beholder”, so also “what is essential is in the use.” A simplified representation made for one use may be useless for another use!

People use simplified representations in everyday life. Maps are a good example. The map of the Washington D.C. Metro system (see the following image ) is a simplified representation of the more complex reality that forms the “real world” transportation system. This is a simplified representation designed to help travelers plan and complete a journey on the transportation system from a starting station to a destination station. It is a simplified representation because it deliberately does not include a great deal of the complexity of the actual Metro system. For example, the map does not:

  • include facts about schedule or cost,
  • have the location of any station,
  • show which parts are above ground and which are below ground, or
  • show the interior layout of the train cars.

You can easily think of many other things about the Metro system that the map does not show. However, the map is perfect for its intended use—navigating between stations—and the simplification is critical to serving this use. Of course, it is not useful for other purposes. For example, the map is not drawn to scale, so it is not useful for the task of figuring out how much track is needed between two stations. You can likely think of other tasks for which this map is not useful.

Metro-Map.png

An Abstraction of the DC Metro System

The relationship between a real artifact and its simplified representation is illustrated in a painting below by Rene Magritte done in the late 1920s. The painting, titled the Treachery of Images, depicts a smoking pipe and has the seemingly contradictory caption painted by Magritte, which is French for, “This is not a pipe.” Magritte is quoted as saying:

“The famous pipe. How people reproached me for it! And yet, could you stuff my pipe? No, it’s just a representation, is it not? So if I had written on my picture “This is a pipe”, I’d have been lying! – Rene Magritte

Magritte-Pipe.png

An Abstraction of a Pipe

(from Torczyner, Harry. Magritte: Ideas and Images. p. 71

Of course, Magritte is correct that the painting of a pipe is not the same as the actual pipe; it is merely a simplified representation of a pipe. It is a simplified representation because it includes some of the properties of the real pipe (shape, appearance) but the painting does not show the actual dimensions of the pipe, what it is made of, what it weighs, etc.

Summary: Simplified representations reduce the complexity of an entity to make it easier to work with and understand.

Abstractions in Computing

An abstraction is a simplified representation of an entity that is used in a computation. The term “entity” is chosen because abstractions can be formed about anything:

  • people
  • places
  • actions
  • objects
  • events
  • processes
  • ...

This means that abstraction is a powerful and basic tool.

A crucial fact is that to compute with abstractions, the abstractions have to be expressed in terms of information—something that computers are built to deal with. While Magritte made his simplified representation using oil and canvas, we cannot get oil and canvas into a computer (or at least, we don’t recommend it!). This use of abstraction—to simplify representations of real-world artifacts—is only one way in which the idea of abstraction is used in computation. We will later see different uses of abstraction in computation.

Making an abstraction of an entity means selecting the entity’s properties that can be expressed as information. The information is what we compute about. A simple example is a book. Its properties might include the book’s:

  • title
  • author
  • publisher

A book has many other properties as well, including:

  • a genre
  • how many pages it has
  • whether it is hardcover
  • a price
  • physical dimensions
  • a table of contents
  • a dedication
  • the cover artwork
  • an ISBN number

Even this longer list is by no means exhaustive. You could likely think of many other properties of a book.

The notion of a stakeholder is often used to decide which of the many properties of an entity are relevant to how an abstraction is to be used. The stakeholder is often defined in reference to a an imagined person (or a class of people) who are seen as the audience or users for whom the abstraction is being defined. For example, if the representation of a book is for a librarian, then such properties as title, author, and ISBN number are relevant while the book’s dimensions, dedication, and cover art are not relevant. The properties that are relevant are those that are important for the work that the librarian does. The librarian in this example is the stakeholder. Notice that the idea of a “librarian” (or a stakeholder in general) does not refer to a specific person but to anyone who is a librarian or is playing the role of a librarian or even an automated library system. Different stakeholders have a different sense of what is relevant. For example, if the representation of a book is for a delivery company shipping the book to purchasers, then the only properties above that are relevant for the book abstraction are the book’s weight and dimensions. For this stakeholder, the book’s title, author, and ISBN number are not relevant. The book abstraction for the delivery company might also include other properties such as the customer’s address, the delivery date, a tracking number, and the current location of the book in the delivery system.

While the term “abstraction” can seem very vague, the use of abstractions in computation is very concrete. Abstractions can be seen behind the web pages of sites that deal with various kinds of entities. For example, the following image shows a part of the amazon.com web page for a Harry Potter book. Notice that this web page demonstrates an abstraction of the book because it displays properties of the book that Amazon found relevant for their purposes of selling books to online customers. In this case, the online customer is the stakeholder. The properties shown on this part of the web page include several properties named above: title, author, cover art, and price.

Amazon-Harry-Potter-Book.png

Amazon’s Abstraction of a Book

Another web page that represents a different abstraction of the same Harry Potter book is given on the Virginia Tech Library System. The image below shows how a librarian might define the relevant properties of this book. As with the Amazon web page, the Virginia Tech Library System web page also shows the title and author of the book.

VT-Library-Harry-Potter-Book.png

The VT Library Abstraction of a Book

However, the abstractions of the Harry Potter book by Amazon and the Virginia Tech library are not the same. The Amazon abstraction contains a price while the library abstraction does not. Also, the Virginia Tech library abstraction contains a “call number” (a code of where to find the book in the library) while the Amazon abstraction does not contain this property. This again illustrates that different stakeholders have different requirements for their abstraction of the same real-world entit

Summary:

  • Abstractions are simplified representations that are used in computing.
  • An abstraction gives all of the relevant properties of an entity.
  • What properties are relevant depends on the purpose for which the abstraction will be used.
  • There can be different abstractions for the same entity when the abstractions are used for different purposes.

Representing an Abstraction using Tables

The information properties of an abstraction can be represented in both a table form and in a text form. These two different, but entirely equivalent, representations are useful in different circumstances. A table form is often convenient to use when people are communicating with people about the design or documentation of an abstraction. A textual form is often convenient when a person is communicating with a computer (e.g., in programming) or when computers are communicating with each other (e.g., internet transactions).

A table representation of an abstraction organizes four elements of an abstraction (see ) using the row and column structure of a table. In this representation, the table’s caption is a description of the abstraction while the rows and columns are used to hold properties and their associated values.

First, the description of the table contains a name by which we refer to the abstraction being defined and a stakeholder. Remember that the stakeholder is the imagined person (or set of people) whose information needs are used to determine which of the many properties of an entity being modeled are relevant to the abstraction. For example, in defining an abstraction of movies for an online vendor (e.g., NetFlix) the description might use the name “Movie” and identify the stakeholder as an “Online Customer”. The abstraction will contain properties of movies deemed relevant to the online customer. Both parts of the description are necessary: the “Movie” abstraction could have another stakeholder (e.g., a “Librarian” or an “Agent”) who has different information needs; the “Online Customer” might also be a stakeholder for other abstractions (e.g., a “Book” abstraction or an “Appliance” abstraction).

Second, the top row of the table contains the properties of the abstraction. Each property is the label of a single column. Each of the properties are uniquely named to reflect the distinct nature of the properties. For example, the “Movie” abstraction might have properties named “Title” and “Genre”. Therefore, the top row of the table for the “Movie” abstraction would have one column labelled “Title” and a separate column labelled “Genre”.

Third, cells in the table contain values for different properties. For example, a possible value for the “Title” property might be “Gone With the Wind”. The table is organized so that the value “Gone With the Wind” is in the column with the label “Title”. A possible value for the “Genre” property might be “Drama”. The “Drama” value would be in the “Genre” column.

Fourth, a row in the body of the table (the body does not include the top row) contains the values that collectively describe instances of the abstraction. An instance is a concrete, specific entity that is being modeled. For example, in the “Movie” abstraction, the values “Gone With the Wind” and “Drama” would be in the same row because these two values describe the same concrete, specific entity.

Abstraction-Table.png

Representing an Abstraction using a Table

An example of the “Movie” abstraction for “Online Customers” is given below (see). In this example, there are five instances. Each instance has six properties: “Title”, “Year”, “Length”, “Genre”, “Format”, and “Price”.

Abstraction-Table-Example.png

Representing the Movie Abstraction using a Table

The ordering of the properties and the ordering of the instances in the table do not matter. For example, the example table above could equally well have been written as shown in the second example (see)

Abstraction-Table-Different.png
An Equivalent Table for the Movie Abstraction

In comparing the two tables, notice that the order of the properties in the two tables is different. For example, the “Title” property in the first example is the label of the first column, while in the second example, this property is the label of the fifth column. Also notice that “Gone With the Wind” is the second instance in the first example, but is the fifth instance in the second example.

What is crucial in constructing a table for an abstraction is the that property-value relationship correctly captures each instance. For example, the instance with the value “Moneyball” for the “Title” property in the first table also has the value “Sports” for the Genre property. These property-value relationships are the same in the second table (check that this is the case).

Each row in the table is termed an "instance" because it is intended to represent exactly one real-world entity. Similarly, each real-world entity that is included in our abstraction should be represented by exactly one instance (one row in the table). The following figure shows the correspondence between the instances in the Movie abstraction and the real-world movies they are meant to represent. Each real-world movie is depicted in this figure by a "movie poster" icon.

Abstraction-Instance-Correspondence.png

Each Instance Corresponds to One Real-World Entity

 

Representing an Abstraction using Text

An abstraction can also be represented in a text form by taking advantage of the property-value relationship that is essential to the abstraction. In the text representation, a single property-value pair is written as

property:value

while an instance—a set of property-value pairs—is written as

{property1:value1, property2:value2, ... propertyn:valuen}

For example, we would write the instance of a “Movie” as:

{Title: Moneyball, Year: 2011, Length: 133 min., Genre: Sports, Format: Blueray, Price: $15.00}

The entire collection of instances can be written as a set of such descriptions using a notation like:

( {instance1}, {instance2}, ... {instancen} )

where the parenthesis are used at the beginning and the end to note the beginning and the end of the entire abstraction. The abstraction is a sequence of instances.

For example, the “Movie” abstraction shown in the table above(see ) can be written in text as follows:

( {Title: Moneyball, Year: 2011, Length: 133 min., Genre: Sports, Format: Blueray, Price: $15.00},
  {Title: Gone With the Wind, Year: 1939, Length: 219 min., Genre: Drama, Format: DVD, Price: $10.95},
  {Title: Jurassic Park, Year: 1993, Length: 127 min., Genre: SciFi, Format: DVD, Price: $12.50},
  {Title: Pirates of the Caribbean, Year: 2003, Length: 143 min., Genre: Comedy, Format: Blueray, Price: $17.50},
  {Title: Sicko, Year: 2007, Length: 116 min., Genre: Documentary, Format: Streaming, Price: $11.75}  )

We will use something like this text description when we see more about the elements of a programming language.

You will notice that the text description, while conveying exactly the same information as the table form, is less easier to read than the table form. This is why table forms are used for human-to-human communication. However, computers are designed so that they adept at working with and communicating through text descriptions.