Book - 3.4) Algorithms - Iteration in Blockly (Part 1)
First Steps with Iteration¶
So far we have only been dealing with single values that was provided by a data block— the current temperature or the current stock price. In other words, we have dealt only with a single instance of an abstraction where the abstraction has only one property. The next step is work with multiple instances of an abstraction where the abstraction has only one property. Two observations before we start. First, you will see that the number of instances in our examples is small. Even though we will be working with small example data sets to start with, the algorithms that work for a small data set will also work for a longer data set of the same kind — it will just take longer to run. Second, we will learn later how to work with abstractions that have multiple properties.
We also introduce here another programming concept - data structures. A data structure is a means of organizing data for use. Typical uses are to access elements of the data structure or to add new elements to the data structure.
The first data structure that we will see is that of a list. A list is simply a sequence of individual elements arranged one after the other. The list of nine numbers can be written like this:
[ 12, 222, 13, 42, 15, 16, 73, 1000, 0 ]
where the square brackets surround the list and each item in the list is separated from the next item by a comma. By convention, the list is read from left to right, so the leftmost item is the first item in the list and the rightmost item is the last item in the list. In the example above, the number 12 is the first item and the number 0 is the last item. A list can be used to organize the instances of an abstraction where each instance has a single property.
To work effectively with big data, two concepts are brought together:
- list: a data structure to contain the instances of the abstraction (i.e., the data set), and
- iteration: applying an algorithm to each instance of the abstraction (i.e., to each item of the list) one instance (item) at a time.
For simplicity, we will use the simpler term "list" rather than "list of the instances of the abstraction.
There is a general pattern for processing a list that represents a data stream. The three steps in the pattern are:
- Set the value of a variable to the stream data.
- Perform the iteration using the “for each” where the variable denotes the item in the data stream used for each step in the iteration.
- Apply some processing to the current item.
The general structure can be adapted to individual problems by specializing the kind of processing that is done on each item. The processing step is often one of four kinds:
- Uniform: apply the same processing to each item in the stream.
- Accumulate: compute a global property for the stream where the global property is updated by each item.
- Filter: select certain items in the stream to process and ignore the others.
- Transform: produce a new data stream from the original stream by transforming the individual items in some way.
We will look at examples of each of these four patterns. We will begin by examining the first two patterns, uniform and accumulate. After seeing how decision is represented in Blockly, we will examine the last two patterns, filter and transform.
Iteration: Uniform¶
The BlockPy workspace below shows an example of combining lists and iteration to process each item in a data set in a uniform way. This code simply prints each item of the data stream, one item per line. Run the code and see what values are in the data streams for different cities. Note: you may need to use the scroll bar on the right to be able to see all of the output.
The example above was one of uniform processing. Each item in the data set was printed. No items were excluded.
Iteration: Accumulate¶
An example of processing that involves accumulating is finding the average of the forecasts. To find the average, we need to know how many items there are in the data set and what the total temperature is over all the forecasts. The following BlockPy workspace shows a Blockly program for this.
In this program, two properties of the data set are being accumulated as each item in the set is examined. One property is the number of items in the set. This is the property represented by the variable “num-items”. The second property is the total of all of the temperatures in the data set. This is the property represented by the variable “total-temp”. On each step of the iteration, a “set-to” block is used to update the properties. The “set-to” block updates the selected variable to the value indicated by whatever block is plugged into the “set-to” block. Thus, “set num-items to num-items + 1” means that the value of “num-items” in increased by one and “set total-temp to total-temp + temp” means that the value of total-temp is increased by the value of “temp” (the forecast temperature on this step of the iteration). An important part of the accumulating pattern is that the variables being used to accumulate the global properties are initialized. In this program, the two variables “num-items” and “total-temp” are each initialized to zero. The initialization is necessary so that on the first iteration the two “set-to” blocks make sense. For example, with “num-items” initialized to zero, the block “set num-items to num-items + 1” on the first step of the iteration makes sense (updating the value of num-items to the result of 0+1).
This pattern of data stream processing is called “accumulate” because on each iteration, some variables are gathering increasingly accurate values for selected global properties of the data stream. In this case, the program is accumulating an increasingly more accurate value for the number of items in the stream and the total of the forecast temperatures.
Decision¶
In addition to working with specific temperatures, we often want to know if the temperature is above or below some threshold. Temperatures that are too high may be uncomfortably hot and temperatures that are too low may be uncomfortably cold. For this example, we will take any Fahrenheit temperature below 60 degrees to be “cool”, from 60 to 75 to be “mild”, from 75 to 85 to be “warm” and 85 and above to be “hot”. The work space below uses decision blocks (from the Decisions category) to output the current temperature and whether the current temperature is “cool” or “warm”.
There are two important things to notice about the code in the above work space. The first is the block used for the decision. This block has two slots, each of which holds the condition for an “if” test. The second condition is only tested if the first condition is found to be false. This logic is used because if a temperature is less than 60 degrees then it cannot also be between 75 and 85 degrees. Thus, if the temperature is less than 60 degrees, the decision block causes “Cool” to be printed and does not evaluate the second condition. However, if the temperature is not less than 60 degrees then it is necessary to evaluate the second condition to decide whether or not to print “Warm”.
It is slightly tricky in Blocky to create an “if..do..else if..do” decision block like the one used above. Here is how it is done:
- In the Decisions category select an “if..do” block. Notice that the block has a star in a blue box to the left of the word “if”.
- Click the star in the blue box. In the popup window there is an “if” form on the right, and on the left, an “else if” form and an “else” form. Drag the “else if” form from the left into the slot of “if” form on the right. When it snaps into place you will see the original “if..do” block change to the desired block.
- Click on the star in the blue box to close the popup window.
The second thing to notice about the code in the above workspace is the second condition. To test if the temperature is within a range, there are two separate but related conditions:
- is the temperature at or above the lower limit of the range,
- is the temperature less than the upper limit of the range.
To combine these two separate conditions into one compound condition, an “and” Boolean operator is used. A compound condition with an “and” operator in Boolean logic is true only if both of its two conditions are true (and false otherwise). A related Boolean operator, the “or” operator, is true if either of its two conditions is true (and false otherwise). The block for building compound and/or conditions is in the Logic category. The category also contains a third Boolean operator, “not”. This block has a single slot into which a condition can be plugged. The combination of the “not” and a condition form a compound condition which is true if the condition is false and false if the condition is true. For example, the compound condition
not (temp > 50)
is
- true if the value of the variable temp is 50 or less
- false if the value of the variable temp is greater than 50
Such negated conditions are sometimes used when the statement of a negative condition is easier to understand.