Data—strings and booleans

Objectives

  • Understand the concept of strings and how text can be represented.
  • Be able to use functions attached to data.
  • Understand boolean data and the use of logical operators.

Strings

Creating strings

We are first going to investigate strings, which are how Python represents text data. We’ve actually already come across strings, when we executed the following code:

print "Hello, world!"

The collection of characters that is enclosed in quotation marks (") defines a string. As we saw in the last lesson, we can assign such data as a variable:

hello_text = "Hello, world!"

print hello_text

Accessing characters in strings

The variable hello_text refers to a string, which is a collection of characters. We can access the individual characters using square brackets and an index. Such indices start at 0; for example, to access the first character only we would append [0], as below:

hello_text = "Hello, world!"

print hello_text
print hello_text[0]
Hello, world!
H

We can also access a range of characters by using separate indices separated by colon (:) characters. As we’ve seen, the first number specifies the start index. The second specifies the end index; what is extracted is up to but not including this index. For example, if the first index is 0 and the second index is 2, the characters that are returned would be those in the 0 and 1 indices (not 2). Putting it all together, we could extract the Hello component of the Hello, World! string by doing:

hello_text = "Hello, World!"

print hello_text[0:5]
Hello

We can also use some shortcuts when using this form of indexing. First, if we are starting at 0 then we do not need to specify the number—Python assumes we want to start at 0. For example:

hello_text = "Hello, World!"

print hello_text[:5]
Hello

Tip

This method of accessing components of a variable using square brackets is an important concept, and one that we will return to in future lessons.

Using operators with strings

We can also use some of the operators we’ve encountered already on strings, which behave intuitively:

hello_text = "Hello, world!"

print hello_text + " From Python"

print hello_text * 2
Hello, world!

Of course, some of the operators don’t make very much sense in the context of strings and Python will complain if we try to use them:

hello_text = "Hello, world!"

print hello_text ** 2
Traceback (most recent call last):
  File "<string>", line 3, in <module>
TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'

As we can see, when Python encounters the offending line it causes an error—the program stops executing and displays an error message. The error message is somewhat informative, with Python indicating that it doesn’t know how to combine a string and an integer in a ‘pow(er)’ operation.

Tip

Being able to interpret Python’s error messages is an important skill. When you encounter an error, give the associated message a close read. It will tell you the line number of the code that is causing problems, and give you some clues on the nature of the error. See Dealing with errors for further details.

An important point is that what seems like numbers are no longer considered as numbers once they are placed inside quotation marks; they acquire the properties of strings rather than of numbers. For example:

oranges = "2"

Here, the variable oranges does not refer to the number 2, but to a string that contains the character 2. What are the consequences of this? As we’ve seen, we can add numbers and add strings—we can see that Python treats oranges as a string by considering:

oranges = "2"

print oranges + oranges
22

We can see that it has added the character "2" to the character "2" to produce "22", rather than adding the number 2 to the number 2 to produce the number 4.

Functions attached to data

Strings are a good example of an important aspect of Python programming—using functions that are ‘attached’ to data. Being ‘attached’ means that functions are associated with a variable and can be executed by putting a period character (.) in between the variable and function name. For example, if the variable apples has a function attached to it called pick, we could access it as apples.pick().

Every string that we define comes with a set of associated functions that can be very useful. For example, the upper function converts all the characters in the string to upper case:

hello_text = "Hello, world!"

print hello_text.upper()
HELLO, WORLD!

Tip

As we encountered before, we can investigate what functions are attached to a given variable using Spyder by typing the variable name followed by a dot and then pressing TAB. Combined with using CTRL-i to show the help for a given function, this is a powerful way to determine the functionality attached to a variable.

Boolean data

Now we will consider another important type of data in Python, booleans. This is a straightforward type of data that can only be one of two values: True or False.

Tip

Note the capitalisation of True and False. Python is case-sensitive, so true is not the same as True.

For example, say we are running an experiment where we don’t want our program to use the whole of the screen and we are using the Windows operating system. To indicate this, we might define the following variables:

fullscreen = False

is_windows = True

We can combine booleans using logical operators (such as and, or, and not). For example, say if we wanted to determine whether we want to run in fullscreen and we are using Windows:

fullscreen = False

is_windows = True

print fullscreen and is_windows
False

The and operator returns True only if both its inputs are True. On the other hand, the or operator returns True if either of its inputs are True:

fullscreen = False

is_windows = True

print fullscreen or is_windows
True

We can also do negation, which flips around our boolean value:

fullscreen = False

print fullscreen
print not fullscreen
False
True

Booleans are particularly useful when we want to control our program flow, which we will see in a later lesson. This often involves testing if two things are equal, which involves boolean data. For example, say we define a subject identifier somewhere in our code:

subj_id = "p1001"

At some point in our code, we might want to know if the subject ID is “p1001”. We can test for this using the comparison operator, which is ==:

subj_id = "p1001"

print subj_id == "p1001"
True

Note the use of the double equals sign in the second line of code above. This is very important, and rather tricky. The single equals sign, =, on the first line of code means assignment. By using the double equal sign in the second line of code, == we are doing a comparison.

For example, the following assigns the value of 2 to the variable oranges.

oranges = 2

Whereas the following compares the value of the variable oranges to the value 2, giving either True or False.

oranges = 2

print oranges == 2
True