Python Data Structures Part 2 Tuples & Sets

This notebook is about Python tuples and sets.

Python tuples are immutable lists and sets can only contain unique values.

  • lists
  • tuples
  • sets
  • dicts

part of #100DaysofCode Python Edition follow along at https://jcutrer.com/100daysofcode

Python Docs Reference: https://docs.python.org/3/tutorial/datastructures.html


Tuples

A tuple consists of a number of values separated by commas, for instance:

In [3]:
t = 2015,2016,2017,2018,2019,2020
t
Out[3]:
(2015, 2016, 2017, 2018, 2019, 2020)

tuples can be nested like lists

In [8]:
t2 = ((1,2,3),(4,5,6),(7,8,9))
t2
Out[8]:
((1, 2, 3), (4, 5, 6), (7, 8, 9))

tuples are immutable

In [9]:
t2[0] = 101
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-0dcdb70712ab> in <module>
----> 1 t2[0] = 101

TypeError: 'tuple' object does not support item assignment

A TypeError exception is raised when you try to reassign a value in a tuple.

Tuples can contain other mutable data structures like lists

In [11]:
t3 = (["John", "Mary", "Jack"], ["Paul", "Martha", "Sam"])
t3
Out[11]:
(['John', 'Mary', 'Jack'], ['Paul', 'Martha', 'Sam'])
In [12]:
t3[0][2] = "David"
t3
Out[12]:
(['John', 'Mary', 'David'], ['Paul', 'Martha', 'Sam'])

But if we try to replace the entire list we get an error.

In [13]:
t3[0] = ['Susan', 'Casey', 'Lee']
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-13-1f2817f32fb5> in <module>
----> 1 t3[0] = ['Susan', 'Casey', 'Lee']

TypeError: 'tuple' object does not support item assignment

Tuples can easily be convert into lists

In [14]:
t = ("On", "Off", "Bypass")
l = list(t)
print(l)
['On', 'Off', 'Bypass']

Lists can be converted to tuples as well

In [16]:
mylist = ["Running", "Walking", "Jumping"]
mytuple = tuple(mylist)
print(mytuple)
('Running', 'Walking', 'Jumping')

Tuple Slicing

Slicing tuples work the same way as list slicing.

In [3]:
colors = ("red", "green", "blue", "yellow", "purple")
colors[:3]
Out[3]:
('red', 'green', 'blue')

The parentheses surrounding tuples is optional so you will sometimes see tuples define with out them.

In [5]:
directions = "north", "east", "south", "west"
directions
Out[5]:
('north', 'east', 'south', 'west')

You cannot define a tuple with a single value, the expression is evaluated to the type of the solitary value.

In [6]:
type(("a string")) # single item tuple
Out[6]:
str
In [7]:
type((100)) # single item tuple
Out[7]:
int
In [8]:
type((1.333)) # single item tuple
Out[8]:
float

If a single iterable is passed into tuple(). That iterables items become the items of a tuple.

In this example we are simple converting a list to a tuple.

In [8]:
a = tuple(['a','b','c'])
print(a)
('a', 'b', 'c')

Here is another example using range() which is also iterable

In [9]:
print(tuple(range(1,10)))
(1, 2, 3, 4, 5, 6, 7, 8, 9)

namedtuple

namedtuple is closely related to a tuple but allows access to data by name. This functionality comes from the collections module which is part of the standard library.

In [17]:
from collections import namedtuple

Position = namedtuple('Position', ['lat', 'long'])

location = Position(34.34314, 114.38423)
print(location)
Position(lat=34.34314, long=114.38423)
In [18]:
print(location.lat)
print(location.long)
34.34314
114.38423

Unpacking Tuples

Values in tuples can be unpacked into seperate variables with the use of slicing.

In [13]:
coords = (39.13423, -110.33234)
(lat, long) = coords

print(lat)
print(long)
39.13423
-110.33234

Using this approach assumes that you line up the exact number of variables as items in the tuple.

In [16]:
coords = (39.13423, -110.33234)
(lat, long, elevation) = coords
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-20f625f4284d> in <module>
      1 coords = (39.13423, -110.33234)
----> 2 (lat, long, elevation) = coords

ValueError: not enough values to unpack (expected 3, got 2)

In the above example, a ValueError exception is raise because there is no 3rd item in the tuple.


Sets

A set is an unordered collection with mo duplicate elements.

sets are constructed using {} curly brackets or the set() method.

In [22]:
vehicles = {'car', 'truck', 'bus', 'motorcycle', 'bicycle'}
In [23]:
type(vehicles)
Out[23]:
set

Passing a string to set() will result in a set of all the unique letters present in the string.

In [25]:
set('hello how are you doing today?')
Out[25]:
{' ',
 '?',
 'a',
 'd',
 'e',
 'g',
 'h',
 'i',
 'l',
 'n',
 'o',
 'r',
 't',
 'u',
 'w',
 'y'}

This can be very useful if you first split a larger block of text into words.

The following example will give you a set of all the unique words.

In [33]:
paragraph = """Hello, how are you doing today?
I hope you have a wonderful day.
Today is actually my birthday!"""

set(paragraph.split())
Out[33]:
{'Hello,',
 'I',
 'Today',
 'a',
 'actually',
 'are',
 'birthday!',
 'day.',
 'doing',
 'have',
 'hope',
 'how',
 'is',
 'my',
 'today?',
 'wonderful',
 'you'}

Let's improve on the above code by first removing all punctuations.

In [37]:
paragraph = """Hello, how are you doing today?
I hope you have a wonderful day.
Today is actually my birthday!"""

punctuations = ",.!?;"
for p in punctuations:
    paragraph = paragraph.replace(p, "")
                                   
set(paragraph.split())
Out[37]:
{'Hello',
 'I',
 'Today',
 'a',
 'actually',
 'are',
 'birthday',
 'day',
 'doing',
 'have',
 'hope',
 'how',
 'is',
 'my',
 'today',
 'wonderful',
 'you'}

Membership testing

Test if a value is present in a set.

In [24]:
'bicycle' in vehicles
Out[24]:
True

Comparing Sets

two or more sets can be compared using these powerful math operations to find the differenes and similarities.

  • issubset
  • issuperset
  • intersection
  • union
  • difference
  • symmetric_difference

Each set() operation has both a named method and an operator.

Operation Equivalent
s.issubset(t) s <= t
s.issuperset(t) s >= t
s.union(t) s | t
s.intersection(t) s & t
s.difference(t) s - t
s.symmetric_difference(t) s ^ t
subsets

compares two set

In [47]:
# is 123 a subset of 123456?
{1,2,3} <= {1,2,3,4,5,6}
Out[47]:
True

supersets

compare two sets

In [49]:
# is 123 a superset of 123456?
{1,2,3} >= {1,2,3,4,5,6}
Out[49]:
False
In [52]:
# is 123456 a superset of 123?
{1,2,3,4,5,6} >= {1,2,3}
Out[52]:
True

union

combines two sets

In [54]:
{1,2,3,4,5} | {4,5,6,7,8}
Out[54]:
{1, 2, 3, 4, 5, 6, 7, 8}
In [66]:
{1,2,3,6} | {2,3,4,5}
Out[66]:
{1, 2, 3, 4, 5, 6}

intersection

Returns a set of values that are in both sets

In [56]:
{1,2,3,4,5,6} & {4,5,6,7,8,9}
Out[56]:
{4, 5, 6}

If there is no overlap, an empty set is returned

In [60]:
{1,2,3,4} & {7,8,9}
Out[60]:
set()

difference

returns values that are present in left set that are NOT present in right

In [75]:
{1,2,3,4,5,6,7,8} - {1,2,3}
Out[75]:
{4, 5, 6, 7, 8}
In [77]:
{1,2,3} - {2,3,4,5,6,7,8}
Out[77]:
{1}

symetric_dfference

returns values unique to each set

In [79]:
{1,2,3,4,5,6} ^ {4,5,6,7,8,9}
Out[79]:
{1, 2, 3, 7, 8, 9}

symetric_difference is the opposite of intersection

set() methods

.add() to append items to a set

In [7]:
s = {1,2,3}
print(type(s))

s.add(44)
s.add('some string')

print(s)
<class 'set'>
{1, 2, 3, 44, 'some string'}

The methods .remove() and discard() both remove items from a list.

In [8]:
s = {1,2,3,4,5,6,7,8,9,10}
s.remove(1)
s.discard(5)
s
Out[8]:
{2, 3, 4, 6, 7, 8, 9, 10}

Attempting to remove() an non-existant item from a set raises a KeyError exception.

In [9]:
s.remove(100)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-9-d197555c5fbf> in <module>
----> 1 s.remove(100)

KeyError: 100

discard() silently continues when you try to remove the same item.

In [13]:
s.discard(100)

.clear() removes all items from a set

In [20]:
s = {1,2,3,4,5,6,7,8,9,10}

s.clear()
s
Out[20]:
set()

.pop() removes and returns the first item in a set

In [18]:
s = {1,2,3,4,5,6,7,8,9,10}
item = s.pop()

print(item)
print(s)
1
{2, 3, 4, 5, 6, 7, 8, 9, 10}

convert a set to a comma seperated string.

In [27]:
weekdays = {"Monday", "Tuesday", "Wednesday", "Thursday", "Friday"}
','.join(weekdays)
Out[27]:
'Tuesday,Monday,Friday,Thursday,Wednesday'

If we attempt to do the same on a set that contains non-string values we run into a problem...

In [30]:
s = {1,2,3,4,5,6,7,8,9,10}
','.join(s)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-30-29f7768f7db1> in <module>
      1 s = {1,2,3,4,5,6,7,8,9,10}
----> 2 ','.join(s)

TypeError: sequence item 0: expected str instance, int found

A TypeError is raised be cause the items are integers not strings.

We can fix this technically using the following comprehension which sends each item through the str() method before joining.

In [33]:
s = {1,2,3,4,5,6,7,8,9,10}
','.join(str(i) for i in s)
Out[33]:
'1,2,3,4,5,6,7,8,9,10'

Thanks for following along, we have come to the end of the Tuples and Sets section. Next, we will look a dictionaries.

This notebook is part of my #100DaysofCode Python Edition project.