# Handy data types in the standard library
[](this doesn't explain how dict.setdefault and collections.defaultdict
work because they're not as simple as the things that are here and i
don't actually use them that much)
Now we know how to ues lists, tuples and dictionaries. They are commonly
used data types in Python, and there's nothing wrong with them. In this
chapter we'll learn more data types that make some things easier. You
can always do everything with lists and dictionaries, but these data
types can do a lot of the work for you.
> If it looks like a duck and quacks like a duck, it must be a duck.
Many things in this tutorial are not really something but they behave
like something. For example, we'll learn about many classes that behave
like dictionaries. They are not dictionaries, but we can use them just
like if they were dictionaries. This programming style is known as
**duck-typing**.
## Sets
Let's say we have a program that keeps track of peoples' names. We can
store the names in [a list](../basics/lists.md), and adding a new name
is easy as appending to that list. Lists remember their order and it's
possible to add the same thing multiple times.
```python
>>> names = ['wub_wub', 'theelous3', 'RubyPinch', 'go|dfish', 'Nitori']
>>> names.append('Akuli')
>>> names.append('Akuli')
>>> names
['wub_wub', 'theelous3', 'RubyPinch', 'go|dfish', 'Nitori', 'Akuli', 'Akuli']
>>>
```
This is usually what we need, but sometimes it's not. Sometimes we just
want to store a bunch of things. We don't need to have the same thing
twice and we don't care about the order.
This is when sets come in. They are like lists without order or
duplicates, or keys of [dictionaries](../basics/dicts.md) without the
values. We can create a set just like a dictionary, but without `:`.
```python
>>> names = {'wub_wub', 'theelous3', 'RubyPinch', 'go|dfish', 'Nitori'}
>>> names
{'RubyPinch', 'theelous3', 'go|dfish', 'wub_wub', 'Nitori'}
>>> type(names)
>>> 'wub_wub' in names
True
>>>
```
We can also convert anything [iterable](../basics/loops.md#summary) to a
set [by calling the
class](../basics/classes.md#why-should-I-use-custom-classes-in-my-projects).
```python
>>> set('hello')
{'o', 'e', 'h', 'l'}
>>> type(set('hello'))
>>>
```
When we did `set('hello')` we lost one `h` and the set ended up in a
different order because sets don't contain duplicates or keep track of
their order.
Note that `{}` is a dictionary because dictionaries are used more often
than sets, so we need `set()` if we want to create an empty set.
```python
>>> type({'a', 'b'})
>>> type({'a'})
>>> type({})
>>> type(set()) # set() is an empty set
>>>
```
Sets have a `remove` method just like lists have, but they have an `add`
method instead of `append`.
```python
>>> names = {'theelous3', 'wub_wub'}
>>> names.add('Akuli')
>>> names
{'wub_wub', 'Akuli', 'theelous3'}
>>> names.remove('theelous3')
>>> names
{'wub_wub', 'Akuli'}
>>>
```
That's the boring part. Now let's have a look at some really handy
things we can do with sets:
```python
>>> a = {'RubyPinch', 'theelous3', 'go|dfish'}
>>> b = {'theelous3', 'Nitori'}
>>> a & b # names in a and b
{'theelous3'}
>>> a | b # names in a, b or both
{'Nitori', 'theelous3', 'go|dfish', 'RubyPinch'}
>>> a ^ b # names in a or b, but not both
{'RubyPinch', 'Nitori', 'go|dfish'}
>>> a - b # names in a but not in b
{'go|dfish', 'RubyPinch'}
>>>
```
## Named tuples
It can be tempting to make a class that just contains a bunch of data
and that's it.
```python
class Website:
def __init__(self, url, founding_year, free_to_use):
self.url = url
self.founding_year = founding_year
self.free_to_use = free_to_use
github = Website('https://github.com/', 2008, True)
```
You should avoid making classes like this. This class has only one
method, so it doesn't really need to be a class. We could just use a
tuple instead:
```python
github = ('https://github.com/', 2008, True)
```
The problem with this is that if someone reading our code sees something
like `website[1] > 2010` it doesn't make much sense, like
`website.founding_year > 2010` would.
In cases like this, `collections.namedtuple` is handy:
```python
>>> Website = collections.namedtuple('Website', 'url founding_year free_to_use')
>>> github = Website('https://github.com/', 2008, True)
>>> github[1]
2008
>>> for thing in github:
... print(thing)
...
https://github.com/
2008
True
>>> github.founding_year
2008
>>> github
Website(url='https://github.com/', founding_year=2008, free_to_use=True)
>>>
```
As you can see, our `github` behaves like a tuple, but things like
`github.founding_year` also work and `github` looks nice when we have a
look at it on the `>>>` prompt.
## Deques
To understand deques, we need to first learn about a list method I
haven't talked about earlier. It's called `pop` and it works like this:
```python
>>> names = ['wub_wub', 'theelous3', 'Nitori', 'RubyPinch', 'go|dfish']
>>> names
['wub_wub', 'theelous3', 'Nitori', 'RubyPinch', 'go|dfish']
>>> names.pop()
'go|dfish'
>>> names
['wub_wub', 'theelous3', 'Nitori', 'RubyPinch']
>>> names.pop()
'RubyPinch'
>>> names
['wub_wub', 'theelous3', 'Nitori']
>>>
```
The list shortens from the end by one when we pop from it, and we also
get the removed item back. So we can add an item to the end of a list
using `append`, and we can remove an item from the end using `pop`.
It's also possible to do these things in the beginning of a list, but
lists were not designed to be used that way and it would be slow if our
list would be big. The `collections.deque` class makes appending and
popping from both ends easy and fast. It works just like lists, but it
also has `appendleft` and `popleft` methods.
```python
>>> names = collections.deque(['theelous3', 'Nitori', 'RubyPinch'])
>>> names
deque(['theelous3', 'Nitori', 'RubyPinch'])
>>> names.appendleft('wub_wub')
>>> names.append('go|dfish')
>>> names
deque(['wub_wub', 'theelous3', 'Nitori', 'RubyPinch', 'go|dfish'])
>>> names.popleft()
'wub_wub'
>>> names.pop()
'go|dfish'
>>> names
deque(['theelous3', 'Nitori', 'RubyPinch'])
>>>
```
The deque behaves a lot like lists do, and we can do `list(names)` if we
need a list instead of a deque for some reason.
Deques are often used as queues. It means that items are always added to
one end and popped from the other end.
## Counting things
Back in [the dictionary chapter](../basics/dicts.md#examples) we learned
to count the number of words in a sentence like this:
```python
sentence = input("Enter a sentence: ")
counts = {}
for word in sentence.split():
if word in counts:
counts[word] += 1
else:
counts[word] = 1
```
This code works just fine, but there are easier ways to do this. For
example, we could use the `get` method. It works so that
`the_dict.get('hi', 'hello')` tries to give us `the_dict['hi']` but
gives us `'hello'` instead if `'hi'` is not in the dictionary.
```python
>>> the_dict = {'hi': 'this is working'}
>>> the_dict.get('hi', 'lol its not there')
'this is working'
>>> the_dict.get('hello', 'lol its not there')
'lol its not there'
>>>
```
So we could write code like this instead:
```python
sentence = input("Enter a sentence: ")
counts = {}
for word in sentence.split():
counts[word] = counts.get(word, 0) + 1
```
Counting things like this is actually so common that there's [a
class](../basics/classes.md) just for that. It's called
`collections.Counter` and it works like this:
```python
>>> import collections
>>> words = ['hello', 'there', 'this', 'test', 'is', 'a', 'hello', 'test']
>>> counts = collections.Counter(words)
>>> counts
Counter({'test': 2, 'hello': 2, 'is': 1, 'this': 1, 'there': 1, 'a': 1})
>>>
```
Now `counts` is a Counter object. It behaves a lot like a dictionary,
and everything that works with a dictionary should also work with a
counter. We can also convert the counter to a dictionary by doing
`dict(the_counter)` if something doesn't work with a counter.
```python
>>> for word, count in counts.items():
... print(word, count)
...
test 2
is 1
this 1
there 1
a 1
hello 2
>>>
```
## Combining dictionaries
We can add together strings, lists, tuples and sets easily.
```python
>>> "hello" + "world"
'helloworld'
>>> [1, 2, 3] + [4, 5]
[1, 2, 3, 4, 5]
>>> (1, 2, 3) + (4, 5)
(1, 2, 3, 4, 5)
>>> {1, 2, 3} | {4, 5}
{1, 2, 3, 4, 5}
>>>
```
But how about dictionaries? They can't be added together with `+`.
```python
>>> {'a': 1, 'b': 2} + {'c': 3}
Traceback (most recent call last):
File "", line 1, in
TypeError: unsupported operand type(s) for +: 'dict' and 'dict'
>>>
```
Dictionaries have an `update` method that adds everything from another
dictionary into it. So we can merge dictionaries like this:
```python
>>> merged = {}
>>> merged.update({'a': 1, 'b': 2})
>>> merged.update({'c': 3})
>>> merged
{'c': 3, 'b': 2, 'a': 1}
>>>
```
Or we can [write a function](../basics/defining-functions.md) like this:
```python
>>> def merge_dicts(dictlist):
... result = {}
... for dictionary in dictlist:
... result.update(dictionary)
... return result
...
>>> merge_dicts([{'a': 1, 'b': 2}, {'c': 3}])
{'c': 3, 'b': 2, 'a': 1}
>>>
```
Kind of like counting things, merging dictionaries is also a commonly
needed thing and there's a class just for it in the `collections`
module. It's called ChainMap:
```python
>>> import collections
>>> merged = collections.ChainMap({'a': 1, 'b': 2}, {'c': 3})
>>> merged
ChainMap({'b': 2, 'a': 1}, {'c': 3})
>>>
```
Our `merged` is kind of like the Counter object we created earlier. It's
not a dictionary, but it behaves like a dictionary.
```python
>>> for key, value in merged.items():
... print(key, value)
...
c 3
b 2
a 1
>>> dict(merged)
{'c': 3, 'b': 2, 'a': 1}
>>>
```
Starting with Python 3.5 it's possible to merge dictionaries like this.
**Don't do this unless you are sure that no-one will need to run your
code on Python versions older than 3.5.**
```python
>>> first = {'a': 1, 'b': 2}
>>> second = {'c': 3, 'd': 4}
>>> {**first, **second}
{'d': 4, 'c': 3, 'a': 1, 'b': 2}
>>>
```
## Summary
- Duck typing means requiring some behavior instead of some type. For
example, instead of making a function that takes a list we could make
a function that takes anything [iterable](../basics/loops.md#summary).
- Sets and the collections module are handy. Use them.
***
If you have trouble with this tutorial please [tell me about
it](../contact-me.md) and I'll make this tutorial better. If you
like this tutorial, please [give it a
star](../README.md#how-can-i-thank-you-for-writing-and-sharing-this-tutorial).
You may use this tutorial freely at your own risk. See
[LICENSE](../LICENSE).
[Previous](../basics/classes.md) | [Next](functions.md) |
[List of contents](../README.md#advanced)