# Set

Python has a built-in method called `set`. set type has the following characteristics

* Sets are a collection which is unordered and unindexed
* Set elements are unique. Duplicate elements are not allowed.
* A set itself may be mutable, but the elements within a set is immutable.

You can create sets in two ways:

1. using `set` method followed by a parenthesis `()`.
2. using **curly brackets** `{}`.

## `set()`

You can have an **ITERABLE** object such as list or tuple within `set(<iter>)`. This returns the list or tuple as a `set` wrapped in a curly bracket `{}`. Any iterable object can be converted to a **set** using `set()`. You can think of `set()` as [`extend()`](/datascience/lists.md#extend) method of lists.

&#x20;A  list within `set()`.

{% tabs %}
{% tab title="Code" %}

```python
numbers = set([1, 2, 3, 4, 5, 6, 7])

print(numbers)
```

{% endtab %}

{% tab title="Output" %}

```python
{1, 2, 3, 4, 5, 6, 7}
```

{% endtab %}
{% endtabs %}

A tuple within `set()`.

{% tabs %}
{% tab title="Code" %}

```python
numbers = set((1, 2, 3, 4, 5, 6, 7))

print(numbers)
```

{% endtab %}

{% tab title="Output" %}

```python
{1, 2, 3, 4, 5, 6, 7}
```

{% endtab %}
{% endtabs %}

A string with `a set()`.

{% tabs %}
{% tab title="Code" %}

```python
my_letters = set('ABCDEF')

print(my_letters)
```

{% endtab %}

{% tab title="Output" %}

```python
{'E', 'F', 'C', 'B', 'A', 'D'}
```

{% endtab %}
{% endtabs %}

While converting an iterable objects to a set, the returned set is **deduplicated**.

{% tabs %}
{% tab title="Code" %}

```python
# Example 1
my_cities = set(['Krakow', 'Warsaw', 'Warsaw', 'Kielce'])

print(my_cities)


# Example 2
my_letters = set('AaBBCCDDEEE')

print(my_letters)


# Example 3
my_numbers = set('12345342')

print(my_numbers)
```

{% endtab %}

{% tab title="Outputs" %}

```python
# Example 1 output
{'Krakow', 'Kielce', 'Warsaw'}

# Example 2 output
{'D', 'E', 'B', 'C', 'a', 'A'}

# Example 3 output
{'4', '1', '3', '2', '5'}
```

{% endtab %}
{% endtabs %}

You can see that the output are unordered and deduplicated. The original orders are not kept. `set()` only accepts an object that is **iterable** such as a string, list or tuple. For example, integers are not iterable and it raises an error, to be specific `TypeError`, while we try to create a set with integer.

{% tabs %}
{% tab title="Code" %}

```python
my_numbers = set(12345342)

print(my_numbers)
```

{% endtab %}

{% tab title="Output" %}

```python
TypeError                                 Traceback (most recent call last)
<ipython-input-5-2b17322ba0b5> in <module>()
----> 1 my_numbers = set(12345342)
      2 
      3 print(my_numbers)

TypeError: 'int' object is not iterable
```

{% endtab %}
{% endtabs %}

## `curly bracket {}`

You can create a `set` using **curly brackets** `{}`. **Curly brackets** `{}` must have only **IMMUTABLE** objects. Each element has to separated by a comma, similar to [lists](/datascience/lists.md) and [tuples](/datascience/tuples.md), in other words, a set can be created as `{<obj1>, <obj2>, <obj3>, ......, <objn>}`.

{% tabs %}
{% tab title="Code" %}

```python
# Example 1
my_cities = {'Krakow', 'Warsaw', 'Warsaw', 'Kielce'}

print(my_cities)


# Example 2
my_letters = {'AaBBCCDDEEE'}

print(my_letters)


# Example 3
my_numbers = {12345342}

print(my_numbers)
```

{% endtab %}

{% tab title="Outputs" %}

```python
# Example 1 output
{'Kielce', 'Warsaw', 'Krakow'}

# Example 2 output
{'AaBBCCDDEEE'}

# Example 3 output
{12345342}
```

{% endtab %}
{% endtabs %}

As you can see, the **curly brackets** do not iterate through iterable elements. Each object is present in the set intact regardless of iterability.&#x20;

## Empty set

set can also be empty, as we had empty list and empty tuple. You can create an empty set using built-in function of set() only because Python interprets empty **curly brackets** `{}` as an empty dictionary.&#x20;

{% tabs %}
{% tab title="Code" %}

```python
empty_set = set()

# Check empty_set type
print(type(empty_set))

print(empty_set)
```

{% endtab %}

{% tab title="Output" %}

```python
# print(type(empty_set))
<class 'set'>

# print(empty_set)
set()
```

{% endtab %}
{% endtabs %}

## Mixed datatypes set

A set can have a mixed datatypes

{% tabs %}
{% tab title="Code" %}

```python
# set function
mixed_set = set([34, 3.2, 'cat', 1.858, False, True, 'Name'])

print(mixed_set)

# Curly brackets
mixed_set_curly = {34, 3.2, 'cat', 1.858, False, True, 'Name'}

print(mixed_set_curly)
```

{% endtab %}

{% tab title="Output" %}

```python
# set function
{False, 1.858, 34, 3.2, True, 'cat', 'Name'}

# Curly brackets
{False, 1.858, 34, 3.2, True, 'cat', 'Name'}
```

{% endtab %}
{% endtabs %}

## How to add element(s) to a set?

Sets are unordered and changing with indexing brackets is not possible. Sets are mutable, but we cannot perform slicing or indexing operations to access its elements. Python raises `TypeError` when you use indexing or slicing operation.

{% tabs %}
{% tab title="Code" %}

```python
number_set = {1, 2, 3, 4}

print(number_set[:2])
```

{% endtab %}

{% tab title="Output" %}

```python
TypeError                                 Traceback (most recent call last)
<ipython-input-11-c24bd2d35a09> in <module>()
      1 number_set = {1, 2, 3, 4}
      2 
----> 3 print(number_set[:2])

TypeError: 'set' object is not subscriptable
```

{% endtab %}
{% endtabs %}

You can use set method of `add()`to add an element. `add()` method can be used to add an element, it takes only an arguments (`add(<obj>)`).

{% tabs %}
{% tab title="Code" %}

```python
new_set = {9, 8, 7, 6}

new_set.add(5)

print(new_set)
```

{% endtab %}

{% tab title="Output" %}

```python
# print(new_set)
{5, 6, 7, 8, 9}
```

{% endtab %}
{% endtabs %}

You can use set method `update()` to add elements . `update()` requires an iterable datatype (simple or complex) (`update(<iter>)`).

{% tabs %}
{% tab title="Code" %}

```python
new_set = {9, 8, 7, 6}

new_set.update([5, 2, 4, 3])

print(new_set)
```

{% endtab %}

{% tab title="Output" %}

```python
# print(new_set)
{2, 3, 4, 5, 6, 7, 8, 9}
```

{% endtab %}
{% endtabs %}

## How to delete element(s) from a set?

You can delete an element from a set using `discard()` or `remove()`.&#x20;

### remove function

`remove()` will delete the element where it is present and raises a `KeyError` where the element is absent.&#x20;

{% tabs %}
{% tab title="Code" %}

```python
# Element is present
new_set = {9, 8, 7, 6}

new_set.remove(8)
print(new_set)


# Element is absent
new_set = {9, 8, 7, 6}

new_set.remove(5)
print(new_set)
```

{% endtab %}

{% tab title="Output" %}

```python
# Element is present
# print(new_set)
{9, 6, 7}


# Element is absent
# print(new_set)
KeyError                                  Traceback (most recent call last)
<ipython-input-23-25ec930c2723> in <module>()
      1 new_set = {9, 8, 7, 6}
      2 
----> 3 new_set.remove(5)
      4 
      5 print(new_set)

KeyError: 5
```

{% endtab %}
{% endtabs %}

### discard function

You can also use `discard()` to delete an element. if element is a member of the set, then removes it, but it does nothing when element is not a member of a set.

{% tabs %}
{% tab title="Code" %}

```python
# Element is present
new_set = {9, 8, 7, 6}

new_set.discard(8)
print(new_set)


# Element is absent
new_set = {9, 8, 7, 6}

new_set.discard(5)
print(new_set)
```

{% endtab %}

{% tab title="Output" %}

```python
# Element is present
# print(new_set)
{9, 6, 7}


# Element is absent
# print(new_set)
{8, 9, 6, 7}
```

{% endtab %}
{% endtabs %}

## pop function

You can use pop() on a set. pop() returns an arbitrary element because sets are unordered.

{% tabs %}
{% tab title="Code" %}

```python
new_set = {9, 8, 5, 4, 7, 6}

print(new_set.pop())
print(new_set)
```

{% endtab %}

{% tab title="Ouput" %}

```python
print(new_set.pop())
4

print(new_set)
{5, 6, 7, 8, 9}
```

{% endtab %}
{% endtabs %}

## Set methods and operators

You can use Python set methods and operators to perform operations such as union, intersection, difference and symmetric difference.

### union

The set made by combining the elements of two sets.&#x20;

![Union of set 1 and set 2 is the whole circles.](/files/-M36qqKXaub3g9CTDfzW)

You can use **`union()`** method or **`| operator`.**

{% tabs %}
{% tab title="Code" %}

```python
set_1 = {4, 5, 6, 7, 8, 9}
set_2 = {1, 4, 3, 5, 6}

# method 1
new_set = set_1.union(set_2)
print(new_set)


# method 2
new_set_2 = set_1 | set_2
print(new_set_2)
```

{% endtab %}

{% tab title="Output" %}

```python
# print(new_set)
{1, 3, 4, 5, 6, 7, 8, 9}

# print(new_set_2)
{1, 3, 4, 5, 6, 7, 8, 9}
```

{% endtab %}
{% endtabs %}

`| operator` creates a union of two sets (both side have to be sets), otherwise, it raises an error. While `union()` takes an iterable and converts it to a set before performing union operation. See example below; notice that the second set is a [tuple](/datascience/tuples.md).&#x20;

{% tabs %}
{% tab title="Code" %}

```python
set_1 = {4, 5, 6, 7, 8, 9}
set_2 = (1, 4, 3, 5, 6)

# method 1
new_set = set_1.union(set_2)
print(new_set)


# method 2
print(set_1 | set_2)
```

{% endtab %}

{% tab title="Output" %}

```python
# print(new_set)
{1, 3, 4, 5, 6, 7, 8, 9}


#print(set_1 | set_2)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-5ce450bc75fa> in <module>()
      8 
      9 # method 2
---> 10 print(set_1 | set_2)

TypeError: unsupported operand type(s) for |: 'set' and 'tuple'
```

{% endtab %}
{% endtabs %}

As you can see, the `union` runs successfully but `| operator` raises `TypeError`.

### intersection

set intersection is the elements that are only in both sets or the elements which are **overlapping**.

![Intersect of set1 and set 2 (set 1 ^ set 2 section only)](/files/-M36vuppxg-_GpqAicnv)

You can use **`intersection()`** method or **`& operator`** to get intersect of two sets.

{% tabs %}
{% tab title="Code" %}

```python
set_1 = {4, 5, 6, 7, 8, 9}
set_2 = {1, 4, 3, 5, 6}

# method 1
new_set = set_1.intersection(set_2)
print(new_set)


# method 2
new_set_2 = set_1 & set_2
print(new_set_2)
```

{% endtab %}

{% tab title="Output" %}

```python
# print(new_set)
{4, 5, 6}


# print(new_set_2)
{4, 5, 6}
```

{% endtab %}
{% endtabs %}

### **difference**

You can use **`difference()`** method or **`- operator`** to get intersect of two sets. The difference of set A and set B is a set of elements that are only present in set A but not set B. The difference of set B and set A is vice versa.

#### set 1 difference

{% tabs %}
{% tab title="Code" %}

```python
set_1 = {4, 5, 6, 7, 8, 9}
set_2 = {1, 4, 3, 5, 6}

# method 1
set_1_diff = set_1.difference(set_2)
print(set_1_diff)


# method 2
set_1_diff_op = set_1 - set_2
print(set_1_diff_op)
```

{% endtab %}

{% tab title="Output" %}

```python
# print(set_1_diff)
{7, 8, 9}


#print(set_1_diff_ops)
{7, 8, 9}
```

{% endtab %}
{% endtabs %}

#### set 2 difference

{% tabs %}
{% tab title="Code" %}

```python
set_1 = {4, 5, 6, 7, 8, 9}
set_2 = {1, 4, 3, 5, 6}

# method 1
set_2_diff = set_2.difference(set_1)
print(set_2_diff)


# method 2
set_2_diff_op = set_2 - set_1
print(set_2_diff_op)
```

{% endtab %}

{% tab title="Output" %}

```python
# print(set_2_diff)
{1, 3}


#print(set_2_diff_ops)
{1, 3}
```

{% endtab %}
{% endtabs %}

### symmetric difference

symmetric difference is a set that contains all the elements from set A and set B that is not shared. It can be seen as opposite of [intersection](/datascience/sets.md#intersection).

You can You can use `symmetric_difference()` method or `^ operator`.

{% tabs %}
{% tab title="Code" %}

```python
set_1 = {4, 5, 6, 7, 8, 9}
set_2 = {1, 4, 3, 5, 6}

# method 1
sym_diff = set_1.symmetric_difference(set_2)
print(sym_diff)


# method 2
sym_diff_op = set_1 ^ set_2
print(sym_diff_op)
```

{% endtab %}

{% tab title="Output" %}

```python
# print(sym_diff)
{1, 3, 7, 8, 9}


#print(sym_diff_ops)
{1, 3, 7, 8, 9}
```

{% endtab %}
{% endtabs %}

All set methods and operators above support multiple set [union](/datascience/sets.md#union), [intersection](/datascience/sets.md#intersection), [difference](/datascience/sets.md#difference) and [symmetric difference](/datascience/sets.md#symmetric-difference) when you are using methods and operators except symmetric difference method.

{% tabs %}
{% tab title="Code" %}

```python
set_1 = {4, 5, 6, 7, 8, 9}
set_2 = {1, 4, 3, 5, 6}
set_3 = {1, 5, 6, 10}

# method 1
sym_diff_op = set_1 ^ set_2 ^ set_3
print(sym_diff_op)


# method 2
sym_diff = set_1.symmetric_difference(set_2, set_3)
print(sym_diff)
```

{% endtab %}

{% tab title="Output" %}

```python
# print(sym_diff_op)
{3, 5, 6, 7, 8, 9, 10}


# print(sym_diff)
TypeError                                 Traceback (most recent call last)
<ipython-input-20-cc52d7471fff> in <module>()
      9 
     10 # method 2
---> 11 sym_diff = set_1.symmetric_difference(set_2, set_3)
     12 print(sym_diff)

TypeError: symmetric_difference() takes exactly one argument (2 given)
```

{% endtab %}
{% endtabs %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://zmnako.gitbook.io/datascience/sets.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
