12 minutes
Data structures with Go - Part I
Data Structures With Go - Part I
This page has been migrated to Medium
Data structures are everywhere. Every developer should know them, starting from the most common ones.
The data structure priorly describes how the data is organised, accessed, associated and processed.
Using data structures you can keep your data in memory and efficiently access them.
You should pick the data structure that is the most suitable for your purposes to minimize space in memory and access time.
Some algorithms are designed upon certain data structures.
The critical part of developing such data structures in Go is the lack of generics. While other programming languages such as Java, C# etc. allow us to work with generics, Go has no way to do that, so we have to use the empty interface: interface{}
.
Let’s start with the most popular class of data structures: linear data structures.
A Linear Data Structure arranges the data into a sequence and follows some sort of order.
The linear data structure is a single level data structure while non-linear data structures are the multilevel data structure.
The most popular Linear Data Structures are:
In this post, we will learn those Linear Data Structures and we will develop them using Go.
The structures we will develop are only for educational purpose, you should not use them in production, also because some of them are already implemented in the Go’s standard library.
None of those structures is thread-safe. We will make them thread-safe in future posts (I wish).
Stack
Beyond cookies, a stack or LIFO (last in, first out) is an abstract data type that serves as a collection of elements, with two principal operations:
- push: adds an element to the collection. This method requires O(1);
- top: returns the last element that was added without removing it. Requires O(1);
- pop: removes the last element that was added. Requires O(1);
In a stack both the operations of push and pop takes place at the same end that is top of the stack. It can be implemented by using both array and linked list.
You can think a stack as a… stack. Yeah! Maybe the stack of plates is the best metaphor when thinking about this data structure.
In a stack of plates you can (and should at least if you don’t want to break all your plates) access only the plate on the top. So you can put a plate on the top of the stack (push), you can take a plate on the top of the stack and get rid of it (pop) or you can observe the plate on the top of the stack without removing it (top).
Now let’s develop our stack using Go!
Let’s start defining our structure:
package stack
// Stack is the structure which contains our elements
type Stack struct {
stack []interface{}
curInd int
}
The best way to implement a stack is by using an array and an int
which indexes the top of the stack.
To simplify stack’s operations we assume that when the index is -1
the stack is empty and when the stack is full we cannot insert other elements.
If you want to avoid the latter limitation, you should either control the size of the array dynamically or you should use another structure such as a linked list.
The following image shows the empty stack, i.e. the one with no elements and curInt
equals to -1
.
// NewStack returns a pointer to a new stack having:
// - the array initialized with the provided size
// - the current index initialized with -1 (empty stack)
func NewStack(size int) *Stack {
return &Stack{
stack: make([]interface{}, size),
curInd: -1,
}
}
// IsEmpty returns true if the stack is empty
func (s *Stack) IsEmpty() bool {
return s.curInd < 0
}
The function NewStack
takes an int
as argument and returns a pointer to a stack having both array and top’s index initialized.
The method IsEmpty
is quite simplistic: it returns true
if curInd
is less than 0
, otherwise it returns false.
Now let’s dive into the core methods of the stack:
// Top returns the element at the top of the stack without removing it.
// This method panics if the stack is empty
func (s *Stack) Top() interface{} {
if s.IsEmpty() {
panic("stack is empty")
}
return s.stack[s.curInd]
}
// Pop removes and returns the element at the top of the stack.
// This method panics if the stack is empty
func (s *Stack) Pop() interface{} {
if s.IsEmpty() {
panic("stack is empty")
}
el := s.stack[s.curInd]
s.stack[s.curInd] = nil
s.curInd--
return el
}
// Push inserts the element at the top of the stack.
// This method panics if the stack is full
func (s *Stack) Push(el interface{}) {
s.curInd++
s.stack[s.curInd] = el
}
The method Top
returns an interface but does not alter the stack. It first checks if the stack is empty, if so it panics, otherwise returns the element of the underlying array at the index specified by curInd
.
The method Pop
is similar to its relative Top
but “deletes” the element at the top. Actually, this method stores the element at curInd
in a temporary variable el
, sets the location at curInd
to nil
(garbage collector will do the tough job) and decrements curInd
, finally returns el
.
The method Push
does the opposite: it increments curInd
and sets the element at location curInd
of the stack
array to be the element taken as the argument.
The following image shows a stack with a single element, so curInd
is 0
(namely, the first location of the array):
The following image shows a stack with two elements, so curInd
is 1
:
Array
Array is a data structure used to store homogeneous elements at contiguous locations. Size of an array can be provided before storing data or the array can dynamically adapt its size to keep all the elements.
Operations on the array are:
- access: returns the element indexed by a given index i and requires O(1);
- search: returns the index of a given element and requires O(n) in the worst case;
- insertion: inserts an element in the array at a given index i. This method requires O(n) in the worst case, namely when insertion happens at the beginning of the array and requires shifting of all the elements;
- deletion: deletes an element in the array at a given index i. This method requires O(n) in the worst case, namely when deletion happens at the beginning of the array and requires shifting of all the elements.
Since Go already has the array
data structure, it does not make any sense to develop our array. However, the array
of Go does not have the search
method, so we’ll develop it!
package array
type SearchArray []interface{}
func (s SearchArray) Search(el interface{}) int {
for i, e := range s {
if e == el {
return i
}
}
return -1
}
The method Search
is very simplistic: it iterates over the elements of the array and, if this element is equal to the element el
passed as the argument, returns its index. If the method does not find the element el
returns -1
.
Linked List
A Linked List is a linear data structure (like arrays) where each element is a separate object. Each element, called node, of a list is comprising of two items: the data and a reference to the next node.
There are different kinds of Linked List, but we will explore only the most popular one: Single Linked List.
Operations on Linked Lists are:
- access: accessing elements in a linked list requires O(n) in the worst case. Actually, to access element i of a linked list requires O(i);
- search: searching elements in a linked list requires O(n) in the worst case, namely when the element we are looking for is the latter element of the linked list;
- insertion: insertion of an element at the position i requires O(i);
- deletion: deletion of an element at the position i requires O(i).
package linked_list
import (
"fmt"
"strings"
)
type Node struct {
el interface{}
next *Node
}
type LinkedList struct {
head *Node
}
The building blocks of our Linked List are Nodes. A node contains an element (el
) and a reference to the next node (next
) in the list. In that way we could iterate over our list starting from a node.
The LinkedList
itself only contains a reference to a node which is the first of the list (head
).
The following image explains the structure of a linked list:
Now let’s dive into its methods:
func (l LinkedList) Get(i int) interface{} {
cur := l.head
for pos := 0; pos < i; pos++ {
if cur == nil { // limit exceeded
return nil
}
cur = cur.next
}
return cur.el
}
func (l LinkedList) Search(e interface{}) int {
pos := 0
cur := l.head
for cur.el != e {
cur = cur.next
pos++
}
return pos
}
Get
and Search
are quite similar: we start from the head and move to the next node until we reach the right position or find the element we are looking for. A temporary variable called cur
(by convention) initially points to head
, then to its next
, then to the next
’s next
and so on.
Add
func (l *LinkedList) Add(i int, e interface{}) {
n := Node{el: e}
if i == 0 {
n.next = l.head
l.head = &n
return
}
prev := l.head
cur := l.head.next
pos := 1
for pos < i {
prev = prev.next
cur = cur.next
pos++
}
prev.next = &n
n.next = cur
}
The Add
method requires a detailed explanation. We first create a new node n
putting the element e
into it. Then we have to discriminate between two cases:
- if
i == 0
we have to change the head of the list, son
points to the old head andl.head
points ton
; - if
i > 0
have to keep two variables:prev
which is the head initially andcur
which is theprev
’s next. We must keep bothprev
andcur
because the insertion of a node requires to change both.pos
is the current position and starts to1
(yes, because we know thati > 0
). Untilpos < i
, we advance bothprev
andcur
making them pointing to theirnext
s and incrementingpos
. Whenpos
is equal toi
we updateprev
’snext
to point ton
andn
’snext
to point tocur
.
The following images may clarify this concept:
Our Linked List before Add
:
Our Linked List after Add
an element in position 1
:
Delete
func (l *LinkedList) Delete(i int) {
if i == 0 { // deleting the head
l.head = l.head.next
return
}
prev := l.head
cur := l.head.next
pos := 1
for pos < i {
prev = prev.next
cur = cur.next
pos++
}
prev.next = cur.next
}
The Delete
method is quite similar to Add
but does the opposite. The elimination of the variable from the memory is up to the garbage collector.
- if
i == 0
,l.head
points tol.head.next
; - if
i > 0
, we iterate until we reach the desired position, then we makeprev.next
to point tocur.next
: doing that we ensure that the element to be removed will never be pointed again by our list.
The following images may clarify this concept:
Our Linked List before Delete
:
Our Linked List after Delete
an element in position 1
:
Bonus: String
func (l LinkedList) String() string {
sb := strings.Builder{}
sb.WriteString("[ ")
for cur := l.head; cur != nil; cur = cur.next {
sb.WriteString(fmt.Sprintf("%v ", cur.el))
}
sb.WriteByte(']')
return sb.String()
}
The String
method shows how to iterate over a list using a for
loop.
Queue
A queue or FIFO (first in, first out) is an abstract data type that serves as a collection of elements, with two principal operations:
- enqueue: the process of adding an element to the collection. The element is added from the rear side. This method requires O(1);
- dequeue: the process of removing the first element that was added. The element is removed from the front side. Requires O(1).
It can be implemented by using both array and linked list. Our implementation will be based on a Linked List.
package queue
import (
"fmt"
"strings"
)
type Node struct {
el interface{}
next *Node
}
type Queue struct {
head *Node
tail *Node
}
The implementation for Queue
is obviously similar to the one for LinkedList
but we also store another pointer to the tail
which is the side from which we insert new elements.
func (q *Queue) Enqueue(e interface{}) {
n := Node{el: e}
if q.tail == nil {
q.head = &n
q.tail = &n
return
}
q.tail.next = &n
q.tail = &n
}
The Enqueue
method first creates a new Node n
, then, if q.tail == nil
(the queue is empty) makes both q.head
and q.tail
to point to the new node, otherwise makes both q.tail
and q.tail.next
to point to the new node.
It is worth noting that q.tail
is the latter inserted node, so its next
previously pointed to nil
.
func (q *Queue) Dequeue() interface{}{
if q.head == nil {
return nil
}
n := q.head
q.head = q.head.next
if q.head == nil {
q.tail = nil
}
return n.el
}
The Dequeue
method first checks if q.head == nil
, so we have an empty queue and it must return nil
, otherwise it stores the q.head
value to a temporary variable n
, checks if its next
pointer is nil (single element queue) (if so makes q.tail
pointing to nil to have an empty queue). Finally it returns the el
value of n
.
Conclusions
Different data structures bring different advantages. You should know when to pick one rather than another and this knowledge comes with experience. However, it is not hard to understand the benefits that come with those simple data structures:
- Stacks are great when you have to remember the steps you did. In fact, you can store (push) your steps in a stack and go back the same way (pop);
- Arrays are great when you need random access. In fact, they require only O(1) for retrieving an element at any desired position;
- LinkedLists are great when you need to insert or remove elements the front. By contrast, arrays are not suitable for those operations because they have to move most of the elements;
- Queues are great when you have to memorize elements and consume them later keeping the order you produced them.
I wish this post would help beginner developers using the correct data structure for their algorithms.
“It is easier to change the specification to fit the program than vice versa.”
Alan Perlis