Data structures with Go - Part I

Data Structures With Go - Part I

This page has been migrated to Medium

Data structures are everywhere. Every developer should know them, starting from the most common ones.

The data structure priorly describes how the data is organised, accessed, associated and processed.

Using data structures you can keep your data in memory and efficiently access them.

You should pick the data structure that is the most suitable for your purposes to minimize space in memory and access time.

Some algorithms are designed upon certain data structures.

The critical part of developing such data structures in Go is the lack of generics. While other programming languages such as Java, C# etc. allow us to work with generics, Go has no way to do that, so we have to use the empty interface: interface{}.

Let’s start with the most popular class of data structures: linear data structures.

A Linear Data Structure arranges the data into a sequence and follows some sort of order.

The linear data structure is a single level data structure while non-linear data structures are the multilevel data structure.

The most popular Linear Data Structures are:

Stack
Array
Linked List
Queue

In this post, we will learn those Linear Data Structures and we will develop them using Go.

The structures we will develop are only for educational purpose, you should not use them in production, also because some of them are already implemented in the Go’s standard library.

None of those structures is thread-safe. We will make them thread-safe in future posts (I wish).

Stack

Beyond cookies, a stack or LIFO (last in, first out) is an abstract data type that serves as a collection of elements, with two principal operations:

push: adds an element to the collection. This method requires O(1);
top: returns the last element that was added without removing it. Requires O(1);
pop: removes the last element that was added. Requires O(1);

In a stack both the operations of push and pop takes place at the same end that is top of the stack. It can be implemented by using both array and linked list.

You can think a stack as a… stack. Yeah! Maybe the stack of plates is the best metaphor when thinking about this data structure.

In a stack of plates you can (and should at least if you don’t want to break all your plates) access only the plate on the top. So you can put a plate on the top of the stack (push), you can take a plate on the top of the stack and get rid of it (pop) or you can observe the plate on the top of the stack without removing it (top).

Now let’s develop our stack using Go!

Let’s start defining our structure:

package stack

// Stack is the structure which contains our elements
type Stack struct {
	stack 	[]interface{}
	curInd 	int
}

The best way to implement a stack is by using an array and an int which indexes the top of the stack. To simplify stack’s operations we assume that when the index is -1 the stack is empty and when the stack is full we cannot insert other elements.

If you want to avoid the latter limitation, you should either control the size of the array dynamically or you should use another structure such as a linked list.

The following image shows the empty stack, i.e. the one with no elements and curInt equals to -1.

Empty stack

// NewStack returns a pointer to a new stack having:
// - the array initialized with the provided size
// - the current index initialized with -1 (empty stack)
func NewStack(size int) *Stack {
	return &Stack{
		stack: make([]interface{}, size),
		curInd: -1,
	}
}

// IsEmpty returns true if the stack is empty
func (s *Stack) IsEmpty() bool {
	return s.curInd < 0
}

The function NewStack takes an int as argument and returns a pointer to a stack having both array and top’s index initialized.

The method IsEmpty is quite simplistic: it returns true if curInd is less than 0, otherwise it returns false.

Now let’s dive into the core methods of the stack:


// Top returns the element at the top of the stack without removing it. 
// This method panics if the stack is empty
func (s *Stack) Top() interface{} {
	if s.IsEmpty() {
		panic("stack is empty")
	}
	return s.stack[s.curInd]
}

// Pop removes and returns the element at the top of the stack. 
// This method panics if the stack is empty
func (s *Stack) Pop() interface{} {
	if s.IsEmpty() {
		panic("stack is empty")
	}
	el := s.stack[s.curInd]
	s.stack[s.curInd] = nil
	s.curInd--
	return el
}

// Push inserts the element at the top of the stack. 
// This method panics if the stack is full
func (s *Stack) Push(el interface{}) {
	s.curInd++
	s.stack[s.curInd] = el
}

The method Top returns an interface but does not alter the stack. It first checks if the stack is empty, if so it panics, otherwise returns the element of the underlying array at the index specified by curInd.

The method Pop is similar to its relative Top but “deletes” the element at the top. Actually, this method stores the element at curInd in a temporary variable el, sets the location at curInd to nil (garbage collector will do the tough job) and decrements curInd, finally returns el.

The method Push does the opposite: it increments curInd and sets the element at location curInd of the stack array to be the element taken as the argument.

The following image shows a stack with a single element, so curInd is 0 (namely, the first location of the array):

One element stack

The following image shows a stack with two elements, so curInd is 1:

Two element stack

Array

Array is a data structure used to store homogeneous elements at contiguous locations. Size of an array can be provided before storing data or the array can dynamically adapt its size to keep all the elements.

Operations on the array are:

access: returns the element indexed by a given index i and requires O(1);
search: returns the index of a given element and requires O(n) in the worst case;
insertion: inserts an element in the array at a given index i. This method requires O(n) in the worst case, namely when insertion happens at the beginning of the array and requires shifting of all the elements;
deletion: deletes an element in the array at a given index i. This method requires O(n) in the worst case, namely when deletion happens at the beginning of the array and requires shifting of all the elements.

Since Go already has the array data structure, it does not make any sense to develop our array. However, the array of Go does not have the search method, so we’ll develop it!

package array

type SearchArray []interface{}

func (s SearchArray) Search(el interface{}) int {
	for i, e := range s {
		if e == el {
			return i
		}
	}
	return -1
}

The method Search is very simplistic: it iterates over the elements of the array and, if this element is equal to the element el passed as the argument, returns its index. If the method does not find the element el returns -1.

Linked List

A Linked List is a linear data structure (like arrays) where each element is a separate object. Each element, called node, of a list is comprising of two items: the data and a reference to the next node.

There are different kinds of Linked List, but we will explore only the most popular one: Single Linked List.

Operations on Linked Lists are:

access: accessing elements in a linked list requires O(n) in the worst case. Actually, to access element i of a linked list requires O(i);
search: searching elements in a linked list requires O(n) in the worst case, namely when the element we are looking for is the latter element of the linked list;
insertion: insertion of an element at the position i requires O(i);
deletion: deletion of an element at the position i requires O(i).

package linked_list

import (
	"fmt"
	"strings"
)

type Node struct {
	el 	 interface{}
	next *Node
}

type LinkedList struct {
	head 	*Node
}

The building blocks of our Linked List are Nodes. A node contains an element (el) and a reference to the next node (next) in the list. In that way we could iterate over our list starting from a node.

The LinkedList itself only contains a reference to a node which is the first of the list (head).

The following image explains the structure of a linked list:

Linked List

Now let’s dive into its methods:


func (l LinkedList) Get(i int) interface{} {
	cur := l.head
	for pos := 0; pos < i; pos++ {
		if cur == nil {	// limit exceeded
			return nil
		}
		cur = cur.next
	}
	return cur.el
}

func (l LinkedList) Search(e interface{}) int {
	pos := 0
	cur := l.head
	for cur.el != e {
		cur = cur.next
		pos++
	}
	return pos
}

Get and Search are quite similar: we start from the head and move to the next node until we reach the right position or find the element we are looking for. A temporary variable called cur (by convention) initially points to head, then to its next, then to the next’s next and so on.

Add


func (l *LinkedList) Add(i int, e interface{}) {
	n := Node{el: e}
	if i == 0 {
		n.next = l.head
		l.head = &n
		return
	}
	prev := l.head
	cur := l.head.next
	pos := 1
	for pos < i {
		prev = prev.next
		cur = cur.next
		pos++
	}
	prev.next = &n
	n.next = cur
}

The Add method requires a detailed explanation. We first create a new node n putting the element e into it. Then we have to discriminate between two cases:

if i == 0 we have to change the head of the list, so n points to the old head and l.head points to n;
if i > 0 have to keep two variables: prev which is the head initially and cur which is the prev’s next. We must keep both prev and cur because the insertion of a node requires to change both. pos is the current position and starts to 1 (yes, because we know that i > 0). Until pos < i, we advance both prev and cur making them pointing to their nexts and incrementing pos. When pos is equal to i we update prev’s next to point to n and n’s next to point to cur.

The following images may clarify this concept:

Our Linked List before Add:

Add Linked List

Our Linked List after Add an element in position 1:

Add Linked List

Delete


func (l *LinkedList) Delete(i int) {
	if i == 0 {	// deleting the head
		l.head = l.head.next
		return
	}
	prev := l.head
	cur := l.head.next
	pos := 1
	for pos < i {
		prev = prev.next
		cur = cur.next
		pos++
	}
	prev.next = cur.next
}

The Delete method is quite similar to Add but does the opposite. The elimination of the variable from the memory is up to the garbage collector.

if i == 0, l.head points to l.head.next;
if i > 0, we iterate until we reach the desired position, then we make prev.next to point to cur.next: doing that we ensure that the element to be removed will never be pointed again by our list.

The following images may clarify this concept:

Our Linked List before Delete:

Add Linked List

Our Linked List after Delete an element in position 1:

Add Linked List

Bonus: String


func (l LinkedList) String() string {
	sb := strings.Builder{}
	sb.WriteString("[ ")
	for cur := l.head; cur != nil; cur = cur.next {
		sb.WriteString(fmt.Sprintf("%v ", cur.el))
	}
	sb.WriteByte(']')
	return sb.String()
}

The String method shows how to iterate over a list using a for loop.

Queue

A queue or FIFO (first in, first out) is an abstract data type that serves as a collection of elements, with two principal operations:

enqueue: the process of adding an element to the collection. The element is added from the rear side. This method requires O(1);
dequeue: the process of removing the first element that was added. The element is removed from the front side. Requires O(1).

It can be implemented by using both array and linked list. Our implementation will be based on a Linked List.

package queue

import (
	"fmt"
	"strings"
)

type Node struct {
	el interface{}
	next *Node
}

type Queue struct {
	head *Node
	tail *Node
}

The implementation for Queue is obviously similar to the one for LinkedList but we also store another pointer to the tail which is the side from which we insert new elements.

func (q *Queue) Enqueue(e interface{}) {
	n := Node{el: e}
	if q.tail == nil {
		q.head = &n
		q.tail = &n
		return
	}
	q.tail.next = &n
	q.tail = &n
}

The Enqueue method first creates a new Node n, then, if q.tail == nil (the queue is empty) makes both q.head and q.tail to point to the new node, otherwise makes both q.tail and q.tail.next to point to the new node.

It is worth noting that q.tail is the latter inserted node, so its next previously pointed to nil.

func (q *Queue) Dequeue() interface{}{
	if q.head == nil {
		return nil
	}
	n := q.head
	q.head = q.head.next
	if q.head == nil {
		q.tail = nil
	}
	return n.el
}

The Dequeue method first checks if q.head == nil, so we have an empty queue and it must return nil, otherwise it stores the q.head value to a temporary variable n, checks if its next pointer is nil (single element queue) (if so makes q.tail pointing to nil to have an empty queue). Finally it returns the el value of n.

Conclusions

Different data structures bring different advantages. You should know when to pick one rather than another and this knowledge comes with experience. However, it is not hard to understand the benefits that come with those simple data structures:

Stacks are great when you have to remember the steps you did. In fact, you can store (push) your steps in a stack and go back the same way (pop);
Arrays are great when you need random access. In fact, they require only O(1) for retrieving an element at any desired position;
LinkedLists are great when you need to insert or remove elements the front. By contrast, arrays are not suitable for those operations because they have to move most of the elements;
Queues are great when you have to memorize elements and consume them later keeping the order you produced them.

I wish this post would help beginner developers using the correct data structure for their algorithms.

“It is easier to change the specification to fit the program than vice versa.”

Alan Perlis