Commonly Used Data Structures And Their Explanations
It will take approximately 4 minutes to read this article.
Smart Data Structures and dumb code works a lot better than the other way around.
Eric S. Raymond
Bad programmers worry about the code. Good programmers worry about data structures and their relationships.
It is very believable to guess in almost every programming interview, you guys had been asked to solve a problem using a data structure which is chosen by you to implement in your solution in best way.
Why all this noise. What is this shit and why this shit is so important.
Those quotes that I’ve written above are belongs maybe the two of the most important persons in the list that we have so far which is created for determining the most important people in the history of computers.
In computer science, a data structure is a particular way of organizing data in a computer so that it can be used efficiently.
The good news is that they’re basically just specialized formats for organizing and storing data.
In this post, mostly used Data Structure types will be examined for you guys to understand them in details, with examples.
If you are a newbie in Data Structures
Before diving into rest of the article, I would suggest you to organize a notebook which is going to have projects in it. And after finishing this article, try to use all of these data structures in the projects, in a way as professional as you can.
Knowing how to implement these data structures will give you a huge edge in your developer job search, and may come in handy when you’re trying to write high-performance code.
So Let’s start:
A linked list is one of the most basic data structure. In computer science, a linked list is a linear collection of data elements, called nodes, each pointing to the next node by means of a pointer.It is often compared to an array since many other data structures can be implemented with either an array or a linked list. They each have advantages and disadvantages.
Linked list representation
A linked list consists of a group of nodes which together represent a sequence. Each node contains two values: the actual data being stored (which can be basically any type of data) and a pointer (or link) to the next node in the sequence. There are also doubly linked lists where each node has a pointer to both the next item and the previous item in the list.
Adding an item, deleting an item and searching for an item in linked list are the basic operations that we can do on a linked list.
What you need to know:
- Designed to optimize insertion and deletion, slow at indexing and searching.
- Doubly linked list has nodes that reference the previous node.
- Circularly linked list is simple linked list whose tail, the last node, references the head, the first node.
I would suggest you to make some examples with linked list on your own. You can
- create a linked list class,
- add some items into it,
- delete some of the items,
- search for a specific item,
- change your linked list to doubly linked list,
- again adding, deleting and searching mechanisms,
- reversing your newly created doubly linked list can work for you.
Stacks are a member of Data Structures and have the Abstract Data Types, that used in most programming languages. They had LIFO(Last in First Out)algorithm, which allows you to use just top element of the Stack. Usually, Push and Pop functions has wide usage in the Stacks. Push function works for insertion, and Pop funtions works for removing the top object.
There are three main operations that can be performed on stacks:
- inserting an item into a stack (called ‘push’),
- deleting an item from the stack (called ‘pop’),
- displaying the contents of the stack (sometimes called ‘pip’).
A queue is a basic data structure that is used throughout programming. You can think of it as a line in a grocery store. The first one in the line is the first one to be served. Just like a queue. A queue is also called a FIFO (First In First Out) to demonstrate the way it accesses data.
This means that once a new element is added, all elements that were added before have to be removed before the new element can be removed.
A queue has just two main operations: enqueue and dequeue. Enqueue means to insert an item into the back of the queue and dequeue means removing the front item.
A set is an abstract data type that can store certain values, without any particular order, and no repeated values. Besides being able to add and remove elements to a set, there are a few other important set functions that work with two sets at once.
- Union — This combines all the items from two different sets and returns this as a new set (with no duplicates).
- Intersection — Given two sets, this function returns another set that has all items that are part of both sets.
- Difference — This returns a list of items that are in one set but NOT in a different set.
- Subset — This returns a boolean value that shows if all the elements in one set are included in a different set.
A map is a data structure and its majorly used for fast look ups or searching data. It stores data in the form of key and value pairs where every key is unique. Each key here maps to a value and hence the name map ! Maps allow the following things:
- the addition of a pair to the collection
- the removal of a pair from the collection
- the modification of an existing pair
- the lookup of a value associated with a particular key
Hash table and hash function representation
a hash table (hash map) is a data structure which implements an associative array abstract data type, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.
The hash function usually takes a string as input and it outputs an numerical value. The hash function should always give the same output number for the same input. When two inputs hash to the same numerical output, this is called a collision. The goal is to have few collisions.
So when you input a key / value pair into a hash table, the key is run through the hash function and turned into a number. This numerical value is then used as the actual key that the value is stored by. When you try to access the same key again, the hashing function will process the key abd return the same numerical result. The number will then be used to look up the associated value. This provides very efficient O(1) lookup time on average.
Binary Search Tree
Binary search tree
A tree is a data structure composed of nodes.
A Binary Search Tree (BST) is a tree in which all the nodes follow the below-mentioned properties −
- The left sub-tree of a node has a key less than or equal to its parent node’s key.
- The right sub-tree of a node has a key greater than to its parent node’s key.
Thus, BST divides all its sub-trees into two segments; the left sub-tree and the right sub-tree.
It has the following characteristics:
- Each tree has a root node (at the top).
- The root node has zero or more child nodes.
- Each child node has zero or more child nodes, and so on.
A binary search tree adds these two characteristics:
- Each node has up to two children.
- For each node, its left descendents are less than the current node, which is less than the right descendents.
Binary search trees allow fast lookup, addition, and removal of items. The way that they are set up means that, on average, each comparison allows the operations to skip about half of the tree so that each lookup, insertion or deletion takes time proportional to the logarithm of the number of items stored in the tree.
I would suggest you to make some examples as below:
- Create a BST
- Add as many elements as you can to the tree
- Delete some of the elements
- Find a specific element in BST
- Use Depth First Search in BST
- Use Breadth First Search in BST
- Find minimum and maximum height of a BST
- It’s a complete tree (All levels are completely filled except possibly the last level and the last level has all keys as left as possible). This property of Binary Heap makes them suitable to be stored in an array.
- A Binary Heap is either Min Heap or Max Heap. In a Min Binary Heap, the key at root must be minimum among all keys present in Binary Heap. The same property must be recursively true for all nodes in Binary Tree. Max Binary Heap is similar to Min Heap.
Min and max heap representations
A graph data structure consists of a finite (and possibly mutable) set of vertices or nodes or points, together with a set of unordered pairs of these vertices for an undirected graph or a set of ordered pairs for a directed graph.
One example of graphs is a social network. The nodes are people and the edges are friendship.
There are two major types of graphs: directed and undirected. Undirected graphs are graphs without any direction on the edges between nodes. Directed graphs, in contrast, are graphs with a direction in its edges.
Two common ways to represent a graph are an adjacency list and an adjacency matrix.
Adjacency matrix graph
An adjacency list can be represented as a list where the left side is the node and the right side lists all the other nodes it’s connected to.
An adjacency matrix is a grid of numbers, where each row or column represents a different node in the graph. At the intersection of a row and a column is a number that indicates the relationship. Zeros mean there is no edge or relationship. Ones mean there is a relationship. Numbers higher than one can be used to show different weights.
When it comes to complexity, there is an amazing chart which shows all the Big-O notations of the data structures as shown below:
And also, I would courage you to walk around in this site a bit.