Beginner Codes
C Beginner Codes
Hello again! We will now get again into Programming and talk about Hashing and Hashtables. I will split the talk in two posts, the first (this one) will contain the Theory behind Hashing and the implementation using Chaining in C and the second one will contain an implementation of Hashtables using Linear Probing! So, why wait any more? Let’s get started!
Hashtables are another Datastructure beside Arrays, Lists and Trees that we talked off earlier. It is used when we want to quickly find a data record given a search key. The records get stored to the specific index the Hashfunction gave for the specific value. We mostly implement them using Static Arrays (that become greating with Rehashing if needed) and use simple modulo hashfunctions (XmodN) so that the values get stored depending on their modulo with the length of the Array. That way we know exactly were the value was stored and can get it much faster instead of having to search a whole Array, List or Tree!
The problem comes when there already is a value stored inside of the index that the value belongs to. Then we have a Collision that can be handled in many different ways! We will have to use another way of finding the index in which the value will be put in. Collision bring also a slight performance loss depending on how we solve our problem.
When two many values belong to an specific index we form clusters of records. Clustering is not good, cause it decreases the search time! Think about an Hashtable of 10 Indexes and that we store 5 values that are for example 5, 15, 25, 35, 45. Let’s say we solve the Collisions by putting the values one index after (Linear Probing, more into that later on) so that we put all in the Hashtable. When we now want to search for 45 we will now have to start off from 5 and then go to 15, 25, 35 and lastly to 45. What happens if we then want to put for example another value in that has a hashing index that is where we put the 5 Colliders in? Well, than it will be put at the end of this Cluster and the performance fells for another key. So, you can see directly that we lost the quick search and that the performance fell dramatically forming those Clusters.
But, what to we do when Collisions happen? The answers is we use specific Collision Handling Techniques that are good in specific datasets. When for example many values fall into the same key, some techniques may have better result then others. As we saw before the Collision Handling using Linear Probing was not so effective. Let’s now get into which Collision Handling Techniques are available and easy to setup!
In Code we will talk only about 2 of those Techniques, but here are those and some others that may come handy in your Hashtable datasets.
We split those in 2 Categories that are:
Let’s get more in depth in all of those:
Lastly I want to give some Tipps that help us make our Hashing Function better.
So, after all that Theory let’s now get into Chaining!
Chaining is pretty easy to implement. We have to create an Array whose indexes point to different Lists. We will use a simple modulo hashfunction with an prime number for that and the size of the Hashtable that can be set as an define parameter! I will use all the List stuff we already talked about. To insert we insert the value to the corresponding List. To search we simply have to search the specific List for the value. I will put the whole Code and some Comments for you to understand it better!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define N 31 // prime number for size of array
#define h(x) (x % m) //h(x) = x mod m
// Node for List
typedef struct Node{
int val;
struct Node *next;
}Node;
//function's declaration
Node *fill_table(Node *table, int table_range, int number);
Node *insert(Node *table, int hash_index, int val);
Node *del(Node *table,int table_range,int val);
void print_table(Node *table, int table_range);
void search_table(Node *table, int table_range, int val);
int main(){
Node *table; // hashtable
int n, i, j, choice, search;
int hash_num, val;
// allocate table
table = (Node*) malloc(N*sizeof(Node));
//make table "heads" NULL
for(i = 0; i < N; i++){
table[i].next = NULL;
}
printf("--h(x) = xmod%d--\n",N);
printf("\n\n");
while(1){
printf("1.Insert random numbers\n");
printf("2.Delete a number\n");
printf("3.Search a number\n");
printf("4.Show Hash Table\n");
printf("0.Exit Programm\n");
printf("\n--------\n");
printf("Choice: ");
scanf("%d",&choice);
switch(choice){
case 0: return;
case 1:
// insert random numbers
printf("Lot to insert: ");
scanf("%d", &n);
table = fill_table(table, N, n);
printf("Filled HashTable with %d random values\n", n);
printf("\n--------\n");
break;
case 2:
// delete a number
printf("Give a number: ");
scanf("%d",&search);
table = del(table, N, search);
printf("\n--------\n");
break;
case 3:
// search for a number
printf("Give a number: ");
scanf("%d",&search);
search_table(table, N, search);
printf("\n--------\n");
break;
case 4:
//print hashtable
printf("-HASHTABLE-\n\n");
print_table(table, N);
printf("\n--------\n");
break;
}
}
return 0;
}
// FUNCTIONS
Node *fill_table(Node *table, int table_range, int number){
int i;
int num;
for(i=0; i<number; i++){
num = rand()%10000; // random number from 0 to 9999
table = insert(table, num % table_range, num);
}
return table;
}
void print_table(Node *table, int table_range){
int i;
Node* cur;
for(i = 0; i < table_range; i++){ // for each list
if(table[i].next == NULL){ // if empty
printf("Table[%d] is empty!\n", i);
continue;
}
cur = table[i].next;
printf("Table[%d]", i);
while(cur!=NULL){ // else print all the values
printf("->%d",cur->val);
cur=cur->next; //cur=cur+1;
}
printf("\n");
}
}
Node *insert(Node *table, int hash_index, int val){
Node *nn, *cur;
nn = (Node*)malloc(sizeof(Node));
nn->val=val;
nn->next=NULL;
if(table[hash_index].next == NULL){
table[hash_index].next = nn;
return table;
}
cur = table[hash_index].next;
while(cur->next != NULL){
cur=cur->next; //cur=cur+1;
}
cur->next = nn;
return table;
}
void search_table(Node *table, int table_range, int val){
int index = val % table_range;
Node *cur;
if(table[index].next == NULL){ // if empty
printf("Value %d not found cause Table[%d] is emtpy!\n", val,index);
return;
}
cur = table[index].next;
while(cur!=NULL){
if(cur->val == val){
printf("Value %d was found!\n", val);
return;
}
cur = cur->next;
}
printf("Value %d not found in Hashtable!\n", val);
}
Node *del(Node *table, int table_range, int val){
int index = val % table_range;
Node *prev;
if(table[index].next == NULL){ // if empty
printf("Value %d not found cause Table[%d] is emtpy!\n", val,index);
return table;
}
if(table[index].next->val == val){ // if first item
printf("Value %d was found at table[%d] and Deleted!\n", val,index);
table[index].next = table[index].next->next;
return table;
}
prev = table[index].next;
while(prev->next!=NULL){
if(prev->next->val == val){
prev->next = prev->next->next;
printf("Value %d was found at table[%d] and Deleted!\n", val,index);
return table;
}
prev = prev->next;
}
printf("Value %d was not found in Hashtable!\n", val);
return table;
}
An Example run looks like this:
Hope you enjoyed it!
I will upload the Linear Probing Implementation also, later in the day!
See ya later!
C Beginner Codes
C Beginner Arrays
C Pointers, Strings and Files
C Dynamic Memory Allocation
C Structs and Switch Case
C Recursive Algorithms
C Linked Lists
C Binary Trees
C Queues using Arrays
C Stacks using Arrays
C Queues using Linked Lists
C Stacks using Linked Lists
C Advanced Lists and Queues
C Advanced Trees
C Stack-Queue Exercise using Dynamic Arrays
C Stack-Queue Exercise using Linked Lists
C Hashtables with Chaining
C Hashtables with Linear Probing
C Can I Run A Dual Monitor Setup
C Function Comparison