What is hashing?
To hash means to grind up, and that’s essentially what hashing is all about. The heart of a hashing algorithm is a hash function that takes your nice, neat data and grinds it into some random-looking integer.
The idea behind hashing is that some data either has no inherent ordering (such as images) or is expensive to compare (such as images). If the data has no inherent ordering, you can’t perform comparison searches.
If the data is expensive to compare, the number of comparisons used even by a binary search might be too many. So instead of looking at the data themselves, you’ll condense (hash) the data to an integer (its hash value) and keep all the data with the same hash value in the same place. This task is carried out by using the hash value as an index into an array.
To search for an item, you simply hash it and look at all the data whose hash values match that of the data you’re looking for. This technique greatly lessens the number of items you have to look at. If the parameters are set up with care and enough storage is available for the hash table, the number of comparisons needed to find an item can be made arbitrarily close to one.
One aspect that affects the efficiency of a hashing implementation is the hash function itself. It should ideally distribute data randomly throughout the entire hash table, to reduce the likelihood of collisions. Collisions occur when two different keys have the same hash value. There are two ways to resolve this problem. In open addressing, the collision is resolved by the choosing of another position in the hash table for the element inserted later. When the hash table is searched, if the entry is not found at its hashed position in the table, the search continues checking until either the element is found or an empty position in the table is found
The second method of resolving a hash collision is called chaining. In this method, a bucket or linked list holds all the elements whose keys hash to the same value. When the hash table is searched, the list must be searched linearly.
What is the quickest sorting method to use?
The answer depends on what you mean by quickest. For most sorting problems, it just doesn’t matter how quick the sort is because it is done infrequently or other operations take significantly more time anyway. Even in cases in which sorting speed is of the essence, there is no one answer. It depends on not only the size and nature of the data, but also the likely order. No algorithm is best in all cases.
There are three sorting methods in this author’s toolbox that are all very fast and that are useful in different situations. Those methods are quick sort, merge sort, and radix sort.
The Quick Sort
The quick sort algorithm is of the divide and conquer type. That means it works by reducing a sorting problem into several easier sorting problems and solving each of them. A dividing value is chosen from the input data, and the data is partitioned into three sets: elements that belong before the dividing value, the value itself, and elements that come after the dividing value. The partitioning is performed by exchanging elements that are in the first set but belong in the third with elements that are in the third set but belong in the first Elements that are equal to the dividing element can be put in any of the three setsthe algorithm will still work properly.
The Merge Sort
The merge sort is a divide and conquer sort as well. It works by considering the data to be sorted as a sequence of already-sorted lists (in the worst case, each list is one element long). Adjacent sorted lists are merged into larger sorted lists until there is a single sorted list containing all the elements. The merge sort is good at sorting lists and other data structures that are not in arrays, and it can be used to sort things that don’t fit into memory. It also can be implemented as a stable sort.
The Radix Sort
The radix sort takes a list of integers and puts each element on a smaller list, depending on the value of its least significant byte. Then the small lists are concatenated, and the process is repeated for each more significant byte until the list is sorted. The radix sort is simpler to implement on fixed-length data such as ints.
when should the volatile modifier be used?
The volatile modifier is a directive to the compiler’s optimizer that operations involving this variable should not be optimized in certain ways. There are two special cases in which use of the volatile modifier is desirable. The first case involves memory-mapped hardware (a device such as a graphics adaptor that appears to the computer’s hardware as if it were part of the computer’s memory), and the second involves shared memory (memory used by two or more programs running simultaneously).
Most computers have a set of registers that can be accessed faster than the computer’s main memory. A good compiler will perform a kind of optimization called redundant load and store removal. The compiler looks for places in the code where it can either remove an instruction to load data from memory because the value is already in a register, or remove an instruction to store data to memory because the value can stay in a register until it is changed again anyway.
If a variable is a pointer to something other than normal memory, such as memory-mapped ports on a peripheral, redundant load and store optimizations might be detrimental. For instance, here’s a piece of code that might be used to time some operation:
time_t time_addition(volatile const struct timer *t, int a)
x = 0;
then = t->value;
for (n = 0; n < x =" x">value - then;
In this code, the variable t-> value is actually a hardware counter that is being incremented as time passes. The function adds the value of a to x 1000 times, and it returns the amount the timer was incremented by while the 1000 additions were being performed. Without the volatile modifier, a clever optimizer might assume that the value of t does not change during the execution of the function, because there is no statement that explicitly changes it. In that case, there’s no need to read it from memory a second time and subtract it, because the answer will always be 0. The compiler might therefore optimize the function by making it always return 0.
If a variable points to data in shared memory, you also don’t want the compiler to perform redundant load and store optimizations. Shared memory is normally used to enable two programs to communicate with each other by having one program store data in the shared portion of memory and the other program read the same portion of memory. If the compiler optimizes away a load or store of shared memory, communication between the two programs will be affected.
When should the register modifier be used? Does it really help?
The register modifier hints to the compiler that the variable will be heavily used and should be kept in the CPU’s registers, if possible, so that it can be accessed faster.
There are several restrictions on the use of the register modifier.
First, the variable must be of a type that can be held in the CPU’s register. This usually means a single value of a size less than or equal to the size of an integer. Some machines have registers that can hold floating-point numbers as well.
Second, because the variable might not be stored in memory, its address cannot be taken with the unary & operator. An attempt to do so is flagged as an error by the compiler. Some additional rules affect how useful the register modifier is. Because the number of registers is limited, and because some registers can hold only certain types of data (such as pointers or floating-point numbers), the number and types of register modifiers that will actually have any effect are dependent on what machine the program will run on. Any additional register modifiers are silently ignored by the compiler.
Also, in some cases, it might actually be slower to keep a variable in a register because that register then becomes unavailable for other purposes or because the variable isn’t used enough to justify the overhead of loading and storing it.
So when should the register modifier be used? The answer is never, with most modern compilers. Early C compilers did not keep any variables in registers unless directed to do so, and the register modifier was a valuable addition to the language. C compiler design has advanced to the point, however, where the compiler will usually make better decisions than the programmer about which variables should be stored in registers.
In fact, many compilers actually ignore the register modifier, which is perfectly legal, because it is only a hint and not a directive.
How can you determine the size of an allocated portion of memory?
You can’t, really. free() can , but there’s no way for your program to know the trick free() uses. Even if you disassemble the library and discover the trick, there’s no guarantee the trick won’t change with the next release of the compiler.
Can static variables be declared in a header file?
You can’t declare a static variable without defining it as well (this is because the storage class modifiers static and extern are mutually exclusive). A static variable can be defined in a header file, but this would cause each source file that included the header file to have its own private copy of the variable, which is probably not what was intended.