“Everyone’s got an API. I want mine!”

First of all, for the record, I didn’t say “Everyone’s got an API. I want mine!”; it’s just the title of this post. Secondly, that is the wrong frame of mine for anyone to have in anything because it’s not good to go into creating/building something if you are primarily motivated to be “second best.” I shouldn’t divert from the main topic here. I’m writing to shed some light on some API business. I’m mainly concerned about the REST API Web Service because it is by far the most popular and useful Web Service out there. This Web service is very popular among JavaScript programmers because JSON is one of the formats by which REST API’s transfer/relay state. SOAP does not support JSON!

Some people might have heard about the SOAP and REST Web Services but might be puzzled on what services these two really provide and which to choose to solve some specific problems.  SOAP stands for Simple Object Access Protocol while REST stands for Representational State Transfer. So that I don’t bore you with so many technical details as would some others, I would say that the main pragmatic difference between these two is that REST as a service ships in more versatile formats like JSON, XML, YAML (sometimes) while XML is the message format that SOAP ships in and depends on. Applications that implement a RESTful architecture are usually said to be RESTful.

Who uses REST and why do people use it? A lot of companies! All of Yahoo’s web services use REST, including Flickr. Del.icio.us API uses it, pubsub, bloglines, technorati, and both eBay, and Amazon have web services for REST. Here are the defining qualities that make REST stand out:

  • Lightweight – excludes much of the redundancy that accompanies SOAP.
  • Easy to build – no special toolkits required.

Now let’s get to the real business: How do you build a REST API (I hope you are convinced enough that REST is easier to produce and use than SOAP). I could write a bunch of pages on how to write a REST API using WSDL (Web Services Description Language) or a server-side language like PHP or Ruby but would be reinventing the wheel by doing so. I have a list of tutorials that I believe people can benefit from just as I did and learn how to create their own REST API’s. Please remember that when following these tutorials, you should take note of third-party libraries like SoapServer in PHP. It is very tedious, slow, and not wise to start building your own library from ground up without using any helper libraries. Take a look at this list of resources on API creation and development:

Ok, that’s it folks! Follow more than one of these tutorials (make sure you listen to the google tech video; it’s priceless!).

Python script to mirror a directory on another (unix/linux/mac)

For some time now, I’ve been trying to figure how to mirror a directory recursively so that all files (in any depth of the source directory) in the source directory will be in the destination direction. In addition, if any files on the source directory changed, we could update the destination directory just by running the program again.

Precondition:
For the program to run, you need to make the variables, from_dir and to_dir, point to the source and destination directories respectively.

At the moment, the script works excellently well and is well commented to elucidate the workings of the script and my intentions. On the other hand, I believe there can be some improvements to the script like:

  • Enabling input facilities in the python script so that the client will only need to enter the values of from_dir and to_dir variables.
  • Make it cleaner by commenting more.
  • Or even package it into a class.
  • Or if we (or you or I) feel very challenged (or less lazy), we/you/I could distribute it as a python third-party package.

Please modify as you please! This code is in the public domain.

#!/usr/bin/env python

# Python script to compare two directories, from_dir and to_dir
# and copy or files from from_dir that are not in to_dir
# it does this recursively as it walks on directories
# author Daniel Alabi
# date
import os
import os.path
import re
import subprocess

# change from_dir and to_dir as appropriate
from_dir = "/home/daniel/jquery/"
to_dir = "/home/daniel/jquery-alias/"

# getExtDir gets the remaning path of the full
# path when "from_dir" has been removed from
# dir
def getExtDir(dir):
    if re.search(from_dir, dir):
    return re.sub(from_dir, "", dir)

# getUpperLevelDir gets the full path of the present
# directory (".")
def getUpperLevelDir(dirString):
# handle the case where dirString already has a
# trailing "/"
    if (dirString[-1] == "/"):
        dirString = dirString[0:-1]
    to = dirString.rfind("/")
    return dirString[0:to] + "/"

# updateDirs recursively updates to_here until
# it has the same files (up-to-date ones) and
# structure as from_here
def updateDirs(from_here, to_here):
    os.chdir(from_here) # cd to from_here

    # walk the from_here directory
    for file_or_dir in os.listdir(from_here):
        to_file_or_dir = to_here + file_or_dir
        from_file_or_dir = from_here + file_or_dir

        # check if to_file_or_dir is a dir by first determining if it
        # is supposed to be a dir
        # if it is a dir, first of all attach a trailing forward slash
        # this will make it easier for the remaining part of
        # the program to deal with the directories, from_here and to_here
        if (os.path.isdir(file_or_dir)):
            to_file_or_dir += "/"
            from_file_or_dir += "/"

            # check if to_file_or_dir (as a dir) exists
            # if it doesn't make the directory
            if (not os.path.exists(to_file_or_dir)):
                os.mkdir(to_file_or_dir)

            upperleveldir = getUpperLevelDir(from_file_or_dir)

            # now we are sure that to_file_or_dir
            # exists and is a dir; recurse into it
            updateDirs(from_file_or_dir, to_file_or_dir)

            # cd back to where you came from
            os.chdir(upperleveldir)
        else:
            # it is a file

            # check if the file exists
            if (os.path.exists(to_file_or_dir)):
                # get the times when they were modified last
                modified_to = os.stat(to_file_or_dir).st_mtime
                modified_from = os.stat(from_file_or_dir).st_mtime

                # here we check if the modified times of to_file_or_dir
                # is the same as that of from_file_or_dir
                # Since we just want to make sure that the file
                # in to_here (the directory we are going to) mirrors
                # the one in from_here, we just check that the modified
                # times are equal
                # if they aren't we copy the one from from_here
                # to the one in to_here and remove
                # the already existing one in to_here 

                if (modified_from > modified_to):
                    copyString = from_file_or_dir + " " + getUpperLevelDir(\
                        to_file_or_dir)
                # print useful messages to screen about the update
                # i'm about to make
                    print from_file_or_dir, "has been modified and",\
                         to_here, "not updated"
                    print "Copying" , from_file_or_dir, "from", \
                         from_here, "to", to_here

                    subprocess.Popen('cp ' + copyString, shell=True)
                    os.remove(to_file_or_dir)

            else:
                # the file to_file_or_dir does not exist
                # so we copy the file from from_here
                # to to_here
                copyString = from_file_or_dir + " " + getUpperLevelDir(\
                      to_file_or_dir)

                # print useful message to screen about copying
                # from_file_or_dir to to_here
                print from_file_or_dir, "does not exist in",\
                      to_here
                print "Copying" , from_file_or_dir, "from", \
                      from_here, "to", to_here

                subprocess.Popen('cp ' + copyString, shell=True)

# traditional main that does calls
# updateDirs -- the recursive function
def main():
    print "***Updating files in", to_dir, "to mirror the ones in", \
         from_dir, "***\n"

    updateDirs(from_dir, to_dir)

    print "***FINISHED***"

if __name__ == "__main__":
    main()

End of Program

Note: There might be some indentation mistakes in the above script which might have occurred in the process of copying the source from my editor to the WordPress post textArea. Bear with me!

Why Use Smart Pointers in C++?

There are a lot of mechanisms through which programmers can make their code cleaner, preserve memory, and still harness the power of the programming language in use. Such mechanisms almost always have demerits that are coupled with their merits. One such mechanism, used primarily in intermediate-level programming languages such as C++, is the Use of Smart Pointers which helps with automatic garbage collection and/or bounds checking. Smart pointers can save a programmer’s time and could further prevent a program from crashing or corrupting files on the system on which it is run due to memory leak or bounds wrapping/overwriting. A smart pointer is an abstract data type that simulates an ordinary pointer but still provides extra features mostly centered around memory management. This article will be based upon the use of pointers in C++.

A pointer in C/C++ is used to store the memory address of its pointee (memory address of an object — an instance of a class).

Book* myPointer = new Book(); // myPointer could be an instance of the Book class or any subclass of the
// Book class

Every time we allocate memory on the heap (through the use of pointers), we must remember to deallocate/reallocate the memory on the heap in order to make the memory available again for use by other parts of the program. This can be done by using the delete keyword like this:

delete myPointer; // or delete[] myPointers where myPointers be an array

Failure to delete a pointer or pointers can result in memory leak (where the memory allocated from the free store wouldn’t be available for use by other parts of the program). So a programmer must always keep track of his pointers and try to curb the risk of letting pointers go out of scope without deallocating the memory used by corresponding pointees. It is hard, especially in a complicated/verbose source, for the programmer to keep track of all pointers used and try to delete them. Smart pointers simplify this task.

There are different ways to implement smart pointers. Some of which are through the use of reference counting and/or operator overloading. However, in C++, smart pointers may be implemented as a template class that mimics, by means of operator overloading, the behavior of traditional (raw) pointers, (e.g.: dereferencing, assignment) while providing additional memory management algorithms. An example pointer implementation is the BOOST library, a high-quality open-source library, that has been considered for inclusion into the standard C++ library. This library provides the following pointer implementations:

shared_ptr<T> pointer to T" using a reference count to determine when the object is no longer needed. shared_ptr is the generic, most versatile smart pointer offered by boost.
scoped_ptr<T> a pointer automatically deleted when it goes out of scope. No assignment possible, but no performance penalties compared to “raw” pointers
intrusive_ptr<T> another reference counting pointer. It provides better performance than shared_ptr, but requires the type T to provide its own reference counting mechanism.
weak_ptr<T> a weak pointer, working in conjunction with shared_ptr to avoid circular references
shared_array<T> like shared_ptr, but access syntax is for an Array of T
scoped_array<T> like scoped_ptr, but access syntax is for an Array of T

I can’t go through all these implementations in this article. For more details, check out the online docs. of the C++ BOOST library. In addition, the C++ standard library provides a class template called auto_ptr (declared in the memory header file) that provides some basic supplementary memory management capabilities for C++ raw pointers.

Smart pointers do not only help in memory management but also help support intentional programming (when the programmer code reflects his intention — his original conception when he was about to start coding). For example, if the programmer wanted to make a pointer function (which is, by the way, useful for making functions polymorphic) like this: Book* myPointerFunction();. If he were to use raw pointers, he would have to figure out how he would delete myPointerFunction(). On the other hand, if he had access to smart pointers he wouldn’t have to worry about deleting myPointerFunction(); , thus more readily portraying the programmer’s intention without side effects.

In programming languages like Java, C#, VB, and Python, the deleting of “pointers” (there are no explicit pointers) are done through their respective garbage collection algorithms and mechanisms. So programmers wouldn’t have to worry unnecessarily about deleting pointers. Even these programming language have different garbage collection schemes depending on the implementation. The standard C implementation (CPython) uses reference counting to  detect inaccessible objects, and a separate mechanism to collect reference cycles, periodically executing a cycle detection algorithm which looks for inaccessible cycles and deletes the objects involved. The gc module provides functions to force garbage collection, obtain debugging statistics, and tune the collector’s parameters. On the other hand, Jython relies on the Java runtime so the JVM’s garbage collector is used. The same applies to IronPython, which uses the CLR garbage collector. This difference can cause some subtle porting problems if your Python code depends on the behavior of the reference counting implementation.

A major demerit in using smart pointers is that the encapsulating nature of smart pointers could cause problems when pointers are used with a “rigid” frameworks that only accept raw pointers. This can be very difficult to handle unless the smart pointer implementation, in this case, has an implicit conversion scheme to its raw pointer type (a dangerous thing indeed).

I must say that the disadvantages of using pointers is trivial! On the other hand, they could ultimately save the programmer’s code-writing and debugging time and improve the quality of the software in which they are used. Overall, smart pointers are very useful if they are needed (that is, when there is a somewhat heavy/tangled/non-explicit use of pointers) but must be used wisely and carefully.