I have just realized that I had missed a very comprehensive analysis of the DocGraph RX dataset by Janos Hajagos.
Highlights from the post:
The second core aspect of the work was to map semantic clinical drugs (TTY=SCD) to WHO’s ATC drug classification system. ATC drug codes are free to use for non commercial purposes. RxNorm includes an accidental mapping to ATC drug codes but they are based on a single ingredient and not on the route, e.g., oral versus topical. A rather time consuming process, in terms of writing SQL, was done to improve the mapping. The final result while not 100% complete allows drugs to be sorted by a synthetic ATC code. Certain branded drugs like Skelaxin (Metaxalone) are not part of the current ATC release. Whenever possible I try to map to the longest length ATC code. The advantage of using ATC to sort the drugs is that we put drugs that are similiar spatially near each other. The MySQL queries for generating the refined drug database are on GitHub.
This is truly unprecedented work!!!
(Update Feb 2014: I misspelled Hajagos in this article. How embarrassing.)