# My most used NumPy/SciPy functions

2013-01-04

I analyzed my recent Python scripts, and it appears that the most used NumPy/SciPy symbols are:

1. `asarray` by a large margin
2. `linspace`, three times less often than `asarray`
3. `mean`
4. `dot`
5. `sqrt`
6. `roll`
7. `hstack`
8. `float32` (that's not a function, but a type)
9. `loadtxt`
10. `linalg.norm`
11. `arange` (I use it four times less often than `linspace`)
12. `array` (copy-by-default, much less used than `asarray`)
13. `where`
14. `diff`
15. `cumsum`
16. `savetxt` (two times less often than `loadtxt`)
17. `max`
18. `cross`
19. `sin`
20. `ones`

Overall, the most used functions

• convert anything to array (mostly lists and nested lists)
• do simple file input-output (`loadtxt`, `savetxt`)
• construct new arrays from the existing blocks (`hstack`, `roll`...) and default arrays (`ones`, `zeros`, `linspace`, `arange`...)
• do basic linear algebra (`dot`, `cross`, `norm`)
• make common reductions (`mean`, `cumsum`, `max`, ...)
• broadcast function application (`sqrt`, `sin`, ...)
• define inter-element relations (`roll`, `diff`)

This may be useful to know when designing a custom array-like API (I still wish I had time to write a Clojure wrapper around Commons Math).

I counted only qualified and explicit imports ("import numpy as np", "from nump import foo"). I didn't count scripts which use `from ... import *`. That's too much work for grep and sed. I also didn't count use of array object's methods. This would require static analysis of Python code.

Do you know tools for such an analysis?