Full IDE environments like Code::Blocks and Visual Studio often will "index" your source code pre-emptively, so that searches are faster. They will keep track of where function and class definitons are, so that when you right-click a symbol and say "go to definition", the result feels almost instantaneous. The index is a noticeable speed improvement when working in a large project with thousands of source files.
I prefer to use a light weight code editor, but I liked this feature. Searching online, it didn't look like anyone had written a plugin like this for SciTE, the code editor I use. Also, I felt in the mood for writing something fast in C.
So, I wrote "SSIP", the simple source indexing processor. Now, if you are working in a C or C++ codebase (even a large one with many thousands of files), you can
- click on a function or method and press F12 to go to its definition
- Shift-F12 to go to the definition of a function/method/class
- Ctrl-F12 to list references to a function/method/class
My primary goal is speed -- I want this to be useful for large codebases. I've claimed this is "fast". What do I mean?
Looks good to me! :) I think I can safely say that most of the projects I work on have less than 40Mb of source code. Remember that these times are for the slowest initial index creation; subsequent runs only look at files that have been changed and run in milliseconds. And searching against this index takes only tens of milliseconds.
On GitHub I have a more full explanation, but here are a few of the more interesting choices along the way:
- I chose to write in C, eschewing C++ features like classes and exceptions
- Used prepared queries in SQLite for better performance.
- Benchmarked a multi-threaded version, but it was not significantly faster because problem is bounded by SQLite's insert speed.
- Because reading through a single file is fast, I ended up not storing only the filename and not position within the file, the position within the file could be found later.
- Because I am searching whole words, not partial words, I only needed to store the numerical hash.
- In fact, by using a 24-bit hash, I could allocate a block of 2^24 bytes of memory and use it as a simple and extremely fast hash table.
ssip's command line syntax:
The source code I've posted on GitHub here, released under the GPLv3.
Update 2016: my new project SciTE-with-Python has searching that is fast enough for all my needs currently. In SciTE-with-Python, F12 Searches for selected term (definition), Shift+F12 Searches for selected term (all references), and Ctrl+F12 Searches for selected term (declaration). Also, it works with Python in addition to C and C++.