Hopefully this will shed some light on what I've been talking about:
dictionary.pd
dictionary-help.pd
A few things to notice:
- Search performance should be at least 10x faster than the worst case for
[text search]
- The total computation time for
[dictionary] should be vastly more predictable than [text search].
However, there's still a clear bottleneck in this abstraction-- calculating the hash. Not sure whether it's the conversion from symbol to list or the recursive addition, but it is slow. So slow that if you reimplement it in C as I did to test, you get another 10x speed increase.
So exposing the (C level) symbol hashing to the user and wrapping it in an abstraction as I've done should give you a 100x speedup over the worst case linear lookup time in your patch above with 10,000 items.
A more dramatic restatement of that-- since memory is cheap, this means you can create a much larger hashmap and handle an enormous number key/value pairs for realtime computation that would not be possible with linear lookup times of [text].
Edit: the abstraction is just a demo. It will happily duplicate keys (preventable by adding a [text search] in the right object chain), and provides no mechanism to delete entries. While lookup is tweaked for realtime performance, adding keys will cause intermittent memory allocation.
Edit2: This is basically the way t_symbol *gensym(char*) works in the Pd code. It hashes the string to get the array index, then searches through the linked list of symbols at that index for the string. If it finds it it returns the memory address; if not, it creates a new entry and returns that address. That's why you can compare two symbols in Pd source code with "==" sign-- if they're both at the same address then it's the same string.