libdatrie 0.2.13
|
Trie data type and functions. More...
Macros | |
#define | trie_state_is_terminal(s) trie_state_is_walkable((s),0) |
Check for terminal state. More... | |
#define | trie_state_is_leaf(s) (trie_state_is_single(s) && trie_state_is_terminal(s)) |
Check for leaf state. More... | |
Typedefs | |
typedef struct _Trie | Trie |
Trie data type. | |
typedef Bool(* | TrieEnumFunc) (const AlphaChar *key, TrieData key_data, void *user_data) |
Trie enumeration function. More... | |
typedef struct _TrieState | TrieState |
Trie walking state. | |
typedef struct _TrieIterator | TrieIterator |
Trie iteration state. | |
Functions | |
Trie * | trie_new (const AlphaMap *alpha_map) |
Create a new trie. More... | |
Trie * | trie_new_from_file (const char *path) |
Create a new trie by loading from a file. More... | |
Trie * | trie_fread (FILE *file) |
Create a new trie by reading from an open file. More... | |
void | trie_free (Trie *trie) |
Free a trie object. More... | |
size_t | trie_get_serialized_size (Trie *trie) |
Get trie serialized size. More... | |
void | trie_serialize (Trie *trie, uint8 *ptr) |
Serializes trie data into a memory buffer (including mapping) More... | |
int | trie_save (Trie *trie, const char *path) |
Save a trie to file. More... | |
int | trie_fwrite (Trie *trie, FILE *file) |
Write trie data to an open file. More... | |
Bool | trie_is_dirty (const Trie *trie) |
Check pending changes. More... | |
Bool | trie_retrieve (const Trie *trie, const AlphaChar *key, TrieData *o_data) |
Retrieve an entry from trie. More... | |
Bool | trie_store (Trie *trie, const AlphaChar *key, TrieData data) |
Store a value for an entry to trie. More... | |
Bool | trie_store_if_absent (Trie *trie, const AlphaChar *key, TrieData data) |
Store a value for an entry to trie only if the key is not present. More... | |
Bool | trie_delete (Trie *trie, const AlphaChar *key) |
Delete an entry from trie. More... | |
Bool | trie_enumerate (const Trie *trie, TrieEnumFunc enum_func, void *user_data) |
Enumerate entries in trie. More... | |
TrieState * | trie_root (const Trie *trie) |
Get root state of a trie. More... | |
TrieState * | trie_state_clone (const TrieState *s) |
Clone a trie state. More... | |
void | trie_state_copy (TrieState *dst, const TrieState *src) |
Copy trie state to another. More... | |
void | trie_state_free (TrieState *s) |
Free a trie state. More... | |
void | trie_state_rewind (TrieState *s) |
Rewind a trie state. More... | |
Bool | trie_state_walk (TrieState *s, AlphaChar c) |
Walk the trie from the state. More... | |
Bool | trie_state_is_walkable (const TrieState *s, AlphaChar c) |
Test walkability of character from state. More... | |
int | trie_state_walkable_chars (const TrieState *s, AlphaChar chars[], int chars_nelm) |
Get all walkable characters from state. More... | |
Bool | trie_state_is_single (const TrieState *s) |
Check for single path. More... | |
TrieData | trie_state_get_data (const TrieState *s) |
Get data from terminal state. More... | |
TrieIterator * | trie_iterator_new (TrieState *s) |
Create a new trie iterator. More... | |
void | trie_iterator_free (TrieIterator *iter) |
Free a trie iterator. More... | |
Bool | trie_iterator_next (TrieIterator *iter) |
Move trie iterator to the next entry. More... | |
AlphaChar * | trie_iterator_get_key (const TrieIterator *iter) |
Get key for a trie iterator. More... | |
TrieData | trie_iterator_get_data (const TrieIterator *iter) |
Get data for the entry referenced by an iterator. More... | |
Trie data type and functions.
Trie is a kind of digital search tree, an efficient indexing method with O(1) time complexity for searching. Comparably as efficient as hashing, trie also provides flexibility on incremental matching and key spelling manipulation. This makes it ideal for lexical analyzers, as well as spelling dictionaries.
This library is an implementation of double-array structure for representing trie, as proposed by Junichi Aoe. The details of the implementation can be found at http://linux.thai.net/~thep/datrie/datrie.html
A Trie is associated with an AlphaMap, a map between actual alphabet characters and the raw characters used to walk through trie. You can define the alphabet set by adding ranges of character codes to it before associating it to a trie. And the keys to be added to the trie must comprise only characters in such ranges. Note that the size of the alphabet set is limited to 256 (TRIE_CHAR_MAX + 1), and the AlphaMap will map the alphabet characters to raw codes in the range 0..255 (0..TRIE_CHAR_MAX). The alphabet character ranges need not be continuous, but the mapped raw codes will be continuous, for the sake of compactness of the trie.
A new Trie can be created in memory using trie_new(), saved to file using trie_save(), and loaded later with trie_new_from_file(). It can even be embeded in another file using trie_fwrite() and read back using trie_fread(). After use, Trie objects must be freed using trie_free().
Operations on trie include:
#define trie_state_is_leaf | ( | s | ) | (trie_state_is_single(s) && trie_state_is_terminal(s)) |
Check for leaf state.
s | : the state to check |
Check if the given state is a leaf state. A leaf state is a terminal state that has no other branch.
#define trie_state_is_terminal | ( | s | ) | trie_state_is_walkable((s),0) |
Check for terminal state.
s | : the state to check |
Check if the given state is a terminal state. A terminal state is a trie state that terminates a key, and stores a value associated with it.
Trie enumeration function.
key | : the key of the entry |
data | : the data of the entry |
user_data | : the user-supplied data on enumerate call |
Delete an entry from trie.
trie | : the trie |
key | : the key for the entry to delete |
Delete an entry for the given key from trie.
Bool trie_enumerate | ( | const Trie * | trie, |
TrieEnumFunc | enum_func, | ||
void * | user_data | ||
) |
Enumerate entries in trie.
trie | : the trie |
enum_func | : the callback function to be called on each key |
user_data | : user-supplied data to send as an argument to enum_func |
Enumerate all entries in trie. For each entry, the user-supplied enum_func callback function is called, with the entry key and data. Returning FALSE from such callback will stop enumeration and return FALSE.
Trie * trie_fread | ( | FILE * | file | ) |
Create a new trie by reading from an open file.
file | : the handle of the open file |
Create a new trie and initialize its contents by reading from the open file. After reading, the file pointer is left at the end of the trie data. This can be useful for reading embedded trie index as part of a file data.
The created object must be freed with trie_free().
Available since: 0.2.4
void trie_free | ( | Trie * | trie | ) |
Free a trie object.
trie | : the trie object to free |
Destruct the trie and free its allocated memory.
int trie_fwrite | ( | Trie * | trie, |
FILE * | file | ||
) |
Write trie data to an open file.
trie | : the trie |
file | : the open file |
Write trie data to file which is opened for writing. After writing, the file pointer is left at the end of the trie data. This can be useful for embedding trie index as part of a file data.
Available since: 0.2.4
size_t trie_get_serialized_size | ( | Trie * | trie | ) |
Get trie serialized size.
trie | : the trie |
Returns size that would be occupied by a trie if it was serialized into a binary blob or file.
Available since: 0.2.13
Bool trie_is_dirty | ( | const Trie * | trie | ) |
Check pending changes.
trie | : the trie object |
Check if the trie is dirty with some pending changes and needs saving to keep the file synchronized.
void trie_iterator_free | ( | TrieIterator * | iter | ) |
Free a trie iterator.
iter | : the trie iterator to free |
Destruct the iterator iter and free its allocated memory.
Available since: 0.2.6
TrieData trie_iterator_get_data | ( | const TrieIterator * | iter | ) |
Get data for the entry referenced by an iterator.
iter | : an iterator |
Get value for the entry referenced by an iterator. Getting value from an un-iterated (or broken for any reason) iterator will result in TRIE_DATA_ERROR.
Available since: 0.2.6
AlphaChar * trie_iterator_get_key | ( | const TrieIterator * | iter | ) |
Get key for a trie iterator.
iter | : an iterator |
Get key for the current entry referenced by the trie iterator iter.
The return string must be freed with free().
Available since: 0.2.6
TrieIterator * trie_iterator_new | ( | TrieState * | s | ) |
Create a new trie iterator.
s | : the TrieState to start iteration from |
Create a new trie iterator for iterating entries of a sub-trie rooted at state s.
Use it with the result of trie_root() to iterate the whole trie.
The created object must be freed with trie_iterator_free().
Available since: 0.2.6
Bool trie_iterator_next | ( | TrieIterator * | iter | ) |
Move trie iterator to the next entry.
iter | : an iterator |
Move trie iterator to the next entry. On return, the iterator iter is updated to reference to the new entry if successfully moved.
Available since: 0.2.6
Create a new trie.
alpha_map | : the alphabet set for the trie |
Create a new empty trie object based on the given alpha_map alphabet set. The trie contents can then be added and deleted with trie_store() and trie_delete() respectively.
The created object must be freed with trie_free().
Trie * trie_new_from_file | ( | const char * | path | ) |
Create a new trie by loading from a file.
path | : the path to the file |
Create a new trie and initialize its contents by loading from the file at given path.
The created object must be freed with trie_free().
Retrieve an entry from trie.
trie | : the trie |
key | : the key for the entry to retrieve |
o_data | : the storage for storing the entry data on return |
Retrieve an entry for the given key from trie. On return, if key is found and o_data is not NULL, *o_data is set to the data associated to key.
Get root state of a trie.
trie | : the trie |
Get root state of trie, for stepwise walking.
The returned state is allocated and must be freed with trie_state_free()
int trie_save | ( | Trie * | trie, |
const char * | path | ||
) |
Save a trie to file.
trie | : the trie |
path | : the path to the file |
Create a new file at the given path and write trie data to it. If path already exists, its contents will be replaced.
void trie_serialize | ( | Trie * | trie, |
uint8 * | ptr | ||
) |
Serializes trie data into a memory buffer (including mapping)
trie | : the trie |
ptr | : a pointer to current position inside of a preallocated buffer. |
Write trie data to a current position in a buffer pointed by ptr. This can be useful for embedding trie index as part of a file data.
The size that the trie will occupy can be calculated using trie_get_serialized_size()
Available since: 0.2.13
Clone a trie state.
s | : the state to clone |
Make a copy of trie state.
The returned state is allocated and must be freed with trie_state_free()
Copy trie state to another.
dst | : the destination state |
src | : the source state |
Copy trie state data from src to dst. All existing data in dst is overwritten.
void trie_state_free | ( | TrieState * | s | ) |
Free a trie state.
s | : the state to free |
Free the trie state.
Get data from terminal state.
s | : a terminal state |
Get value from a terminal state of trie. Getting value from a non-terminal state will result in TRIE_DATA_ERROR.
Bool trie_state_is_single | ( | const TrieState * | s | ) |
Check for single path.
s | : the state to check |
Check if the given state is in a single path, that is, there is no other branch from it to leaf.
Test walkability of character from state.
s | : the state to check |
c | : the input character |
Test if there is a transition from state s with input character c.
void trie_state_rewind | ( | TrieState * | s | ) |
Rewind a trie state.
s | : the state to rewind |
Put the state at root.
Walk the trie from the state.
s | : current state |
c | : key character for walking |
Walk the trie stepwise, using a given character c. On return, the state s is updated to the new state if successfully walked.
Get all walkable characters from state.
s | : the state to get |
chars | : the storage for the result |
chars_nelm | : the size of chars[] in number of elements |
Get the list of all walkable characters from state s. At most chars_nelm walkable characters are stored in chars[] on return.
The function returns the actual number of walkable characters from s. Note that this may not equal the number of characters stored in chars[] if chars_nelm is less than the actual number.
Available since: 0.2.6
Store a value for an entry to trie.
trie | : the trie |
key | : the key for the entry to store |
data | : the data associated to the entry |
Store a data for the given key in trie. If key does not exist in trie, it will be appended. If it does, its current data will be overwritten.
Store a value for an entry to trie only if the key is not present.
trie | : the trie |
key | : the key for the entry to store |
data | : the data associated to the entry |
Store a data for the given key in trie. If key does not exist in trie, it will be inserted. If it does, the function will return failure and the existing value will not be touched.
This can be useful for multi-thread applications, as race condition can be avoided.
Available since: 0.2.4