Class: Iowa::Classifier
- Inherits:
-
Object
- Object
- Iowa::Classifier
- Defined in:
- ext/Classifier/classifier.c
Instance Method Summary collapse
- #[](item) ⇒ Object
-
#insert("/someuri", SampleHandler.new) ⇒ nil
Registers the SampleHandler (one for all requests) with the “/someuri”.
-
#delete("/someuri") ⇒ Object
Yep, just removes this uri and it’s handler from the trie.
- #handler_map ⇒ Object
-
#new ⇒ Classifier
constructor
Initializes a new Classifier object that you can use to associate URI sequences with objects.
-
#insert("/someuri", SampleHandler.new) ⇒ nil
Registers the SampleHandler (one for all requests) with the “/someuri”.
- #inspect ⇒ Object
- #keys ⇒ Object
-
#resolve(item) ⇒ Object
Attempts to resolve either the whole URI or at the longest prefix, returning the prefix (as script_info), path (as path_info), and registered handler (usually an HttpHandler).
Constructor Details
#new ⇒ Classifier
Initializes a new Classifier object that you can use to associate URI sequences with objects. You can actually use it with any string sequence and any objects, but it’s mostly used with URIs.
It uses TST from www.octavian.org/cs/software.html to build an ternary search trie to hold all of the URIs. It uses this to do an initial search for the a URI prefix, and then to break the URI into SCRIPT_NAME and PATH_INFO portions. It actually will do two searches most of the time in order to find the right handler for the registered prefix portion.
58 59 60 61 62 63 64 65 66 67 |
# File 'ext/Classifier/classifier.c', line 58
VALUE Classifier_init(VALUE self)
{
VALUE hash;
// we create an internal hash to protect stuff from the GC
hash = rb_hash_new();
rb_ivar_set(self, id_handler_map, hash);
return self;
}
|
Instance Method Details
#[](item) ⇒ Object
198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 |
# File 'ext/Classifier/classifier.c', line 198
VALUE Classifier_match(VALUE self, VALUE item)
{
void *handler = NULL;
int pref_len = 0;
struct tst *tst = NULL;
unsigned char *item_str = NULL;
DATA_GET(self, struct tst, tst);
item_str = (unsigned char *)StringValueCStr(item);
handler = tst_search(item_str, tst, &pref_len);
if(handler) {
return (VALUE)handler;
}
return Qnil;
}
|
#insert("/someuri", SampleHandler.new) ⇒ nil
Registers the SampleHandler (one for all requests) with the “/someuri”. When Classifier::resolve is called with “/someuri” it’ll return SampleHandler immediately. When called with “/someuri/iwant” it’ll also return SomeHandler immediatly, with no additional searches, but it will return path info with “/iwant”.
You actually can reuse this class to insert nearly anything and quickly resolve it. This could be used for caching, fast mapping, etc. The downside is it uses much more memory than a Hash, but it can be a lot faster. It’s main advantage is that it works on prefixes, which is damn hard to get right with a Hash.
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
# File 'ext/Classifier/classifier.c', line 86
VALUE Classifier_insert(VALUE self, VALUE item, VALUE handler)
{
int rc = 0;
void *ptr = NULL;
struct tst *tst = NULL;
DATA_GET(self, struct tst, tst);
rc = tst_insert((unsigned char *)StringValueCStr(item), (void *)handler , tst, 0, &ptr);
if(rc == TST_DUPLICATE_KEY) {
rb_raise(rb_eStandardError, "Handler already inserted with that name");
} else if(rc == TST_ERROR) {
rb_raise(rb_eStandardError, "Memory error inserting handler");
} else if(rc == TST_NULL_KEY) {
rb_raise(rb_eStandardError, "Value to insert was empty");
}
rb_hash_aset(rb_ivar_get(self, id_handler_map), item, handler);
return Qnil;
}
|
#delete("/someuri") ⇒ Object
Yep, just removes this uri and it’s handler from the trie.
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
# File 'ext/Classifier/classifier.c', line 115
VALUE Classifier_delete(VALUE self, VALUE item)
{
void *handler = NULL;
struct tst *tst = NULL;
DATA_GET(self, struct tst, tst);
handler = tst_delete((unsigned char *)StringValueCStr(item), tst);
if(handler) {
rb_hash_delete(rb_ivar_get(self, id_handler_map), item);
return (VALUE)handler;
} else {
return Qnil;
}
}
|
#handler_map ⇒ Object
218 219 220 221 222 223 |
# File 'ext/Classifier/classifier.c', line 218 VALUE Classifier_handler_map(VALUE self) { VALUE hm; hm = rb_ivar_get(self,id_handler_map); return rb_hash_dup(hm); } |
#insert("/someuri", SampleHandler.new) ⇒ nil
Registers the SampleHandler (one for all requests) with the “/someuri”. When Classifier::resolve is called with “/someuri” it’ll return SampleHandler immediately. When called with “/someuri/iwant” it’ll also return SomeHandler immediatly, with no additional searches, but it will return path info with “/iwant”.
You actually can reuse this class to insert nearly anything and quickly resolve it. This could be used for caching, fast mapping, etc. The downside is it uses much more memory than a Hash, but it can be a lot faster. It’s main advantage is that it works on prefixes, which is damn hard to get right with a Hash.
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
# File 'ext/Classifier/classifier.c', line 86
VALUE Classifier_insert(VALUE self, VALUE item, VALUE handler)
{
int rc = 0;
void *ptr = NULL;
struct tst *tst = NULL;
DATA_GET(self, struct tst, tst);
rc = tst_insert((unsigned char *)StringValueCStr(item), (void *)handler , tst, 0, &ptr);
if(rc == TST_DUPLICATE_KEY) {
rb_raise(rb_eStandardError, "Handler already inserted with that name");
} else if(rc == TST_ERROR) {
rb_raise(rb_eStandardError, "Memory error inserting handler");
} else if(rc == TST_NULL_KEY) {
rb_raise(rb_eStandardError, "Value to insert was empty");
}
rb_hash_aset(rb_ivar_get(self, id_handler_map), item, handler);
return Qnil;
}
|
#inspect ⇒ Object
231 232 233 234 235 |
# File 'ext/Classifier/classifier.c', line 231 VALUE Classifier_inspect(VALUE self) { int id_inspect; id_inspect = rb_intern("inspect"); return rb_funcall(rb_ivar_get(self,id_handler_map),id_inspect,0); } |
#keys ⇒ Object
225 226 227 228 229 |
# File 'ext/Classifier/classifier.c', line 225 VALUE Classifier_keys(VALUE self) { int id_keys; id_keys = rb_intern("keys"); return rb_funcall(rb_ivar_get(self,id_handler_map),id_keys,0); } |
#resolve("/someuri") ⇒ Object #resolve("/someuri/pathinfo") ⇒ Object #resolve("/notfound/orhere") ⇒ nil #resolve("/") ⇒ Object #resolve("/path/from/root") ⇒ Object
Attempts to resolve either the whole URI or at the longest prefix, returning the prefix (as script_info), path (as path_info), and registered handler (usually an HttpHandler). If it doesn’t find a handler registered at the longest match then it returns nil,nil,nil.
Because the resolver uses a trie you are able to register a handler at any character in the URI and it will be handled as long as it’s the longest prefix. So, if you registered handler #1 at “/something/lik”, and #2 at “/something/like/that”, then a a search for “/something/like” would give you #1. A search for “/something/like/that/too” would give you #2.
This is very powerful since it means you can also attach handlers to parts of the ; (semi-colon) separated path params, any part of the path, use off chars, anything really. It also means that it’s very efficient to do this only taking as long as the URI has characters.
A slight modification to the CGI 1.2 standard is given for handlers registered to “/”. CGI expects all CGI scripts to be at some script path, so it doesn’t really say anything about a script that handles the root. To make this work, the resolver will detect that the requested handler is at “/”, and return that for script_name, and then simply return the full URI back as path_info.
It expects strings with no embedded ‘0’ characters. Don’t try other string-like stuff yet.
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 |
# File 'ext/Classifier/classifier.c', line 165
VALUE Classifier_resolve(VALUE self, VALUE item)
{
void *handler = NULL;
int pref_len = 0;
struct tst *tst = NULL;
VALUE result;
unsigned char *item_str = NULL;
DATA_GET(self, struct tst, tst);
item_str = (unsigned char *)StringValueCStr(item);
handler = tst_search(item_str, tst, &pref_len);
result = rb_ary_new();
if(handler) {
rb_ary_push(result, rb_str_substr (item, 0, pref_len));
if(pref_len == 1 && item_str[0] == '/') {
rb_ary_push(result, item);
} else {
rb_ary_push(result, rb_str_substr(item, pref_len, RSTRING_LEN(item)));
}
rb_ary_push(result, (VALUE)handler);
} else {
rb_ary_push(result, Qnil);
rb_ary_push(result, Qnil);
rb_ary_push(result, Qnil);
}
return result;
}
|