Friday, January 14, 2011

Writing Ruby Extensions in C - Part 8, Strings

This is the eighth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. The sixth post talked about ruby catch and throw blocks. The seventh post talk about dealing with numbers. This post will talk about strings.

Dealing with Strings


It is fairly easy to convert C-style strings to ruby string objects, and vice-versa. There are a few functions to know about:
  • rb_str_new(c_str, length) - take the char * c_str pointer and a length in, and return a ruby string object. Note that c_str does *not* have to be NULL terminated; this is one way to deal with binary data
  • rb_str_new2(c_str) - take the NULL terminated char * c_str pointer in, and return a ruby string object
  • rb_str_dup(ruby_string_object) - take ruby_string_object in and return a copy
  • rb_str_plus(string_object_1, string_object_2) - concatenate string_object_1 and string_object_2 and return the result without modifying either object
  • rb_str_times(string_object_1, fixnum_object) - concatenate string_object_1 with itself fixnum_object number of times and return the result
  • rb_str_substr(string_object, begin, length) - return the substring of string_object starting at position begin and going for length characters. If length is less than 0, then "nil" is returned. If begin is passed the end of the array or before the beginning of the array, then "nil" is returned. Otherwise, this function returns the substring of string_object that matches begin..length, though it may be cut short if there are not enough characters in the array
  • rb_str_cat(string_object, c_str, length) - take the char * c_str pointer and length in, and concatenate onto the end of string_object
  • rb_str_cat2(string_object, c_str) - take the NULL-terminated char *c_str pointer in, and concatenate onto the end of string_object
  • rb_str_append(string_object_1, string_object_2) - concatenate string_object_2 onto string_object_1
  • rb_str_concat(string_object, ruby_object) - concatenate ruby_object onto string_object_1. If ruby_object is a FIXNUM between 0 and 255, then it is first converted to a character before concatenation. Otherwise it behaves exactly the same as rb_str_append
  • StringValueCStr(ruby_object) - take ruby_object in, attempt to convert it to a String, and return the NULL terminated C-style char *
  • StringValue(ruby_object) - take ruby_object in and attempt to convert it to a String. Assuming this is successful, the C char * pointer for the string is available via the macro RSTRING_PTR(return_value) and the length of the string is available via the macro RSTRING_LEN(return_value). This is useful to retrieve binary data out of a String object

An example should make most of this clear:

 1) VALUE result, str2, substr;
 2)
 3) result = rb_str_new2("hello");
 4) // result is now "hello"
 5) str2 = rb_str_dup(result);
 6) // result is now "hello", str2 is now "hello"
 7) result = rb_str_plus(result, rb_str_new2(" there"));
 8) // result is now "hello there"
 9) result = rb_str_times(result, INT2FIX(2));
10) // result is now "hello therehello there"
11) substr = rb_str_substr(result, 0, 2);
12) // result is now "hello therehello there", substr is "he"
13) substr = rb_str_substr(result, -2, 2);
14) // result is now "hello therehello there", substr is "re"
15) substr = rb_str_substr(result, -2, 5);
16) // result is now "hello therehello there", substr is "re"
17) // (substring was cut short because the length goes past the end of the string)
18) substr = rb_str_substr(result, 0, -1);
19) // result is now "hello therehello there", substr is Qnil
20) // (length is negative)
21) substr = rb_str_substr(result, 23, 1);
22) // result is now "hello therehello there", substr is Qnil
23) // (requested start point after end of string)
24) substr = rb_str_substr(result, -23, 1);
25) // result is now "hello therehello there", substr is Qnil
26) // (requested start point before beginning of string)
27) rb_str_cat(result, "wow", 3);
28) // result is now "hello therehello therewow"
29) rb_str_cat2(result, "bob");
30) // result is now "hello therehello therewowbob"
31) rb_str_append(result, rb_str_new2("again"));
32) // result is now "hello therehello therewowbobagain"
33) rb_str_concat(result, INT2FIX(33));
34) // result is now "hello therehello therewowbobagain!"
35) fprintf(stderr, "Result is %s\n", StringValueCStr(result));
36) // "hello therehello there wowbobagain!" is printed to stderr

Update: modified the code to fit in the pre box.

No comments:

Post a Comment